jump to navigation

Collaborative Filtering January 16, 2007

Posted by Andre Vellino in Collaborative filtering, Digital library, Recommender service.
trackback

One area of research that I think will be fruitful and also beneficial to the end-user of a science library portal is a collaborative filtering (CF) system. The general idea is to take information about the user’s past usage statistics to help cluster / rank search results, offer serendipitous recommendations and automatically update library “alerts”.

Things like that have been tried before, mostly for commercial purposes. A9 (owned by Amazon) and the Google portal do it, why not a digital library? One challenge in applying this kind of technique fruitfully in a sparsely frequented scientific e-library is that there may not be enough data to draw meaningful conclusions without additional information. It will probably require a combination of other techniques as well, such as text-analysis on retrieved items, query analysis and explicit personal profiles.

I like some things about the suggester at The Library Thing, (though not the unsuggester, except, perhaps, for entertainment value.) What I like best about the portal version of the “suggester” is the explanation feature – why the recommender system made the recommendation. This could be especially useful for eliminating unwanted collections of suggestions in those instances where the sample space is sparse and heterogeneous. Which leads me to wish for a refinement on explanation feature – recommendations could be returned in clusters, perhaps even a hierarchical cluster, based on the reason(s) the recommendation was made.

People used to be worried about the privacy issues with data from search analytics, but I think it’s clear that if the value to the user reaches a certain level, privacy no longer matters (much.) Furthermore, there may even be some general social acceptance and understanding about what an automated recommendation service can do for you. Chris Anderson, in The Long Tail, credits Amazon’s CF system for the rediscovery of forgotten book treasures such as Touching the Void. People now seem to want software to help them, providing it doesn’t look like a paper-clip and beep at you :-).

Comments»

1. Peter Turney - January 17, 2007

On the topic of collaborative filtering:

Web Page Recommendiation System:
StumbleUpon Home: http://www.stumbleupon.com/
My StumbleUpon: http://pdturney.stumbleupon.com/

Collaborative Email Filtering:
Home: http://www.cloudmark.com/
Desktop Spam Filter: http://www.cloudmark.com/desktop/

I’ve been using both of these for at least a year. They work very well.

2. Andre Vellino - January 17, 2007

I thought about subscribing to stumbleupon when you first pointed them out to me, but they didn’t earn my trust. I just wasn’t comfortable with them, I don’t know why. I’m not even sure I trust Google these days.

I’ll have to look more closely at Cloudmark, but it doesn’t appear to be very different from SMTP RBLs.

3. Daniel Lemire - January 18, 2007

I agree that collaborative filtering should be easily explained. This is why we should seek extremely simple-to-understand algorithms.

Example:
http://en.wikipedia.org/wiki/Slope_One


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: