jump to navigation

Tensors for Multi-Dimensional Recommenders November 24, 2007

Posted by Andre Vellino in Collaborative filtering, Digital library, Recommender service.
trackback

It’s good to know that the intellectual heavy-lifting is there when you need it. The ideas in Peter Turney‘s recent tech report on tensors may well be of some use to me when it comes to implementing the multi-dimensional component of the Synthese Recommender that Dave Zeber and I are developing at CISTI. The charts from our presentation at WPRS workshop at Web Intelligence give some indication of how we intend to build these multi-dimensional matrices.

This component of our system is for some time in the future, but I expect the resulting tensors may be a challenge even for Peter’s MATLAB code (given in the paper). Fortunately, I think this can be combined with another idea for distributing recommenders across different subject domains, as described by F. Ricci et al. at RecSys 2007. That’s one way of both reducing the dimensionality of the original tensors and parallelizing the computation, but I agree with Daniel Lemire observation that Peter’s code can be optimized for parallel processors as well.

Comments»

1. Daniel Lemire - November 24, 2007

I haven’t read the Ricci et al. paper, but the typical approach to scale this up is to partition the problem.

2. Andre Vellino - November 24, 2007

Yes, that’s Ricci’s approach. But the claim in that paper is that partitioning and distributing recommendations produces better accuracy as well.

3. James Bowery - November 25, 2007

The claim of better accuracy seems rather strange since by partitioning the problem you have less information for each of the partitions. How does he explain that?

4. Andre Vellino - November 25, 2007

Yes this result does seem odd. As I recall from the presentation – I have only skimmed the paper (to be studied more carefully at a later date) – the greater accuracy comes from the fact if you break up a large matrix into pieces that are relatively more dense and zero elsewhere, the more CF results from the more dense matrices are significantly more accurate. I think the trick is how you “integrate” distributed CF results.

5. James Bowery - March 3, 2008

That involves an interesting step:

Finding the combination of row ordering with column ordering that provides the most densely populated partitions.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: