Recommendation Engines

The Netflix Prize was awarded to the team with the algorithm that most accurately guessed people’s movie tastes. Accurate, according to some measure: root-mean-squared error, or the L₂ norm.

In my opinion, that’s the wrong measure of success. Netflix selected for algorithms that predicted well across all data, penalizing large misses extra. But that’s not what makes a recommendation algorithm good.

The best algorithm, I think, should observe my tastes and recommend just one product that I’ve never heard of (or at least never tried), that I absolutely love. It’s OK if I like a movie and you show me another one by the same director — but I could have done that myself. The best algorithm would say:

You like Cowboy Bebop + Out Of Africa + Winged Migration so you will like = Seven Samurai.

Cowboy Bebop indicates that I like Asian sh*t; Out Of Africa is an old classic; Winged Migration doesn’t have a lot of talking. Put them together and you get an Asian classic without a lot of talking.

That’s just an example of a recommendation that would fit my criteria of goodness.

In other words,

  1. only the “most recommended” movie matters
  2. it should blow me away
  3. it should be surprising.

RMSE fails #1 because accuracy in the highest recommendation matters just as much as accuracy in every other recommendation.

As a result, today’s recommendation engines are conservative in the wrong ways and basically hack together machine learning fads.


Tags: , , , , , , , ,

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: