Sunday, May 3, 2009

Netflix working on top secret project?

The people at Netflix are a clever bunch. Very clever indeed. All this time, they have led us to believe that the goal of this contest was to improve their movie recommendation engines. Well we, at Pragmatic Theory, have uncovered the truth behind this sham.

The reality is that the goal of this contest is to keep the brightest minds in the world occupied, working on this futile project, so that their scientists can be the first to complete work on their real mission: time travel.

This might sound a bit unrealistic, and you may ask us "Do you have any proof sir?"... oh but of course... yes, we have discovered hard evidence that, not only are they working on a time travel machine, but in fact, they have already found a breach in the space-time continuum...... on with the facts.

By closely examining the Netflix dataset, one can find that 7 movies have a release year of NULL. While this is strange in itself, and puzzled us at first, we fortunately found help in the good old Netflix prize forum. Some great people have corrected this obvious mistake by finding the proper DVD release years. But here's the kicker... upon examining the dataset closer, it seems that many customers have rated these movies two, three, even four years before their actual release date. To us, the only logical explanation is that Netflix has a working time portal and has allowed a select few customers the piviledge to use it. Of course, these movie buffs did the only logical thing when being propelled into the future... rent some new releases.

What? This isn't enough proof? Ahh but there's more...

In this other post on the forum, the prizemaster indicates that it is OK to use the data published as part of the KDD cup 2006. This is an additional set of ratings of the same movies, by the same users, but in the year 2006. Great! More data is good. But wait a minute... close examination of the dataset shows that 6 customer-movie pairs are found both in the Netflix prize quiz set AND in the KDD cup set... how can these people have rated the same movie in 2005 AND in 2006... bingo. Time travel.

Oh, I can hear you from here: "People are allowed to re-rate movies on the Netflix site"... ahhhh but why would someone re-rate a movie, only to give it the same rating? Impossible.

Those Netflix chaps thought they had it all planned out... good thing that we have uncovered this little plot... now perhaps we can beat them to the punch... if I can just find a street long enough to get this DeLorean to hit 88 MPH...

9 comments:

Kenneth Hoste said...

I knew it!

Great post. ;-)

Anonymous said...

The Netflix prize set had redundant movie/user pairs. People rerate the same movies. As for why they used the same rating? Maybe they accidentally clicked another rating and then clicked back?

Hassan said...

Anonymous is clearly a desperate Netflix scientist that is not happy because you caught them red handed. Busted!

Anonymous said...

But Sir(s), I haven't looked at this "futile project" in a while, but if I remember correctly, a subset of the rating contains records that have been randomly altered (to avoid possible identification of customer via their ratings). I forgot what proportion of records were altered. It is written somewhere.

Good luck,

Algorista said...

Hahahahahahaha...
Very great !

aangtce said...

Oh Anonymous why don't you just give up and focus your efforts on the timetravel?? Or are you really just the minister of misinformation???

Steve said...

Any thoughts on using on using the fact that something is a re-rating to improve ratings?
Maybe re-ratings tend to be up or down. Doubts its that novel an idea, but I don't recall seeing much on it

jb said...

Pragmatic Theory is now back on top?
Go go go!! :)

nevin said...

Calisse!! Also, if you look at the leaderboard, quite often the top-five-or-so leaders have submitted scores 10's of hours (if not days) after the actual current time... I'm on board with your theory. [It's currently 21-06-2009 00:16 am eastern std. and your latest score was submitted 22-06-2009 02:16 am]