The slides that we used for the Netflix Grand Prize technical presentation on September 21, 2009 are available here.
Thursday, October 1, 2009
Tuesday, July 28, 2009
By a nose... or is it a hair...
Wow. What a crazy 24 hours that was.
After being quietly confident about our position but utterly nervous about the general silence on the leaderboard, we were struck by lightning when the newest coalition of coalitions, The Ensemble, submitted an entry just above ours, taking over first place merely 24 hours before the contest deadline. We were, of course, expecting the other parties to come to an agreement and join forces, but we were somehow hoping that they would come up short.
Well, with 24 hours to go and at least 0.01% to come up with, we weren't going to go down without a fight. Many people offered their help (a big thanks to all that did). Predictors were blended in. New techniques were tried out. Code was written. Nothing seemed to be helping to tip the scale... In the end, with less than a half hour to go, Yehuda and Martin P. scraped up a few new predictors, Michael and Andreas worked some more blending magic and we barely made it to a 10.09% tie for first. We had accomplished the day's mission and were now hoping that our test set score would be good enough to edge out a win.
Four short minutes before the end of the competition, another lightning bolt. The Ensemble had submitted at 10.10% and had appeared to have sealed the deal. We could now only pray that they had overfit the quiz set. We too had done our fair share of quiz set blending, but we had the advantage of having had a month's worth of experiments to tweak the regularization. Now, the contest was over anyway, it was out of our hands and all we could do was wait.
The wait was excruciating. Without much enthusiasm emails were exchanged internally. Hope was slim. Then the word came out on twitter that the Ensemble had won. It was over. We had lost.
All of a sudden, when we were no longer expecting it, an email from Netflix came in... Subj: "Netflix Prize Grand Prize Verification"... "Congratulations! The test subset performance of your team ”BellKor's Pragmatic Chaos” on the following submission makes your team the current top contender for the Grand Prize."... Sorrow turned into Joy... We had succeeded. We couldn't believe it. After having lost all hope, we had come out on top. Now the only thing standing between us and the Grand Prize was the verification process. Truly an amazing and unexpected finish.
We won't know the details of the test set results for a little while, but it's possible that we actually finished with the same test score as The Ensemble. If that is the case, then the tie breaker would be the submission time of those tied entries, which are most likely both the ones from July 26th. That would mean that we have the lead only because we submitted our final result 20 minutes before theirs. Almost three years of competition may have come down to 20 short minutes. Again, amazing.
As we enter the evaluation process, we would like to thank everyone for their participation in this contest. In the end, there can only be one winner, but it wouldn't have been such a great competition without everyone else out there that worked long and hard hours on this crazy project. We have met truly great and interesting people along the way. Walking this long and winding road with all of you was certainly the best part of this adventure.
Cheers!
After being quietly confident about our position but utterly nervous about the general silence on the leaderboard, we were struck by lightning when the newest coalition of coalitions, The Ensemble, submitted an entry just above ours, taking over first place merely 24 hours before the contest deadline. We were, of course, expecting the other parties to come to an agreement and join forces, but we were somehow hoping that they would come up short.
Well, with 24 hours to go and at least 0.01% to come up with, we weren't going to go down without a fight. Many people offered their help (a big thanks to all that did). Predictors were blended in. New techniques were tried out. Code was written. Nothing seemed to be helping to tip the scale... In the end, with less than a half hour to go, Yehuda and Martin P. scraped up a few new predictors, Michael and Andreas worked some more blending magic and we barely made it to a 10.09% tie for first. We had accomplished the day's mission and were now hoping that our test set score would be good enough to edge out a win.
Four short minutes before the end of the competition, another lightning bolt. The Ensemble had submitted at 10.10% and had appeared to have sealed the deal. We could now only pray that they had overfit the quiz set. We too had done our fair share of quiz set blending, but we had the advantage of having had a month's worth of experiments to tweak the regularization. Now, the contest was over anyway, it was out of our hands and all we could do was wait.
The wait was excruciating. Without much enthusiasm emails were exchanged internally. Hope was slim. Then the word came out on twitter that the Ensemble had won. It was over. We had lost.
All of a sudden, when we were no longer expecting it, an email from Netflix came in... Subj: "Netflix Prize Grand Prize Verification"... "Congratulations! The test subset performance of your team ”BellKor's Pragmatic Chaos” on the following submission makes your team the current top contender for the Grand Prize."... Sorrow turned into Joy... We had succeeded. We couldn't believe it. After having lost all hope, we had come out on top. Now the only thing standing between us and the Grand Prize was the verification process. Truly an amazing and unexpected finish.
We won't know the details of the test set results for a little while, but it's possible that we actually finished with the same test score as The Ensemble. If that is the case, then the tie breaker would be the submission time of those tied entries, which are most likely both the ones from July 26th. That would mean that we have the lead only because we submitted our final result 20 minutes before theirs. Almost three years of competition may have come down to 20 short minutes. Again, amazing.
As we enter the evaluation process, we would like to thank everyone for their participation in this contest. In the end, there can only be one winner, but it wouldn't have been such a great competition without everyone else out there that worked long and hard hours on this crazy project. We have met truly great and interesting people along the way. Walking this long and winding road with all of you was certainly the best part of this adventure.
Cheers!
Tuesday, June 23, 2009
What's in a name?
As most of our readers must have already seen, we made a big splash today by forming a coalition with our closest competitors. There will be time to answer all of the burning questions about the joined team, but for now, I would like to start on a lighter note: the team name.
While BellKor's Pragmatic Chaos may not be the sexiest of names, in the end, it was chosen because we felt that it best served the main purpose which was to give credit to each joining team and to provide instant recognition of what the new team represented. Also, it had a better ring to it than some other combinations like Pragmatic BellKor Chaos or PT in BK in BT.
This was a tough decision, because we came up with quite a few creative ideas. Here is a rundown of all the names that were discussed along the way. Credit goes out to all members of the coalition.
First runner up:
The Usual Suspects - This idea stems from a quote in the movie Casablanca ("Round up the usual suspects"). While this is certainly a catchy name, we (PT) didn't feel that it included us entirely because we have not been recognized officially in the past, so are not immediate "usual suspects".
Second runner up:
Million Dollar Baby - A most appropriate movie reference. This was the early favorite, but was eliminated because we felt that putting emphasis on the financial aspect didn't represent the spirit of the contest or of our coalition. This is also why we eliminated Show Me The Money as a potential name.
Category cocky:
Resistance is Futile - With the release of the new Star Trek movie, we thought that this quote was pretty cool... but perhaps a bit too aggresive.
The Dream Team - Again, a bit too arrogant, but this one is also a funny movie reference. Imagine a bunch of patients in a psychiatric ward working on the Netflix prize...
Catch Us If You Can - Another movie reference, but we didn't want to tempt people into actually catching us...
Category movie quotes:
Gonna Need a Bigger Boat (Jaws) - I love this one... I can imagine 7 guys from around the globe, that don't know each other too much, piled into this small life raft trying to get away from this huge shark... "Yeah, hmmm, I think we're gonna need a bigger boat here..."
Other suggested quotes:
Not in Kansas Anymore (Wizard of Oz)
Go Ahead, Make My Day (Sudden Impact)
Like A Box of Chocolates (Forrest Gump )
Another Nice Mess (Laurel and Hardy)
The Kindness of Strangers (Streetcar Named Desire)
Miscellaneous:
Going All In - I actually liked this poker reference a lot... how it indicated that this was the final hand, win or lose.
All Aboard
First and Ten
Mission accomplished
42 - The answer to life, the universe and everything.
A small step for math
You see, us engineers, math wizes and scientists can also be creative... but in the end, we do make the most logical choice... that's just the way we are.
While BellKor's Pragmatic Chaos may not be the sexiest of names, in the end, it was chosen because we felt that it best served the main purpose which was to give credit to each joining team and to provide instant recognition of what the new team represented. Also, it had a better ring to it than some other combinations like Pragmatic BellKor Chaos or PT in BK in BT.
This was a tough decision, because we came up with quite a few creative ideas. Here is a rundown of all the names that were discussed along the way. Credit goes out to all members of the coalition.
First runner up:
The Usual Suspects - This idea stems from a quote in the movie Casablanca ("Round up the usual suspects"). While this is certainly a catchy name, we (PT) didn't feel that it included us entirely because we have not been recognized officially in the past, so are not immediate "usual suspects".
Second runner up:
Million Dollar Baby - A most appropriate movie reference. This was the early favorite, but was eliminated because we felt that putting emphasis on the financial aspect didn't represent the spirit of the contest or of our coalition. This is also why we eliminated Show Me The Money as a potential name.
Category cocky:
Resistance is Futile - With the release of the new Star Trek movie, we thought that this quote was pretty cool... but perhaps a bit too aggresive.
The Dream Team - Again, a bit too arrogant, but this one is also a funny movie reference. Imagine a bunch of patients in a psychiatric ward working on the Netflix prize...
Catch Us If You Can - Another movie reference, but we didn't want to tempt people into actually catching us...
Category movie quotes:
Gonna Need a Bigger Boat (Jaws) - I love this one... I can imagine 7 guys from around the globe, that don't know each other too much, piled into this small life raft trying to get away from this huge shark... "Yeah, hmmm, I think we're gonna need a bigger boat here..."
Other suggested quotes:
Not in Kansas Anymore (Wizard of Oz)
Go Ahead, Make My Day (Sudden Impact)
Like A Box of Chocolates (Forrest Gump )
Another Nice Mess (Laurel and Hardy)
The Kindness of Strangers (Streetcar Named Desire)
Miscellaneous:
Going All In - I actually liked this poker reference a lot... how it indicated that this was the final hand, win or lose.
All Aboard
First and Ten
Mission accomplished
42 - The answer to life, the universe and everything.
A small step for math
You see, us engineers, math wizes and scientists can also be creative... but in the end, we do make the most logical choice... that's just the way we are.
Sunday, May 3, 2009
Netflix working on top secret project?
The people at Netflix are a clever bunch. Very clever indeed. All this time, they have led us to believe that the goal of this contest was to improve their movie recommendation engines. Well we, at Pragmatic Theory, have uncovered the truth behind this sham.
The reality is that the goal of this contest is to keep the brightest minds in the world occupied, working on this futile project, so that their scientists can be the first to complete work on their real mission: time travel.
This might sound a bit unrealistic, and you may ask us "Do you have any proof sir?"... oh but of course... yes, we have discovered hard evidence that, not only are they working on a time travel machine, but in fact, they have already found a breach in the space-time continuum...... on with the facts.
By closely examining the Netflix dataset, one can find that 7 movies have a release year of NULL. While this is strange in itself, and puzzled us at first, we fortunately found help in the good old Netflix prize forum. Some great people have corrected this obvious mistake by finding the proper DVD release years. But here's the kicker... upon examining the dataset closer, it seems that many customers have rated these movies two, three, even four years before their actual release date. To us, the only logical explanation is that Netflix has a working time portal and has allowed a select few customers the piviledge to use it. Of course, these movie buffs did the only logical thing when being propelled into the future... rent some new releases.
What? This isn't enough proof? Ahh but there's more...
In this other post on the forum, the prizemaster indicates that it is OK to use the data published as part of the KDD cup 2006. This is an additional set of ratings of the same movies, by the same users, but in the year 2006. Great! More data is good. But wait a minute... close examination of the dataset shows that 6 customer-movie pairs are found both in the Netflix prize quiz set AND in the KDD cup set... how can these people have rated the same movie in 2005 AND in 2006... bingo. Time travel.
Oh, I can hear you from here: "People are allowed to re-rate movies on the Netflix site"... ahhhh but why would someone re-rate a movie, only to give it the same rating? Impossible.
Those Netflix chaps thought they had it all planned out... good thing that we have uncovered this little plot... now perhaps we can beat them to the punch... if I can just find a street long enough to get this DeLorean to hit 88 MPH...
The reality is that the goal of this contest is to keep the brightest minds in the world occupied, working on this futile project, so that their scientists can be the first to complete work on their real mission: time travel.
This might sound a bit unrealistic, and you may ask us "Do you have any proof sir?"... oh but of course... yes, we have discovered hard evidence that, not only are they working on a time travel machine, but in fact, they have already found a breach in the space-time continuum...... on with the facts.
By closely examining the Netflix dataset, one can find that 7 movies have a release year of NULL. While this is strange in itself, and puzzled us at first, we fortunately found help in the good old Netflix prize forum. Some great people have corrected this obvious mistake by finding the proper DVD release years. But here's the kicker... upon examining the dataset closer, it seems that many customers have rated these movies two, three, even four years before their actual release date. To us, the only logical explanation is that Netflix has a working time portal and has allowed a select few customers the piviledge to use it. Of course, these movie buffs did the only logical thing when being propelled into the future... rent some new releases.
What? This isn't enough proof? Ahh but there's more...
In this other post on the forum, the prizemaster indicates that it is OK to use the data published as part of the KDD cup 2006. This is an additional set of ratings of the same movies, by the same users, but in the year 2006. Great! More data is good. But wait a minute... close examination of the dataset shows that 6 customer-movie pairs are found both in the Netflix prize quiz set AND in the KDD cup set... how can these people have rated the same movie in 2005 AND in 2006... bingo. Time travel.
Oh, I can hear you from here: "People are allowed to re-rate movies on the Netflix site"... ahhhh but why would someone re-rate a movie, only to give it the same rating? Impossible.
Those Netflix chaps thought they had it all planned out... good thing that we have uncovered this little plot... now perhaps we can beat them to the punch... if I can just find a street long enough to get this DeLorean to hit 88 MPH...
Friday, March 13, 2009
Friday The 13th
[LP]Hey, I saw you finally made it to number one in that netflix contest. Did you implement that idea I gave you... you know, the one where you boost the ratings of horror films on Friday the 13th?
[PT] (sigh)
No we didn't use movie metadata in our final push to number one. Instead, we relied on our original methodology... As the full moon sailed high, at midnight on Friday the 13th, we slaughtered a goat and offered its liver up to the gods...
Seriously, we're very happy about our recent progress and achieving this milestone in a little over a year. Thanks to everyone who has supported our team in various ways. Since we don't know how long we'll be on top, we captured this moment for posterity: http://pragmatictheory.googlepages.com/numberone
Now I'll go back to my BBQ. That goat will make a great roast.
[PT] (sigh)
No we didn't use movie metadata in our final push to number one. Instead, we relied on our original methodology... As the full moon sailed high, at midnight on Friday the 13th, we slaughtered a goat and offered its liver up to the gods...
Seriously, we're very happy about our recent progress and achieving this milestone in a little over a year. Thanks to everyone who has supported our team in various ways. Since we don't know how long we'll be on top, we captured this moment for posterity: http://pragmatictheory.googlepages.com/numberone
Now I'll go back to my BBQ. That goat will make a great roast.
Tuesday, March 3, 2009
All the way around the sun...
Thirty-one million five-hundred-thirty-six thousand seconds ago, we were two guys with an itch and a bit of spare time.
Five-hundred-twenty-five thousand six hundred minutes ago, we read an article about an intersting contest.
Eight thousand seven hundred sixty hours ago, we got a couple of ideas.
Three-hundred and sixty-five days ago, we knew nothing about collaborative filtering or matrix factorization.
Fifty-two weeks ago, we wondered how far engineering could take us in this world full of scholars, PhDs and other geniuses.
Twelve months ago, we decided to give it a shot.
Today marks the one year anniversary of team Pragmatic Theory.
Let's take to look back at some of the intersting milestones we achieved this year:
- March 9th 2008 : First submission ever. Not very impressive: 0.9862
- May 10th 2008: Cracking the top 40. A little over 2 months in: 0.8822
- June 1st 2008: Cracking the top 10. The hill is getting steeper: 0.8731
- June 18th 2008: Above the Progress Prize 2007 line. Sixth place: 0.8707
- June 20th 2008: Cracking the top 5. June was a good month: 0.8699
- September 8th 2008: Second Place. Not for very long: 0.8655
- November 21st 2008: New-York Times article revealing that we are, indeed, geeks.
- December 26th 2008: Back in second place and top individual team. Merry Christmas: 0.8620
- January 11th 2009: Breaking the Progress Prize 2008. Things are getting interesting: 0.8614
- February 27th 2009: When you type "pragmatic the" in Google, it actually suggests "pragmatic theory netflix".
To mark this wonderful anniversary, we decided to reveal some of our secrets. Follow this link to find out more.
Five-hundred-twenty-five thousand six hundred minutes ago, we read an article about an intersting contest.
Eight thousand seven hundred sixty hours ago, we got a couple of ideas.
Three-hundred and sixty-five days ago, we knew nothing about collaborative filtering or matrix factorization.
Fifty-two weeks ago, we wondered how far engineering could take us in this world full of scholars, PhDs and other geniuses.
Twelve months ago, we decided to give it a shot.
Today marks the one year anniversary of team Pragmatic Theory.
Let's take to look back at some of the intersting milestones we achieved this year:
- March 9th 2008 : First submission ever. Not very impressive: 0.9862
- May 10th 2008: Cracking the top 40. A little over 2 months in: 0.8822
- June 1st 2008: Cracking the top 10. The hill is getting steeper: 0.8731
- June 18th 2008: Above the Progress Prize 2007 line. Sixth place: 0.8707
- June 20th 2008: Cracking the top 5. June was a good month: 0.8699
- September 8th 2008: Second Place. Not for very long: 0.8655
- November 21st 2008: New-York Times article revealing that we are, indeed, geeks.
- December 26th 2008: Back in second place and top individual team. Merry Christmas: 0.8620
- January 11th 2009: Breaking the Progress Prize 2008. Things are getting interesting: 0.8614
- February 27th 2009: When you type "pragmatic the" in Google, it actually suggests "pragmatic theory netflix".
To mark this wonderful anniversary, we decided to reveal some of our secrets. Follow this link to find out more.
Thursday, February 12, 2009
8756
We have been working recently on variants of BellKor's integrated model as described in their 2008 progress prize paper. We obtained results very similar from the published numbers: our implementation achieved 0.8790 RMSE on the Quiz Set (f=200), compared to the reported 0.8789.
This model proved superior to our own flavor of integrated model. However, what is interesting is that we were able to leverage the best of both models and combine them together. This combined model achieved a Quiz set RMSE of 0.8756 (f=200). This is, to our knowledge, the best reported number for a model without blending. On today's leaderboard, this would achieve the 47th rank by itself.
This model proved superior to our own flavor of integrated model. However, what is interesting is that we were able to leverage the best of both models and combine them together. This combined model achieved a Quiz set RMSE of 0.8756 (f=200). This is, to our knowledge, the best reported number for a model without blending. On today's leaderboard, this would achieve the 47th rank by itself.
Subscribe to:
Posts (Atom)