Tuesday, July 28, 2009

By a nose... or is it a hair...

Wow. What a crazy 24 hours that was.

After being quietly confident about our position but utterly nervous about the general silence on the leaderboard, we were struck by lightning when the newest coalition of coalitions, The Ensemble, submitted an entry just above ours, taking over first place merely 24 hours before the contest deadline. We were, of course, expecting the other parties to come to an agreement and join forces, but we were somehow hoping that they would come up short.

Well, with 24 hours to go and at least 0.01% to come up with, we weren't going to go down without a fight. Many people offered their help (a big thanks to all that did). Predictors were blended in. New techniques were tried out. Code was written. Nothing seemed to be helping to tip the scale... In the end, with less than a half hour to go, Yehuda and Martin P. scraped up a few new predictors, Michael and Andreas worked some more blending magic and we barely made it to a 10.09% tie for first. We had accomplished the day's mission and were now hoping that our test set score would be good enough to edge out a win.

Four short minutes before the end of the competition, another lightning bolt. The Ensemble had submitted at 10.10% and had appeared to have sealed the deal. We could now only pray that they had overfit the quiz set. We too had done our fair share of quiz set blending, but we had the advantage of having had a month's worth of experiments to tweak the regularization. Now, the contest was over anyway, it was out of our hands and all we could do was wait.

The wait was excruciating. Without much enthusiasm emails were exchanged internally. Hope was slim. Then the word came out on twitter that the Ensemble had won. It was over. We had lost.

All of a sudden, when we were no longer expecting it, an email from Netflix came in... Subj: "Netflix Prize Grand Prize Verification"... "Congratulations! The test subset performance of your team ”BellKor's Pragmatic Chaos” on the following submission makes your team the current top contender for the Grand Prize."... Sorrow turned into Joy... We had succeeded. We couldn't believe it. After having lost all hope, we had come out on top. Now the only thing standing between us and the Grand Prize was the verification process. Truly an amazing and unexpected finish.

We won't know the details of the test set results for a little while, but it's possible that we actually finished with the same test score as The Ensemble. If that is the case, then the tie breaker would be the submission time of those tied entries, which are most likely both the ones from July 26th. That would mean that we have the lead only because we submitted our final result 20 minutes before theirs. Almost three years of competition may have come down to 20 short minutes. Again, amazing.

As we enter the evaluation process, we would like to thank everyone for their participation in this contest. In the end, there can only be one winner, but it wouldn't have been such a great competition without everyone else out there that worked long and hard hours on this crazy project. We have met truly great and interesting people along the way. Walking this long and winding road with all of you was certainly the best part of this adventure.



Patent_Merc said...

I would like to congratulate you on your success and hard work. I have been watching and learning from everyone's posts throughout the competition and will miss my daily "Netflix Leaderboard/Forum" fix.


Anonymous said...

Thanks for sharing. That must have been intense!

I'm wondering if it was even closer that a nose... or a hair....

How about closer than gnat's pube? :-)

Anonymous said...

The Ensemble submitted 10.09 the on July 25th. Your team submitted 10.09 on July 26th.

Then the Ensemble submitted 10.10 on July 26 after you.

So it looks to me that the Ensemble got to 10.09 first.

We have already seen the papers from BellKor and BigChaos.

You should have joined the Ensemble :)

Anonymous said...

Since the test set was a lottery draw wherein any team could have come up on top, isnt it a little awkward for BPC to justify why their results would be better since the other team (Ensemble) is visibly better than then them on the leaderboard

Anonymous said...

Anonymous #2,

Please notice that the Test set is not a lottery draw, but the ultimate goal of this competition. Competitors had much consideration on how to optimize performance on that Test set. This is unlike the Quiz-set, or leaderboard, which should have been taken just as a proxy to the Test set. For more explanation, please refer to the competition FAQ.

Anonymous said...

Wow, that was a really dramatic twist! Congratulation to you and your teammates!