(a.k.a. why one submission a day is enough)
Serious attempts at the Netflix challenge require blending results from many algorithms. Blending is an operation that transforms multiple estimates into a single higher accuracy estimate. This is a brief tutorial of the steps involved. Experienced Netflix participants should not bother to read further.
Step 1: construct a reduced training set
To blend the models, you need to construct a reduced training set by excluding from Netflix provided training set all ratings present in the probe set.
Step 2: train you different models on the reduced training set
For now we train each individual model on the reduced training set. Later we will re-train all the models on the full training set. To re-train in a consistent way, it is critical to record carefully at this step all the parameters used, the number of training epoch, etc.
Step 3: predict the probe set
For each model trained on the reduced training set, predict the probe set.
Step 4: select you favorite blending recipe
This step receives as input the predicted probe set results from each model, and the real probe set ratings. The output is a function that mixes the individual model predictions into a blended prediction, hopefully better than any individual result. A simple linear regression will get you a long way, but feel free to experiment with your favorite machine learning algorithm. What is key here, is that any unknown coefficient (for example the linear regression coefficients) can be selected to minimize the error between the blended prediction and the real probe set scores.
N.B. If over fitting the blending function is an issue, partition the probe set in two. Use one part for training the function, and the other for cross-validation.
Step 5: re-train all models using the full training set
At this point, we are preparing for our final predictions. To get the best possible accuracy, we re-train all models using the full training set. The addition of the probe set data in the training data can result in an accuracy boost of 0.0060 or more.
Step 6: blend the re-trained models
Here's the leap of fate. We assume that the blending function we computed at step 4 is still a valid function to blend the re-trained models. For this to work, the two sets of models must be computed with a rigorously equivalent methodology. Also, the selected function from step 4 must be a valid generalization and avoid over fitting. This is not an issue with a simple linear regression, but may become problematic for complex machine learning methods with many degrees of freedom.
Step 7: clamp the final predictions
Here's a hint: clamping values between 1 and 5 is not optimal.
If this well done, then improvements in the models can be measured after step 4 by comparing the accuracy on the probe set. Values on the qualifying set will be better by 0.0060 or more, but this offset should be very consistent from one submission to another. Lately I have been getting 0.0068 +/- 0.0001.