NFL Week 1 Results: How Did Bing's Prediction Engine Do?
Note: This recap is from the 2014 season. You can find the 2015 Week 1 recap right here.
Based on Week 1, it looks like Microsoft Bing (along with its Cortana app for Windows Phone users) might be able to sneak into the playoffs this year, at least as a wildcard team.
Microsoft’s search engine revealed last week that it would be predicting the outcomes of NFL games all season, and they backed up their predictions with the bold claim that they expected to get the majority of the games right.
So how did they do?
Week 1 Recap and Analysis
First, let’s review their Week 1 predictions:
As I mentioned in my analysis last week, Bing favored the home team in 14 of 16 matches (and they favored the Vegas odds in 15 of 16 games). I thought this might be giving too big of an edge to home-field advantage. After all, the home team only wins 57% of the time. In my own prediction, I gave Bing slightly better odds than this, predicting they would go 12-4.
It seems I gave Bing a little too much credit, but they did make good on their promise of getting the majority of the games correct. Their Week 1 record was 10-6, giving them a 62.5% success rate at predicting the winners. In what is something of a coincidence, the home teams were also 10-6, which is slightly better than the historical trends.
While we don’t know the precise mechanics of Bing’s prediction algorithm, we do know that website mentions and social signals play a role. With active discussion of gameday meet-ups, home teams might get a slight edge in the world of social. However, the more popular teams—which also tend to be the better teams—will almost always get the most online attention from fans. That could explain why Bing would (incorrectly) favor the visiting Patriots over the blossoming Dolphins and (correctly) favor the visiting 49ers over the struggling Cowboys.
Let's break things down a bit further:
- Bing gave four predictions with chances at 70% or more (Broncos, Seahawks, Steelers, and Eagles). All four of these predictions were correct.
- Half of Bing's predictions fell in the 60%-69.9% range. Bing's record here was 4-4.
- The remaining four games were listed with chances between 50% and 59.9%. Bing went 2-2 here.
So far, the prediction engine generally performs better when it is more confident, at least to an extent. When an outcome has a chance of 68% or better, Bing is a perfect 5-0.
Bing's Biggest Error is Minnesota's Big Win
Bing’s biggest flub: giving a 67.4% edge to the St. Louis Rams, who fell to the Minnesota Vikings in an embarrassing 34-6 blowout—the biggest margin of defeat from Week 1.
Why was the search engine so wrong? Well, this may reflect the danger of predicting based on social signals and website mentions. The Rams were arguably the most talked about team leading up to Week 1. From weeks of media coverage regarding Michael Sam to plenty of talk about who would fill the shoes of injured starting quarterback Sam Bradford, the Rams might have been the preseason leaders in web discussion. Is Bing’s prediction machine capable of distinguishing between controversy and pregame hype? Given this prediction, it seems not. The Rams, after all, have arguably the weakest offensive attack in the NFL. Even home-field advantage and lots of internet chatter wasn’t enough to propel them to a single touchdown.
Bing's Week 2 Predictions
Bing has released its preliminary predictions for most of Week 2 (we’re still waiting for the Sunday night and Monday night games):
Bing seems to have abandoned its home team trend as it moves into Week 2, giving the edge to 6 visiting teams. This week’s picks also tend to favor last week’s winners, with only two games giving the edge to an 0-1 team over a 1-0 team. Bing again is pulling for the 0-1 Patriots on the road against the 1-0 Vikings, and they also favor the 0-1 Packers at home over the 1-0 Jets. Could the Minnesota Vikings pull another Bing upset this week and give the Patriots on 0-2 start?
Looking at the percentages, Bing seems far less confident in its predictions this week. During Week 1, the prediction engine put the chances at 60% or more for 12 of the games, including four games with chances over 70% (all four of which were correct). So far this week, Bing has only four games with chances over 60%, the highest being a relatively modest 62.9%. Has Bing lost its confidence already, or are these games that much tougher to call?
Will Bing Make Adjustments?
We should expect to see adjustments to these predictions as the week goes along, especially since we haven't seen predictions for the Sunday night or Monday night games yet. Based on their prediction patterns, I assume Bing will favor the 49ers over the Bears (with a 60%+ chance) and the Eagles over the Colts (with a chance close to 50%).
It will be very interesting to see how Bing handles the vast amount of talk about the Baltimore Ravens and Ray Rice. Will they mistake this coverage for a sign that the Ravens are on the upswing? Or will they correctly be able to read this chatter as a sign of a franchise in turmoil?
Come back next week to find out how Bing did in Week 2. My prediction: Even though they seem less confident, Bing will improve on their Week 1 performance and go 11-5.
Bing has now released its full predictions for Week 2. In a bit of a surprise, they favored the 0-1 Colts over the 1-0 Eagles for the Monday night game (chalk this up to the home-field advantage factor).
They have also upped the chances for some of the games, with increases by as much as 16%. It's unclear at the moment what has happened to cause such drastic adjustments. They haven't made any changes to the predicted winners yet, and the chances for that interesting Steelers/Ravens match-up remain at 59.8%.
The 80% chance for a 49ers win marks the most confident Bing has been yet in a prediction. We'll see what that looks like on gameday.
- Games with 50%-59.9% chance: 7
- Games with 60%-69.9% chance: 5
- Games with 70%-79.9% chance: 3
- Games with 80%+ chance: 1
I'm going to stick with my 11-5 prediction for Bing, although a Vikings upset is looking more likely now.