How pre-World Cup statistical predictions compare to current standings
The 2018 World Cup is being billed as the most unpredictable ever, but is it really? The data used to make some forecasts suggests it is.
Stay up to date with all the latest news from Russia with Daily Maverick’s dedicated World Cup 2018 section.
Despite the popularity of things like SuperBru, nobody really makes predictions in sport with the expectation of being right. We’re not just saying that because the Daily Maverick‘s office pool has been a bit of a nightmare for this writer.
It’s felt like this World Cup has seen more upsets than usual. The curse of the champions struck again with Germany booted out at the group stage and much-fancied Brazil also sent packing when everyone thought they would win the trophy, again.
We have our final four – Belgium, Croatia, England and France – but what were the chances of them making it this far before the tournament kicked off?
Big tournaments are a hotbed of PR goals for investment banks. Everyone is looking for predictions and angles, but sometimes it all goes horribly wrong.
Just ask Goldman Sachs.
The firm used a complicated statistical model for predictions, relying on machine learning to use one million variations to pick a winner ahead of the tournament.
Before kick off, it forecast a final between Brazil and Germany. Towards the end of the group stage, the model was tweaked and predicted a final between Brazil and England.
The changed model correctly predicted the current England vs Croatia semi-final but it suggests that the Croats will win. However, considering how predictions have gone awry England fans might dare to dream, all things considered.
To be fair the firm did issue caution, saying: “it is difficult to assess how much faith one should have in these predictions. We capture the stochastic nature of the tournament carefully using state-of-the-art statistical methods and we consider a lot of information in doing so (including player-level data). But the forecasts remain highly uncertain, even with the fanciest statistical techniques, simply because football is quite an unpredictable game. This is, of course, precisely why the World Cup will be so exciting to watch.”
But just how wrong were the predictions before kick off? Let’s take a look.
Of the teams who reached the final four, only France were given a reasonable chance of becoming champions in the original forecast.
The table below is responsive and you can sort it accordingly to see how predictions hit and miss.
Of the four teams picked as most likely to make it to the semis, only Les Blues were given a chance. Staggeringly, despite being perennial dark horses, Croatia were given no hope at all. The quarter-final predictions were also way off.
Predictions for the last 16 were a bit more accurate, with Germany and Poland’s early exit plus Switzerland and Japan’s progress the only obvious anomalies.
Goldman Sachs aren’t the only folks with dodgy data, though.
Kitman Labs, a company that provides data to elite sports teams, also got it wrong. They used a more transparent and straightforward data model to compile data from the previous five World Cups, predicting that it was all down to goals scored in the group stage.
They noted: “teams were more likely to progress through to the knockout rounds when they scored one more goal than they conceded during the group stage. Data also showed, on average, that teams who were successful in making it to the quarter-finals of the Fifa World Cup score four to five more goals than they conceded during the group stages.”
This model was successful for predicting the last 16 teams, with Kitman labs being spot on about 14 of the 16 sides.
But at the quarter-final stage, things became a bit more complex, with the data scientists predicting Uruguay, Russia, Croatia, Brazil, England and Belgium as the sides to progress.
For the semis, they put down Belgium, Uruguay, England and Croatia. Early predictions suggested Uruguay would go on to win the whole thing.
The difference between the two models is intriguing, showing that perhaps less is more.
But machine predictions are just one way of doing things. What about public opinion?
Before the tournament kicked off, an Ipsos poll across 27 countries with roughly 12,000 respondents was also way off the mark. Germany and Brazil were their favourites for the trophy.
You will find more infographics at Statista
Over on SuperBru, predictions have also been hit and miss. As one example, just 33 percent of users guessed the winner between Belgium and Brazil in the quarter-final.
With so many variables, you might as well flip a coin to predict who is going to win the whole thing in the end. DM
Daily Maverick © All rights reserved