Predicting World Cup dark horses from press coverage using the AYLIEN News API – Monthly Media Roundup

The biggest competition in the world’s most popular sport returns this week, and from bars to office pools, predicting the winners and losers will be the topic of countless conversations. And just like in every tournament, these conversations will be dominated by the underdogs that do well and the giants that crash out early.


Since most of people’s expectations about their country’s team are formed by what they read in the sports coverage, we’re going to look into the publishing trends around the World Cup. From these trends, we’re going to try to predict the teams that could defy everyone’s expectations, and the teams are most likely to disappoint their fans.


To do make this prediction, we’re going to use the News API, which has gathered, analyzed, and indexed over 100,000 stories about soccer since the beginning of May. Specifically, we’re going to answer:

  • Which teams are being hyped by the media?

  • What is the tone of the coverage toward each country?

  • In Euro 2016, did the media hype the right teams before the tournament?

  • With all of this in mind, who look like the best underdogs to keep an eye on?


With these questions in mind, let’s dive in! Remember, the News API has a free two-week trial, so if you think of some points you’d like to dig into more, you can sign up for free here.

First, how much coverage is there to look into?

By using the Time Series endpoint, we can see the daily volume of stories published about any topic. Below, we used the Time Series endpoint to search for the number of stories published mentioning the World Cup every day since the beginning of May.

You can see that the coverage of the World Cup in the global media picked up steam very quickly after the final games of the Premier League on the weekend of May 12th/13th.


Since this blog is about looking into the media hype of the World Cup, we’re going to confine our analysis to the coverage that makes up the spike since May 15th. This leaves us with just under 40,000 stories in the soccer category that talk about the World Cup, and are not simply referencing the World Cup while talking about their domestic leagues.


So what can we find out from all of these stories?


English fans are the most likely to be disappointed

In every sports tournament, some teams are hyped more than others. To understand this hype, we used the News API to see how much the press is talking about each team and then check the FIFA World rankings of each team. This will let us roughly gauge which teams’ press coverage is merited by their success on the pitch and which teams are getting hyped for other reasons.

FIFA World Rankings are determined by how many points a team gathers, and these points are awarded by stats like goals scored, goals conceded, and how well the team performs against teams with higher rankings (in other words, a 2-0 win against Brazil will earn more points than a 2-0 win against Ireland).


Below are the FIFA rankings of the 32 teams taking part in the tournament, from highest to lowest. You can see that England are almost dead center, at 14th out of 32.


Using the News API, we can see how many times each of these countries are mentioned in the soccer coverage. To do this, we made a call to the Trends endpoint.

When we look at the amount of times each of these countries are mentioned in stories about the World Cup, we can see that the media coverage is skewed, covering some teams much more than their FIFA rankings justify:

The biggest outlier is clearly Russia, which as the second lowest-ranked team is the most-mentioned nation. But considering they are the host nation, ‘Russia’ will be mentioned more than any other country and we can disregard this.


Other than Russia, the clearest outlier is England, who despite ranking dead-middle are the second most-mentioned team in the press coverage. This disparity essentially means that of all the countries taking part in the tournament, the hype around the English team is the most out of sync with their FIFA ranking. So we think this also means that England fans are the most likely to be disappointed by the outcome.


Notice how despite the team’s high FIFA ranking, **Poland are the fourth-least mentioned team **in the tournament. In a similar vein, it’s interesting that Belgium are right behind Brazil in the FIFA rankings, but are mentioned in the media less than half as much.


If you’re looking to pick an outside for your office pool, these are some great nuggets of information to keep in mind.


Which countries are being talked about in a positive tone?

Knowing which countries are being mentioned is great, but using the News API, we can dive in further and analyze the sentiment of each article.


Using the Trends endpoint again, we searched for what percentage of the coverage of each country was positive, negative, and neutral. We saw in our previous post about sports that the coverage was very positive in general, and we can see that trend in the coverage of the World Cup:

The most obvious outlier here is Egypt’s high number of negative stories, which we’re guessing is largely due to Egypt’s star player, Mohamed Salah, being on the receiving end of a disgraceful tackle in the Champions League final.


Other than Egypt, we can see that Peru and Iceland received quite negative coverage. We can also see that  Brazil, Portugal, South Korea, and Croatia are mentioned in the most positive stories.


Did the coverage of Euro 2016 tell us anything?

So now that we have an overview of which teams are being hyped and which are being covered in positive and negative lights, we can also look back at the coverage of Euro 2016 and see how the media’s favourite teams fared.


To do this, we’re going to copy the steps we took above – analyze the coverage of the Euros from the end of the domestic league until the tournament began. This gives us an overview of which teams the media were talking about on the eve of the tournament.


Once again, we’re going to rank the teams by their FIFA World Rankings as they stood in May 2016, and once again, we can see that both England and the host nation (France) were the most-mentioned countries in the coverage:

On the eve of Euro 2016, Portugal, the eventual winners, were rated by bookmakers as the seventh most likely team to win the tournament  at 20/1. It’s a little bit annoying for us to see the coverage on the chart above because it’s clear that that they really didn’t stick out very much – they’re not mentioned too much or too little relative to their position on the FIFA rankings.


But looking at the chart above, the fate of two of the three least-mentioned teams is quite interesting:

  • Iceland got as far as the quarter finals, knocking out the most-mentioned team (England) on the way

  • Croatia topped their group, beating Spain and only narrowly losing after extra time in the knockouts to the eventual champions, Portugal.


Keep in mind that the only team mentioned less than either of these teams was Albania – who were dead last in the FIFA World Rankings. So keeping an eye out for teams that are mentioned much less than their FIFA rankings justify can help you spot potential dark horses.


In the upcoming World Cup, some teams that clearly meet this criteria and could cause upsets are

So if you’re making a prediction on who the outsiders will be, remember that these teams aren’t being talked about in the press as much as the others, just like Iceland and Croatia were two years ago.


If you see something in the data here you’d like to look into, you can try out the News API completely free of charge for two weeks by clicking on the link below, or check out some more content showing the News API in action.