Football, Python

Probability of World Cup Group Draw

The lineup for the 2018 FIFA World Cup is now complete, and the teams have been sorted into pots ahead of the group stage draw on December 1st. All draws are not equally likely, so what teams is your team likely to face? 

The format for the group stage was decided in September. Each team has been allocated a pot (1 to 4) based on “sporting principles”. The FIFA World Rankings (as of October 2017) were used to rank the teams in descending order (after Russia, who, as hosts, are the top seeds). The pots are:

POT 1 POT 2 POT 3 POT 4
Russia Spain Denmark Serbia
Germany Peru Iceland Nigeria
Brazil Switzerland Costa Rica Australia
Portugal England Sweden Japan
Argentina Colombia Tunisia Morocco
Belgium Mexico Egypt Panama
Poland Uruguay Senegal Korea Republic
France Croatia Iran Saudi Arabia

The exact rules for the draw are described here (with a handy video explanation).  Roughly speaking the rules are that,  after Russia, who will definitely be assigned to Group A, a random team from each pot will be drawn, and then placed in the next available group (alphabetically) which they are allowed to be put in (subject to constraints such as splitting up the confederation). This process is repeated until all teams have been drawn.

However, because of the rules of splitting confederations, each permutation of the 4 pots is not actually possible. So if you focus on one team you can look at what the probabilities of drawing other teams are. For example,  the different probabilities of the draw for England are:

POT 1 POT 2 POT 3 POT 4
Russia (12.5%) Denmark (7.1%) Serbia (6.5%)
Germany (9.5%) Iceland (7.1%) Nigeria (12.0%)
Brazil (20.0%) Costa Rica (17.0%) Australia (13.5%)
Portugal (9.5%) England Sweden (7.1%) Japan (13.5%)
Argentina (20.0%) Tunisia (15.4%) Morocco (12.1%)
Belgium (9.5%) Egypt (15.4%) Panama (15.5%)
Poland (9.5%) Senegal (15.4%) Korea Republic (13.5%)
France (9.5%) Iran (15.4%) Saudi Arabia (13.5%)

The above was created using 1,000,000 simulations.

If you want to run your own experiment (with any team of your choice). The code I used can be found on github. Running 100,000 simulations will take about 3 minutes. To change the number of simulations and/or the team of interest just edit the first two lines of the code (specifically change the variables ‘sims’ and ‘pickedTeam’).

As for England, a South American giant in the form of either Brazil or Argentina from pot 1 is a daunting, but realistic, prospect for us in the Summer.


EDIT:

It was pointed out to me on Twitter by Julien Guyon that the code I originally posted wasn’t quite right. Specifically I was finding that not all simulations were resulting in possible draws, I was then excluding these cases. However this does have an impact on the probabilities. This has now been fixed. Julien has posted a lot of tweets showing various probabilities related to the World Cup draw so be sure to check out his Twitter feed. The code also had to be updated when FIFA announced the detailed rules of how they would do the draw.

Related post

7 thoughts on “Probability of World Cup Group Draw

  1. Could you share the probabilities for Argentina? I see it’s not difficult to modify the code but I don’t know how to apply it. Great job!

    1. The code is run using Python (which if you don’t have/use isn’t worth it just for this!). Here are the results for Argentina:

      Pot 1:
      Argentina

      Pot 2:
      Spain 20%
      Peru 0%
      Switzerland 20%
      England 20%
      Colombia 0%
      Mexico 20%
      Uruguay 0%
      Croatia 20%

      Pot 3:
      Denmark 18.5%
      Iceland 18.5%
      Costa Rica 8.9%
      Sweeden 18.5%
      Tunisia 8.9%
      Egypt 8.9%
      Senegal 8.9%
      Iran 8.9%

      Pot 4:
      Serbia 21.8%
      Nigeria 13.1%
      Australia 10.4%
      Japan 10.4%
      Morocco 13.1
      Panama 10.4%
      Korea Republic 10.4%
      Saudi Arabia 10.4%

    1. You need to install numpy on your computer. You can find out how here (I recommend using Pip). As for Tunisia probabilities, here you go (100,000 simulations hence some of the minor differences):

      Pot 1:
      Russia : 16.56 %
      Germany : 12.96 %
      Brazil : 9.09 %
      Portugal : 13.12 %
      Argentina : 9.01 %
      Belgium : 13.14 %
      Poland : 13.07 %
      France : 13.07 %

      Pot 2:
      Spain : 15.4 %
      Peru : 9.35 %
      Switzerland : 15.67 %
      England : 15.39 %
      Colombia : 9.24 %
      Mexico : 10.11 %
      Uruguay : 9.46 %
      Croatia : 15.38 %

      Pot 3:
      Tunisia : 100 %

      Pot 4:
      Serbia : 18.29 %
      Nigeria : 0.0 %
      Australia : 16.41 %
      Japan : 16.71 %
      Morocco : 0.0 %
      Panama : 15.43 %
      Korea Republic : 16.67 %
      Saudi Arabia : 16.49 %

  2. Hi Sean,
    Great job !
    But I got an error when running Python code:

    Traceback (most recent call last):

    File “”, line 31, in
    groups = addToGroups(groups, pot2, ‘team2’)

    File “”, line 37, in addToGroups
    if (np.array(remainingDict.values()) < 0).any():

    TypeError: '<' not supported between instances of 'dict_values' and 'int'

    Probably it comes from this command : getattr(group,team) in function addToGroups….

    Do you have some thoughts about it ?

    Thanks.

    Best regards,

    Fred

    1. That is very strange. Haven’t seen that one before. What version of Python are you running? Were you running the version directly pulled from Git (i.e. for England) or in a different set up?

Leave a Reply to Sean Cancel reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.