Edit (17/03/2022): This post from 2020 is the oldest post from this site and somehow also the most popular. Unfortunately, a lot of the code was outdated so I decided to update this post.
Passmaps or passing networks are one of the most popular visualizations in football. And probably for good reason. If I could pick a single graphic to tell the match story, I’d probably pick a (well-made) passmap.
They pack a lot of useful information
about a single match in an intuitive manner. Passing trends, networks, players’ roles in a given system, and even how well they’re performing said roles.
Let’s go over creating your own in Python (using Statsbomb’s open data).
(If you’re just interested in the code, the github link to the notebook is here)
Pre-requisites
I’m gonna be using Python so you’ll need that.
Other than that, we’ll also be using the following libraries:
To create a passmap for a match, we’ll need some event data. Statsbomb have some free data. The library statsbombpy is useful to access that data.
Basic Overview
What really is a passmap?
This is the popular version by 11tegen11. At first glance, it might seem like there’s a lot going on here but
let’s take a closer look at what information it’s supposed to convey to us, specifically, two main things:
the average position of the player, and
the number of passes between any two given players.
(Note: average position doesn’t necessarily have to be the mean, in some instances it makes sense to use other aggregations as well - like median)
Apart from that, we also have
players’ names, and
players’ dot sizes (which indicate the total number of passes played by the player).
Finally we have some aesthetic details - match details, watermark, team’s logo. For the purpose of this post, we are going to ignore the last two.
(Again, to just check the notebook, find the link here)
Getting Started
Imports
Code cell
Loading the data
Statsbomb have a unique match_id for every match in the open-data repository. The match we’re going to look at is the FIFA WC 2022 Final
between Argentina and France.
Code cell
Next, we’ll pick the match_id and get all events for it using the .events function. It will return a Pandas dataframe.
Code cell
The dataframe df contains the lineups for both teams as the first two dictionaries in the column tactics.
Code cell
This is important, we’ll need the names from here.
Helper Functions for Plotting
Let’s write a couple helper functions and Classes to create the arrows to indicate passes and to create an entry for the same in the legend.
Code cell
Passmap Logic
The passmap logic is essentially just two groupbys.
Groupby 1 is used to get the average positions for players’ - we can aggregate on the player names and calulate a mean/median value for x and y
player_location_df=team_pass_df.\
groupby(['player']).\
agg(x=('pass_start_x','mean'),y=('pass_start_y','mean'),total=('pass_start_x','size')).\
reset_index()"""
player x y total
0 Alexis Mac Allister 68.737500 23.845833 24
1 Cristian Gabriel Romero 33.905128 53.346154 39
2 Damián Emiliano Martínez 13.400000 44.663636 11
3 Enzo Fernandez 56.135000 36.532500 40
4 Julián Álvarez 76.346154 37.538462 13
5 Lionel Andrés Messi Cuccittini 72.172000 50.708000 25
6 Nahuel Molina Lucero 53.695000 68.920000 20
7 Nicolás Alejandro Tagliafico 55.591667 8.387500 24
8 Nicolás Hernán Otamendi 37.979070 23.820930 43
9 Rodrigo Javier De Paul 59.708824 58.770588 34
10 Ángel Fabián Di María Hernández 81.928571 12.909524 21
"""
Groupby 2 is used to calculate the number of successful passes between all pairs of players - the starting 11 probably makes the most sense in most cases (except for red cards and early substitutions)
players_passes_df=team_pass_df.\
groupby(['player','pass_recipient']).\
agg(passes=('pass_start_x','size')).\
reset_index()"""
player pass_recipient passes
0 Alexis Mac Allister Enzo Fernandez 6
1 Alexis Mac Allister Julián Álvarez 1
.. ... ... ...
80 Ángel Fabián Di María Hernández Julián Álvarez 2
"""
Once, we have those two we can loop over the result of groupby 2 and use each player’s corresponding average position from groupby 1 to plot an arrow (passes).
The following function does exactly all that.
Code cell
Note: This bit was just to ensure we avoid overlaps between our arrows.
Finally, we’ll set up some variables and call our draw_passmap function.
Code cell
One of the advantages of passmaps is that they’re fairly customizable. There’s a bunch of different things you can try depending on what you’re trying to look at. Some examples,
Aggregation
Method can be different: median is useful when you have players changing wings.
Aggregation period can be different: first half, first 60 minutes, etc.
The players being aggregated could also be different: France had two early subs and hence their passmap is only until the 40th minute. It might make sense to consider a different set of 11 players. How would we go about doing this? Perhaps using Ben Torvaney’s window function idea outlined here
Player Nodes - Size could indicate something else other than total passes (xT maybe).
Could increase/decrease minimum number of passes considered. Could also add other useful metrics.
This is why instead of a static viz, a more interactive thing makes sense, where you can use a bunch of widgets to mess around with all these small things.
Karun Singh does this and more in his blog post here.