So we finally come to the famous Expected Goals, usually abbreviated as xG.
First things first: it’s not The Stat That Solves All Our Problems, and has never been claimed as such. It’s a number like any other, which tells us some things and doesn’t tell us others. But for the moment (and a very strong emphasis on those three words), it appears to be the number which best sums up a team’s overall performance level.
Note I said ‘sums up.’ To understand how a team is performing, we have lots and lots of measurements, some simple, some complex. They’re all useful in some way. Moreover, there’ll always be things our numbers haven’t touched, for which new measurements will need to be developed. But right now xG is the shorthand version, the place to go if you’re looking for a single number that tells the story.
In that sense, it’s a vast oversimplification, as any single number has to be. But it works well enough to be useful, and as a result has shouldered to the front of the statistical queue. It’s managed to do so even though there’s no actual conclusive agreement on how it should be calculated.
That’s right — there is no one xG statistic. Instead there are several different versions of xG, each calculated by a different method, each developed by an individual analytics expert. How, then, can the stat have become so current, so successful?
Because it measures something we want to know, and performs the job well. Expected Goals, as its name implies, tells you how many goals your team should expect to score, and allow, based on its actions on the field of play. It’s an indisputable fact that you can play badly and win, or play well and lose. What xG tries to do is cut through the noise of actual goal totals to get to the true level of performance.
We’ve already seen this idea with total shots ratio (TSR) and shots on target ratio (SoTR). Remember the Swansea City example: TSR and SoTR showed that although the Swans finished in the top half of the table in 2014/15, their underlying performance level was significantly lower. Expected Goals does the exact same thing as TSR and SoTR, only better.
At the same time, we’re still figuring out how best to calculate xG, meaning what statistically measurable actions, combined in which ways, most clearly lead to scoring and allowing goals. Hence the many versions of the stat.
These versions come in two basic forms: shot-based and non-shot-based. In this piece we’ll talk only about shot-based xG, because it’s more widely used, and also much easier to grasp. In fact, it’s very simple: every shot your team takes is assigned a probability that the shot might result in a goal. Then all your shots are added together, and the total is your xG. Your opponents’ shots are added in the same way, and usually abbreviated as xGA. And that’s it.
The devil is in the details—how, exactly, do you calculate the probability? The model developed by Michael Caley, one of the most prominent statisticians, uses a large number of variables, including distance, angle, part of body used, type of pass preceding the shot, dribbles involved, and so on. This piece, this piece, and the links on those pages are an excellent introduction to his method, plus a frank evaluation of its strengths and weaknesses.
Here, then, is the current league table. After each team, there are four numbers: xG, then actual goals scored in parenthesis, then xGA, then actual goals against in parenthesis.
- Chelsea 40.6 (51) 19.7 (17)
- Tottenham 46.6 (46) 23.0 (16)
- Man City 53.8 (49) 21.9 (29)
- Arsenal 44.1 (52) 30.3 (28)
- Liverpool 47.1 (52) 24.7 (30)
- Man United 39.7 (36) 22.1 (21)
- Everton 35.6 (40) 30.4 (27)
- West Brom 25.6 (32) 30.3 (29)
- West Ham 33.2 (32) 40.0 (41)
- Watford 23.3 (29) 34.3 (40)
- Stoke City 30.6 (29) 33.8 (36)
- Burnley 23.7 (26) 40.6 (35)
- Southampton 32.5 (24) 24.7 (31)
- Bournemouth 29.9 (35) 41.6 (47)
- Middlesbrough 20.2 (19) 31.2 (27)
- Leicester City 29.8 (24) 35.1 (41)
- Swansea City 27.0 (29) 51.7 (54)
- Hull City 21.0 (22) 49.3 (47)
- Crystal Palace 29.1 (32) 33.5 (45)
- Sunderland 23.9 (24) 42.1 (42)
Let’s look first at xG, with a particular eye on who’s clearly overperforming or underperforming. Significant overperformers, scoring more than five goals above their xG, are, in order: Chelsea (by a large margin), Arsenal, West Brom and Bournemouth. Significant underperformers, scoring more than five goals below their xG, are Southampton (by a large margin) and Leicester.
When we go to xGA, we find the big overperformers in defence are Tottenham and Burnley. Significant underperformers are more plentiful, but Crystal Palace are in a class by themselves, before a drop to Man City, Leicester, Watford, and Liverpool.
If we put the two together, we can find the sides most overperforming and underperforming their goal difference. At the top of the list are Chelsea, almost all of it on the attacking side. They have an outstanding conversion rate of 14.9%. They’ve also scored a number of goals early, from relatively few shots, and then defended well. A good counter-attacking side will often overperform xG, because they’ll rack up goals late when their opponents are trying to score. It may be no coincidence that Southampton, the greatest underperformer both in attack and overall, are one of the least effective teams on the counter, because they have limited pace.
As always, if you’re greatly overperforming or underperforming, the odds are that things will shift a bit toward the mean. But unlike Sh% and Sv%, and like TSR and SoTR, Expected Goals can tolerate significant differences for long periods of time.
Analytics types have been waiting for Chelsea to drop all season, but they’ve continued their remarkable results, and accordingly have a huge lead in the table.
We can’t look at every team individually, so let’s focus on Burnley, one of the surprise teams this year. As we’ve seen, their TSR and SoTR have them battling relegation, and so do their xG. But they’re the third-biggest overperformer in goal difference. How? In attack, they’ve squeezed out a number of even matches with late goals. They’re also exceeding xG at unprecedented levels from shots outside the area.
In defence, they’ve held firm against some very shot-heavy onslaughts. This is great for Burnley fans, and great for the league – but again, remember Swansea. All the measures have Burnley’s true level near the bottom, and if the squad isn’t strengthened significantly next year, they’re much more likely to be relegated than to finish mid-table. They may yet fade this year as well.
One of the best things about xG is that it shows your overall shot quality and that of your opponents. Just divide your total xG and xGA by the actual number of shots taken, and you get a measure of how well you’re attacking (xG/shot) or defending (xGA/shot). Man City, unsurprisingly, have a big lead in xGA/shot: theirs is 0.140 to second-place Arsenal’s 0.119, meaning the average Man City shot has over a 2% chance more of going in than an Arsenal shot. But Arsenal have scored three more goals, and since the teams have taken roughly the same number of shots, that suggests the Gunners are converting their shots at a significantly higher rate, and they are – 14.0% to 12.7%.
So Man City get excellent shots, but don’t convert enough of them; they’re 4.8 goals behind their xG. When we go to Man United, though, we find that they aren’t getting enough good shots. They take more shots than any side except Tottenham and Liverpool, but their xG/shot is 0.098, only the 15th best in the league. Their conversion rate is even worse: 18th in the league. If we go back to last week, we find that they’ve been a bit unlucky, with a low Sh%. And even though that Sh% regressed toward the mean with three goals against Leicester, they’re still not helping themselves enough. Southampton, the great underperformer in attack, rank even lower in shot quality. They’re getting poor shots, and scoring even fewer than they should.
Now to xGA/shot. You’d guess that Chelsea would lead the pack here, but in fact they’re only fifth best, behind leaders Burnley, Middlesbrough, West Brom, and Stoke. What these teams have in common is that when they defend, they tend to defend deep. So their opponents tend to have high numbers of lower quality shots.
Let’s now take a closer look at the Fabianski/Forster comparison from last week. Remember, they’re the two keepers with the lowest Sv%. When we look at their teams’ xGA/shot, we find something revealing. Swansea, just now fixing a super-leaky defence, have an xGA/shot of 0.149, easily the worst in the league. So Fabianski has faced a barrage of very difficult shots, which is a factor in his low save percentage. But Southampton’s xG/shot is a midtable 0.100, which means Fraser Forster is registering a historically low save percentage against a set of average shots. In other words, he’s having a bad year. West Ham’s second goal last weekend was just another in a long line of savable shots that have gone in.
To finish, let’s examine another useful application: xG totals in individual matches. A large percentage of matches can be effectively evaluated by xG, aided by shot maps. Let’s look at a few from last weekend, with shot maps available on Caley Graphics if you keep scrolling down.
The big game, Chelsea-Arsenal, wound up 3-1, with Chelsea leading 1.8 – 0.9 in expected goals. The xG numbers show a solid advantage, almost a full goal. A pretty typical game for Chelsea: taking the lead early, scoring on a counter and a defensive error, allowing some shots when ahead but nothing dramatic. So they exceed their xG, and their xGA is almost on the nose.
Now to Leicester – Manchester United: goals 0-3, expected goals 0.5 – 2.1. Once again expected goals tells a clear story. United dominated, and if you look at the shot maps you’ll see Leicester got only low-chance shots. Like Chelsea, United outscored their xG. In fact, unless you’re Barcelona, it’s rare to hit three xG in a game. A side that scores three goals is usually indebted to good finishing.
Southampton – West Ham is fascinating, because the result, 1-3, is the opposite of the xG totals, 1.3 – 0.5. The Saints had most of the play, and got close to their xG, but typically, could only get off low-quality shots. West Ham had only a few opportunities, but put them away, scoring fully 2.5 goals above their xG. This game also displays a little of what are called “score effects.” When a team is ahead in the latter stages of a game, they’ll defend deep. The result, as here, is that the team behind gets shots of low quality, which increase their xG, but usually come to nothing.
And to take this even farther, we can look at Crystal Palace – Sunderland, a game where xG doesn’t tell the story at all. The result was (look away, Palace fans) 0-4, but the xG totals have Palace ahead 1.6 – 0.9. That’s right: xG by itself would tell us that Crystal Palace were the better side. What happened, of course, is that Sunderland scored three goals off not-at-all-easy chances, then sat tight the entire second half. So at the individual game level, xG can be way off the mark.
It’s good to close with an example that shows the limitations of xG, lest we be tempted to think it has all the answers. As we noted at the beginning, it’s a stat like any other, and there are several competing versions. Analytics people themselves know best of all where it succeeds and where it can fail. But if used carefully, it’s our best single stat for overall performance, and offers a logical endpoint for our series.
Next week, then, I’ll have some final thoughts on football stats, plus some excellent links for those who want to explore this fascinating field for themselves.