Win probability estimates: what are they good for?

This post doesn’t contain any (original) graphs. Normal programming will resume shortly.

Win probability graphs have taken over the wonkier corners of sports media. American sports media, in particular, feature them heavily.

I’ve calculated some simple win probability estimates for AFL footy in the past. My estimates show how likely a team is to come back from a given margin at quarter time, half time, or three-quarter time. For example, if a home team is up by 12 points at three-quarter time, I estimate that they’re 81.3% likely to go on to win the game.

My model is very simple. Using all home-and-away games since 1920, it looks at what the margin was at each break, and whether the team went on to win or not. Other models – particularly for American sports – are far more complicated. For example, baseball win probability models take into account not only the score and the inning, but the number of outs, the number of runners on base, and other factors. They are, in short, good and cool.

Win probability plots copped a fair bit of flak today, because the New England Patriots came back from 28-3 down in the third quarter to win an improbable victory against the Atlanta Falcons. Here’s what ESPN’s win probability plot for the game looked like:

2017 Superbowl win probability plot

I think a fair bit of the criticism directed towards win probability graphs, and the models they’re based on, has been undeserved. I’ll address that, for some reason, in the form of an entirely hypothetical dialogue with a win probability hater. Here goes!

In-game win probability is only useful for gambling and gambling sucks. Therefore, win probability graphs suck.

Nope! In fact, these plots aren’t even that useful for gambling. For them to be useful for gambling, your model would have to be able to reliably outperform the bookies in estimating the odds of victory. Any model that could reliably outperform the bookies would, if it was public, cause the bookies to use it to set their odds, thereby eliminating your ability to make money using the model.

In fact, these plots provide an alternative to live betting odds. They provide an estimate of each teams’ win probability unencumbered by gambling advertising.

What are these plots useful for, if not gambling?

As a baseball fan, I often keep half an eye on the Fangraphs live scoreboard to figure out when a game’s getting interesting. With up to 15 games played at any one time, it’s useful to have an easy overview of which games are still up for grabs and which are all but over. When should you turn on a game, or turn one off? Win probability graphs can tell you that.

Why do you need a model for that? Can’t you just look at the score?

We all have our implicit models of how matches are likely to go. If a footy team is up by 10 goals with 5 minutes to go in the final quarter, we can say with a high degree of certainty that they’re going to win. If they’re up by 10 goals at quarter time, we’ll also be pretty sure they’re going to win, but a bit less certain. How much less certain should we be? That’s where a model comes in. They can give us a much more rigorous estimate of how likely a team is to go on to win from a particular position. At what point does a three goal lead become ‘safe’? Again, a model gives an answer to that.

Is that it?

No! How much did a particular goal affect a team’s chance of winning? A model can tell us that.

Why bother with all that?

Why bother with anything? It’s all just a momentary diversion on the road to the grave. Figuring out how sports work is just some small way to impose artificial order on our ultimately meaningless existence and derive some fleeting pleasure from understanding our arbitrary universe just a tiny bit more.

Uh, moving on… Nate Silver got the election wrong, so why do you still bother with all this stats stuff? Doesn’t the election show it’s all nonsense?

Nate Silver didn’t get the election wrong though, did he? The night before the Presidential election, FiveThirtyEight’s Polls Only forecast model gave Donald Trump a 29% chance of winning. That’s about the same odds as getting a one or a two if I roll a die. If I roll a die, it’s most likely that the number that comes up won’t be a one or a two – but we all understand that a one or a two might still come up. Why is that hard for people to understand when it comes to probabilistic forecasts? One-in-three events happen about one time in three, and that’s what happened in the Presidential election.

What if you say a team has a 1% chance of winning, but then they go on to win? Surely that shows you got it wrong.

Not necessarily.

If you assign a one-in-100 probability to something happening, it should happen about one time in a hundred. If one-in-100 events keep happening – much more often than one time in a hundred – then there’s probably something wrong with your model. By the same token, if events you predict will come up one time in a hundred never happen, your model has some problems as well.

If win probabilities are estimates from a model, why are they presented like they’re a fact?

This is a fair point! Any estimate of win probability is just that – an estimate. With any win probability model, there will be some uncertainty about that estimate. Some of the uncertainty depends on the sample size of games used to estimate the model – this can and should be depicted graphically as confidence intervals around your probability estimate.

Some of the uncertainty about a win probability estimate comes from inadequacy of the model, and this is harder to quantify. If the model doesn’t take into account every possible factor that could affect the result of the game, this will also mean that the estimated probabilities might not be quite right.

The estimates are estimates, and they should be presented as such. But a lot of win probability models generate pretty good estimates, and the quality of their predictions is easily tested.

Isn’t there a danger that people will misinterpret probabilities?

Yes! A 90% probability doesn’t sound that different to a 99% probability, right? But a 90% probability is one-in-ten, while a 99% probability is one-in-100. That’s a big difference, a people routinely misunderstand that.

But this is a reason to explain estimates more rigorously and carefully, not to abandon win probability models.

Shut up with all this nerd stuff – I just want to watch the game!

Cool! Me too. But enjoying the aesthetic spectacle of sport for its own sake and delving beneath the surface to figure it out with numbers are not mutually exclusive things. You can enjoy both.

Creator of The Arc