Stat’s More Like It: Squeaky bum time is upon us

Stat’s More Like It: Squeaky bum time is upon us

While some crucial World Cup qualifiers get played (well, at least where I’m from), and we have to suffer the only annoying thing about not being in League 1 any more – an international break – it’s perhaps a good time to try and keep our nervous dispositions in check and have a look at what our chances of Championship survival are. Stats are from stats.football365.com and statto.com.

Last season, when we were coming up to the business end of the football season, I did a look at what fixtures we had left compared to our promotion rivals. The verdict wasn’t positive: United (and Milton Keynes) both had easier run-ins than we did. What even statistics couldn’t account for was United’s wonderful knack of self-combustion, and that amazing run we went on when Dave Jones was hired, so, happily, what seemed statistically most likely didn’t happen. And the strangest things really do happen sometimes in football. Just look at Kidderminster Harriers in the Conference: They didn’t get a single win in their first 10 matches this season, but have won a staggering 21 of their last 28 matches.

Daren't Watch

Relegation battles – not for the squeamish.

But even if, unlike me, you don’t see a certain beauty in a well-ordered spreadsheet, graphs that look like a goalkeepers jersey from the late 80s/early 90s and numbers, number and more numbers in all their glory, you should still pay attention to statistics. They’re usually better than your gut feeling, and a lot less likely to just say what you’ve been going on about all season. Churchill once said that he never trusted a statistic he hadn’t manipulated himself, and there’s a truth in that. How you set up the numbers make a massive difference. In this installment of “Stat’s More Like It” I’ll try the best I can to show how I’ve ended up where I have with the statistics I’m going to show.

We’re coming up to the part of the season, where men and mice are sorted accordingly. It’s squeaky bum time. It’s never boring being a Wednesdayite. Fans of boring midtable teams that always seem to be midtable don’t know what they’re asking for, when they complain about a lack of excitement. And the most faint-hearted among us will take refuge in all manners of girly “I daren’t look”-poses earlier and earlier into games (none mentioned, none forgotten!).

Relegation battle – recent form

Mystic Meg

Not easy to predict the future

Already radio phone-ins, tv punditry and football podcasts are awash with people making cocksure predictions about how the season will end – they’re often as wide of the mark as the lady on the left. And sometimes they’ll say something like “we have a better run-in than they do”.

Normally people will make a quick scan of the matches each team battling for promotion/survival have, and judge the run-in on what league position the opponents in the fixture lists are in.

 

Just looking at where a team is in the table when considering if they’ll be tough opponents is too simple.

You’ll see why here:

1213-mar20-last8and4matches

The table shows the normal table in the first handful of columns. The second column from the right is the number of points the team has won in the last 8 matches.

Simply looking at where teams are in the table, we’d not get the full picture, especially this season. Notice how many at the teams at the bottom have won close to the amount of points teams at the top have in the last 8 matches. What’s even more noticeable is teams like Burnley, Derby, Millwall and Blackburn – all above the drop zone – who have all won just 6 points in their last 8 matches.

8 matches can also be quite a while, though. And if a team has done badly a while ago, but had an upturn in form in their last 4 matches for instance, you won’t really know that just looking at their points haul from the last 8 matches.

Look at the graph above again, and pay attention to the column furthest to the right. As I said before, Millwall have not done very well in their last 8 matches, winning just 6 points. But if you look at the last column there, they’ve won all those 6 points in their last 4 matches.

A very, very different picture emerges, when you look at just those last 4 matches. Where there was already quite a few teams at the bottom going at a rate close to those at the top in their last 8, the last 4 matches rams home one of the home truths about the Championship: It’s quite possible the tightest, most unpredictable division in all of football.

Bar the form teams Forest and Bolton, no other team in the division have won more than 2 of their last 4 matches. That’s really quite extraordinary, and it speaks volumes about how tight the relegation fight could be, if everyone truly does keep being able to beat everyone.

How it’d all end if teams kept up their form

It’d be one thing to simply assume teams would continue the form of their last 8 to the end of the season. The good people at stats.football365 have actually done that, and if we assumed teams would win as many points per game for the rest of the season, as they have in their last 8 matches, the table would be very, very different, especially below the top:

1213-mar20-last8forseasonremainder

So, we’d all take that surely? Wednesday quite a distance above the relegation zone, and another extraordinary thing: Two of the teams relegated from the Premier League last season relegated. Barnsley’s stoking form would see them romp into 8th (but still quite a distance off the play-offs) from 20th at the moment.

As Millwall’s form showed, just looking at the last 8 matches can be a bit treacherous. But only using the results of the last 4 matches may also be problematic. So what I’ve done is take the average of points per game in the last 8 and points per game in the last 4. We then get a form that tries to balance out the problems of just using either one of them.

Weighted form

And how would the table look, if we used that form and assumed teams would continue it for the rest of the season?

1213-mar20-last8and4forseasonremainder

Well, I wish I hadn’t done that now: As you can see we’re in the last of the three relegation spots.

As a way to make up for letting the numbers predict we’ll get relegated let me say this: Maybe even using the weighted form of the last 4 and the last 8 is too simplistic? Using the points won in any number of the last matches played doesn’t consider the opponents that teams played in those matches.

Run-in – what role does it play?

We’ve just played 5 opponents that were all, on paper especially, very hard, and our points haul suffered due to that. And that’s one reason the last graph there sends us down.

So we need to figure out a way to factor in what opponents teams still have to play. Whose run-in is easier than the others? As mentioned at the beginning, I tried something similar last season. And I’ll do something similar albeit more advanced to get a measure of how strong an opponent is likely to be. That way we’ll be able to see who has the better run-in.

A full Hillsborough

Home form – not something for us, but nearly half of all Championship matches have been home wins this season.

First off, it makes a difference whether you’re playing at home or not. Well, we’ve actually won more points away from home than at Hillsborough, but in all the matches played in the Championship this season, nearly half of them (45 percent) have been home wins. One in every three (31 percent) have been away wins. Being at home is an advantage, and I’ll factor that into the model.

How, then, can we say something about a team’s strength? Points total is obviously a good measure, and at season’s end the only measure, of which teams are good. So that’ll be included in the model too.

As demonstrated, a team’s points total does not tell the whole story. Middlesbrough, in 9th, have won just half as many points as Barnsley, in 20th, in recent matches, despite having 10 points more. It seems only fair to include a team’s form in the model too: I’ll include points won in the last 8 and 4 matches in the model.

Even if nearly half of Championship matches end in a home win, not all teams do well at home; we’re a good example of that. And Bolton and Leeds – both with 11 home wins and just 3 away wins – are both better home teams than the divisional norm. We need to factor in whether an opponent is “a good home team” when you’re playing them away, and vice versa when you’re at home to them. Points won at home and away respectively goes into the model, as does home and away form (last 5 matches).

We can then make a measure of how good a home team and away team each of the teams are. I’ve only looked at the bottom 14 in the Championship, as we’re concerned with relegation, not promotion. Who has the hardest run-in then?

The run-in score (patent pending…)

To find that out I’ve used the model with the numbers mentioned above, and made a point score that shows how tough the run-in is; the higher the score, the tougher the run-in.

I’ll admit that those run-in point values can be a bit hard to interpret. A good guide may be, that if our last 8 matches were going to be as tough as the run of fixtures we’ve just had – playing Palace, Forest, Watford, Leicester, Cardiff – we’d have a run-in score of 403 points.

The worst possible run-in you could theoretically have, playing away to the 4 hardest home teams and at home to the 4 hardest away teams, would give you a run-in score of 549. On the other hand, the easiest run-in would have the score 285.

What sort of run-in scores do the bottom 14 teams have then?

1213-mar20-runinscores

 

The red line is the average number of run-in points; if you have fewer run-in points than that and are on the left side of the red line, your run-in is easier than average. If your score is to the right of the red line, your run-in is harder than average.

Finish line ahead

Home straight – it’s a mad scramble for the finish line at the bottom of the Championship, and while the difference in run-ins isn’t great, it could be crucial.

What are we seeing then? Well, our run-in is almost bang on average. And both Wolves and Bristol City below us have harder run-ins. You almost have to feel for Barnsley (but only almost): Having put together such a terrific run as they have recently, they have the hardest run-in of all bottom 14 teams. And, using the run-in points, their’s (423 points) is even harder than that rotten run of fixtures we’ve just had (403 points).

Birmingham and Blackburn have the easiest run-ins, whereas Charlton, in particular, and Blackpool and Ipswich have harder run-ins than most of their rivals.

The most striking thing, perhaps, is that there’s not that big a difference between the run-ins of about 8 or 9 of the teams in there. But having put us in the bottom 3 before, I’m glad this does, on balance, point in the other direction.

Expected total points: Points haul, recent form and run-ins all combined

What I’d really like, though, is to put that image of Wednesday in the bottom 3 soundly to bed by combining the points haul and recent form of teams with how their run-in looks.

It’s a bit tricky to do so, to say the least. But I’ve conjured up a way to do so (this is where you might want to recall the Churchill-quote I mentioned earlier!), and this is how it looks:

1213-mar20-expectedpoints

 

Pich Invasion 03

Pitch invasion – if the stats have their way, we may see another one of these at Hillsborough on May 4th after averting relegation on the final day.

WAHEY! WE’RE SAVED!
But not by much: Only 2.3 points separate us and relegated Charlton, who drop from 14th currently and back into League 1 (don’t let the door hit you on the way out). They’re joined by Wolves and Bristol City who fail to claw their way out of trouble.

But look at the expected point totals: Bristol City and Wolves would be finishing on the highest number of points a team in 24th and 23rd respectively have finished on in the last 25 years of Championship/Division II football – and by quite a distance too: Luton 45 points (1995-96) are the highest points finishers in 24th in that period. The two university Uniteds, Oxford (1993-94) and Cambridge (1992-93), both finished on 49 points in 23rd.

In fact Wolves expected points haul would’ve seen them save themselves from relegation in all but 1 of the last 25 years of second tier football; even Bristol City would’ve equalled or bettered the points total of the team in 21st in half of those last 25 seasons.

Word of warning

Statistics do not give all the answers, but they do make us wiser, or – as a saying goes over here – confused at a higher level.

Before I go overboard with my lovely numbers – and I’m sure, by now, you’d agree they’re decent when they can predict our survival – there’s an obvious word of warning: With the difference in expected points total being so small – just around 3 points from Ipswich in 14th to Charlton in 22nd – how I’ve conjured up the numbers could prove vital. As could, obviously, a single win for one of these teams.

If you can’t be arsed with the technical details (and I don’t blame you), I’ll leave you with this thought to take away: Yes, it will be unbelievably tight at the bottom, but there’s an okay to decent chance we might actually survive, when you look strictly at the numbers. Factors like losing the man directly involved in half our goals this season to injury aren’t accounted for.

But just for the fun of it, I’ll put my neck on the line and say we won’t get relegated, and that Charlton, Wolves and Bristol City will. You’re welcome to taunt me at the end of the season if I’m not right (Twitter: @ploehmann), but rest assured I’ll come back, next season as well, with more numbers regardless of how my prediction fares!

Technical explanation – how is the expected points total calculated?

I’ve used the weighted home and away strengths calculated earlier, and put that in relation to the maximum possible home and away strength. So if you’ve won all your games this season, you’d be 100 %. If you’re Bolton and have done very well at home this season, you’d be 90 % (conversely Charlton is 31 %). The same goes for away teams (Forest is best at 80 %, and Blackburn worst at 26 %). We’re 48 % and 55 % respectively.

I’ve then used the number of home and away matches remaining and multiplied the maximum points from those with the   home and away strength percentages from before. Using us as an example, we’d get 48 % of the maximum points from our 4 remaining home matches, that is 5.76 points. The same is done with all the other bottom 14 teams, both home and away.

Now the run in is factored in: I’ve taken how far a team’s run-in score is from the average run-in score of 362 expressed as a percentage. I’ve had to invert the number, because a high number was bad, and for this to work, I need the highest percentage to be good. If you’re over the average, it’s more than 100 %; if you’re under the average it’s less than 100 %. As we’re on 361, it’s 98 % for us. Barnsley’s run-in mean they’re at 83 %; Birmingham, instead, on 116 %.

I then make another multiplication, this time taking the number of points from remaining matches from before and multiplying it with the run-in percentage. Using Barnsley as an example, they would get another 16 points to finish on 60. But those 16 points are then multiplied by 83 %, meaning it’s 13.28 points instead, and an expected points total of 57.3. So as you can see, even with the modest difference in how hard a run-in teams have, in this model of mine at least it does play a significant part.

If you want to look at the raw data behind the above, you should be able to access the spreadsheets used (and the spreadsheet used for the last article I did) here – let me know if it doesn’t work:

https://docs.google.com/folder/d/0B1reCxmmHxr_a2xEcVpHNDdNcVE/edit?usp=sharing

Badge Owl POST
Peter

Owls Alive
TWITTER: @OwlsAlive or @ploehmann


Leave a Comment

Your email address will not be published. Required fields are marked *