McCormick’s Craic: Lies, damn lies and statistics! What can we really learn from crunching numbers?

Who was it said “There are three kinds of lies. Lies, damn lies and statistics”? We asked one hundred people. 37% said “Benjamin Disraeli,” 55% said “Mark Twain” and 42% said “I don’t give a toss!” Half Man Half Biscuit said “If 8 out of 10 cats do prefer Whiskers do the other two prefer Lesley Judd?” John McCormick having set off on a train of thought about whether or not Arsene Wenger uses statistics to help him to decide whether or not to sell players (he doesn’t need stats to tell him Chamakh was a waste of money) has been considering the place of the numbers game in football.

John McCormick - vital statistics

After M. Salut posted some comments about Ibrahimovic, regular Salut contributor, Jeremy Robson and I engaged in a debate, some of which concerned the use of statistics in making footballing decisions. We’re probably not as far apart as that debate suggests but I thought I’d put pen to paper (actually fingers to keyboard but we oldies are bound by metaphor ) and give you my thinking about the use of statistics in football.

I must say, though, that I’m not really a stats person. I don’t download or analyse figures. I rarely even read the Opta stats for SAFC when they are posted on the ALS website. Nor do I want to get into semantics over the differences between information, data and statistics, or even discuss how many Sessegnons can dance on the head of pin (*the answer to that is at the bottom, by the way). I just think, properly used, hard data has an important role to play in the modern football club.

I’ll use a recent trip to Southport to outline some of my themes and then I’ll move on to football scenarios to add a bit of depth:

Last Thursday, when it was sunny, I strolled down to the station and caught a train to Southport, which is a pleasant ride up the coast. I spent the day in Southport and got back about 7.00pm.
Did I have a good time? What do you think?

There’s actually nothing to tell you what the day was like but because of the way I presented (some may say slanted, or biased) the information you might be inclined to think I did have a good time. Even so, you might be reluctant to make a firm decision. So here’s another bit of information:
Last Thursday I spent 7 hours in the Accident and Emergency unit of Southport General Hospital.
Now what do you think?

It’s probable that your view has changed. The second piece of information is presented without any slant but most people’s experiences of A&E units aren’t good and that personal experience will affect judgements. I think you will now be of the opinion that I didn’t have a good day.

However, there are other possibilities to consider. I might be a retired health professional called in to cover absence, or a lay minister volunteering to work with casualties. It’s possible I went home feeling happy and fulfilled.
But the truth is that myself, my friends and my family have no association with Southport General in any professional or volunteer capacity.

So what do you think now? You probably will be more confident that I didn’t have a good day. You could be wrong but by considering all of the information, discarding any which is irrelevant, taking account of bias (in the information or in yourself) and, above all, asking the right questions to get more information, you are likely to reach a much better decision than if you used gut feeling, guesswork or a coin toss.

My argument is that the same applies to football. There’s a lot of information out there in the form of hard facts. If clubs can identify the questions to ask and find the data which provides the answers they might just gain an advantage. The key is to ask the right questions and use the right data. I think from what Jeremy said that he believes clubs don’t always do this. I have to agree with him, and if clubs don’t ask the right questions or make decisions using the relevant data their decisions are indeed meaningless.

Let’s consider this in the context of a hypothetical situation. If Opta stats, which for the purpose of argument we will assume are based on enough data (i.e. enough games played) to be valid, show Phil McBardsley made fewer tackles in 2012-13 than he did in 2011-12, and fewer tackles in 2013-14 than in 2012-13 should the club sell him?

Hamish McBardsley by Jake

Such data might indicate McBardsley is slowing down and needs to be moved on. However, there are other possibilities. Could it be that he has gained experience and intercepts passes before he needs to tackle, or that someone further up the pitch is making life difficult for the opposition and the ball doesn’t come as deep? If so, it could be a mistake to get rid of him. Or could changes be happening because of MON’s coaching. In this case the data might be the only sure way of telling that it’s working.

On its own the data about tackles is not enough for any decision to be made, but it does flag up something that needs investigation. More questions need to be asked and only when these are all answered might the manager be in a position to make a decision. Clubs are increasingly turning to data to help them answer questions and I think this is a force for the good. Done properly, this must be better than working on intuition and gut feeling. It isn’t all down to the readily available Opta stats, though. Clubs are very private in the data they collect and what they do with it.
Here’s a real-life situation: Match of the day showed a game against Stoke, who were playing Delap, where a defender conceded a corner rather than a throw in. Motty’s comment was something like “That’s one way of preventing a long throw in”. It was, but was it good play?

If this was just a one- off event during a game then that’s OK. It’s nothing more than a talking point. But what if a coach had just watched a “Football Focus” sequence which featured half a dozen goals coming from Delap’s long throws? There’s something called the “availability heuristic” that states information recently received can stay in the memory and influence judgements. The coach might have thought long throws were more dangerous than really was the case and given his players the wrong advice.

To make the best decision the coach would need to know the probabilities of Stoke scoring from a long throw and from a corner. That’s not as simple as it seems because you have to factor in things like goals directly arising from the kick or throw, add in goals from what rugby would call the “second phase” and take account of other effects, such as sending offs or penalties.

But such calculations can be made, and they will be valid as long as enough data is collected. If someone had carried out such an analysis and it showed that a long throw was more dangerous than a corner, then conceding a corner could be a smart tactic. But there are always other considerations. How does your own team handle corners? How does this compare with their handling of balls played in from the wings, which might be the best data you can get to help you focus on long throws? Then what about the effects of weather, sunlight, boggy ground, pitch slope, ice, wind, rain? And don’t forget players and their foibles. It’s a complex situation but in setting up for any game asking the right questions and getting the correct answers could give your team the edge.

The same applies in setting up for any season. You have a limited squad. How will you cope with the speed of Bale, the aggression of Rooney , the experience of Sir Alex, the hostility of ASDA in the derby. What data will you use to maximise your chances of a successful season when you select your squad? Our medal winning cyclists evaluate everything they think might be relevant and operate on what they call the aggregation of marginal gains. Football should do the same.

So here’s a question for you. MON is setting up for 2012-13 and that includes moving players in and out. Should he keep Cattermole or get rid?

Catts adds to his card collection
This site and others such as ALS show it’s an emotive area for fans. That’s fine but do we want MON to behave emotionally? Should he just rely on his judgement, what he sees in training and what his scouts tell him? Or are there statistics, data, or information (take your pick) which might help him make an optimal decision. If so, what are they and where are they? That’s the issue clubs need to get to grips with.

(*Only one Sessegnon can dance on the head of a pin, ‘cos there’s only one Sessegnon)

6 thoughts on “McCormick’s Craic: Lies, damn lies and statistics! What can we really learn from crunching numbers?”

  1. I think the software that some managers use does a lot of this over the course of games so would indicate levels of fatigue/stamina problems etc. Sam Allardyce is a keen user of this approach as is Brian Laws. I don’t think Steve Bruce is an advocate (see elsewhere on Salut) 🙂

    As you say John, you wouldn’t want to make some of the data public because of the effect that it would have on a player’s worth etc. With someone like Andy Reid for example, maintaining a current inventory of the corned beef and cheese pasty trays at the local Greggs would probablybe a more useful guide to form and fitness than any bespoke software.

  2. Jeremy and Geoff have touched on a point I just mentioned in passing. What statistics/data do clubs collect? They keep a lot of it to themselves. Apparently key indicators of potential are acceleration to above 7m/sec and the ability to stop abruptly, change direction or reverse when moving beyond that speed. I don’t think I’ve ever seen such data but I bet sensible managers get it when they say things such as “so and so is training with us..” I’d also expect them to collect data when people are having “medicals”.

    If your data shows one of your stars is beginning to flag you’d be daft to let everyone know. If you find one of your targets isn’t meeting expectations you might want to reconsider the transfer.

    I bet, though, that some clubs/managers use data more effectively than others

  3. That’s precisely the sort of analysis that these raw data stats do not perform Geoff Number of passes does not tell the whole story of even part of it. What happened as a result of that pass? Very much a case of never mind the quality, feel the width.

  4. If we ever saw stats which gave the number of times a player made an off the ball run which created space for his team, denied space to the opposition, caused an opponent to be marked, or persuaded them to hoof the ball forwards then I’d be about to take them a wee bit more seriously. Stats are only as good as the planning behind them – and that is usually woeful.

  5. There’s some tantalizingly hot nurses at Southport Hospital – I have no doubt he had a great time. Not misleading at all.

  6. Great article John which raises some excellent points.

    The comment about Delap’s throws is a good one because it touches on another issue regarding stats which is currency and validity. Football often relies on outmoded concepts and facts which were historically correct at some time but which are often no longer true. It is well over a year since Delap’s throw ins created a goal for Stoke. Any Potter will telll you that but teams still behave and defend as if that threat remains and is significant as it once was.

    What is also interesting is the relative infrequency of goals resulting from corner kicks. In the last World Cup only about 8% of corners led to a goal being scored. I remember reading other articles about this (free kicks tend to be much more dangerous to defend than corners) but there is an interesting article to be found here. http://www.scienceofsocceronline.com/2010/12/corner-kicks-by-numbers.html

    My guess is that if you were to stop fans going into the ground at 2-45pm on any given Saturday that their estimation would be that a much higher percentage of goals would result from corners. Stats for the PL are remarkably low too, perhaps surprisingly low. It would be very interesting to compare the threat from corners with throw ins taken in the last 25 yards from goal for example. There’s a challenge to the Sports Science community.

    Some of the research that Tom Reilly has done provides real scientific evidence about what can make teams effective and how best to deploy creative midfield players. It may sound like a theory built out of common sense but it provides clear evidence of something that we have all known for years. Tom Reilly’s team developed Zone 14 depends on a player with the trickery to move fast through it and change direction of the play within 8 seconds. They have the stats to prove it from studying major championships. There’s a good article here about it http://www.telegraph.co.uk/sport/football/3028353/Scientists-find-footballs-golden-square.html

    It depends not only on the creative player such as Iniesta, Messi, David Vaughan etc to do what he does best, but also on the strikers or wide players to do their bit aswell. Very interesting you may say, but what do you do with that information? Well, that comes down to a coach who can develop plays in training to exploit this knowledge, and that requires more than simple interpretation. It demands skill, knowledge and practice as well as the element of surprise.

    We aren’t far apart in our thoughts at all John. This type of analysis goes way beyond Opta indexes which portray players like Britton at Swansea as some sort of World beater when he simply isn’t. Some of the stats used in football don’t help, they simply hinder or create an illusion of some kind. The quality of experience and final result are what counts and that doesn’t matter if it’s a goal or a hospital visit.

Comments are closed.

Next Post