Deep Backward Point

Blog against the machine.

Tag: statistics

Ballsiness

Of all the popular numbers, miles per gallon is a bit of a liar. It doesn’t say what many people think says.

Take three cars. The Honda CR-V gets 20mpg, the Honda Civic 30 mpg and the Insight gets 40mpg. So the Insight is 10 better than the Civic. The Civic is 10 better than the CR-V. Simple, right?

Wrong. Let’s flip the number around to gallons consumed per 100 miles. Now the CR-V costs 5 gallons to get to 100 miles, the Civic 3.3 and the Insight 2.5.

The Insight is still better than the rest, but not by as much. When judging about cost-effectiveness, the gallons/100 miles is a better number.

What you hold constant at 100 matters. Holding the number of miles constant at 100 is a better way of understanding performance, because it maps well with reality– your commute distance, the distance to the mall and the number of miles you will drive in a year are largely constant. So what you really want to know is how many gallons of gasoline will it take you to get there?

Of course, car companies want you to dream about where you can go on a tank of gas

Now let’s take the Strike Rate in cricket. It tells us how many runs a batsman would score if he faced 100 balls.

This is a very useful number. It tells me that if I had a team full of Sehwags (ODI SR: 105), then we would make 315 per 300-ball ODI. And since Sehwag averages 35, a team full of Sehwags would get to about 315 for 9 in 50 overs.

In T20 Internationals, a team full of Sehwags (T20I SR: 152) would make about 183 per 120-ball match. And since Sehwag averages 23, it would be 183 for 8 in 20 overs.

This is a useful number because the number of balls is a constant in limited overs cricket. This makes comparisons proportional. A team full of 70SR would get to 210 in an ODI, 60SR would get to 180 and 80SR would get to 240.

Of course, we could flip this around, the way we flipped the miles per gallon.

The new flipped number would be the number of balls it would take for a batsman to get to 100. Let’s call this new stat ballsiness.

As in, how ballsy is Sehwag? For Sehwag in ODIs, the number of balls he takes per 100 runs is 95.23. So the answer is, very ballsy

In general, this is useless. There is no purpose in holding the number of runs to be scored as constant, because the overarching reality of limited overs cricket is limited overs.

But a few recent tournaments have turned this assumption on its head. The catalyst? The bonus point.

To recap, in the recent CB Series in Australia and the Asia Cup in Bangladesh, a team could get a bonus point by scoring at a run-rate that was 1.25 times their opponent. So if Australia bat first and score 200 in 50 overs (RR: 4), then India would have to chase it down in about 38 overs (RR: 5.25) to get the bonus point.

Now we have a situation where the number of runs to get is a constant and you are trying to minimize the number of deliveries taken to get there. So our new flipped number– balls per 100 runs, ballsiness— becomes useful.

Now, (a team of) Sehwags would chase 201 runs in about 31.5 overs.

Virat Kohli (ODI SR: 85) has a ballsiness of 117. So a Kohli XI would chase the same target of 201 in about 39.1 overs.

Jonathon Trott (ODI SR: 78) has a ballsiness of 128, so a Trott team would chase the same target in about 43 overs.

*

So what’s the meaning of all of this? Not much really, except to stimulate some thought. People talk about Moneyball all the time, but fans can’t understand many of the newly invented statistics. Like the Duckworth-Lewis method, these new-fangled statistics add barriers between the fans and their game. My idea is to think about ways to think about numbers that improve our understanding and our discourse.

My other idea was to force you to imagine a team full of Trotts.

How to Win at Twenty20: A Statistical Analysis

As Sri Lanka beat Australia twice batting first, I began to wonder if there was an inherent bias in T20 towards sides batting first, batting second or winning the toss.

So I ran the numbers in StatsGuru, and as always, things aren’t as simple as they seem.

Win % in Twenty20 Internationals

Win % in Twenty20 Internationals

The blue bars are total win percentage of the team. Red bars are win percentage after winning the toss. Green is when batting first, purple is when batting second. These statistics are only for these eight teams when they play each other.

A few things stand out like a sore thumb:

  • Australia do significantly better when they win the toss.
  • India do worse when the win the toss.
  • India like to bat first.
  • Sri Lanka doesn’t care– win the toss, lose the toss, bat first, bat second, their win percentage remains the same.
  • South Africa like to win the toss. And bat first.

Overall, sides batting first seem to have a slight advantage. This was not true in the IPL, at least anecdotally. I’d love to run the numbers for each IPL, to see how they differ and how they have changed over time.

The Pakistani Fountain of Youth in Numbers: a Chart on Catching Talent Young

Recently Jarod Kimber was gushing over the new Pakistan quick Junaid Khan. In doing so, he said that in addition to flair and skill, it is the youth of new Pakistan bowlers that makes them so appealing. Of course, I’m paraphrasing. Jarrod never says anything so dull.

This got me thinking about how early Pakistan cricketers start in International cricket. Anecdotally, it seemed Pakistan had the most young debutants. This led me to StatsGuru. Which led to this chart (click the chart for an awesome large version)– the bars represent % of total debutants who were under 22, and there’s one bar per decade, per team:

Debuts Under Age 22 (as % of Total Debuts)

Debuts Under Age 22 (as % of Total Debuts) by Decade in ODI cricket

(View Enlarged)

I started by just getting the per team numbers for all 40 years of ODI cricket. This was great, and demonstrated the same trend (younger debuts in the sub-continent, older in England/Aus), but I wanted to see how these numbers changed over time. So I pulled the numbers separately for each decade of One Day cricket.

A few points that stand out for me:

  • Pakistan and Sri Lanka have consistently favored youth. The remarkable thing is that their numbers remain high regardless of the fortunes of their team.
  • English players have historically taken time to prove themselves worthy of an international cap, until the last decade. Perhaps this is a reason for their recent success?
  • West Indies has oscillated dramatically between starting older and starting young.
  • Do teams turn to youth when they are struggling? This is obviously the case with Bangladesh and Zimbabwe– I didn’t include their data here– but how about other teams? What I was trying to get at by splitting the data in to four decades.
  • In countries like England, there is actually something going on at the other end of the spectrum. Andrew Strauss was effectively forced out of the side at age 34. A combination of the under-22 and over-34 problem is why the total centuries by the entire current English Test side is less than Tendulkar+Dravid*.
  • Finally, it doesn’t help to compare 1970’s statistics to other decades. It was the first decade of ODI cricket, and most “debuts” were actually established players. It does make sense, however, to compare 1970’s numbers between teams. Even in that early decade, Pakistan is substantially ahead of the rest.
Starting players late means they have shorter shelf-lives. Not only that, it makes for poorer branding. The reason adjectives like exciting get thrown around a lot more for Pakistani bowlers and Indian batsmen is because at 19 they’re blowing top-notch opposition out of the water.

I also pulled overall (40 year) numbers for Bangladesh and Zimbabwe, but I don’t consider them interesting. They have such a poor record that they have no choice but to turn to the teenagers. If you’re interested, Bangladesh is 66% under 22 debuts, and Zimbabwe is 51%.

* And talent, of course.

The History of One Day Cricket: Part I

The One Day International has changed dramatically in its 40 years of existence. Here is part one of my analysis of the game:

Highest Score per team, per year

We’ve come a long way since the ’70s. It used to be a 60-over innings and teams barely got a couple of hundred runs. In 1977, no team made more than 250 in their allotted 60 overs. Every year since 2004, the top eight teams have had a 300+ score every year. We’ve come a long way, baby.

Take a look at how Jayasuriya and company changed the game in 1996. It’s an outlier, so different from the years around it and wouldn’t be surpassed until the batting powerplay was instituted in 2006.

Highest Score of World Cup 2011: 375 by India against Bangladesh

High Scores in One Day History

High Scores in One Day History (click for larger version)

Runs per over per team, per year

We’ve gone from a par average of 4 to a par average of 5.5. In 1994, every team had a yearly run rate of 5 and under. By 2010, every team was over 5. In fact, South Africa finished 2010 at 6 runs per over for the year.

Top 8 Teams Run Rate at World Cup 2011: 5.38

Run Rate by Year in One Day History

Run Rate by Year in One Day History (click for larger version)

Runs per wicket per team, per year

Now here’s something that hasn’t changed much as the game has changed. Even though teams are scoring at a (much) faster pace, the runs per wicket has been largely steady. Barring some outliers (West Indies in the early days, Australia in the last 10 years), the average has barely increased from the upper 20’s to the low 30’s.

In both this chart and the runs per over, Sri Lanka’s progress between say 1983 and 1996 has been the most dramatic. On this chart, Sri Lanka goes from about 18 in 1984 to 38 in 1997. Of note: Australia crossed 50 runs per wicket in 2001.

Also, look how the mighty have fallen. West Indies dominates every chart here for the first decade and then drops off the map. Finally, the era of Aussie dominance ended in 2008- the orange dot on all three charts falls from the top that year.

World Cup 2011: Matches Among Top 8 Teams:
Side Batting First: 29.58 Runs per Wicket
Side Batting Second: 31.61 Runs per Wicket
Overall: 30.49

Average per wicket per year in One Day History

Average per wicket per year in One Day History (click for larger version)

In the next installment, I will present three charts on how the balance of power in one day internationals has changed over 40 years.

Notes:

  • Only the top eight teams (no Zimbabwe, no Bangladesh) have been considered.
  • The runs per over are for the entire year, with each dot representing a different team.
  • The runs per wicket are for the entire year, with each dot representing a different team.
  • The highest score is the highest score for a particular team in that year.
  • The color code for each country is consistent across all charts.
  • Statistics until the end of 2010 are reflected in the charts.

A Statistician’s Perfect World Cup

Statistician Anantha Narayanan spends some time looking back at World Cups past. But that’s not what’s interesting at that link.

In the comments section of the article, Narayanan describes the “best” format for a future World Cup:

However the best is the all-play-all and then nothing or a 3-match final.

This is a terrible idea. It’s a statistician’s idea of a perfect World Cup, wherein we would produce the most statistically perfect winner in an unacceptably dull tournament. Imagine that two teams dominated the first couple of weeks of the tournament– every other game would become irrelevant, since only two teams could get anywhere.

In other words, it would be a lot like the current World Cup– where we can safely ignore all matches until the end of March– but worse.