Wednesday, November 19, 2014

Alpha average

Cricket has had a curious relationship with numbers. The sport compiles the achievements of its players in rich and detailed statistics. It celebrates a batting average in the 50s and a bowling average in the 20s as some kind of an irrefutable stamp of greatness. At other times, cricket and its patrons can be notorious in completely ignoring it in favour of subjective narratives too. Looking at the numbers alone, to argue for Gavaskar over Viswanath is a no brainer. But I strongly advise anyone to stay away from it with Indian cricket fans of that generation.

To be fair, it's not necessarily contradictory. In fact, for a sport which has undergone tremendous amount of change over the years, some of its statistical measures have survived rather well. One such measure is the batting average of 50: it is one of cricket's great constants. Of course, it's not always unimpeachable. There are the 50ers who aren't quite hall of famers, and there are hall of famers without achieving the 50 benchmark. But its value as a currency of greatness has largely stood the test of time almost throughout Test cricket's life term. Even with noticeable exceptions, that's quite an incredible feat.

While acknowledging the idea of putting the 50 benchmark on a pedestal, cricket fans tend to get into the nuances. How does the 50 spread out? Is it too skewed in favour of performances at home? How is the record against the top ranked teams of that era? Didn't he play in an era of mediocre bowlers and flat pitches? and so on. Some of the questions are entirely subjective, others may not be. For instance, let me take up the point about playing in an era of mediocre bowlers and flat pitches. Is there a way to make it more objective for comparisons? I make an attempt.

A batsman can only play in an era that he belongs to. If a particular era is more favourable to score runs than others, it should have been the case for all players of that era and vice versa. So, instead of looking at the absolute averages, we can perhaps look at the incremental average of a batsman over the rest of his contemporaries. Now, how do we define contemporaries? Gooch was Tendulkar's contemporary and so is Murali Vijay.

I am going to take the first day and last day of test cricket of a batsman and calculate the aggregate average of all batsmen in that time period. For instance, for Kallis, the time period is from December 14, 1995 to December 30, 2013. The average of a batsman (runs scored by all batsmen/number of dismissals) in the period is 31.42, and the career average of Kallis was 55.37. On an average he scored 23.94 runs more per innings than the aggregate average. Let's call this incremental average the alpha average. Now, batsmen can be compared across eras on the basis of the alpha average.

I get it, the incremental percentage of the runs scored over the aggregate average is a marginally more sound indicator, but I like the idea of an absolute number of alpha average because of my bias for the portability of a standalone metric but have given both below.

For the purpose of this exercise, I have taken all batsmen who average 50 and more and have played a minimum of 20 tests. Cut off date is Sep 18, 2014.

Player
Span
Mat
Ave
Average rank
1928-1948
52
99.94
1
1963-1970
23
60.97
2
1930-1954
22
60.83
3
1924-1935
54
60.73
4
1931-1939
20
59.23
5
2000-2014
128
58.76
6
1955-1968
82
58.67
7
1948-1958
48
58.61
8
1927-1947
85
58.45
9
1954-1974
93
57.78
10
1908-1930
61
56.94
11
1948-1960
44
56.68
12
1937-1955
79
56.67
13
1995-2013
166
55.37
14
1970-1984
87
53.86
15
1935-1951
34
53.81
16
1989-2013
200
53.78
17
1994-2014
158
53.1
18
1990-2006
131
52.88
19
2009-2014
44
52.65
20
1976-1993
124
52.57
21
1996-2012
164
52.31
22
1998-2010
90
52.29
23
1995-2012
168
51.85
24
1920-1929
20
51.62
25
1992-2002
63
51.54
26
2005-2013
79
51.52
27
2004-2014
105
51.5
28
2000-2014
91
51.41
29
2004-2014
79
51.32
30
1971-1987
125
51.12
31
1985-2004
168
51.06
32
2004-2014
95
51.02
33
2012-2014
22
50.94
34
1994-2009
103
50.73
35
1978-1994
156
50.56
36
1974-1991
121
50.23
37
1937-1957
78
50.06
38

.
Player
Average
Average rank
Aggregate Average
Alpha average
Alpha Avg Rank
Avg rank - alpha avg rank
99.94
1
31.85
68.09
1
0
60.97
2
30.73
30.24
2
0
60.73
4
30.87
29.86
3
1
60.83
3
31.05
29.78
4
-1
58.61
8
29.19
29.42
5
3
58.67
7
29.87
28.80
6
1
59.23
5
30.57
28.66
7
-2
56.68
12
28.7
27.98
8
4
57.78
10
29.97
27.81
9
1
58.45
9
31.21
27.24
10
-1
56.94
11
29.82
27.12
11
0
58.76
6
32.33
26.43
12
-6
56.67
13
30.58
26.09
13
0
55.37
14
31.43
23.94
14
0
53.86
15
30.42
23.44
15
0
53.78
17
31.17
22.61
16
1
52.57
21
30.09
22.48
17
4
52.88
19
30.57
22.31
18
1
51.54
26
29.67
21.87
19
7
53.1
18
31.35
21.75
20
-2
53.81
16
32.49
21.32
21
-5
51.06
32
30.16
20.90
22
10
52.31
22
31.42
20.89
23
-1
51.12
31
30.43
20.69
24
7
52.29
23
31.62
20.67
25
-2
51.85
24
31.49
20.36
26
-2
50.56
36
30.22
20.34
27
9
50.06
38
29.77
20.29
28
10
52.65
20
32.36
20.29
29
-9
50.23
37
30.2
20.03
30
7
50.73
35
30.82
19.91
31
4
51.62
25
31.73
19.89
32
-7
51.41
29
32.11
19.30
33
-4
50.94
34
31.68
19.26
34
0
51.5
28
32.43
19.07
35
-7
51.32
30
32.51
18.81
36
-6
51.52
27
32.86
18.66
37
-10
51.02
33
32.54
18.48
38
-5

Out of the 38 players, 8 players jump down by 5 ranking slots or more when the  parameter is changed from average to alpha average: Hussey(-10), Mathews(-9), Ryder(-7), Clarke(-7), Sangakkara(-6), Amla(-6), Nourse (-5) and De Villiers (-5). Except Nourse and Ryder, the other 6 players belong to the last decade and half. This lends credibility to the notion that run scoring has become easier in recent times partly because of dearth of quality bowlers, flatter tracks and perhaps even better bats.

On other hand, 6 players jump up the ranking by 5 ranking slots or more when you change the parameter: Hayden, Flower, Gavaskar, Richards, Border, Waugh and Compton. Curiously, all these players average over 50, none average more than 51.54.

Ranking based on alpha percentage:
Player
Average
Average rank
Aggregate Average
Alpha average
Alpha percentage
Alpha % Rank
Avg rank - alpha % rank
99.94
1
31.85
68.09
214%
1
0
58.61
8
29.19
29.42
101%
2
6
60.97
2
30.73
30.24
98%
3
-1
56.68
12
28.7
27.98
97%
4
8
60.73
4
30.87
29.86
97%
5
-1
58.67
7
29.87
28.80
96%
6
1
60.83
3
31.05
29.78
96%
7
-4
59.23
5
30.57
28.66
94%
8
-3
57.78
10
29.97
27.81
93%
9
1
56.94
11
29.82
27.12
91%
10
1
58.45
9
31.21
27.24
87%
11
-2
56.67
13
30.58
26.09
85%
12
1
58.76
6
32.33
26.43
82%
13
-7
53.86
15
30.42
23.44
77%
14
1
55.37
14
31.43
23.94
76%
15
-1
52.57
21
30.09
22.48
75%
16
5
51.54
26
29.67
21.87
74%
17
9
52.88
19
30.57
22.31
73%
18
1
53.78
17
31.17
22.61
73%
19
-2
53.1
18
31.35
21.75
69%
20
-2
51.06
32
30.16
20.90
69%
21
11
50.06
38
29.77
20.29
68%
22
16
51.12
31
30.43
20.69
68%
23
8
50.56
36
30.22
20.34
67%
24
12
52.31
22
31.42
20.89
66%
25
-3
50.23
37
30.2
20.03
66%
26
11
53.81
16
32.49
21.32
66%
27
-11
52.29
23
31.62
20.67
65%
28
-5
51.85
24
31.49
20.36
65%
29
-5
50.73
35
30.82
19.91
65%
30
5
52.65
20
32.36
20.29
63%
31
-11
51.62
25
31.73
19.89
63%
32
-7
50.94
34
31.68
19.26
61%
33
1
51.41
29
32.11
19.30
60%
34
-5
51.5
28
32.43
19.07
59%
35
-7
51.32
30
32.51
18.81
58%
36
-6
51.02
33
32.54
18.48
57%
37
-4
51.52
27
32.86
18.66
57%
38
-11

While the change in ranks gives a sense of who has been relatively better off or worse off in playing in the era that they did, just sorting the alpha average from high to low presents an interesting picture too. Down the bottom, we see the players of the last 15 years crowd out others - Sangakkara is the stand out exception. But at the same time, players whose careers stretched from the 1990s till the 2000s and even the 2010s, like Tendulkar, Kallis, Dravid, Lara, Ponting, and Chanderpaul are still somewhere in the middle. This could mean two things: 1) Run scoring in the 90s was difficult and these guys were a class above the rest or 2) They made up for their relative drought of the 90s by feasting in the 2000s and boosting their overall alpha average. Let's see:


Individual Average
Aggregate Average
Alpha Average
Matches

1990s
2000s
2010s
1990s
2000s
2010s
1990s
2000s
2010s
1990s
2000s
2010s
Tendulkar
56.70
54.35
50.01
29.63
32.02
32.03
27.07
22.33
17.98
73
89
38
Kallis
41.08
58.70
53.68
29.13
32.02
32.04
11.95
26.68
21.64
32
101
33
Dravid
49.96
54.85
46.18
28.99
32.02
32.51
20.97
22.83
13.67
34
103
27
Lara
51.60
54.06
N.A
29.41
31.66
N.A
22.19
22.40
N.A
65
66
0
Ponting
44.51
58.38
37.30
29.19
32.02
32.57
15.32
26.36
4.73
33
107
28
Chanderpaul
40.61
52.31
71.78
29.23
32.02
32.22
11.38
20.29
39.56
37
86
35

 Note: 1990s includes from the date of debut (In Tendulkar's case from Nov 15, 1989) to 31 Dec 1999, 2000s includes the whole decade or from 01 Jan 2000 to last day of test played. 2010s includes from 01 Jan 2010 to day of last test played.

Tendulkar's alpha average was way higher in the 1990s than in the 2000s. It's both a testimony to the fact that he was a notch above the rest in the 90s when run scoring was harder, and also a reflection of his somewhat relatively more mortal self in the 2000s that he doesn't quite tower above the rest.

In the case of Kallis, Ponting and Chanderpaul, the lower alpha averages is more an indication of their early phase of batting career and trying to establish themselves rather than their struggle in a harder decade for batting. But they were clearly helped by the fact that their batting peak reached in a relatively easier time for batting. Chanderpaul seems to be peaking where most careers will be on the wane.

For a player establishing himself in the 1990s, Dravid was considerably better than Kallis, Ponting and Chanderpaul. But he didn't capitalize in the next decade as much as Kallis and Ponting did, though his alpha average in the 2000s is very healthy too.

The decade of play doesn't make a difference to Lara as he was equally consistent in both, though he didn't particularly tower over the rest in either decades.

While it is fair to compare the incremental average over the aggregate average, it still leaves room for ambiguity because of the inclusion of tail enders and keepers in the mix. There are the often repeated assertions in cricket that the ability of tail enders with the bat has improved considerably in modern times, and wicket keeper batsmen aren’t the same since Gilchrist happened. It’ll be a worthwhile exercise to test these notions separately, but for the sake of this theme, let’s just remove them from the equation and compare the just the top order aggregate averages and work the batsman’s alpha average based on that.


Player
Average
Average rank
Top 6
Alpha Top 6 Avg
Alpha Avg (Top 6) Rank
Alpha avg rank - Alpha top 6 avg rank
Alpha Avg Rank
99.94
1
39.99
59.95
1
0
1
60.97
2
38.07
22.90
2
0
2
60.73
4
38.20
22.53
3
0
3
58.61
8
36.17
22.44
4
1
5
60.83
3
38.94
21.89
5
-1
4
58.67
7
36.98
21.69
6
0
6
56.68
12
35.63
21.05
7
1
8
59.23
5
38.27
20.96
8
-1
7
57.78
10
36.88
20.90
9
0
9
56.94
11
36.49
20.45
10
1
11
58.45
9
38.92
19.53
11
-1
10
58.76
6
39.82
18.94
12
0
12
56.67
13
38.13
18.54
13
0
13
53.86
15
37.23
16.63
14
1
15
55.37
14
38.80
16.57
15
-1
14
52.57
21
36.86
15.71
16
1
17
53.78
17
38.50
15.28
17
-1
16
52.88
19
37.92
14.96
18
0
18
51.54
26
36.80
14.74
19
0
19
53.1
18
38.63
14.47
20
0
20
51.12
31
37.13
13.99
21
3
24
51.06
32
37.30
13.76
22
0
22
52.31
22
38.84
13.47
23
0
23
50.56
36
37.09
13.47
24
3
27
50.23
37
36.94
13.29
25
5
30
50.06
38
36.80
13.26
26
2
28
51.62
25
38.43
13.19
27
5
32
52.29
23
39.14
13.15
28
-3
25
52.65
20
39.50
13.15
29
0
29
51.85
24
38.86
12.99
30
-4
26
53.81
16
41.11
12.70
31
-10
21
50.73
35
38.11
12.62
32
-1
31
50.94
34
38.68
12.26
33
1
34
51.41
29
39.53
11.88
34
-1
33
51.5
28
39.69
11.81
35
0
35
51.32
30
39.80
11.52
36
0
36
51.52
27
40.23
11.29
37
0
37
51.02
33
39.83
11.19
38
0
38
  
As convincing as the argument to separate the keepers and tail enders was, the result isn’t particularly telling. Richards and Ryder move up 5 ranks, possibly indicating the relatively better batting abilities of tail enders and/or the keepers of their time. They did much better when compared against their contemporary top 6 than when compared against their contemporary aggregate of all 11. Rest are all marginal differences in ranks except Nourse, who quite amazingly moves down by 10 ranks. The top 6 of his time averaged a whopping 41.11! Does the fact that his career coincided with that of Bradman's have anything to do with this handsome top order average?

The aggregate record during Bradman's time:


overall
409
1928-1948
128
4376
584
120798
364
31.85
254

Bradman's record:


Mat
Inns
NO
Runs
HS
Ave
100
Tests
52
80
10
6996
334
99.94
29

Removing Bradman alone from the 409 players over those two decades:

Runs

Aggregate
408
1928-1948
128
4296
574
113802
364
30.58
225

The top 6 record of his time:
Players
Span
Mat
Inns
NO
Runs
HS
Ave
100
259
1928-1948
128
2612
196
96631
364
39.99
238

Removing Bradman from the list:



258
1928-1948
128
2532
186
89635
364
38.21
209

Bradman as an individual makes a difference of 1.27 runs per batsman for the 408 players of his era and an even more impressive 1.78 runs per top 6 batsman. A big part of the reason is, well, he's Bradman. But a part of it is also because he played in 41% of the total matches played in his time. Contrast that with someone like George Headley who only played 11% of the matches of his time. 

To get some perspective on that, I compared the Bradman's influence on his era with another batsman from a different era whose average was in the healthy 50s and who featured in 60% of the matches of his time: Jack Hobbs. He improved the average of the batsmen of his era by 0.83 runs for all and by 1.04 for the top 6 of his time.

Modern batsmen play a lot less proportion of the overall matches in addition to the fact that the number of matches have shot up by 4 to 5 times from the earlier eras and are at a disadvantage to make such significant impact individually. Sangakkara, who averages the most among modern batsmen and has played 20% of the matches in his time, improves the average of his era by 0.27 for all and by 0.31 for the top 6 of his time.

End of digression. Back to the main thread.

Getting back to the point made earlier in computing the alpha average by considering only the top 6 batsmen, there are two widely floated around theories in modern cricket: Wicketkeepers bat way better since Gilchrist and tail enders aren't mugs with the bat in modern day cricket.

Let's consider wicket keepers first.

This is the total aggregate of wicket keeper batsmen record in Test cricket: 

Players
Span
Mat
Inns
NO
Runs
HS
Ave
100
Aggregate
258
1877-2014
2139
6706
902
155553
232*
26.8
191

Wicket keeper batting record decade wise:

Players
Mat
Inns
NO
Runs
HS
Ave
100
2000s
47
464
1490
179
41705
232*
31.81
65
2010s
62
195
641
71
19597
224
34.38
35
1970s
24
198
624
83
14764
152
27.29
11
1990s
37
347
1086
135
25950
173
27.28
30
1930s
26
89
272
45
5829
149
25.67
9
1940s
15
45
130
18
2648
152
23.64
3
1980s
39
266
773
109
15696
210*
23.63
14
1960s
35
186
588
73
12150
192
23.59
12
1950s
38
164
505
68
9005
209
20.6
10
1890s
13
32
106
23
1637
134*
19.72
1
1920s
16
51
157
36
2373
84
19.61
0
1900s
9
41
133
26
1825
115
17.05
1
1880s
11
29
97
16
1293
82
15.96
0
1910s
9
29
93
20
954
72
13.06
0
1870s
3
3
11
0
127
38
11.54
0

Note: The decisive shift in a trend could have happened at any time, but I have just chosen to compare them on the parameters of decades solely on discretion rather than any compelling rationale. Also the comfort that even if something decisive happened midway, it would still show up in these filters.

Prima facie, the idea of wicket keeper batsmen seems to have been a serious thing since the 1930s. And from there till almost 1990s, there has been spikes and troughs in different time periods. The average jumped up by 3.70 runs from the 1960s to the 1970s. But somehow the significance was lesser because of the lack of follow up in the next decade. That is where the jump of 4.53 runs from the 1990s to the 2000s achieves greater significance. More than the normal volatility, this seemed like a big leap for the discipline.

That Gilchrist made his debut towards the end of 1999 makes it appealing to think of him as the single biggest factor  but it's a bit too simplistic. Even excluding him from the records, the average in 2000s is a very impressive 30.26.

List of wicket keepers with at least 1000 runs and average above 30 in the 2000s:
20
34
1972
232*
73.03
6
91
129
5130
204*
46.63
16
25
40
1390
131*
42.12
2
23
38
1404
169
41.29
2
48
81
3117
230
40.48
7
40
62
2176
148
40.29
3
40
66
2056
124*
36.07
3
47
80
2525
158*
34.12
6
48
79
2394
143
31.92
3
30
40
1044
154*
30.7
2
104
152
4024
122*
30.02
2

Somewhere between the end of 1990s and into the 2000s cricket either seems to have moved towards placing greater emphasis on batting credentials of keepers or just fortunate to be flooded with a rich vein of talent clustered around the same time. That is the key difference in the 2000s. Gilchrist is no doubt a phenomenon, but so was Les Ames in his time. That Gilchrist played in an era of great keeper batsmen is what made his era so special.

List of keeper batsmen with more than 1000 runs in 2010s
Runs
19
31
1726
169
59.51
6
17
27
1041
124
45.26
3
24
44
1673
200
40.8
3
54
83
2709
126
39.26
5
48
78
2632
224
37.07
3
34
58
1629
136
31.32
2
26
39
1031
120
30.32
2

All of them average more than 30 with De Villiers setting a whole new benchmark for wicket keeper batsmen.

Stats or anecdotes, there is no disputing the notion that wicket keeper batsmen have had a huge leap in terms of ability and achievement with the bat since the 2000s.

On to the lower order
This is the record of batsmen from no.8 to no.11 in the history of cricket:

Players
Mat
Inns
Runs
HS
Ave
100
Aggregate
1928
2136
24067
277051
257*
15.25
96

Quite obviously, there'll be exceptions here of top order batsmen batting lower down the order for various reasons. But spread over such a large population, it's insignificant. 
Decade
Matches
Average of Nos. 8 to 11
1920s
51
17.72
2010s
195
16.54
1980s
266
16.05
1960s
186
15.78
1890s
32
15.75
1940s
45
15.57
2000s
464
15.51
1870s
3
15.3
1900s
41
15.12
1970s
198
14.79
1990s
347
14.34
1930s
89
14.22
1950s
164
14.05
1910s
29
14.01
1880s
29
12.57

The total range of averages across all decades is 5.15 runs between 1880s and 1920s. The best decade for run making, 2000s, has a tail ender average of 15.51 which puts it in the 7th place, way behind the 1980s - the glory era of fast bowling.

There is no real improving or worsening pattern over time here.

List of batsmen (8 to 11) who have scored at least 1000 runs and average >= 20:

Span
Mat
Runs

HS
Ave
100
1997-2012
53
1577
122*
35.04
2
1995-2008
75
2330
111
32.81
2
1978-1993
61
1967
116
32.78
2
1997-2012
100
3502
140
30.71
5
1981-1992
55
1667
173
27.78
2
1976-1986
63
1598
102
27.55
1
1986-1993
47
1180
73
26.81
0
2000-2006
40
1125
82
26.78
0
1994-2009
105
2785
100*
25.55
1
1920-1937
40
1003
65*
25.07
0
1973-1990
56
1641
103
24.49
1
2007-2014
73
2120
169
24.09
1
2007-2014
57
1559
123*
23.26
1
2007-2013
33
1185
106
22.78
1
1946-1959
41
1062
104
22.59
1
1985-2002
87
2160
257*
22.26
2
2008-2013
60
1370
85
22.09
0
1993-2005
48
1187
127*
21.98
1
1978-1995
53
1212
75
21.26
0
1998-2006
54
1421
59
20.89
0
1999-2008
75
1415
64
20.8
0
1946-1960
51
1103
118
20.42
2

Most of the solid record belongs to all rounders who were batting out of position and wicket keepers. Even here, the numbers are heavily skewed in favour of 1980s and beyond. But that hardly seems to impact the aggregate numbers of the recent decades.

Anecdotally, the theory that tail enders have become better batsmen in recent times is so sound. Cricket fans can recollect a sizeable number of matches where tail enders have played out spells and spells of incisive bowling to bat out for a draw. The ugly blindfold like slog is history. Cricket is a lot more professional with an ever expanding support staff assisting cricketers to hone their secondary skill. Bats are better and so are protection equipments. It all adds up to make perfect sense.


But at a macro level, stats just don't add up to make a convincing case.