Informational constraints in cricket
Strategies in cricket are either high-specificity-and-low-generalizability or low-specificity-and high-generalizability. Sometimes, constraints on information dictate which is used
In information economics theory, there is an assumed behavior named statistical discrimination. When employers are confronted with absent or imperfect information about the productivity of a prospective employee, they react by using available low-quality information to assess the suitability of the applicant. This can result in usage of manifold stereotypes in recruitment, such as "women are better communicators than men" and "neurodivergent people are more attentive to detail". The manifestations are enormous. For example, most sports fans, who lack a mental barometer for sporting quality, use the gender of players as a measure of athletic ability, producing a differential assessment of quality of men's and women's sports that ceases to exist when the gender of female players is obscured. Claudia Goldin famously showed that "blind" auditions result in increased recruitment of female musicians.
You may have noticed that I have used two examples of stereotypes that are true on the general scale. Sex differences in communication have long been studied and, indeed, hyper-attentivity to detail is one of the diagnostic criteria used in many autism diagnoses. I did this for a reason: it is also true generally that left-handed batters in cricket are worse off against right-arm offspin and right-handed batters are worse off against right-arm legspin. But that is not always the case. It has been pointed out an annoyingly high number of times on Twitter, sometimes by this very author, that Devon Conway and Nitish Rana are two examples of left-handers who are elite at playing offspin. When opposition captains bowl offies at these southpaws, cricket Twitter scoffs. My modest thesis in this article is this: I think that this happens so often in cricket not because opposition captains are stupid but because of informational constraints in cricket causing a form of statistical discrimination similar to the kind we've witnessed above.
Recently, as is public knowledge by now, the India cricket side has become without an advanced cricket data analyst. Data is information. This information can also be accumulated intuitively by players who know each other inside-out, but when faced with a Sri Lanka side that has a new captain and plenty of new faces, and other international cricketers who don't take part in the IPL, the resultant informational constraints are likely to be laid bare. This is what happened in the tied T20I in Kandy. On-air commentators were convinced that Yashasvi Jaiswal, who had been warming up by bowling his part-time legspin before each of the T20Is, would bowl at some point. Instead, it was the much less accomplished spinner in Rinku Singh, who could bowl offbreak to the left-handed pair at the crease, who was tossed the ball. That was hardly the only instance: throughout the series Riyan Parag bowled legbreak to the Sri Lankan right-handers and offbreak to the left-handers. In the final over of the final T20I, Suryakumar Yadav, having decided to send down the last over, bowled four offbreak deliveries and two legspinners, each perfectly matched by handedness of the opposition batter.
A slightly closer inspection would reveal that since 2020 Sri Lanka's right-handers don't average any less against away-spin than they do against in-spin while their left-handers, in fact, are better off against offies compared to leggies in this period. (This is an interesting quirk and deserves deeper attention but may vanish at the level of batters rather than the level of teams.) That is not to say that any of SKY, Parag or Rinku were making a mistake by doing what they did. In fact, it says the opposite. Contemporary elite cricketers are smart enough and intuitive enough to use low-specificity information that is accurate at a general level but suffers from considerable individual deviation when confronted with absence of high-specificity information. What India were doing, in particular, is a smart way of using this low-specificity information because the underlying stereotype - that away-spin is tougher to face than in-spin - is true as opposed to, say, race stereotypes used in recruitment.
There are at least two other examples of informational constraints shaping cricketing decisions that are interesting to me. During the IPL, I had privately noted that teams are being a lot “saner” with the way in which they were employing tactics such as bowling to the long side of the boundary. These venue-specific tactics, I think, gain most utility when there are hardly any pitch-specific or opponent-specific tactics that can be expected to work, either because pitches are so flat or because the opponent in question is somebody like Heinrich Klaasen. The IPL, as most will agree, is currently not only among the most data-intensive but also the most data-curious of cricket competitions in the world. (By "data-curiousness" I am referring to player acceptance of data-driven insights. This acceptance in the IPL may be forced - liberal businesspeople who believe they have privately benefited from good-quality data use may push for more data-driven cricketing decision-making - but it exists today nonetheless.) Players equipped with these data insights were depending on more outwardly obvious tactics such as bowling to the long side only on the flattest of pitches or when all else was going wrong.
With the advent of the T20 World Cup, there was a slight change. I noticed more usage of this stratagem, particularly in contests involving at least one Associate-level nation. This must be influenced partly by the fact that in most matches in the West Indies the short side of the boundary was also the side to which the wind blew, creating a compound effect, but it can also be explained by the framework I have motivated here. Information about Associate-level batters is lacking for players from Full-Member nations while at the same time they are perhaps not as data-curious as they might have been in the IPL. This results in more knowledge constraints, at least at the intuitive level for players, causing them to appeal to more low-specificity-but-high-generalizability tactics. I saw even more of this when I switched on a few matches of the MLC, where my educated guess is that data-intensiveness and data-curiousness are both low. Bowling to the side that the breeze was blowing against was the name of the game even for many established international bowlers.
What I have made in this article are claims, not necessarily factual ones (just as I intend to do in this blog going forward). I think that while it may be possible to put these claims to the test, this might be cumbersome. The easiest outcome variable to focus on, obviously, would be the percentage of away-spin overs faced by batters in any league, which is computable using popular cricket data sources. More interesting to me personally would be to look at the prevalence of the two venue-specific tactics - bowling to the long side and bowling against the wind - using data on boundary shot magnitudes to create a proxy for the distance from the pitch to both square boundaries in every match at every venue, and data on wind vectors which are widely used as instrumental variables in applied economics currently. A neat analysis would also require a measure of the "quality" of a competition - such as the ones CricViz have promulgated - but it would be no less interesting to simply compare the prominence of these tactics in different leagues and competitions.
I would expect the data to show that these trends in general hold since 2020 but I will be slow to reconsider my intuition simply because of how noisy cricket data can be in these regards. Further, causality is just difficult to prove (is it information constraints that are causing the adoption of these strategies or because, say for instance, players just don't want to win in a low-quality league as much as they do in the IPL?) which is why I am content with making these observations and not seeking confirmation in the data. My economics education has unfortunately instilled in me a joy-killing skepticism of simple data-driven conclusions and a stubborn focus on alternative causal channels.
Lastly, I want to emphasize that the point of this article is not to critique player choices. When high-quality information is unavailable, it is the Bayesian thing to do to use available low-quality information in decision-making tasks (although it is wise to remain more alert in these cases to new data accumulation). Using low-quality information is likely to result in more favourable outcomes than using no info at all; i.e., by being random. In other words, it is indeed true that hitting to the long side of the boundary is tougher on average, even though this may not be the case when the batter in question is an accomplished off-side whacker. But this caveat is not always information that is available to bowling teams. And that is why it is to their credit that they are smart enough to go to the second-best option and why I think one shouldn't be quick to criticize bowlers for satisficing. Maybe it’s the best they can know.