By James Willoughby
The tendency for humans to seek out and try to explain patterns is acutely obvious in horse racing analysis. The sport is governed by a multitude of statistics, and often those numbers produce sequences into which people tend to read a cause. There is no better example than the concept of “Trainer Form”– the apparent correlation of performance between horses from the same yard. Trainer Form can be good or bad, but the case du jour arousing interest about nine-times champion jumps trainer Paul Nicholls is decidedly to the negative.
Following a spate of below-par performances from Nicholls-trained horses in the last few weeks, a growing belief has propagated that horses in his powerful Somerset stable are collectively off colour. This was first a meme of social media, before print journalists picked up on it, though the Racing Post is reporting the trainer himself unconcerned by a barren spell of over two weeks. Click here.
The achievement of a racehorse can be described by any number of performance measures. A simple guide is its finishing rank, and if not successful, its lengths behind the winner. An example of a more sophisticated appraisal is the Racing Post Rating, which places performance on a scale covering the universe of horses in the same domain. These data are known to statisticians as random variables because some unknown fraction of each is either outside the racehorse's control (luck, health, suitability to conditions) or else beyond our understanding.
Most often, the identification of Trainer Form in the media is motivated by a recent sample of wins or losses skewed to the positive or negative. In the present case, the performance of Nicholls-trained runners is summarised by a relatively poor win-loss report in the Racing Post story.
Figure 1: Strike-rate of Paul Nicholls-trained runners in December in recent seasons.
Nicholls' strike-rate in December this year is indeed more than twice as low as the lowest of the four previous seasons. The reporter Jon Lees–one of the most responsible in the business –includes quotes from the trainer in which he comes off as apparently unconcerned, but the views of his assistant Harry Derham are more cautionary.
The win-loss record of a sample of racehorses is governed by countless variables, of which at least 50 can be captured or calculated from the data supplied by the Racing Post itself. The most significant driver of racehorse performance is, of course, intrinsic merit, but a horse's handicap mark is also highly important, as is its suitability to ground conditions. There are many others factors familiar to followers of the sport.
In the Racing Post story, an inference in invited by use of the win-loss record of Nicholls runners partitioned by calendar month. But the fortunes of the yard could also be driven by other factors, one of which Nicholls clearly believes is important: “We've lots of horses to run in the spring that want better ground,” he said.
At the win-loss level, a trainer's record can show clustering of wins and losses for a number of reasons. An influx of new talent may lead to a slew of wins at the start of the season, for instance, but when the stable's horses pick up penalties or ascend the handicap, it is more difficult to continue winning. The opposite trend–a slow start and a belated winning spree–may be related to a switch in ground conditions typical of changing seasons, such as Nicholls identifies, or may come about for strategic reasons, owing to the season being back-loaded with valuable prizes.
These factors and others are dynamic, and, coupled with the randomness of the racecourse and the fact strike-rate makes no differentiation between a horse finishing last or finishing second, and there is obvious potential for any signal of form to be accompanied by noise.
If there really is any information in the win-loss metric which provokes interest in Trainer Form, it ought to show up in data. So, has Nicholls' win-loss record in December predicted his performance in January? Is a win-loss record clean enough to know what will happen next to the yard?
Figure 2: Paul Nicholls 1997-2015: New Year Form Reversal.
No. Figure 2 contains a circle for each season of December-January strike-rates (there is some over plotting). On the X-axis is the trainer's strike-rate in December and on the Y-axis is the trainer's strike-rate in January. Overall, Nicholls has a 25.2% strike-rate in December and a 20.9% strike-rate in January, so the axes are scaled to allow for this different baseline. (Incidentally, I would read nothing as far as seasonality into the difference between these two strike-rates; it is the motivation for further study.)
The purple “best-fit” line has a negative (downward) slope, meaning that higher-than-average strike-rates in December are associated with lower-than-average strike-rates in January, and vice versa. Also, the graph is divided into quadrants to make it clearer that the Nicholls yard tends to experience a reversal in relative efficiency over the New Year period: there are more points in the upper-left and lower-right quadrants, showing that the stable's win-loss record in December is a poor guide to its relative fortunes in January.
The data shows that, at least for the Nicholls stable in this part of the year, there is no momentum in either direction as far as win-loss record is concerned. We have established that Nicholls has a 4.3% lower strike-rate in January than December (25.2% minus 20.9%) on average, but I would certainly take 5-4 against the opposite coming true this season (Figure 1 shows I have 10% to beat if securing the price!).
The message is that we should consider a horse's chance on a case-specific basis, rather than relying on the magical belief that performance has trend after all other factors are taken into account. More often than not, the opposite is true. Whatever the metric attempting to capture Trainer Form, regression to the mean is more likely than momentum: it is hard to keep winning when penalties mount up, the handicapper has his say and the ground changes.
Any system featuring proactive handicapping and seasonality of ground conditions is likely to drive form in cycles. And that is the most rational reason for any clustering of wins, regardless of whether “Trainer Form” is real or an illusion.
Click here for a free daily subscription.