Uploaded by Erwin Koenen

Plant and Stowe, 2018

advertisement
Article
Journal of Sports Economics
1-20
ª The Author(s) 2018
Reprints and permission:
sagepub.com/journalsPermissions.nav
DOI: 10.1177/1527002518777977
journals.sagepub.com/home/jse
Is Moneyball Relevant
on the Racetrack?
A New Approach to
Evaluating Future Racehorses
Emily J. Plant1 and C. Jill Stowe2
Abstract
The market for racehorses is volatile and inefficient, and the ability to identify and
exploit undervalued characteristics which predict performance can be profitable. In
this article, we evaluate whether quantitative measures of physical structure and
movement can help predict racing success and whether these measures are
appropriately valued by the market. We discover an interesting dynamic: One
measure predicting late racing development is a significant predictor of career
earnings but is not valued by the auction market; a different measure predicting early
racing development is valued by the marketplace but does not predict career
earnings.
Keywords
performance measurement, performance prediction, biomechanics, volatile markets
market valuation, Thoroughbred racehorse
In the book Moneyball: The Art of Winning an Unfair Game, the author describes
how the management group of the Oakland Athletics baseball team applied quantitative analysis (coined “sabermetrics”) to uncover and exploit market inefficiencies
1
2
School of Business Administration, University of Montana, Missoula, MT, USA
Department of Agricultural Economics, University of Kentucky, Lexington, KY, USA
Corresponding Author:
C. Jill Stowe, Department of Agricultural Economics, University of Kentucky, 307 CE Barnhart, Lexington,
KY 40546, USA.
Email: jill.stowe@uky.edu
2
Journal of Sports Economics XX(X)
and, even under significant budget constraints, outperform most of its competition
(Lewis, 2003). Hakes and Sauer (2006) confirm the “Moneyball hypothesis” using
simple econometric tools. Since that time, the “Moneyball approach,” where statistical analysis is utilized to identify factors which are indicators of success but may be
undervalued by the market, has been applied in a number of different environments.
One such environment in which this approach is starting to gain popularity is in
the Thoroughbred racing industry. The Thoroughbred industry can be characterized
as one in which decisions have traditionally been made by instinct and experience,
with limited statistical evidence available for supporting decision-making, even for
transactions which represent a significant investment. However, there is a recent
trend among market participants to utilize quantitative analysis in the purchase
decision concerning future racehorses. Similar to sabermetrics, buyers of future
racehorses attempt to uncover market inefficiencies through the identification of
undervalued characteristics. Successful market participants can gain an advantage
over uninformed buyers by exploiting these inefficiencies.
Young Thoroughbred horses are commonly sold in public auctions; the market
for these horses is characterized by asymmetric information, where sellers know
more about the quality of a horse than buyers, but where neither party has full
information. The most popular public auctions occur when horses are yearlings,
which is roughly 7–12 months before they start their first race. Buyers of Thoroughbred yearlings have few options at their disposal when it comes to gathering
information on the expected future racing quality of these horses. Traditionally,
buyers rely on several key pieces of information when selecting a young racehorse
at auction: the opinion of expert agents skilled in selecting young horses, known as
“bloodstock agents”; analysis of the performance history of genetically related
horses, such as the sire (father), mother (dam), and siblings of the horse available
for sale; and likely the buyer’s own expertise or opinion.
Recently, interest in quantitative approaches has increased the utilization of
alternative technologies in predicting the future performance of a racehorse. While
veterinarians provide novel information with testing methods including cardiovascular assessment, genetic testing, and analysis of airway capacity, other lines of
inquiry attempt to quantify how the athlete moves. The analysis of motion is known
as biomechanics, in which quantitative measurements of physical conformation and
movement are utilized in forecasting a horse’s future racing potential. This idea is
not unlike how baseball teams attempt to quantify measures of past performance in
order to make decisions that improve future performance. This study suggests that if
biomechanics tools can identify characteristics undervalued by the market, users of
these tools gain an advantage over other buyers. In this article, we evaluate whether
biomechanics tools provide additional information in predicting the future success
of a racehorse, and if so, whether these measures are appropriately valued by the
market. In any type of volatile, inefficient market such as this, the ability to identify
and exploit undervalued characteristics stands to benefit the holders of that
information.
Plant and Stowe
3
In general, the auction market for young Thoroughbreds is inefficient in the sense
that sales prices are, in general, not good predictors of racetrack success. A working
paper by Chezum, Stowe, and Schuckers (2014) investigates an efficient markets
hypothesis using auction prices on young Thoroughbreds and finds that while prices
marginally incorporate information that becomes available as a Thoroughbred ages,
variation in auction prices explains very little of the variation in career racetrack
earnings (on the order of about 1% for weanlings up to about 8% for 2-year-olds).
However, most research in this line of inquiry focuses on identifying attributes that
determine hammer prices in auctions based on information readily available to the
seller. It is this set of characteristics on which buyers of Thoroughbreds have traditionally relied in selecting prospective racehorses. According to these studies, characteristics which are consistently identified as price determinants include pedigree,
gender, and sale characteristics. In pricing studies, the quality of the sire is best
measured by stud fee; the quality of the dam is measured both by her own performance on the track and that of her progeny. Gender and sale characteristics, like the
horse’s placement in the sale (higher quality horses are generally presented earlier in
the sales), are other characteristics traditionally valued by the market (see Buzby &
Jessup, 1994; Chezum & Wimmer, 1997; Commer, 1991; Parsons & Smith, 2008;
Plant & Stowe, 2013; Robbins & Kennedy, 2001; Vickner & Koch, 2001; Wimmer
& Chezum, 2003).
To our knowledge, no economics studies have looked at the determinants of
career performance. However, genetic approaches have been used to study heritability of racing performance to assist in selection of horses for breeding and
racing. Racing performance is a function of both genetic attributes and a multitude
of environmental factors including trainer, jockey, track conditions, and weather.
A recent study of a large population of Australian Thoroughbreds (164,046 horses
over an 11-year period) was conducted to reestablish which traits have a measurable genetic component and are thereby better suited for selection (Velie, Hamilton, & Wade, 2014). The authors found that heritability estimates for career
earnings was .19, which lies toward the low end of the range from previous studies
conducted on smaller populations in European and Asian countries (.10 to .79).
The authors conclude that racing performance traits are complex and have relatively low estimates of heritability, due in large part to the fact that horse racing is
influenced by many environmental factors; however, the estimate does suggest
that from a population standpoint, the racing success of sires and dams will potentially be reflected in their progeny.
The use of biomechanics is rooted in the idea that an optimally proportioned
human will demonstrate superior athletic performance. This same concept can be
applied in many different arenas, including horse racing. Biomechanics tools allow
for objective measurement of variables related to a horse’s physical structure and
movement. Measurements of important bones and muscle groups, cardiac function, and pulmonary structures are analyzed for their potential to predict
4
Journal of Sports Economics XX(X)
performance as an alternative or supplement to the traditionally used characteristics described above.
Using data from Thoroughbred auctions and biomechanical measurements provided by a company specializing in this type of approach, this article examines the
effectiveness of biomechanics data in predicting future racetrack performance.
Then, we investigate whether the market values this information appropriately by
examining its relationship with sales prices.
Background and Data
The data consist of 841 horses born between 2004 and 2007 that were offered for
sale in public auctions in the United States as 1-year-old horses, known as
“yearlings.” These yearlings had biomechanical measurements taken by the company that provided the proprietary data to the authors. The yearlings in this data set
had already passed some initial selection criteria as established by the clients of
the biomechanics company, based on the objectives of the buyers who enlisted
the company to help select which yearlings to purchase.1 Supplementary data
are retrieved from the American Produce Records (Brisnet.com, n.d.), various
annual issues of the BloodHorse Stallion Register, and EquiBase (Equibase.com,
n.d.).
The basic theory behind equine biomechanics measurements is that the physical
makeup and mechanics of a Thoroughbred are directly correlated with their ability as
a racehorse. The biomechanics company takes a battery of measurements of a
horse’s entire body. These measurements are then compiled into a score which is
an assessment of the biomechanical efficiency of the horse. In the case of yearlings
sold at public auction, the horses are measured either shortly before the time of the
auction or on the sales grounds of the auction in the days prior to the sale of the horse.
Clients pay the biomechanics company a fee to measure yearlings they are interested
in purchasing and use that information to assist in their purchasing decision.
The use of biomechanics data is another attempt to gain an advantage in a market
where investing appears to be a losing proposition. Neibergs and Vinzant (1999)
demonstrate that in what is considered the lowest quality of racing, the claiming
ranks, there is a positive average profitability in the bottom tiers of claiming races.
However, when taken as a whole, over 80% of all horses in training fail to recover
the variable costs associated with their training, much less the fixed cost of the
investment (Simon, 1998). These studies may not consider the terminal value of the
horse at the end of its racing career (its value in the breeding shed), which may
recover some of the difference. From this evidence, in order to explain racehorse
ownership, Gamrat and Sauer (2000) test three models: a participation model, a
constant returns model, and a championship model. Their results suggest that utility
of participation is the best model to explain the relationship between the rate of
return on racehorse ownership and auction prices. Hence, even though the purpose of
Plant and Stowe
5
this article is not to explain the discrepancy between price and earnings, the observation that these horses are, on average, largely unprofitable is not entirely surprising and can be explained through the use of models incorporating nonpecuniary
returns to participating. Moreover, it is worth noting that Chezum et al. (2014) show
that the markets for the top quality, or “select,” yearlings are the least efficient of all,
and as we will discuss shortly, that market is most similar to the sample studied in
this article.
The most central measurement provided is the biomechanical efficiency score,
which is comprised of a proprietary group of measurements and mathematical
calculations. Yearlings (*14–16 months of age) are measured and assigned predicted biomechanical scores for early racing development and late racing development (BIOEARLY and BIOLATE, respectively). In short, their modeling forecasts a
horse’s future biomechanical efficiency when they are in their 2-year-old and 3year-old racing seasons; both of these ages correspond to meaningful stages in a
horse’s racing career and may be linked to buyers’ investment objectives. The
effective scoring range is 90–100, and higher scores indicate more exceptional
biomechanical efficiency.
Predicting the racing potential of a horse at 2 and 3 years old is meaningful
for buyers. Racing a horse as a 2-year-old, the time period for which the
BIOEARLY score is predicted, results in an earlier potential return on investment
relative to waiting until the horse’s 3-year-old year, the time period predicted by
the BIOLATE score. However, the BIOLATE time period represents the important and coveted “Classics” in racing, which are some of the most richly rewarding and important races, such as the Kentucky Derby. Depending on whether
their objective is to maximize early returns or to increase their odds of successful performance in the big 3-year-old races, prospective buyers can utilize the
appropriate score.
Other physical measurements recorded are the width of the horse’s jaw,
which might be related to the amount of space available for air intake. The
measurements are divided into well below average, below average, average,
above average, and well above average. The variable JAWBELOWAVG ¼ 1 if
the yearling’s jaw width is “below average” or “well below average”; similarly,
the variable JAWABOVEAVG ¼ 1 if its jaw width is categorized as “above
average” or “well above average.”
A series of measurements related to the horse’s heart are also reported. The first
measurement is indicative of the stroke volume of the horse’s heart (HEARTSCORE). This continuous heart score measures the capacity of the horse’s cardiovascular system to deliver oxygen to and remove metabolic waste from the muscles.
Higher scores are desirable as they indicate higher oxygen delivery and metabolic
waste removal. Another measurement reported is the pace of the horse’s heartbeat
(PACE), with several categories of paces identified, including “normal,” “slow,”
“slow/block,” and “block.” According to the biomechanics company, paces other
6
Journal of Sports Economics XX(X)
than “normal” are desirable when selecting a racehorse.2 The variable PACE ¼ 1 if
the pace is normal and 0 otherwise.
In addition to biomechanical measurements, other common predictors of performance relate to pedigree quality and gender. In pricing studies, the contribution of
the pedigree from the sire’s side is measured by the sire’s stud fee (STUDFEE) at the
time of the sale. In addition, the number of foal crops3 produced by the sire (SIRECROPNO) is included because the market values differently offspring produced by
sires in different stages of their career, with a special distinction for sires with their
first crop of yearlings (FRESHMAN). The quality of these sires is unknown, as they
have no progeny on the racetrack at the time of the sale. Although sale year stud fee
appears to function as the best measure of sire quality in pricing studies, little
research has investigated the quality of sires as it relates to progeny performance.
Consequently, variables representing the sire’s own performance on the racetrack
and the performance of his progeny on the racetrack, both measured in earnings per
start (SIREEARNPERSTART and SIREPROGEARNPERSTART, respectively), are
included.4 Pedigree quality on the dam’s side is measured by whether the horse’s
dam earned black type on the racetrack (DAMBT) and whether the horse’s dam has
produced any black-type earning progeny (DAMPROGBT).5 The gender of the
horse is included as a control variable (COLT). Colts (uncastrated male horses)
stand to collect higher earnings than fillies (female horses), as they generally win
the most prestigious races with the highest purses. Finally, an indicator variable
controlling for whether the horse was sold at a select sale (SELECT) is included to
account for any unobserved differences, both in pedigree quality and in physical
conformation, which may exist among horses chosen to be sold in these highquality sales.
A racehorse’s success on the track can be measured in a number of different
ways. For example, performance can be measured by career earnings, number of
wins, or whether a horse earns black type. While these measures capture describe
the success of a given horse’s race career differently, in this article, career earnings
(EARN) are ultimately used to demonstrate a horse’s success on the racetrack as
this is the most robust and general indicator of quality. Descriptions and summary
statistics of all variables used in the model are provided in Tables 1 and 2,
respectively.
Because of the selection criteria used prior to measurement, the sample used
in this study is not representative of the entire yearling population. In short, the
horses in our sample earn more money and perform better at the highest levels
of racing than the average Thoroughbred. Because we are looking at yearlings
that have a higher propensity to be successful than the average yearling,
BIOEARLY and BIOLATE have a tight range; even with that limitation, we find
evidence of the dichotomy between the two measures in predicting career racetrack earnings and sales prices. With that said, it would be helpful to obtain
measures on a more representative sample to investigate whether that relationship is replicated.
Plant and Stowe
7
Table 1. Variable Names, Descriptions, and Expected Signs.
Variable
Description
BIOEARLY
Predicted biomechanical efficiency score for the yearling
when it reaches the age of 30 months (2.5 years)
BIOLATE
Predicted biomechanical efficiency score for the yearling
when it reaches the age of 42 months (3.5 years)
HEARTSCORE
Score measuring the efficiency of the cardiovascular system
PACE
Descriptive of the pace of the heartbeat. PACE ¼ 1 if normal
JAWABOVEAVG
A measurement of the width of the yearling’s jaw.
JAWABOVEAVG ¼ 1 if the year’s jaw width was categorized
as “above average” or “well above average”
JAWBELOWAVG
A measurement of the width of the yearling’s jaw.
JAWBELOWAVG ¼ 1 if the year’s jaw width was
categorized as “below average” or “well below average”
COLT
GENDER ¼ 1 if the yearling is a colt and 0 if the yearling
is a filly
SELECT
SELECT ¼ 1 if the yearling was sold at a select sale or in the
select portion of the sale
LNSTUDFEE
The natural log of the advertised stud fee of the yearling’s
sire in the year of the sale
SIRECROPNO
The number of crops that the yearling’s sire had produced at
the time of its birth
FRESHMAN
FRESHMAN ¼ 1 if the yearling is from the sire’s first crop
of foals
LNSIREEARNPERSTART
The natural log of career racetrack earnings divided by
number of starts for the yearling’s sire
LNSIREPROGEARNPERSTARTER The natural log of the average racetrack earnings among the
progeny produced by the yearling’s sire provided they had
started at least one race þ 1
DAMBT
This indicator variable is set equal to 1 if the yearling’s dam
earned black type (came in first, second, or third in a
stakes race) during her racing career
DAMPROGBT
This indicator variable is set equal to 1 if the yearling’s dam
produced at least one foal that earned black type (came in
first, second, or third in a stakes race) prior to the date of
the sale
LNEARN
The natural log of career racetrack earnings þ 1
LNPRICE
The natural log of hammer price
We include a comparison of the sample used in this article to the entire sample of
yearlings sold at Keeneland 2005–2008 in Table 3 to help the reader better understand the limitations of the sample. Based on relative average sales prices and
performance metrics, it appears that the sample used in this article more closely
mirrors the select portion of the Keeneland Sales, although only 33% of our sample
were actually sold in the select portion of a sale.
8
Journal of Sports Economics XX(X)
Table 2. Summary Statistics.
Variable
n
BIOEARLY
BIOLATE
HEARTSCORE
PACE
JAWABOVEAVG
JAWBELOWAVG
COLT
SELECT
STUDFEE
SIRECROPNO
FRESHMAN
SIREEARNPERSTART
SIREPROGEARNPERSTARTER
DAMBT
DAMPROGBT
EARN
PRICE
Mean
SD
Min.
Max.
841
97.51
0.65
94.0
98.9
841
97.85
0.58
94.7
99.0
841
6.81
0.58
0
8.5
841
0.78
0.42
0
1
841
0.28
0.45
0
1
841
0.15
0.35
0
1
841
0.57
0.49
0
1
841
0.33
0.47
0
1
841 US$68,793.70 US$74,869.99 US$3,000 US$500,000
841
7.20
5.01
1
24
841
0.13
0.34
0
1
841 US$106,352.30 US$121,330.00
0
US$845,906.10
841 US$21,921.60 US$15,417.15
0
US$66,784.05
841
0.39
0.49
0
1
841
0.32
0.47
0
1
841 US$78,313.32 US$194,816.40
US$0
US$3,614,500
841 US$279,550.80 US$461,373.30 US$2,200 US$9,200,000
Table 3. Comparison of Sales Price and Performance Metrics Between Yearlings Sold at the
Keeneland September Sale 2005–2008 and the Sample Used in This Article.
Samplea
Keeneland select
Keeneland
US$100Kþ
Keeneland total
P&S select
P&S US$100Kþ
P&S total
Average
Price
Average
Earnings
Per
Start
US$471,724 US$6,112
US$291,970 US$6,073
US$103,217
US$518,911
US$379,411
US$278,783
%
Average
Graded
Earnings
%
%
% 2 YO % Stakes Stakes
Per Starter Runners Winners Winners Winners Winners
US$70,284
US$77,571
83
86
59
64
15
16
7.1
7.7
3.8
4.1
US$3,863 US$56,342
US$7,180 US$100,971
US$6,927 US$100,053
US$6,777 US$98,375
83
79
81
81
61
61
66
65
17
18
19
18
5.2
6.6
6.7
6.4
2.3
4.9
4.6
3.6
Note: Plant & Stowe (201X); YO¼ year old.
Keeneland statistics from Gillies (2011).
a
Empirical Models
We develop models to predict performance (earnings) and price. Explanatory factors
include the commonly used pedigree and sibling performance variables as well as
the new biomechanical measurements, which may be useful in predicting future
performance. Then, we explore whether these factors are appropriately valued by
Plant and Stowe
9
the market in a public auction. The performance and price models are estimated
using seemingly unrelated regressions (SUR). Adopting this approach allows the
error term to be correlated across models and enables the equality of coefficient
estimates across models to be formally tested.
The performance model assumes that success on the racetrack is measured by
lifetime earnings and that this success is a function of both genetic and biomechanical factors.6,7
ðPerformanceÞLNEARN ¼ a þ b1 BIOEARLY þ b2 BIOLATE
þ b3 HEARTSCORE þ b4 PACE þ b5 JAWABOVEAVG
þ b6 JAWBELOWAVG þ b7 COLT þ b8 LNSTUDFEE
þ b9 FRESHMAN þ b10 SIRECROPNO
þ b11 SIREEARNPERSTART
þ b12 SIREPROGEARNPERSTARTER þ b13 DAMBT
þ b14 DAMPROGBT þ b15 SELECT þ e:
The dependent variable is the natural log of career earnings. Independent variables include those related to the horse’s physical structure: The predicted biomechanical efficiency scores for early racing development and late racing
development, the heart score and heart pace, dummy variables indicating whether
a horse’s jaw was above average width or below average width, and the yearling’s
gender. Paternal pedigree variables include the natural log of the yearling’s sire’s
sale year stud fee, if the horse is part of the sire’s first crop of yearlings, the number
of foal crops the sire has produced, the sire’s own racetrack performance measured
by earnings per start, and the earnings per starter of the sire’s progeny. Maternal
pedigree variables include whether the yearling’s dam earned black type during her
racing career and whether the dam has produced any progeny earning black type.
Finally, we control for whether the horse was sold in the select portion of the sale.
The price model analyzes the determinants of (the natural log of) auction prices
using the following hedonic pricing model:
ðPriceÞLNPRICE ¼ a þ b1 BIOEARLY þ b2 BIOLATE þ b3 HEARTSCORE
þ b4 PACE þ b5 JAWABOVEAVG þ b6 JAWBELOWAVG
þ b7 COLT þ b8 LNSTUDFEE þ b9 FRESHMAN
þ b10 SIRECROPNO þ b11 SIREEARNPERSTART
þ b12 SIREPROGEARNPERSTARTER þ b13 DAMBT
þ b14 DAMPROGBT þ b15 SELECT þ e:
The results from the price model can be used to investigate the extent to which
factors in the performance model are valued by the market. Results from both
models, along with marginal values, are presented in the next section. Depending
on the model, the marginal value of a variable is the change in career racetrack
earnings or sales price given a one-unit increase in the relevant independent variable.
10
Journal of Sports Economics XX(X)
Marginal values are calculated at sample means for continuous variables and at zero
for all dummy variables except the variable under consideration.
Results and Discussion
Results from the SUR estimation are presented in Table 4. The percentage of variance predicted by the explanatory variables (the reported “R2,” which is not a welldefined concept when a generalized least squares approach is used) is .0346 for the
performance model and .4529 for the price model. The results from the Breusch–
Pagan test for correlation of the residuals rejects the null hypothesis that the residuals
are not correlated at the 1% level, w2(1) ¼ 15.423, p ¼ .0001. Both the performance
and price models are significant at the 5% level.
Previous research (Chezum, Stowe, & Schuckers, 2014) has shown that it is difficult to predict the success of a horse on the racetrack. Results from the performance
model suggest that even after including biomechanical measurements to improve the
ability to predict performance, forecasting racing success is still essentially a shot in
the dark. However, the results presented in Table 4 provide evidence that attributes
relating to the physical structure of the horse are statistically significant in predicting
performance. The biomechanical efficiency score predicting 3-year-old racing success, BIOLATE, is positive and statistically significant (p < .05). Neither the heart- nor
jaw-related biomechanical measures are significant at conventional levels.
In addition to the biomechanical determinants, three other variables are statistically significant at the 10% level or better. First, as expected, colts earn significantly
more than fillies (p < .05); purses for sex-restricted races (i.e., races only for
females) are almost always lower than races open to either gender, and the most
prestigious races are almost always won by colts. Progeny by freshman sires earn
more than progeny by other sires (p < .10). Finally, the natural log of sire earnings
per start is negative and significant at the 10% level.
Results from the price model are also presented in Table 4. The results suggest
that buyers take into account many of the biomechanical measurements in their
bidding decisions. The biomechanical efficiency score predicting 2-year-old
racing success, BIOEARLY, is positive and significant at the 1% level. This result
indicates that yearling buyers value a horse that physically appears to be an earlyperforming type. Thus, the market values horses who are positioned to get to the
races earlier because they are more likely to begin “earning their keep” before their
later developing peers. In addition, PACE is significantly negatively related to
price, as expected (p < .05). Buyers pay more for yearlings whose jaw width
measured “above average” or “well above average,” all else equal (p < .10), but
pay less for yearlings whose jaw width was measured as “below average” or “well
below average” (p < .01).
The pedigree and gender variables affect sales prices as expected. Stud fee, which
is an indicator of a sire’s quality, is positive and significant at the 1% level. Prices are
11
0.362 (.375)
0.966** (.425)
0.160 (.275)
0.266 (.372)
0.547 (.356)
0.130 (.446)
0.779** (.323)
0.198 (.211)
0.051 (.038)
1.123* (.627)
0.255* (.152)
0.056 (.049)
0.386 (.321)
0.379 (.336)
0.079 (.365)
47.072* (27.228)
R2
n
w2 (14,826)
Prob. > w2
Coefficient
Estimate (SE)
LNEARN
.0346
841
30.16
.0113
US$153,724.94
US$0.19
US$75,024.53
US$75,618.96
Marginal
Valuea
.198*** (.071)
.009 (.080)
.013 (.052)
.180** (.070)
.121* (.067)
.238*** (.084)
.209*** (.061)
.573*** (.040)
.018** (.007)
.159 (.118)
.041 (.029)
.007 (.009)
.120** (.060)
.278*** (.063)
.684*** (.069)
14.317*** (5.125)
R2
n
w2 (14,826)
Prob. > w2
Coefficient
Estimate (SE)
LNPRICE
Price Model
.4529
841
696.30
.0000
US$31,396.08
US$76,621.59
US$157,497.42
US$49,248.54
US$34,127.27
US$60,229,90
US$61,806.05
US$2.33
US$4,948.69
US$55,322.68
Marginal
Valuea
The marginal value is the change in the dependent variable given a one-unit increase in the relevant independent variable. Marginal values are calculated at
sample means for continuous variables and at zero for all dummy variables except the variable under consideration.
***, **, and * represent significance at the 1%, 5%, and 10% levels, respectively.
a
BIOEARLY
BIOLATE
HEARTSCORE
PACE
JAWABOVEAVG
JAWBELOWAVG
COLT
LNSTUDFEE
SIRECROPNO
FRESHMAN
LNSIREEARNPERSTART
LNSIREPROGEARNPERSTARTER
DAMBT
DAMBTPROG
SELECT
Constant
Regression statistics
Variable
Dependent Variable
Performance Model
Table 4. Results From Performance and Price Models.
12
Journal of Sports Economics XX(X)
Table 5. Correlation Matrix Between Biomechanics Scores and Pedigree Variables.
Variable
BIOEARLY
BIOLATE
STUDFEE
SIREEARN
SIREPROGEARN
DAMBT
DAMPROGBT
BIOEARLY
BIOLATE
STUDFEE
SIREEARN
SIREPROGEARN
DAMBT
.7725
.0263
.0574
.0089
.0240
.1057
.0334
.0056
.0307
.0094
.1042
.0486
.3966
.1415
.0203
.1363
.0468
.0064
.0188
.0060
.0092
decreasing sire crop number, but there is no additional premium for yearlings by
freshman sires. Neither the sire’s own on-track performance nor average earnings for
each of his racing progeny are significantly related to price. Dam quality, as measured
by whether the yearling’s dam has produced any black-type winners, is significant at
the 1% level. Yearlings whose dams had earned black type themselves command a
higher price (p < .05). In this sample, on average, colts sell for significantly more than
fillies (p < .01). Finally, yearlings sold in the select portion of the sale sell for
significantly more than those in the nonselect portions of the sale (p < .01).
One might be concerned that the biomechanical measures are correlated with
pedigree quality variables, so that including both in a regression may introduce
multicollinearity. The correlation matrix presented in Table 5 alleviates those
concerns.8
The results from the performance and pricing models present an intriguing
dichotomy. The variable measuring late racing performance (BIOLATE) significantly predicts racing performance, while the variable measuring early racing performance (BIOEARLY) significantly predicts sale price. In other words, yearling
buyers are willing to pay more for horses who physically appear to be able to
perform well in their 2-year-old year, but this variable does not predict racetrack
earnings. On the other hand, yearling buyers do not pay a premium for later developing horses, when in fact this measurement does predict racetrack earnings. Buyers
also adjust their bids based on the pace of the horse’s heartbeat as well as the width
of its jaw; however, none of these factors are statistically related to performance at
conventional levels of significance.
The equality of coefficients across the models can be formally tested. All of the
coefficients for the biomechanical measures are significantly different in the two
models except for HEARTSCORE: BIOEARLY (p < .01); PACE, JAWABOVEABG,
and JAWBELOWAVG (p < .05); and BIOLATE (p < .1). Remaining variables that
indicate statistical difference at the 10% or better are COLT, LNSTUDFEE, SIRECROPNO, DAMBT, DAMPROGBT, and SELECT.
Billy Beane used quantitative measurements to find baseball players who were
undervalued by the marketplace. This innovative approach allowed Beane to more
Plant and Stowe
13
efficiently build the Oakland Athletics baseball team, paying far less for high performance than competing teams. In the same vein, this research demonstrates that
yearling racehorse buyers can more effectively select better performing horses
while paying less (i.e., at least until the use of these measurements becomes more
commonplace). Yearling buyers who use the BIOLATE score to select racehorses
could realize a monetary advantage (on average) against competing buyers, as that
score appears to have some predictive power for future career earnings, at the same
time that it is undervalued by the market. When evaluated at the mean, a one-unit
increase in the BIOLATE score translates into, on average, over an additional
US$75,000 (US$75,618.96) in career earnings. Buyers do not pay an additional
amount for higher BIOLATE scores in the sales ring, but they do pay an average of
US$55,322.68 more for each unit increase in the BIOEARLY score which does not
appear to predict career earnings. Buyers pay US$49,248.54 less for a horse with a
normal heart pace, yet this variable does not influence career earnings in our
sample. They also pay an average of US$34,127.27 more for a yearling with a
wider than average jaw but discount yearlings by an average of US$60,299.90 for a
narrower than average jaw. Neither of these measurements are significantly related
to a horse’s career racetrack earnings.
Colts earn, on average, over US$74,141.44 more than fillies, all else equal; Table
3 suggests that buyers pay just about US$64,402.43 more for colts than fillies,
suggesting that the market value of a colt is roughly valued appropriately.
Sire quality is measured in a number of ways: sale year stud fee, the sire’s
earnings per start on the track, and the sire’s earnings per starter produced. Buyers
pay an average of US$2.33 more for each US$1 increase in stud fee; however, Table
4 indicates that stud fee is not statistically significant in the earnings regression. The
sire’s crop number in which the yearling was born is negatively related to sales price;
each additional crop number reduces average price by nearly US$5,000. This suggests that, all else equal, buyers pay the most for yearlings by freshman sires. There
has been discussion on whether the quality of a sire’s foals diminish over time,
which may explain this result; however, sire crop number is not a significant predictor of earnings. Progeny by first-year sires earned an average of US$153,724.94
more than progeny by all other sires.9
The dam quality variables are not statistically significant in the earnings regression
but are related to price. Buyers will pay US$31,396.08 more, on average, for a yearling
whose dam earned black type on the track even though this characteristic does not
influence earnings. The ability to produce quality racehorses is even more highly
valued; buyers pay over US$76,000 (US$76,621.59) more on average for a yearling
whose dam has already produced a black-type earning progeny. Interestingly, at least
in our sample, this propensity does not translate into higher career earnings. Finally,
buyers pay an average of over US$157,000 (US$157,497.42) more for a horse sold at a
select sale, but again, this attribute does not factor significantly into career earnings.
As mentioned earlier, controlling for the influence of freshman sires may be
problematic. More specifically, SIREPROGEARNPERSTARTER, which can be
14
Journal of Sports Economics XX(X)
thought of as representing expected earnings for the sire’s progeny, is set equal to 0
for freshman sires. To mitigate this issue, we investigate further by running SUR
only on the data for freshman sires and then separately for yearlings by all other
sires. These results are presented in Table 6.
Segmenting the sample in this way is enlightening in interpreting the divergent
results on BIOEARLY and BIOLATE. First, examine the results from the performance regressions in the first two columns. It appears that the significance of the
BIOLATE variable in the pooled performance model is driven largely by the
relationship between BIOLATE and career earnings for progeny by freshman sires;
it is significant at the 5% level in the model for progeny by freshman sires but is
insignificant (p < .13) in the model for progeny by all other sires. Recall that
freshman sires have not yet had any progeny race at the time of the sale, so the
quality of their offspring as racehorses is unknown. However, for all other sires,
there are several widely available sources of progeny performance and sire ranking
statistics (e.g., http://www.thoroughbreddailynews.com/tdn-sire-stats/) which may
negate the need for the biomechanical measurements. Without that information for
freshman sires, the predicted late maturity of the progeny as measured by the
BIOLATE variable is the best predictor of earnings. The performance model for
both the freshman sire subsample is significant at the 10% level, while the model
for the remaining sires falls just beyond that (p < .11), so caution should be taken to
not overinterpret the results.
A similar but opposite result is seen when examining the BIOEARLY variable and
its relationship to sales price. Consider the results from the price regressions in the last
two columns of Table 6. In this case, it appears that its significance in the pooled
pricing model is driven largely by the relationship between BIOEARLY and sales price
in the pricing model for progeny by nonfreshman sires. It is positive and significant at
the 5% level in that model, while it is insignificant at conventional levels in the model
restricted to freshman sires (p < .17). This result may illustrate buyers’ objectives.
First, from the buyer’s perspective, there is some evidence that freshman sires are
different. A number of papers find that buyers are willing to pay a premium for
progeny of freshman sires. In this article’s pooled model, even though FRESHMAN
is not significant, the variable SIRECROPNO is negative and significant, suggesting
that all else equal, buyers pay the most for progeny of unproven sires.10 Other papers
have identified this result as a desire by buyers to correctly predict the next successful
sire. However, once a sire is no longer a new, unproven commodity, the results of this
model suggest that buyers are looking to maximize early return on investment as
evidenced by the positive and significant result on BIOEARLY.
Conclusion
In this article, we apply a Moneyball approach to selecting future racehorses by
investigating whether factors influencing a horse’s success on the racetrack are
15
Freshman Sires
All Other Sires
Pricing Model (LNPRICE)
.1474
115
19.98
.0693
1.487 (0.979)
2.614** (1.106)
1.731** (0.857)
0.504 (1.018)
1.234 (0.820)
0.050 (1.029)
0.448 (0.764)
0.077 (0.552)
—
0.476 (0.318)
—
0.671 (0.787)
0.071 (0.782)
0.453 (0.877)
108.924** (54.896)
.0282
726
21.04
.1005
.188 (.409)
.713 (.461)
.008 (.291)
.334 (.400)
.430 (.390)
.128 (.491)
.811** (.357)
.222 (.228)
.050 (.039)
.255 (.174)
.061 (.050)
.581* (.352)
.390 (.368)
.208 (.401)
38.176 (30.891)
.3667
115
66.58
.0000
.307 (.221)
.003 (.249)
.290 (.192)
.106 (.229)
.079 (.185)
.615*** (.232)
.034 (.172)
.462*** (.124)
—
.017 (.072)
—
.163 (.177)
.095 (.176)
.787*** (.198)
21.536* (12.366)
.4678
726
638.19
.000
.161**
.029
.035
.189**
.126*
.170*
.256***
.583***
.018**
.043
.007
.125*
.299***
.671***
12.773**
(.075)
(.084)
(.053)
(.073)
(.071)
(.090)
(.065)
(.042)
(.007)
(.032)
(.009)
(.065)
(.067)
(.073)
(5.655)
Coefficient Estimates (SE) Coefficient Estimates (SE) Coefficient Estimates (SE) Coefficient Estimates (SE)
All Other Sires
***, **, and * represent significance at the 1%, 5%, and 10% levels, respectively.
BIOEARLY
BIOLATE
HEARTSCORE
PACE
JAWABOVEAVG
JAWBELOWAVG
COLT
LNSTUDFEE
SIRECROPNO
LNSIREEARNPERSTART
LNSIREPROGEARNPERSTARTER
DAMBT
DAMBTPROG
SELECT
Constant
Regression statistics
R2
n
w2
Prob > w2
Variable
Freshman Sires
Performance Model (LNEARN)
Table 6. Results From Performance and Price Models Divided Into Samples for Yearlings by Freshman Sires and Nonfreshman Sires.
16
Journal of Sports Economics XX(X)
appropriately valued. In addition to standard pedigree attributes, quantitative measurements from biomechanical data are introduced for the first time. As in Chezum
et al. (2014), we find that very little can be used to predict the success of a racehorse.
However, we do find evidence of at least one biomechanical measurement which
was significantly related to career racetrack performance while being undervalued
by the market; in a volatile, inefficient market, any advantage in selecting winning
racehorses translates to higher payoff in terms of purse money earned, likely
increased residual breeding value, and added nonmonetary utility from enjoyment.
Perhaps just as useful to buyers of similar yearlings, other attributes were highly
valued by the market but did not translate to better career performance. These results
depend to some extent on whether the yearlings are by freshman sires or more
proven sires.
As always, there are some limitations which must be considered. First, the sample
of horses included in this study represents a selected portion of the crop which is
likely higher in quality than the average racehorse. Thus, we are looking at yearlings
that have a higher propensity to be successful than the average yearling. Accordingly, BIOEARLY and BIOLATE have a tight range; even with that limitation, we
find robust evidence of the dichotomy between the two measures in predicting career
racetrack earnings and sales prices. It would be helpful to obtain measures on a wider
selection of quality of horses to investigate whether that relationship is replicated.
Second, we measure future performance by success on the racetrack, measured
specifically by career earnings. However, the value of a future racehorse is not only
its expected racing potential but also its residual breeding value. For colts, the
competition is so fierce at the highest level of racing that only a small proportion11
will ever accomplish enough on the racetrack to have the opportunity to stand at stud
in a prominent breeding area, so expected future breeding value is near zero,12 and
lifetime earnings may be a good measure of its total value. However, fillies typically
have a positive residual breeding value, as most13 of them will enter second careers
as breeding stock; measuring only racetrack earnings and nothing else may underestimate their value somewhat. Gamrat and Sauer (2000) find that in the 1980s, a
mare’s average residual breeding value was about 3 times her racetrack earnings.
This may help explain to some extent why buyers pay more for horses sold at select
sales and for those with higher quality pedigrees even if those attributes do not
translate directly into increases racetrack earnings.
Much like the lottery, individuals choose to participate in the Thoroughbred
market not because they are assured a secure investment, but rather because of the
possibility of “catching lightning in a jar.” Each year, many thousands of yearlings
are bought and sold, each carrying the hope that it will one day become a successful
racehorse. While the probability of success is indeed slim, the allure of finding the
proverbial needle in a haystack sustains the industry. With the competition inherent
in the auction marketplace, buyers use available tools to increase the chance of
success, trying to gain an informational advantage over other buyers. More research
is needed to determine whether the result from this test of biomechanical
Plant and Stowe
17
measurements would replicate in other marketplaces, such as those which sell horses
under 1 year of age or 2-year-olds; or in measuring breeding stock stallions and
mares to predict progeny performance.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of
this article.
Notes
1. Different buyers have different goals in the purchase decision: For example, some may
select male horses to race and hope to sell as a stallion prospect, while others would
choose to buy fillies with the intention of breeding them once they are done racing. Some
buyers would focus on choosing offspring of a certain sire or horses who are related to
other successful racehorses in their female family line.
2. An equine heart should be efficient at performing the task specific to that animal’s
athletic performance. A large heart isn’t necessarily required—a long-distance horse will
be well served by a slow pumping hear with a large stroke volume, while a sprinter needs
a faster, smaller, and more powerful pump.
3. The Thoroughbred breeding season operates on a seasonal basis, with the Northern
Hemisphere breeding season operating from mid-February to early July. Each group of
foals born in a given year are known as a “crop.”
4. One limitation of this modeling choice is that SIREPROGEARNPERSTART ¼ 0 for
freshman sires. Consequently, it is measured with error in the sense that it is not representative of what these sires’ progeny will earn once they reach the racetrack. This has the
potential to bias the slope coefficient toward 0. In one set of models, we avoid this
potential problem by dividing the sample into yearlings by freshman sires and yearlings
by all other sires.
5. A horse earns black type (literally, their names on the sales catalog page are printed in
bold font, black letters) if it finishes first, second, or third in a qualifying stakes race.
6. Other performance measures, such as number of (graded/stakes) starts, number of
(graded/stakes) wins, earnings per start, winning percentage, and likelihood of earning
black type, we also tested; however, the selected model had the best fit and the greatest
significance of explanatory variables.
7. We also investigated models to analyzing whether precocity, or success on the racetrack
as a 2-year-old, can be predicted by these same variables but found no significant results.
8. Since the biomechanical efficiency scores play a role in predicting performance and/or
price, another approach to determining whether the nonrandom sample may be introducing selection bias is to utilize a Heckman approach. The determinants of these scores can
18
9.
10.
11.
12.
13.
Journal of Sports Economics XX(X)
be estimated, but we only have observations for the yearlings that were selected to be
measured. To investigate this issue, we utilize data from the 2007 Keeneland September
Yearling Sale as an example. A total of 4,882 yearlings were sold at that sale, but only 263
were selected by the company’s clients for measurements.
The first step of the model estimates the likelihood that a yearling is selected to be
measured. This choice is assumed to be a function of its pedigree quality, gender, and
age (using stud fee, freshman sire, dam black type, dam progeny black type, colt, and
month born as explanatory variables). The second step of the model estimates the determinants of the BIOEARLY or BIOLATE scores and are assumed to be a function of both
the sire’s and dam’s racing and producing quality as well as gender (using sire earnings,
progeny earnings, sire earnings, dam black type, dam progeny black type, and colt as
explanatory variables).
In neither model were the inverse Mills ratios significant, and we cannot reject the null
hypothesis that the errors are uncorrelated. In other words, in terms of biomechanical
efficiency scores, the data are consistent with no selection.
This value seems suspiciously large. As mentioned earlier, this result may be picking up
measurement error in SIREEARNPERSTARTER. The unconditional sample means support this result in direction but not magnitude (yearlings by freshman sires earned an
average of US$105,079.20, while average earnings for yearlings by nonfreshman sires
was US$74,073.54).
Furthermore, if SIRECROPNO is dropped from the model, FRESHMAN becomes significant at the 5% level.
An average of around 70 colts a year entered stud between the years 2004 and 2007,
where the average entering stud fee was about US$10,800. About 45% of these stallions
stood in the elite location of central Kentucky, and only 30% of each “entering crop”
stood for US$10,000 or more in their first year. The total number of foals born during this
time period ranges from a high of 35,050 in 2005 to a low of 34,356 in 2007. If we assume
that half of the foals born are male, the data inform us that only about .40% of the total
number of colts born in the United States will enter stud in this country, about .18% of the
total will stand in Kentucky, and .12% will have a first stud fee of over US$10,000.
Oppenheim (2015) explains the industry method of valuing a prospective sire’s value; the
rule of thumb is that 300 times a stallion’s entering stud fee is a conservative estimate of a
prospective sire’s earning potential. So on average, the expected residual breeding value
for a colt is 0.004 300 US$10,800 ¼ US$12,960, which is negligible relative to the
purchase prices for these yearlings.
The competition for fillies to enter the bloodstock pool is much less—as each male horse
can potentially mate with 100þ mares each year, a relatively small group of stallions can
serve a relatively large group of mares.
References
Brisnet.com. (n.d.). Brisnet.com American produce records. Brisnet.com. Retrieved January
10 to May 1, 2013, from http://www.brisnet.com/php/AprOnline/index.php/
Plant and Stowe
19
Buzby, J. C., & Jessup, E. L. (1994). The relative impact of macroeconomic and
yearling-specific variables in determining thoroughbred yearling price. Applied Economics, 26, 1–8.
Chezum, B., Stowe, C. J., & Schuckers, M. (2014). Some evidence of information aggregation
in auction prices (Working paper).
Chezum, B., & Wimmer, B. (1997). Roses or lemons: Adverse selection in the market for
thoroughbred yearlings. Review of Economics and Statistics, 79, 521–526.
Commer, M. Jr. (1991). The effect of non-phenotypic data on thoroughbred prices in the
MidAtlantic market. The Professional Animal Scientist, 7, 18–24.
Equibase.com. (n.d.). Equibase stats central-horse profile. Equibase.com. Retrieved January
10 to May 1 2013, from http://www.equibase.com/stats/View.cfm?tf¼year&tb¼horse
Gamrat, F. A., & Sauer, R. D. (2000). The utility of sport and returns to ownership: Evidence
from the thoroughbred market. Journal of Sports Economics, 1, 219–235.
Gillies, S. T. (2011). Keeneland September preview: Flat is the new measure of success. The
Blood-Horse Market Watch, 1–3.
Hakes, J. K., & Sauer, R. D. (2006). An economic evaluation of the Moneyball hypothesis.
Journal of Economic Perspectives, 20, 173–185.
Lewis, M. (2003). Moneyball: The art of winning an unfair game. New York, NY: W.W. Norton.
Neibergs, J. S., & Vinzant, P. L. (1999). Maximum-likelihood estimates of racehorse earnings
and profitability. Journal of Agribusiness, 17, 37–48.
Oppenheim, B. (2015). Good gamble. Thoroughbred Daily News. Retrieved December 9, 2015,
from http://www.thoroughbreddailynews.com/bill-oppenheim-good-gamble-shared-archive/
Parsons, C., & Smith, I. (2008). The price of thoroughbred yearlings in Britain. Journal of
Sports Economics, 9, 43–66.
Plant, E., & Stowe, C. J. (2013). The price of disclosure in the thoroughbred yearling market.
Journal of Agricultural and Applied Economics, 45, 1–15.
Simon, M. (1998, April 4). All about purses. Thoroughbred Times, pp. 18–22.
Robbins, M., & Kennedy, P. E. (2001). Buyer behavior in a regional thoroughbred yearling
market. Applied Economics, 33, 969–977.
Velie, B. D., Hamilton, N. A., & Wade, C. M. (2014). Heritability of racing performance in
the Australian thoroughbred racing population. Animal Genetics, 46, 23–29.
Vickner, S., & Koch, S.I. (2001, Fall). Hedonic pricing, information, and the market for
thoroughbred yearlings. Journal of Agribusiness, 19, 173–189.
Wimmer, B. S., & Chezum, B. (2003). An empirical examination of quality certification in a
‘Lemon’s market.’ Economic Inquiry, 41, 279–291.
Author Biographies
Emily J. Plant holds a PhD in Marketing from the University of Kentucky, along with a BS in
Business from Indiana University and an MBA from Xavier University. Formerly an associate
professor at the University of Montana, Plant now exercises her research skills and passion for
all things equine full time as a consultant in the Thoroughbred racing and breeding industry.
20
Journal of Sports Economics XX(X)
C. Jill Stowe holds a PhD in Economics from Texas A&M University along with a BS in
Mathematics from Texas Tech University. Stowe is currently an associate professor of Agricultural Economics at the University of Kentucky. Her research interests include economics
of the equine industry, decision-making under risk and uncertainty, behavioral economics,
game theory, sports economics, and industrial organization.
Download