Appendix 8
Selection of tooth wear traits in Brown and Chapman, and Dudley
When we applied the scoring methods proposed by Brown and Chapman, and
Dudley, to our samples it was obvious that presence-absence of many traits was
correlated. To reduce the number of redundant traits we fitted multiple linear
regression models of age on tooth wear traits, incorporating a selection procedure
for inclusion of traits in models based on the prediction error sum of squares statistic
(PRESS). Ideally, we would have liked to fit all possible models based on every
possible subset of traits. However, evaluation of all possible regression models was
not feasible due to the large number of traits to be considered. To help identify useful
sets of predictor traits, a sequential stepwise-like selection procedure was therefore
adopted, where traits were entered or removed from a regression model according to
a specified criterion. Since the primary objective here was prediction, a cross
validation statistic was considered an appropriate criterion for entry/removal of
variables. The prediction error sum of squares was used as this statistic is derived
from cross-validated predictions and can be considered a summary measure of
prediction quality of a regression model, with lower values of PRESS indicating
better predictive models.
Further details of the cross validation process are as follows: leave-one-out cross
validation was used, where each sample was omitted in turn; the regression model
was fitted to the remaining samples and a prediction was made for the omitted
sample.
For leave-one-out cross-validation, the PRESS statistic is defined as follows:
𝑛
𝑃𝑅𝐸𝑆𝑆 = ∑(𝑦𝑖 − 𝑦̂(𝑖) )
2
𝑖=1
where 𝑦̂(𝑖) is the predicted age derived from a regression model with actual age ( 𝑦𝑖
) omitted. PRESS is calculated by squaring the differences of cross-validated
predicted ages from actual ages prior to summing them.
Within the stepwise regression procedure, the trait which was most highly correlated
with the actual age was selected at the first step. Thereafter, at each additional step
of the procedure, traits were either entered or removed from the regression model
one at a time based on the value of the PRESS statistic, with lower values of PRESS
indicating improved predictive models. The stepwise procedure is stopped when
there is no decrease in the PRESS statistic with additional steps (i.e. no
improvement in predictive quality of models with further selection or removal of
predictor variables). The variable selection process described above was carried out
using procedure ‘glmselect’ in SAS 9.2 (SAS Institute, Cary NC).
Predictive regression equations were derived by applying the stepwise-like
regression technique described above to combined data from all three technicians
with a factor representing technician being included in models as the first term.
Cross-validated predicted ages, calculated using leave-one-out cross validation and
adjusted to the average effect of technician, were then derived for each sample from
the appropriate predictive regression equations.
As there is some evidence that tooth wear differs by sex (Van Deelen et al., 2000;
HØye, 2006; Carranza et al., 2008), analyses were carried out separately for males
and females (by-sex). Additionally, combined analyses using samples of both sexes
were carried out with a factor representing sex being included in these models
(unisex).
References
Carranza, J., Mateos, C., Alarcos, S., Sanchez-Prieto, C. B. & Valencia, J. (2008).
Sex-specific strategies of dentine depletion in red deer. Biological Journal of
the Linnean Society 93, 487-497.
HØye, T. T. (2006). Age determination in roe deer ο€­ a new approach to tooth wear
evaluated on known age individuals. Acta Theriol. 51, 205-214.
Van Deelen, T. R., Hollis, K. M., Anchor, C. & Etter, D. R. (2000). Sex affects age
determination and wear of molariform teeth in white-tailed deer. Journal of
Wildlife Management 64, 1076-1083.