Supplemental Appendix Model description and robustness checks We are interested in a statistical modeling approach that is open to exogenous shocks that might affect the total population, e.g. environmental targets. We therefore stayed away from the traditional logistic models and described growth by an additive sequence of specific terms. An additive sequence may provide an adequate statistical representation, as more than 50% of the human height increment between birth and adulthood is caused by leg growth. The complex phenomenon of leg, i.e. long bone growth is beyond the scope of this paper, but we want to stress that long bones essentially grow additive via endochondral ossification. The specific architecture of the proliferating chondrocytes within the epiphyseal growth plates result in a mainly unidirectional length increment of the bone. Growth of the chondrocytes depends on multiple interactions of endocrine signals, nutrition, oxygenation, physical stress, etc. (28) that all add in a cumulative, i.e. an additive, way to long bone length. This was the reason to anticipate a model consisting of an additive rather than a multiplicative sequence of specific terms. To check the robustness of our empirical results and guard against misspecification, we also investigate the statistical evidence for this model. We consider three archetypical models of height. These are defined as , (1) (2) and (3) 1 These three specifications characterize height as trend stationary (model 1), as first order trend stationary, i.e. stationary in differences (model 2), and trend stationary in logarithmic differences (model 3). The evidence for these three baseline specifications is evaluated on the basis on the marginal likelihood, see below for details. Note that these different models imply different inference results, when regressing height differences or logarithmic growth rates on past height. In case that model specification 1 is supported by the data, the corresponding regression coefficient is expected to be negative, for both dependent variables, i.e., height differences and logarithmic growth rates. This effect is known asregression to the mean. However, when model specification 2 holds, a significant impact of the past height is only expected in case of regressing logarithmic growth rates on past height, whilst in case of model specification 3 this is to be expected when regressing height differences on past height. Thus in order to guard against spurious regression results, we check the evidence for the underlying different characterizations of stationarity. The evidence according to the marginal likelihood is strongly in favor of specifications 2. The corresponding marginal likelihoods are -6400.8 (1), -2506.9 (2), and -6652.9 (3) for boys and -5156.6 (1), -1956.7 (2), and -4364.0 (3) for girls respectively. The described model can be summarized as follows. Define thus for all at time point where denotes observed height. Then for and can be modelled (reduced form) conditional on 2 as height differences where is a latent individual specific growth component, 'average' growth, time specific dynamic transmission of growth, as time specific captures a backward looking mechanism conceptualized as and captures growth tempo via difference between bone age and calendar age. Nonlinearities occurring in growth due to puberty are controlled via takes value 1, when a puberty control status is reached at time or before. denotes the time point of a certain puberty control, i.e. Note that this parameterization allows a time dependent modeling of the influence puberty controls exhibit on growth. Model estimation, model comparison and handling of missing data Bayesian estimation is performed using MCMC techniques (Gibbs sampling). Conjugate distributions are chosen a priori. Hyperparameters are set as follows. Means and variance of conjugate normal distributions are set to zero and 1000 respectively. For conjugate gamma distributions hyperparameters are both set to 1. Gibbs sampling is conducted via iterative sampling form the following full conditional distributions yielding a sample from the joint posterior distribution. 10000 iterations are performed, 2000 draws are discarded as burn-in. 3 Posterior sample characteristics serve as estimators. The algorithm involves the following steps. 1. Sample from the full conditional distribution of given as a univariate normal distributions with parameters and where , and and , and denote the prior moments of a conjugate normal density. 2. Define with and of dimension density, then . and where , denotes a diagonal unit matrix denote their prior moments of a conjugate normal can be sampled from a bivariate normal density with moments and 3. Define , and is sampled from an with . Then -dimensional multivariate normal distribution with moments and 4 4. Finally, sample for where and from an inverse Gamma distribution with moments denote the corresponding hyperparameters. A total of 69 boys and 60 girls were observed from 9-18 years and 9-17 years respectively. Two strategies are employed to deals with missing values. The first strategy uses complete case analysis allowing for handling missing values in height and measure Tanner stages. However, no complete case remains when considering bone age as a tempo control. Several approaches to deal with missing values in explaining variables are discussed in the literature. Based on the comprehensive review provided by (29) of multiple imputation as one way to deal with missing values in survey data, (30) discuss the use of of multiple imputation by chained equation (MICE) algorithms based on classification and regression trees (CART) to mimic the full conditional distribution of missing values, when the data structure does not allow for rich and yet computationally feasible full parametric models. Note that imputation in panel model context is of especial relevance, since even a single missing value would cause the loss of all observation of an individual. However in the context of missing values in observed bone ages parametric models approximating the full conditional distribution are at hand making use of the dynamic panel structure of the data. Consider a panel data setup, with individuals and t = 1, … , T observation periods. Let zitdenote observed bone age for individual i at period t .For the pattern of missing values, see Table A1. Note that dealing with missing values is straightforward when Bayesian estimation is performed via Markov Chain Monte Carlo (MCMC) methodology. In each iteration of the Gibbs sampler, a new set of imputed values is generated thus incorporating the uncertainty 5 concerning the missing values within the parameter estimation of the structural model of interest. Given the unsystematic pattern of missing values within the data, the algorithm has the following structure. Define parametric models providing the approximation for the full conditional distributions of each variable in terms of linear regressions. is thereby regressed on the following two periods, i.e. For all following periods regression is performed on one previous and one consecutive period, i.e. and thus for the last period the model takes the form These approximations of the full conditional distribution are then incorporated within the MCMC sampling scheme, where draws of are obtained from the corresponding predicitive distributions, where parameters thereof are sampling from the sampling distributions, i.e. , where denotes the set of regressors involved in the approximations within the conditional distributions of . The Bayesian framework allows to compare the differentspecifications via the marginal likelihood , which gives theevidence of the sample data under a specific model. This conceptincorporates the parameter uncertainty and the uncertainty stemming from the missing values and provides a consistentmodel assessment even for smaller samples as it is not based onasymptotic properties. The derivation of the marginal likelihood isalong the way proposed by (15). Starting point of thederivation is to decompose the log marginal likelihood of all data into 6 As this identity holds for each point θ within the parameter space of the model, it is calculated at a point within the highest density region, where θ* is the posterior mean. The first component gives the log likelihood, which is calculated as Where the draws are obtained from a special shortened Gibbs runs iterating through the the following conditional distributions References 28. Nilsson O, Marino R, De Luca F, Phillip M, Baron J. Endocrine regulation of the growth plate. Horm Res, 2005;64:157-65. 29. Raghunathan TE, Lepkowski JM, van Hoewyk J, Solenberger P. A multivariate technique for multiply impuing missing values using a sequence of regression models. Survey Meth, 2002;27:85-96. 30. Burgette LF, Reiter JP. Multiple imputation for missing data via sequential regression trees. AmJ Epid, 2010;172:1070-1076. Table A1: Number of missing values in bone ages year 9 10 11 12 13 14 15 16 17 18 boys 14 10 15 16 10 11 9 11 8 8 girls 7 10 9 10 7 3 10 8 4 13 7 8