Case study: Lactation curves for camels Data provided by Fred Aloo (Institut 480, Uni Hohenheim, Prof. Valle-Zarate, June 2003) Short description of objective of study and research question Rendille pastoralists in northern Kenya group their camels (all belonging to the Rendille camel breed) into different performance and adaptation types. There is - the Dabach type, said to have the highest milk yield during rainy season but loses body condition rapidly during dry season, with consequent drop in milk yield, - the Godan type with has a medium but stable milk yield during both rainy and dry season and - the Coitte type with a low milk yield but good body condition throughout. - The Aitimaso type is said to be in-between the Dabach and the Godan type having a higher milk yield then the Godan but a better drought tolerance then the Dabach type. The study aims at estimating the milk yield of these different Rendille camel types in order to assess whether the indigenous classification can be trusted and hence used as part of on-farm performance estimation. Performance estimation of livestock kept under their natural production conditions is specifically difficult in remote areas under adverse conditions, such as the marginal dryland areas. Performance and adaptation of such livestock species and breeds can also hardly be determined on research stations, because of strong genotype-environment interactions. Therefore the development and use of methods for on-farm performance testing are necessary when aiming at assessing and improving livestock production systems in marginal areas. These methods seem to be more practical when the livestock keepers are involved in the performance recording. The results of the study will therefore contribute to the assessment of coherence and complementarities of indigenous and scientific knowledge and further be used for the fine-tuning of on-farm performance testing methods suitable for the area. Cross-sectional study, i.e., survey was conducted at one point in time (2,5 months data collection period from March to May 2003, this was towards the end of the dry season) Camels are classified into four different types: Dabach, Aitimaso, Godan and Coitte For each type, camels were surveyed which were in lactation months 1, 2, ..., 12. For each lactation month and camel type, there are replicate camels. Each camel was surveyed only once, i.e., there are no repeated measurements. Lactation curves were to be fitted to each type Questions: What model to fit to the lactation curve? Are regression curves different among types and if so, in what respect? The following model has been shown to fit well to lactation curves of camels: x exp x where x is the number of lactation months. This model is known as the incomplete gamma model due to its relation to the incomplete gamma function and as Wood's function (Wood, 1967, 1972). Suggestion for analysis The model can be linearized by taking logarithms: log( ) log log( x) x z1 z2 1 where log z1 log( x) z2 x This can be fitted by multiple linear regression of y' = log(y) on z1 and z2. To compare lactation curves, it is useful to fit curves simultaneously for all four types. The joint model can be written i i z1t i z 2 t eijt yijt where = logged milk yield of the j-th camel of the i-th type at the t-th month y ijt Since there are replicate camels per month and type, the lack-of-fit can be tested by adding a lack-of-fit effect: i i z1t i z 2 t it eijt yijt (1) where it = lack-of-fit effect for the i-th type and t-th month The ANOVA for model (1), using sequential SS (Type I SS in SAS), yields Source DF Type I SS Mean Square F Value Pr > F Type Lactmonth*Type LOG_Lactmonth*Type Type*lackfit 3 4 4 33 5.21121050 14.11557956 4.19206207 10.05184539 1.73707017 3.52889489 1.04801552 0.30460138 8.86 17.99 5.34 1.55 <.0001 <.0001 0.0004 0.0314 The lack-of-fit is significant, so one may consider modifications of model (1). Specifically, we tried the following three models: i i z1t i z 2 t i z3t it eijt yijt (2) where (a) z 3t xt2 , (b) z3t log( xt ) and (c) z3t xt log( xt ) . 2 The ANOVAs are: (a) z 3t xt2 2 Source DF Type I SS Mean Square F Value Pr > F Type Lactmonth*Type LOG_Lactmonth*Type Lactmon*Lactmon*Type Type*lackfit 3 4 4 4 30 5.21121050 14.11557956 4.19206207 4.30056889 5.75127650 1.73707017 3.52889489 1.04801552 1.07514222 0.19170922 8.82 17.93 5.32 5.46 0.97 <.0001 <.0001 0.0004 0.0003 0.5092 R2 = 0.31 (without lack-of-fit term) (b) z3t log( xt ) 2 Source DF Type I SS Mean Square F Value Pr > F Type Lactmonth*Type LOG_Lactmonth*Type LOG_Lac*LOG_Lac*Type Type*lackfit 3 4 4 4 29 5.21121050 14.11557956 4.19206207 2.91910479 7.13274064 1.73707017 3.52889489 1.04801552 0.72977620 0.24595657 8.86 17.99 5.34 3.72 1.25 <.0001 <.0001 0.0004 0.0057 0.1788 R2 = 0.29 (without lack-of-fit term) (c) z3t xt log( xt ) Source DF Type I SS Mean Square F Value Pr > F Type Lactmonth*Type LOG_Lactmonth*Type Lactmon*LOG_Lac*Type Type*lackfit 3 4 4 4 30 5.21121050 14.11557956 4.19206207 3.64047308 6.41137231 1.73707017 3.52889489 1.04801552 0.91011827 0.21371241 8.82 17.93 5.32 4.62 1.09 <.0001 <.0001 0.0004 0.0012 0.3522 R2 = 0.30 (without lack-of-fit term) While model (a) had the largest R2, it showed a slight upward twist for the highest lactation month (not shown), which is not plausible biologically. This problem did not occur with models (b) and (c), and among these, (c) fit slightly better. Thus, all subsequent analyses were based on model (c). We tested interaction of type-by-covariate based on the model i i z1t i z2 t i z3t eijt yijt (3) where z3t xt log( xt ) . Type-specific effects were partitioned into a main effect and an interaction effect as follows: i ai i bi i ci i d i The ANOVA is as follows: Source DF () () (bi) () (ci) Type I SS Mean Square 3 F Value Pr > F Type Lactmonth Lactmonth*Type LOG_Lactmonth LOG_Lactmonth*Type Lactmonth*LOG_Lactmo Lactmon*LOG_Lac*Type 3 1 3 1 3 1 3 5.21121050 13.55550713 0.56007242 3.81456270 0.37749937 2.29964353 1.34082956 1.73707017 13.55550713 0.18669081 3.81456270 0.12583312 2.29964353 0.44694319 8.75 68.32 0.94 19.22 0.63 11.59 2.25 <.0001 <.0001 0.4211 <.0001 0.5935 0.0007 0.0822 All three interaction terms (bi, ci, di) were non-significant. The results suggest that curves are identical for the four groups, except for the intercept (ai significant), i.e., the curves are parallel. Thus, we fit the reduced model i z1t z 2 t z3t eijt yijt (4) For illustration, the fitted model for Type 1 is shown in Figs. 1 (log-scale) and Fig. 2 (original scale; obtained by back-transformation of model fitted on log-scale). A residual plot across all types is inconspicuous and not suggestive of variance heterogeneity (Fig. 3). log(yield) Type=1 2 1 0 -1 1 2 3 4 5 6 7 8 Month 9 10 11 12 Fig. 1: Fitted model (4c) for Type=1 on log-scale, with 95% prediction limits. 4 Yield Type=1 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 Month 9 10 11 12 Fig. 2: Model for Type 1 on original scale based on back-transformation of model (4c) fitted to log-transformed data (y0.29). Residual 3 2 1 0 -1 -2 -3 -4 -0.2 0.0 0.2 0.4 0.6 Predicted value 0.8 1.0 Fig. 3: Plot of studentized residual versus predicted value for fitted model (4c). All types. Finally, we compute least squares means based on the reduced model (2). A letter display is obtained by the method of Piepho (2003). 5 Tab. 1: Means for types based on model (4) using log-transformed data. Means in a column followed by the same letter are not significantly different. Scale Type log§ original$ 0.601 a 0.476 b 0.309 c 0.486 abc 1 2 3 4 1.82 a 1.61 b 1.36 c 1.63 abc § least squares mean $ back-transformed least squares mean; this is a median estimate A final improvement Statistical inference (tests of effects) is expected to be fairly robust to nonnormality. Predictions of individual observations, however, are more sensitive. A Q-Q-plot of studentized residuals based on model (4) with log-transformed data shows some departure from normality (Fig. 4). Thus, we used the Box-Cox transformation, which yielded y0.29 as the optimal normalising transformation. Normality of studentized residuals was satisfactory on this scale (Fig. 5). Thus, we re-did all analyses on this new transformed scale. ANOVA results were the same as on the log-scale (ANOVA tables not shown), including the comparison of means (Tab. 2). Fitted curves and prediction intervals are also quite similar, though the intervals have become a bit more narrow. Residual 3 2 1 0 -1 -2 -3 -4 -2.9 2.9 Normal score Fig. 4: Normal Q-Q-plot based on model (4c) using log-transformed data. 6 Residual 3 2 1 0 -1 -2 -3 -4 -2.9 2.9 Normal score Fig. 5: Normal Q-Q-plot based on model (4c) using power-transformed data (y0.29). Yield0.29 Type=1 1.7 1.6 1.5 1.4 1.3 1.2 1.1 1.0 0.9 0.8 0.7 1 2 3 4 5 6 7 8 Month 9 10 11 12 Fig. 6: Fitted model (4c) for Type=1 on power-transformed-scale (y0.29), with 95% prediction limits. 7 Yield Type=1 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 Month 9 10 11 12 Fig. 7: Fitted model for Type=1 on original scale, with 95% prediction limits. Model on original scale based on back-transformation of model (4c) fitted to power-transformed data (y0.29). Tab. 2: Means for types based on model (4) using power-transformed data (y0.29). Means in a column followed by the same letter are not significantly different. Scale Type 1 2 3 4 y0.29§ 1.202 a 1.161 b 1.107 c 1.163 abc original$ 1.88 a 1.67 b 1.42 c 1.68 abc § least squares mean $ back-transformed least squares mean; this is a median estimate Other types of model Inverse polynomials are a flexible class of models (McCullagh and Nelder, 1989, Generalized linear models. Chapman and Hall, London) which may be used to fit lactation curves, e.g., exp x / x and 1 x / x 8 These are easily linearized and so can be used in the same way as the incomplete gamma model. One may also consider using the power transformation to gain more flexibility. Yet another model (Emmans and Fisher, 1986): log( ) x exp(x) Rerefences Wood PDP 1967 Algebraic model of the lactation curve in cattle. Nature 216: 164-165. Wood PDP 1972 A note on seasonal fluctuations in milk production. Anim. Prod. 15: 89-92. SAS code SAS macros for multiple comparisons and for Box-Cox-transformation are available at www.uni-hohenheim.de/bioinformatik. Details on their use can be found in the lecture notes "Gemischte Modelle in den Life Sciences" (see homepage under "Veranstaltungen"). /*lack of fit for regression model*/ proc glm data=s; Class type lackfit; Model LOG_yield=type type*lactmonth type*LOG_lactmonth type*lactmonth*LOG_lactmonth type*lackfit; run; quit; /*fit model with interaction*/ proc glm data=s; Class type; Model LOG_yield=type lactmonth log_lactmonth lactmonth*log_lactmonth type*lactmonth*log_lactmonth; run; type*lactmonth type*LOG_lactmonth /*interactions not significant, so reduce model -> curves run parallel on log-scale/not on original scale*/ proc glm data=s; Class type; Model LOG_yield=type lactmonth log_lactmonth lactmonth*log_lactmonth; output out=res student=student predicted=pred_log_scale LCL=LCL_log_scale UCL=UCL_log_scale; run; proc sort data=res out=res; by type lactmonth; data res; set res; pred_original_scale=exp(pred_log_scale); LCL_original_scale=exp(LCL_log_scale); UCL_original_scale=exp(UCL_log_scale); filename fileref 'd:\hpp\beratung\aloo\fig.cgm'; 9 goptions reset=all dev=win target=cgmof97L gsfname=fileref keymap=winansi ftext=hwcgm001 gsflen=8092 gsfmode=replace hsize=17 cm vsize=12 cm htext=1.1; axis1 label=none offset=(0 cm, 0 cm) w=10 minor=none major=(w=10); axis2 label=none offset=(0 cm, 0 cm) minor=none w=10 major=(w=10); /* plot symbol1 symbol2 symbol3 symbol4 fitted curves on log-scale*/ i=spline value=none color=black i=none value=dot color=black i=spline value=none color=black i=spline value=none color=black w=2; h=1.5; w=2 line=3; w=2 line=3; proc gplot; plot pred_log_scale*lactmonth log_yield*lactmonth LCL_log_scale*lactmonth UCL_log_scale*lactmonth/overlay haxis=axis1 vaxis=axis2 nolegend; by type; run; quit; /* plot fitted curves on original scale*/ proc gplot; plot pred_original_scale*lactmonth yield*lactmonth LCL_original_scale*lactmonth UCL_original_scale*lactmonth/overlay haxis=axis1 vaxis=axis2 nolegend; by type; run; quit; /*Residual plots*/ symbol i=none value=dot; proc gplot; plot student*pred_log_scale/overlay haxis=axis1 vaxis=axis2; run; quit; proc univariate data=res; qqplot student/normal nohlabel novlabel; run; /*pairwise comparisons*/ ods output diffs=diffs; ods output lsmeans=lsmeans; proc mixed data=s; Class type; Model LOG_yield=type lactmonth log_lactmonth lactmonth*log_lactmonth; lsmeans type/pdiff; %mult(trt=type); run; %boxcox(phimin=0.2,phimax=0.4, steps=20, model=type lactmonth log_lactmonth lactmonth*log_lactmonth, class=type, data=s, response=yield); Data Type Lactmonth Yield 2 2 0.76 2 4 3.00 2 3 3.00 10 3 2 3 1 2 1 2 3 1 3 2 1 2 1 3 2 1 3 2 4 1 1 1 3 3 1 2 2 1 2 3 3 1 1 4 3 2 3 2 2 1 1 4 4 3 1 1 3 2 4 4 1 1 1 2 4 1 2 2 1 1 3 2 3 2 4 4 2 4 4 4 5 5 8 8 4 4 4 4 4 1 4 4 4 1 4 4 4 4 4 5 4 4 5 5 4 4 5 5 5 12 12 12 4 4 4 4 11 11 12 4 4 4 4 3 3 3 3 4 4 4 4 5 4 1 3 1.52 2.00 0.65 3.13 2.08 5.06 4.00 3.46 3.60 3.20 1.00 1.20 1.40 4.00 2.80 3.80 3.60 0.90 3.00 1.60 2.80 1.70 3.90 2.20 2.40 1.80 1.80 2.20 2.40 2.40 1.40 1.60 2.00 1.96 2.20 2.10 1.00 0.70 0.76 0.52 3.42 2.00 1.60 1.90 1.40 1.04 1.00 1.50 2.50 0.50 1.70 2.00 3.80 2.40 1.44 2.80 4.20 2.40 2.80 1.30 1.30 1.60 2.00 11 1 1 2 1 1 4 1 1 3 2 2 2 2 1 2 1 1 4 3 4 3 2 1 2 1 1 2 1 2 1 3 3 2 2 1 4 2 4 3 3 3 4 2 3 1 1 4 3 2 2 2 4 4 1 1 3 1 4 4 1 2 1 2 5 2 12 3 3 4 12 3 2 5 3 12 12 3 5 12 5 12 5 12 3 3 3 5 4 10 9 5 7 3 12 4 4 2 11 7 3 4 5 4 5 5 6 5 5 5 5 4 12 12 12 3 12 11 11 8 6 3 5 10 12 5 3 2.84 1.90 0.78 1.22 1.80 4.40 0.80 1.00 2.50 2.00 1.60 1.50 3.60 4.20 2.20 1.00 2.80 1.46 2.40 1.00 1.42 1.56 3.50 2.20 3.20 1.92 1.50 1.40 2.80 1.84 1.60 2.80 4.80 4.00 1.00 1.20 1.24 2.50 2.80 2.70 2.50 2.90 1.00 1.20 2.60 1.46 2.30 1.20 0.60 1.82 0.84 2.42 0.68 0.60 0.90 1.00 2.68 3.20 1.60 3.00 1.70 1.88 2.00 12 1 1 4 4 1 1 2 1 4 4 4 4 1 4 4 4 4 3 1 1 1 1 1 2 2 1 1 2 2 2 1 3 2 4 1 3 4 1 1 2 3 4 4 1 3 4 1 1 1 1 2 3 1 3 3 1 1 1 1 3 3 1 2 7 7 4 6 5 12 1 5 4 4 5 5 5 5 5 5 4 4 6 12 3 5 5 5 4 4 5 5 5 2 12 5 6 6 6 6 5 5 6 3 6 6 6 6 7 7 7 6 5 5 5 4 5 6 7 2 2 6 3 6 3 3 8 3.00 2.00 1.64 2.00 2.42 1.00 2.08 1.40 2.48 2.20 1.60 2.80 2.00 2.60 2.40 0.80 2.40 1.40 4.40 2.60 2.60 1.84 2.40 1.80 2.60 2.20 2.40 2.00 1.00 1.00 1.60 1.40 1.70 1.60 1.80 0.90 1.50 1.00 1.80 2.40 2.40 2.00 2.00 1.76 0.84 2.40 1.60 3.20 2.00 2.00 2.20 0.80 2.80 1.50 1.00 1.00 2.40 2.20 2.62 1.24 1.80 2.40 1.60 13 1 1 1 1 1 1 1 2 3 2 1 4 1 1 4 1 2 2 3 3 3 1 2 2 2 2 2 1 2 1 1 2 2 1 2 2 2 2 1 2 2 1 2 2 1 2 1 2 1 2 2 3 1 2 2 1 2 3 3 2 1 1 2 8 9 9 8 8 11 9 8 7 11 11 9 8 9 9 10 7 10 9 8 9 7 5 7 5 8 6 7 5 10 6 10 7 7 7 7 7 6 1 11 11 8 8 8 8 8 9 11 10 9 9 9 10 6 7 9 10 7 5 9 6 9 10 1.80 1.40 1.06 1.22 0.90 2.40 1.00 1.00 0.76 0.30 1.00 0.80 1.40 1.80 1.00 1.00 0.84 0.82 0.20 0.22 0.50 1.60 1.52 1.60 2.00 0.40 2.00 1.00 0.90 0.84 1.00 1.30 0.90 1.22 1.40 1.00 1.00 1.40 1.40 1.42 1.00 1.40 1.80 1.60 1.58 2.00 2.60 1.00 0.60 1.40 1.60 0.80 1.60 1.80 1.60 2.00 0.60 1.20 1.00 1.70 0.72 1.40 1.00 14 2 2 1 3 1 2 2 1 1 2 2 1 3 2 1 1 3 2 3 2 1 2 3 2 2 2 3 2 3 1 3 2 2 3 1 1 2 1 2 2 2 3 3 1 3 2 3 3 3 3 2 1 3 2 2 2 2 2 3 3 3 3 3 11 12 8 7 8 10 10 10 1 1 7 7 12 11 10 11 9 11 7 1 1 1 1 1 12 11 11 6 6 12 12 6 7 6 5 7 5 3 10 5 6 5 5 5 7 6 6 6 12 12 5 2 3 6 6 6 6 6 6 6 6 5 8 1.44 0.50 1.00 0.40 1.60 1.40 1.00 1.40 1.60 1.00 1.60 1.80 1.20 1.20 1.32 1.38 1.30 1.60 1.80 2.20 2.40 0.80 0.90 1.00 0.80 1.30 0.76 1.26 1.22 2.60 2.00 2.40 3.20 1.00 2.80 4.00 1.60 1.20 2.98 2.80 2.40 1.60 4.40 3.60 1.30 2.00 1.80 1.50 1.60 1.80 1.00 2.90 1.60 3.00 2.40 1.80 4.46 4.00 2.00 1.70 1.60 1.60 1.00 15 3 3 3 3 3 3 3 3 3 3 3 1 3 1 1 4 2 4 7 10 7 11 2 8 9 10 9 8 11 11 1 5 3 4 1 5 1.04 0.90 1.30 1.30 1.60 1.30 0.80 1.10 1.60 2.00 1.50 1.60 1.00 2.60 3.80 4.00 1.50 2.26 16