regression exercise solution/template

SOLUTIONS PROBLEM Individual chicks were depleted of their vitamin K reserves and then fed dried liver for 3 days at dosage levels ranging from 1.6 to 14.8 mg per gram of chick per day. At the end of this period, on each chick was measured the concentration of clotting agent that would clot samples of its blood in 3 minutes. The data for this experiment can be found in the file clotting.dat. The first column in the file contains the dose and the second column contains the concentration of clotting agent. (a) Find an adequate model relating concentration of clotting agent (= dependent variable = y) to dosage of dried liver (= independent variable = x). Use everything at your disposal for checking the adequacy of the model (i.e., residual plots, outlier and influence diagnostics, tests of lack-of-fit, etc.). (b) Given the model you chose in (a), with 95% confidence, predict the average concentration of clotting agent for a dosage level of 10.0 mg. SOLUTIONS (a) From a plot of concentration of clotting agent versus dosage of dried liver it appeared that the relationship is non-linear. Furthermore, given that the raw data plot resembled the plot in Figure 14 (d), we attempted square-root and log10 transformations on the concentration and dosage. After several attempts, it appeared that a linear relationship was achieved with the transformation y’ = log10(y) and x’ = log10(x) where y = concentration of clotting agent and x = dosage of dried liver Consequently, a straight-line model was fit to this transformed data. The ANOVA F-test was used to determine whether or not there was a significant linear relationship between log concentration and log dosage. With this end we had as our test statistic F = 421.099 with observed significance level P < 0.001 Consequently, assuming that all assumption are met, at = 0.05, we concluded that there was a significant linear relationship between log concentration and log dosage. From a plot of the studentized deleted residuals versus dosage of dried liver there appeared to be no systematic patterns, indicating that a linear model was adequate. Furthermore, none of the residuals was greater that 2 in absolute value. Hence, there appeared to be no outliers. Leverages, Cook’s D and Dfits critical values are: Cook’s D Leverage Dfits critical values 0.2667 0.2667 0.73030 Chick #2 has two of the statistics (Cook’s Distance and Dfits) above the critical values so it may be an influencial point. We run the regression again with the chick #2 deleted and the results were not significantly different so we left chick #2 in the dataset. In summary, we had no outliers or (actually) influential observations and an adequately fitting straight-line model relating log10(concentration) to log10(dosage). Hence, the final fitted model was yˆ3.010 1.892x (b) Since we needed to predict the average concentration of clotting agent for an a dosage level of 10 mg, a confidence interval for the mean was required. Furthermore, since 10 mg was not a dosage value contained in the original data set, we appended an extra observation with a dosage value of 10 and no value for concentration. From the output, in terms of log10 concentration, the point estimate was yˆ1.11837 Hence, in terms of concentration, yˆ 10**1.11837 13.1 Furthermore, in terms of log10 concentration, the 95% confidence interval for the mean was (1.0422, 1.1945). Hence, in terms of concentration, the 95% confidence interval for the mean was (10^1.0422, 10^1.1945)=(11.02, 15.65) Consequently, with 95% confidence, we predict the average concentration of clotting agent for a dosage of 10 mg to between 11.02 and 15.65 . data clot; infile ‘C:\Documents and Settings\rcarta\Desktop\clotting.dat’; input dose conc; run; proc print data=cloth;run; data clot2; set clot; y=log10(conc); x=log10(dose); drop dose conc; run; proc print data=clot2; run; proc reg data=clot2; model y = x / p r influence ; run; quit; /*************** check influencial point */ data clot22; set clot2; if x<0.3424 or x >0.35; run; proc print data=clot22; run; proc reg data=clot22; model y = x / p r influence ; run; quit; /************** end check influential point */ data oneob; y=.; x=log10(10); run; proc print data=oneb; run; data both; set clot2 oneob; run; proc print data=both; run; proc reg data=both; model y = x / clm cli; run; quit;

regression exercise solution/template

Related documents

Products

Support

regression exercise solution/template

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib