Contents Supplementary materials for: Routine educational outcome measures in health studies: Key Stage 1 in the ORACLE Children Study follow up of randomized trial cohorts. David R Jones, Katie Pike, Sara Kenyon, Laura Pike, Brian Henderson, Peter Brocklehurst, Neil Marlow, Alison Salt, David J Taylor. Contents For SPL cohort (analysed in main paper) pages 4 - 61 Extended version of Table 1 in main paper: Characteristics of responders and non-responders, and characteristics of responder by treatment group p5 Table S1 Highest equivalent (HEL) score derived from raw score data, compared with level returned from teacher’s assessment, for Mathematics, 2004 onwards. p7 Table S2 Numbers (percentages) failing to achieve level 2 or higher in KS1 level data for Mathematics anchored using PIPS score reference data (>12; <12) from 104,750 children. p8 Additional more detailed data and analyses pp 9 - 61 Different methods of analysis 1) Dichotomising at level 2 2) Extended analysis retaining most categories 3) Ordinal logistic regression 4) Poisson regression Adjusting for covariates 1) Ordinal logistic regression 2) Poisson regression Mapping categorical scores to continuous scores 1) Unadjusted models 2) Adjusted models Use of raw score data Contents Contents 1) 2) 3) 4) Level scores for those with raw score data available Descriptive analyses of level 2 test raw scores Modelling level 2 raw score data Extending analyses for other tests sat Standardisation/anchoring using PIPS data 1) Exploratory analyses on PIPS data 2) Analysis of the relationship between PIPS scores and KS1 levels 3) Anchoring KS1 level data Appendix A: Additional graphs For PROM cohort (not analysed in main paper) pages 62 - 119 Tables 1-5 for PROM cohort (followed up from ORACLE I trial) corresponding to those in main paper for SPL cohort Characteristics of responders and non-responders, and characteristics of responder by treatment group (extends Table 1) Additional more detailed data and analyses (including example Stata commands) Different methods of analysis 1) Dichotomising at level 2 2) Extended analysis retaining most categories 3) Ordinal logistic regression 4) Poisson regression Adjusting for covariates 1) Ordinal logistic regression 2) Poisson regression Mapping categorical scores to continuous scores Contents Contents 1) 2) Unadjusted models Adjusted models Use of raw score data 1) Level scores for those with raw score data available 2) Descriptive analyses of level 2 test raw scores 3) Modelling level 2 raw score data 4) Extending analyses for other tests sat Standardisation/anchoring using PIPS data 1) Exploratory analyses on PIPS data 2) Analysis of the relationship between PIPS scores and KS1 levels 3) Anchoring KS1 level data Appendix A: Additional graphs [End of Contents list] Contents 4 Additional tables and analyses for SPL cohort (followed up from ORACLE II trial, as presented in main paper) Additional more detailed data and analyses Extended version of Table 1 in main paper: Characteristics of responders and non-responders, and characteristics of responder by treatment group Different methods of analysis 1) 2) 3) 4) Dichotomising at level 2 Extended analysis retaining most categories Ordinal logistic regression Poisson regression Adjusting for covariates 1) Ordinal logistic regression 2) Poisson regression Mapping categorical scores to continuous scores 1) Unadjusted models 2) Adjusted models Use of raw score data 1) Level scores for those with raw score data available 2) Descriptive analyses of level 2 test raw scores 3) Modelling level 2 raw score data 4) Extending analyses for other tests sat Standardisation/anchoring using PIPS data 1) Exploratory analyses on PIPS data 2) Analysis of the relationship between PIPS scores and KS1 levels 3) Anchoring KS1 level data SPL cohort 5 Characteristics of responders and non-responders, and characteristics of responder by treatment group (extending Table 1 in main paper) Consent to KS1 data being collected Number of women Maternal age - Median (IQR) years Gestation age at trial entry - Median (IQR) days Multiple births Maternal antibiotics Number of children Delivery within 48hrs Delivery within 7 days Gestational age at delivery - Median (IQR) days Birthweight - Median (IQR) g Males Admission to Neonatal unit Ventilated Respiratory Distress Syndrome Oxygen at 28 days Positive blood culture Necrotising enterocolitis Abnormal cerebral ultrasonography Social deprivation - SPL cohort Income Consent to KS1 data being collected 1776 27.3 (23.3, 31.5) 219 (201, 232) 131 7.4% 178 10.0% 1899 241 12.7% 339 17.9% 265 (242, 277) 2920 (2210, 3400) 1018 53.6% 588 31.0% 185 9.7% Contact made but no consent 1418 25.0 (21.4, 29.0) 219 (201, 232) 79 5.6% 165 11.6% 1493 125 8.4% 189 12.7% 268 (249, 279) 2940 (2350, 3390) 786 52.6% 390 26.1% 105 7.0% Erythromycin and Co-amoxiclav 434 27.3 (23.2, 31.3) 220 (202, 232) 26 6.0% 42 9.7% 459 50 10.9% 69 15.0% 267 (248, 279) 2980 (2300, 3480) 240 52.3% 127 27.7% 46 10.0% Erythromycin only 467 27.3 (22.9, 31.7) 220 (202, 232) 38 8.1% 50 10.7% 504 62 12.3% 93 18.5% 264 (241, 276) 2896 (2207, 3355) 284 56.3% 164 32.5% 41 8.1% Co-amoxiclav only 429 27.3 (23.3, 31.3) 216 (199, 230) 31 7.2% 36 8.4% 459 57 12.4% 82 17.9% 264 (240, 278) 2920 (2200, 3410) 228 49.7% 144 31.4% 44 9.6% Double placebo 446 27.2 (23.6, 31.6) 218 (201, 232) 36 8.1% 50 11.2% 477 72 15.1% 95 19.9% 265 (241, 276) 2892 (2180, 3360) 266 55.8% 153 32.1% 54 11.3% 202 10.6% 91 4.8% 40 2.1% 19 1.0% 28 1.5% 910 121 8.1% 57 3.8% 31 2.1% 15 1.0% 20 1.3% 898 47 10.2% 21 4.6% 10 2.2% 6 1.3% 5 1.1% 202 46 9.1% 18 3.6% 8 1.6% 3 0.6% 9 1.8% 260 48 10.5% 20 4.4% 8 1.7% 7 1.5% 5 1.1% 226 61 12.8% 32 6.7% 14 2.9% 3 0.6% 9 1.9% 222 6 - lowest quartile Education Child Poverty Ethnicity SPL cohort White 47.9% 854 45.0% 884 46.6% 60.1% 846 56.7% 877 58.7% 44.0% 190 41.4% 195 42.5% 51.6% 243 48.2% 254 50.4% 49.2% 196 42.7% 220 47.9% 46.5% 225 47.2% 215 45.1% 1546 1449 93.7% 1693 1311 77.4% 376 349 92.8% 419 398 95.0% 372 350 94.1% 379 352 92.9% 7 Table S1 Highest equivalent (HEL) score derived from raw score data, compared with level returned from teacher’s assessment, for Mathematics, 2004 onwards. BelowLevel 1 4 1 0 0 0 Level 1 Teacher assessed level Level 2C Level 2B Level 2A Under Level 1 8 4 1 Level 1 25 11 0 Level 2C 12 225 24 Level 2B 1 34 294 HEL Level 2A 1 9 60 Level 3 or above 0 0 0 0 Missing 0 0 0 0 Data from earlier years are excluded because the format of tests changed. SPL cohort Missing 0 0 1 29 351 Level 3 or above 0 0 0 0 13 41 8 292 2 0 0 0 0 0 0 1 8 Table S2 Numbers (percentages) failing to achieve level 2 or higher in KS1 level data for Mathematics anchored using PIPS score reference data (>12; <12) from 104,750 children. Mantel-Haenszel odds ratios (95% confidence intervals) Erythromycin 1641 279 (17.0%) Anonymous data from DFE No Erythromycin Co-amoxiclav 1598 1608 263 (16.5%) 268 (16.7%) 1.04 (0.86, 1.25) SPL cohort No Coamoxiclav 1631 274 (16.8%) 1.00 (0.83, 1.20) Erythromycin 963 121 (12.6%) School/parental data No CoErythromycin amoxiclav 936 918 110 (11.8%) 105 (11.4%) 1.08 (0.82, 1.43) No Coamoxiclav 981 126 (12.8%) 0.88 (0.67, 1.16) 9 Different methods of analysis 1) Dichotomising at level 2 The table below shows the results of using Mantel-Haenszel methods stratifying by test year, dichotomising into scoring level 2 and above, and failing to achieve level 2: N Reading Below level 2 Writing Below level 2 Maths Below level 2 Reading Writing Maths MH OR (95% CI) MH OR (95% CI) MH OR (95% CI) Erythromycin 963 165 17.1% 188 19.5% 95 9.9% Parental data No CoErythromycin amoxiclav 936 918 153 142 16.3% 15.5% 178 168 19.0% 18.3% 82 79 8.8% 8.6% No Coamoxiclav 981 176 17.9% 198 20.2% 98 10.0% Erythromycin 1641 377 23.0% 413 25.2% 239 14.6% DfE data No CoErythromycin amoxiclav 1598 1608 367 366 23.0% 22.8% 413 395 25.8% 24.6% 225 230 14.1% 14.3% No Coamoxiclav 1631 378 23.2% 431 26.4% 224 13.7% 1.06 (0.83, 1.35) 0.84 (0.66, 1.06) 1.00 (0.85, 1.18) 0.98 (0.83, 1.15) 1.04 (0.82, 1.30) 0.88 (0.70, 1.11) 0.97 (0.83, 1.13) 0.91 (0.77, 1.06) 1.14 (0.83, 1.55) 0.84 (0.62, 1.15) 1.04 (0.85, 1.26) 1.00 (0.82, 1.21) The parental and DfE results are broadly similar with no statistically significant treatment differences. However variations between treatment groups are slightly more extreme for the parental data. SPL cohort 10 2) Extended analysis retaining most categories Erythromycin Reading SPL cohort Erythromycin DfE data No CoErythromycin amoxiclav No Coamoxiclav 18 1.9% 21 2.2% 17 1.9% 22 2.2% 86 5.2% 77 4.8% 77 4.8% 86 5.3% Level 1 147 15.3% 132 14.1% 125 13.6% 154 15.7% 291 17.7% 290 18.1% 289 18.0% 292 17.9% Level 2C 145 15.1% 127 13.6% 127 13.8% 145 14.8% 225 13.7% 214 13.4% 219 13.6% 220 13.5% Level 2B 221 22.9% 218 23.3% 220 24.0% 219 22.3% 416 25.4% 416 26.0% 425 26.4% 407 25.0% Level 2A 222 23.1% 216 23.1% 199 21.7% 239 24.4% 323 19.7% 311 19.5% 283 17.6% 351 21.5% Level 3 or over 204 21.2% 6 0.6% 220 23.5% 2 0.2% 225 24.5% 5 0.5% 199 20.3% 3 0.3% 287 17.5% 13 0.8% 283 17.7% 7 0.4% 304 18.9% 11 0.7% 266 16.3% 9 0.6% Under level 1 44 4.6% 34 3.6% 33 3.6% 45 4.6% 125 7.6% 117 7.3% 124 7.7% 118 7.2% Level 1 144 15.0% 144 15.4% 135 14.7% 153 15.6% 288 17.6% 296 18.5% 271 16.9% 313 19.2% Level 2C 244 25.3% 203 21.7% 194 21.1% 253 25.8% 393 23.9% 336 21.0% 360 22.4% 369 22.6% Level 2B 248 25.8% 256 27.4% 258 28.1% 246 25.1% 451 27.5% 490 30.7% 481 29.9% 460 28.2% Level 2A 170 17.7% 181 19.3% 178 19.4% 173 17.6% 237 14.4% 210 13.1% 215 13.4% 232 14.2% Level 3 or over 111 11.5% 2 0.2% 117 12.5% 1 0.1% 119 13.0% 1 0.1% 109 11.1% 2 0.2% 144 8.8% 3 0.2% 148 9.3% 1 0.1% 154 9.6% 3 0.2% 138 8.5% 1 0.1% 14 12 11 15 53 53 50 56 Missing Maths No Coamoxiclav Under level 1 Missing Writing Parental data No CoErythromycin amoxiclav Under level 1 11 1.5% 1.3% 1.2% 1.5% 3.2% 3.3% 3.1% 3.4% Level 1 81 8.4% 70 7.5% 68 7.4% 83 8.5% 186 11.3% 172 10.8% 180 11.2% 178 10.9% Level 2C 174 18.1% 183 19.6% 168 18.3% 189 19.3% 309 18.8% 300 18.8% 297 18.5% 312 19.1% Level 2B 226 23.5% 231 24.7% 227 24.7% 230 23.4% 476 29.0% 488 30.5% 485 30.2% 479 29.4% Level 2A 273 28.3% 255 27.2% 251 27.3% 277 28.2% 337 20.5% 340 21.3% 328 20.4% 349 21.4% Level 3 or over 193 20.0% 2 0.2% 184 19.7% 1 0.1% 192 20.9% 1 0.1% 185 18.9% 2 0.2% 278 16.9% 2 0.1% 243 15.2% 2 0.1% 266 16.5% 2 0.1% 255 15.6% 2 0.1% Missing The DfE data has a higher proportion of lower grades than parental data, reflecting that parents of lower achieving children are less likely to give consent to collect their child’s results. There are no major treatment differences although consideration for test year and paper sat has not yet been taken into consideration. Any minor differences for one dataset (either parental or DfE) are generally not replicated for the other dataset. 3) Ordinal logistic regression Ordinal logistic regression for the level achieved (6 groups) with explanatory variables indicating allocation to Erythromycin and/or Co-amoxiclav, and also school year: Parental data – OR (95% CI) Subject Reading Writing SPL cohort Models with no interactions Erythromycin Co-amoxiclav 1.10 (0.94, 1.29) 1.13 (0.96, 1.33) 0.86 (0.73, 1.01) 0.82 (0.70, 0.96) Model with interaction Co-amoxiclav Erythromycin* Co-amoxiclav 1.04 (0.83, 1.30) 0.82 (0.65, 1.03) 1.01 (0.81, 1.53) 1.08 (0.86, 1.35) 0.79 (0.63, 0.99) 1.09 (0.79, 1.50) Erythromycin 12 Maths 0.98 (0.83, 1.15) 0.91 (0.77, 1.07) 0.97 (0.78, 1.21) 0.90 (0.72, 1.13) 1.01 (0.73, 1.39) DfE data – OR (95% CI) Subject Reading Writing Maths Models with no interactions Erythromycin Co-amoxiclav 1.01 (0.89, 1.14) 1.02 (0.91, 1.16) 0.98 (0.87, 1.11) 0.98 (0.87, 1.11) 0.94 (0.83, 1.06) 0.98 (0.86, 1.10) Model with interaction Co-amoxiclav Erythromycin* Co-amoxiclav 0.89 (0.75, 1.05) 0.86 (0.72, 1.02) 1.30 (1.02, 1.66) 0.89 (0.75, 1.06) 0.81 (0.68, 0.96) 1.33 (1.04, 1.70) 0.88 (0.74, 1.05) 0.88 (0.74, 1.04) 1.24 (0.97, 1.58) Erythromycin For the parental dataset there is evidence of an improvement in writing score associated with co-amoxiclav. There is some evidence this may also be apparent for reading, although this is not formally significant. The writing result is replicated in the model with interaction terms. For the DfE dataset there is no evidence of treatment differences for the models with no interactions, estimates are generally closer to one than for the parental data. However for the models with interactions there is evidence of interaction effects between erythromycin and co-amoxiclav (formally for reading and writing), and evidence of an improvement in writing associated with co-amoxiclav. Ordinal logistic regression relies on the proportional odds assumption. This was tested via likelihood ratio (LR) tests, the p-values from which are given below: Parental data – p-values from LR tests for proportional odds Subject Erythromycin Co-amoxiclav Reading Writing Maths 0.43 0.12 0.65 0.20 0.09 0.64 Model with interaction 0.15 0.11 0.37 DfE data – p-values from LR tests for proportional odds Subject Erythromycin Co-amoxiclav Reading Writing Maths 0.001 <0.001 <0.001 <0.001 0.001 <0.001 SPL cohort Model with interaction <0.001 0.005 <0.001 13 These tests indicate the assumptions are valid for the parental data, but not valid for any subject using the DfE data. One hypothesis could be that the assumptions are not valid due to adjusting for school year, as the models are testing for proportionality across each year and between the antibiotic and no antibiotic (is this correct?). Also see graphs in Appendix A, Section 1. It is also worth noting that there are very low numbers of children tested in 2001 and 2002. If we exclude school year from the models the following p-values are given from the LR tests for proportional odds: Parental data - p-values from LR tests for proportional odds excluding school year from the models Subject Erythromycin Co-amoxiclav Reading Writing Maths 0.88 0.56 0.80 0.34 0.49 0.84 Model with interaction 0.25 0.30 0.37 DfE data - p-values from LR tests for proportional odds excluding school year from the models Subject Erythromycin Co-amoxiclav Reading Writing Maths 0.97 0.11 0.65 0.03 0.40 0.85 4) Model with interaction 0.13 0.41 0.34 Poisson regression Poisson regression for the level achieved (scaled 1 to 6) with explanatory variables indicating allocation to Erythromycin and/or Co-amoxiclav, and also school year: Parental data – RR (95% CI) Subject SPL cohort Models with no interactions Erythromycin Co-amoxiclav Erythromycin Model with interaction Co-amoxiclav Erythromycin* Co-amoxiclav 14 Reading Writing Maths 1.03 (0.97, 1.08) 1.03 (0.98, 1.08) 1.00 (0.94, 1.05) 0.96 (0.91, 1.01) 0.96 (0.91, 1.00) 0.97 (0.92, 1.03) 1.00 (0.93, 1.08) 1.01 (0.95, 1.08) 0.99 (0.92, 1.06) 0.94 (0.87, 1.01) 0.94 (0.88, 1.01) 0.96 (0.89, 1.04) 1.05 (0.94, 1.17) 1.03 (0.93, 1.14) 1.02 (0.91, 1.14) DfE data – RR (95% CI) Subject Reading Writing Maths Models with no interactions Erythromycin Co-amoxiclav 1.00 (0.96, 1.04) 1.00 (0.97, 1.04) 0.99 (0.96, 1.03) 0.99 (0.95, 1.03) 0.99 (0.95, 1.02) 0.99 (0.95, 1.03) Model with interaction Co-amoxiclav Erythromycin* Co-amoxiclav 0.97 (0.91, 1.02) 0.96 (0.90, 1.01) 1.08 (1.00, 1.17) 0.97 (0.92, 1.03) 0.95 (0.91, 1.01) 1.07 (0.99, 1.15) 0.96 (0.91, 1.02) 0.96 (0.91, 1.02) 1.06 (0.98, 1.15) Erythromycin There are no formally significant treatment differences. However treatment effects are in the same direction as from ordinal logistic regression. There is evidence of some interaction effects for DfE data (only formally evident for reading), in a similar manner to ordinal logistic regression. Confidence intervals are much smaller than for ordinal regression, and point estimates are generally more conservative (closer to 1). For illustrations of residual plots, etc, to assess the assumptions of the models, see Appendix A, Section 2. Adjusting for covariates Parental data can only be used due to the anonymous nature of DfE data. Models were fitted including terms indicating treatment allocation, school year and allowance was made for the following variables: Baseline factors: Maternal age (years), gestation at randomisation and birth (days), multiple births, maternal antibiotics, delivery with 48 hours and 7 days, birthweight (grams), sex Social factors: Ethnicity (white/non white), smoking in family, damp/mould problems, family history of asthma, social deprivation scores for income, education and child poverty (on continuous scales with higher scores indicating higher deprivation) Neonatal outcomes (two models were fitted – allowing for and excluding these variables): Admission to neonatal unit, ventilated, respiratory distress syndrome, oxygenation at 28 days, positive blood culture, necrotising enterocolitis, abnormal ultrasound scan SPL cohort 15 1) Ordinal logistic regression – reading used first as an example Not allowing neonatal outcomes – the ‘best’ fitting models are given below: Models with no treatment interactions: Subject Treatment Smoking in family Sex Gestation at randomisation Erythromycin 1.11 (0.94, 1.30) 1.88 (1.60, 2.23) 1.87 (1.59, 2.21) 1.00 (0.99, 1.00) Co-amoxiclav 0.91 (0.78, 1.08) 1.88 (1.60, 2.23) 1.86 (1.58, 2.20) 1.00 (0.99, 1.00) Model with interaction: Subject Erythromycin Co-amoxiclav Erythromycin*Co-amoxiclav Smoking in family Sex Gestation at randomisation OR (95% CI) 1.08 (0.86, 1.35) 0.90 (0.71, 1.13) 1.05 (0.76, 1.45) 1.88 (1.59, 2.22) 1.86 (1.58, 2.20) 1.00 (0.99, 1.00) Allowing neonatal outcomes – the ‘best’ fitting models are given below: Models with no treatment interactions: Subject Treatment Social dep score - education Sex Oxygenation at 28 days SPL cohort Erythromycin 1.10 (0.94, 1.29) 1.00 (1.00, 1.00) 1.92 (1.63, 2.26) 1.96 (1.35, 2.84) Co-amoxiclav 0.92 (0.78, 1.08) 2.21 (1.87, 2.62) 1.92 (1.63, 2.25) 1.94 (1.34, 2.82) 16 Model with interaction: Subject Erythromycin Co-amoxiclav Erythromycin*Co-amoxiclav Social dep score - education Sex Oxygenation at 28 days OR (95% CI) 1.02 (0.82, 1.28) 0.85 (0.68, 1.07) 1.16 (0.84, 1.60) 1.00 (1.00, 1.00) 1.92 (1.63, 2.25) 1.94 (1.33, 2.81) For all models – conclusions of treatment effects are unchanged after adjustment (unadjusted ORs were Erythromycin 1.10 (0.94, 1.29) and Co-amoxiclav 0.86 (0.73, 1.01)). Adjustment has brought OR estimates for co-amoxiclav closer to one. Smoking in family, having a high education social deprivation score, being randmised at low gestations and being oxygenated at 28 days are related to poorer KS1 grades (as expected). N.B. Smoking is missing for 383 (18%) children and so using this means some of dataset is unusable Proportional odds assumption – the table below gives the p-values from likelihood ratio tests for proportional odds: Not allowing neonatal outcomes Allowing neonatal outcomes Erythromycin Co-amoxiclav 0.60 0.44 0.41 0.27 Model with interaction 0.42 0.19 Results are very similar to unadjusted modelling with proportional odds assumptions appearing valid. Repeating the analysis without adjusting for academic year yields the following results: Not allowing neonatal outcomes Allowing neonatal outcomes Erythromycin Co-amoxiclav 0.76 0.61 0.52 0.37 Again assumptions appear valid, with results similar to the unadjusted models. SPL cohort Model with interaction 0.53 0.27 17 2) Poisson regression – reading used first as an example Not allowing neonatal outcomes – the ‘best’ fitting models are given below: Models with no treatment interactions: Subject Treatment Smoking in family Sex Erythromycin 1.03 (0.97, 1.09) 1.19 (1.13, 1.26) 1.19 (1.13, 1.26) Co-amoxiclav 0.97 (0.92, 1.03) 1.19 (1.12, 1.26) 1.19 (1.13, 1.26) Model with interaction: Subject Erythromycin Co-amoxiclav Erythromycin*Co-amoxiclav Smoking in family Sex OR (95% CI) 1.02 (0.94, 1.10) 0.96 (0.89, 1.04) 1.02 (0.92, 1.14) 1.19 (1.12, 1.26) 1.19 (1.12, 1.26) Allowing neonatal outcomes – the ‘best’ fitting models are given below: Models with no treatment interactions: Subject Treatment Social dep score - education Sex Oxygenation at 28 days SPL cohort Erythromycin 1.03 (0.97, 1.08) 1.00 (1.00, 1.00) 1.20 (1.14, 1.27) 1.20 (1.07, 1.35) Co-amoxiclav 0.98 (0.92, 1.03) 1.00 (1.00, 1.00) 1.20 (1.14, 1.27) 1.20 (1.07, 1.35) 18 Model with interaction: Subject Erythromycin Co-amoxiclav Erythromycin*Co-amoxiclav Social dep score - education Sex Oxygenation at 28 days OR (95% CI) 1.00 (0.93, 1.08) 0.95 (0.88, 1.03) 1.05 (0.95, 1.17) 1.00 (1.00, 1.00) 1.20 (1.14, 1.27) 1.20 (1.07, 1.35) Results are very similar to those for ordinal regression modelling. For all models – conclusions of treatment effects are unchanged after adjustment (unadjusted RRs were adjustment Erythromycin 1.03 (0.97, 1.08) Co-amoxiclav 0.96 (0.91, 1.01)). Smoking in family, males, having a high education social deprivation score and being oxygenated at 28 days are related to poorer KS1 grades (as expected). N.B. Smoking is missing for 383 (18%) children and so using this means some of dataset is unusable For tests of model assumptions see Appendix A, Section 3. Mapping categorical to continuous scores 1) Unadjusted models Categorical scores are mapped to continuous outcomes according to the following: (W, 1, 2C, 2B, 2A, 3) → (3, 9, 13, 15, 17, 21). Linear regression is then used to estimate treatment effects (allowing for test year) with the results displayed below: Parental data – estimates (95% CIs) Subject Reading Writing Maths SPL cohort Models with no interactions Erythromycin Co-amoxiclav -0.19 (-0.57, 0.19) 0.39 (0.01, 0.77) -0.25 (-0.61, 0.12) 0.41 (0.04, 0.78) -0.01 (-0.34, 0.32) 0.23 (-0.10, 0.56) Erythromycin 0.04 (-0.49, 0.56) 0.00 (-0.51, 0.51) 0.13 (-0.33, 0.58) Model with interaction Co-amoxiclav Erythromycin* Co-amoxiclav 0.62 (0.08, 1.16) -0.46 (-1.22, 0.30) 0.66 (0.13, 1.18) -0.50 (-1.24, 0.24) 0.37 (-0.10, 0.84) -0.27 (-0.93, 0.39) 19 DfE data – estimates (95% CI) Models with no interactions Erythromycin Co-amoxiclav -0.04 (-0.36, 0.28) 0.13 (-0.19, 0.45) -0.04 (-0.34, 0.26) 0.14 (-0.17, 0.44) 0.06 (-0.22, 0.34) 0.07 (-0.20, 0.35) Subject Reading Writing Maths Erythromycin 0.32 (-0.12, 0.77) 0.33 (-0.10, 0.76) 0.38 (-0.01, 0.77) Model with interaction Co-amoxiclav Erythromycin* Co-amoxiclav 0.50 (0.05, 0.95) -0.74 (-1.37, -0.10) 0.51 (0.08, 0.95) -0.74 (-1.35, -0.14) 0.39 (0.00, 0.79) -0.63 (-1.19, -0.08) N.B. These estimates will be in the opposite direction to the estimates for ordinal logistic regression and Poisson regression, as the scales for ordinal and Poisson regression are purposely set to estimate degree of disability, not ability. The continuous score scale estimates degree of ability. There is some evidence of treatment differences – for the parental data there is evidence of improvements in reading and writing scores associated with coamoxiclav, in both the models with and without interaction terms; for DfE data this is only apparent in the models with interaction terms, and there is also evidence of interactions between erythromycin and co-amoxiclav for all three subjects. One of the assumptions of the model is normality of the outcome .2 .1 0 Density .3 .4 variables; a histogram of parental reading scores is given below: 5 10 15 read_cts SPL cohort 20 20 The histogram provides evidence that the assumptions of the model are not met, and therefore this method is not advisable. Further residual plots to determine model assumptions are given in Appendix A, Section 4. These plots provide evidence that other assumptions are also not met. 2) Adjusted models Adjusting for covariates gives the same variables proving important to the model when using the alternative two methods. Not allowing neonatal outcomes – the ‘best’ fitting models are given below: Models with no treatment interactions: Subject Treatment Social dep score - education Sex Gestation at birth Erythromycin -0.19 (-0.56, 0.18) 0.00 (0.00, 0.00) -1.49 (-1.86, -1.12) 0.01 (0.00, 0.01) Co-amoxiclav 0.25 (-0.12, 0.62) 0.00 (0.00, 0.00) -1.48 (-1.85, -1.11) 0.01 (0.00, 0.01) Model with interaction: Subject Erythromycin Co-amoxiclav Erythromycin*Co-amoxiclav Social dep score - education Sex Gestation at birth OR (95% CI) 0.10 (-0.41, 0.62) 0.54 (0.02, 1.07) -0.60 (-1.33, 0.15) 0.00 (0.00, 0.00) -1.47 (-1.84, -1.10) 0.01 (0.00, 0.01) Allowing neonatal outcomes – the ‘best’ fitting models are given below: Models with no treatment interactions: Subject Treatment Social dep score - education Sex Oxygenation at 28 days SPL cohort Erythromycin -0.20 (-0.54, 0.17) 0.00 (0.00, 0.00) -1.49 (-1.86, -1.13) -1.70 (-2.56, -0.84) Co-amoxiclav 0.24 (-0.12, 0.61) 0.00 (0.00, 0.00) -1.48 (-1.85, -1.12) -1.67 (-2.53, -0.81) 21 Model with interaction: Subject Erythromycin Co-amoxiclav Erythromycin*Co-amoxiclav Social dep score - education Sex Oxygenation at 28 days OR (95% CI) 0.05 (-0.46, 0.56) 0.50 (-0.02, 1.03) -0.51 (-1.25, 0.22) 0.00 (0.00, 0.00) -1.48 (-1.85, -1.11) -1.67 (-2.53, -0.81) Treatment effects are somewhat similar to those from unadjusted models, although the association between co-amoxiclav and reading score is somewhat reduced for most methods of adjustment. Once again estimated effects will be in the opposite direction to when using ordinal or poisson regression. For model assumptions see Appendix A, Section 5. The residual plots are much better than for the unadjusted models, although there is still some evidence of grouping of residuals into six groups according to the six groupings of KS1 level. Use of raw score data The maths raw score data has been examined. Data is available on 1590 PROM children, and is quite complicated due to the combination of tests children could sit and therefore the amount of data for each child varies. The tests available were: task ab (pre 2003), task c (pre 2003), test 23 (testing levels 2 and 3 and pre 2003), test 2 (level 2 test, 2003 onwards), test 3 (level 3 test, 2003 onwards). It was decided to exclude the data from pre 2003 (138 PROM children) due to the different nature of the data. 1) Level scores for those with raw score data available The PROM KS1 maths levels (from teachers) are tabulated below for those with raw score data compared to those without from 2003 onwards: N Below level 1 Level 1 Level 2C Level 2B Level 2A SPL cohort Raw score 1452 5 (0%) 47 (3%) 283 (19%) 379 (26%) 430 (30%) No raw score 338 19 (6%) 94 (28%) 51 (15%) 54 (16%) 75 (22%) 22 Level 3 or above Missing 307 (21%) 1 (0%) 43 (13%) 2 (1%) Therefore there is slightly less raw score available for the lower grades, but this could be due to weak children not being entered for the tests and merely awarded a level via teacher assessment. 2) Descriptive analyses of level 2 test raw scores The raw scores just from those who sat the level 2 test (regardless of whether they also sat the level 3 test) are examined initially. The table below gives the distribution of level 2 raw scores by teacher assessed level, and by test sat: Test 2003 2004 2005 2007 TOTAL N Median (IQR range) N Median (IQR range) N Median (IQR range) N Median (IQR range) N Median (IQR range) (Range) SPL cohort Under Level 1 1 4 (., .) 2 4.5 (3, 6) 1 4 (., .) 1 4 (., .) 5 4 (., .) (3, 6) Level 1 3 6 (6, 8) 21 5 (4, 6) 20 6 (5, 8.5) 3 6 (6, 9) 47 6 (5, 8) (0, 19) Level 2C 38 10 (8,12) 80 11 (8, 12) 133 11 (8, 13) 32 9 (7, 11) 283 11 (8, 12) (1, 22) Level 2B 52 16 (14.5, 16.5) 94 16 (14, 18) 188 16 (15, 18) 45 16 (15, 18) 379 16 (15, 18) (0, 28) Level 2A 60 22 (20, 23) 120 22 (20, 24) 195 22 (20, 24) 40 23 (22, 25.5) 415 22 (20, 24) (9, 30) Level 3 or above 50 26 (24, 28) 60 26 (23, 27) 57 25 (22, 28) 27 27 (25, 28) 194 26 (24, 28) (18, 30) 23 The scores appear to be broadly similar over the tests sat. The next table gives similar distributions but by year of assessment: Year 2003 2004 2005 2006 2007 TOTAL N Median (IQR range) N Median (IQR range) N Median (IQR range) N Median (IQR range) N Median (IQR range) N Median (IQR range) (Range) Under Level 1 1 4 (., .) 1 3 (., .) 0 2 5 (4, 6) 1 4 (., .) 5 4 (., .) (3, 6) Level 1 2 6 (., .) 14 5 (5, 6) 11 7 (5, 8) 15 6 (5, 10) 5 6 (6, 9) 47 6 (5, 8) (0, 19) Level 2C 33 10 (8,11) 56 11 (8.5, 12) 66 12 (8, 14) 85 10 (9, 13) 43 9 (7, 11) 283 11 (8, 12) (1, 22) Level 2B 46 16 (14, 16) 56 16 (13, 17.5) 116 17 (15, 18) 108 15 (14, 18) 53 16 (15, 18) 379 16 (15, 18) (0, 28) Level 2A 53 21 (20, 23) 89 22 (20, 24) 107 22 (20, 24) 105 22 (20, 24) 61 23 (21, 25) 415 22 (20, 24) (9, 30) Level 3 or above 48 26 (24, 28) 40 26 (24, 27) 30 24 (22, 27) 43 25 (23, 28) 33 26 (25, 28) 194 26 (24, 28) (18, 30) Again scores are broadly similar for each year the tests are sat. Level from raw score The equivalent level derived from the level 2 raw score is tabulated by the overall teacher assessment awarded: 0 Under Level 1 Level 1 Level 2C Level 2B Level 2A Under Level 1 0 4 1 0 0 0 Level 1 0 8 25 12 1 1 Teacher awarded level Level 2C Level 2B 1 0 4 1 11 0 225 24 34 294 9 60 Level 2A 0 0 0 1 29 385 Level 3 or above 0 0 0 0 2 192 The above table demonstrates agreement between the teacher awarded score and level score for 933/1323 (71%) of children. When scores do disagree it is more common for the teacher to award a level higher than that achieved in the test compared to lower, although at this stage we do not present information on whether a higher test (level 3 test) has also been sat. This will be expanded upon later. SPL cohort 24 3) Modelling level 2 raw score 0 .02 Density .04 .06 The level 2 raw scores are now modelled using normal least squares. Firstly the assumption of normality of the scores is investigated: 0 10 20 30 2 score There is some doubt as to the normality of the scores, mainly due to the ‘tail’ of low scoring pupils. Unadjusted models In the table below are results of fitting models adjusting only for academic year the child sat the test, or the paper sat: Adjusting for Academic year Paper sat SPL cohort Models with no interactions Erythromycin Co-amoxiclav -0.46 (-1.14, 0.23) 0.02 (-0.67, 0.71) -0.45 (-1.14, 0.24) 0.02 (-0.66, 0.71) Erythromycin -0.75 (-1.70, 0.21) -0.74 (-1.70, 0.22) Model with interaction Co-amoxiclav Erythromycin* Co-amoxiclav -0.30 (-1.28, 0.68) 0.60 (-0.78, 1.97) -0.30 (-1.27, 0.68) 0.59 (-0.78, 1.97) 25 Firstly results are very similar regardless of whether the academic year or the paper sat is adjusted for in the model. There are no statistically significant treatment differences and none of the improvements associated with co-amoxiclav observed earlier for other types of modelling are apparent. N.B. Again these estimates will be in the opposite direction to the estimates when looking at KS1 levels for ordinal logistic regression and Poisson regression, as the scales for ordinal and Poisson regression are purposely set to estimate degree of disability, not ability. The raw scores estimate degree of ability. Residual plots to determine model assumptions are given in Appendix A, Section 6. A histogram of the standardised residuals shows a ‘tail’ of negative residuals, on examination this group relates to those scoring poorly (5 out of 30 or below) and therefore the models do not seem to be accurate for low scoring children. The normal probability plot shows distinct groups of residuals relating to the fact the scores are technically ordinal and not continuous. Adjusted models The models allowing for academic year have been adjusted for covariates: Not allowing neonatal outcomes – the ‘best’ fitting models are given below: Models with no treatment interactions: Subject Treatment Social dep – education score Weight (g) Erythromycin -0.50 (-1.18, 0.18) 0.00 (0.00, 0.00) Co-amoxiclav 0.09 (-0.77, 0.59) 0.00 (0.00, 0.00) 0.00 (0.00, 0.00) 0.00 (0.00, 0.00) Model with interaction: Subject Erythromycin Co-amoxiclav Erythromycin*Co-amoxiclav Social dep – education score Weight (g) SPL cohort Coeff (95% CI) -0.70 (-1.65, 0.24) -0.31 (-1.28, 0.65) 0.42 (-0.94, 1.77) 0.00 (0.00, 0.00) 0.00 (0.00, 0.00) 26 SPL cohort 27 Allowing neonatal outcomes – the ‘best’ fitting models are given below: Models with no treatment interactions: Subject Treatment Social dep – education score Ventilated Weight (g) Erythromycin -0.43 (-1.62, 0.76) 0.00 (0.00, 0.00) -1.80 (-3.25, -0.35) 0.00 (0.00, 0.00) Co-amoxiclav 0.16 (-1.03, 1.35) 0.00 (0.00, 0.00) -1.80 (-3.25, -0.35) 0.00 (0.00, 0.00) Model with interaction: Subject Erythromycin Co-amoxiclav Erythromycin*Co-amoxiclav Social dep – education score Ventilated Weight (g) Coeff (95% CI) -1.19 (-2.86, 0.47) -0.63 (-2.30, 1.03) 1.59 (-0.80, 3.99) 0.00 (0.00, 0.00) -1.86 (-3.31, -0.40) 0.00 (0.00, 0.00) Treatment effects are largely unaltered from unadjusted models. Once again estimated effects will be in the opposite direction to when using ordinal or poisson regression. Being ventilated, low birth weight and having worse social deprivation on the child poverty scale are all associated with poorer KS1 performance. For model assumptions see Appendix A, Section 7. The residual plots are much better than for the unadjusted models. 4) Extending analysis for other tests sat Initially the combination of tests sat (level 2 only, level 2 and level 3, level 3 only) have been compared to the teacher assessed maths level: Level 2 test only Level 2 and level 3 tests Level 3 test only SPL cohort Below level 1 5 (1%) 0 0 Level 1 47 (5%) 0 0 Level 2C 281 (29%) 2 (1%) 0 Level 2B 375 (39%) 4 (1%) 0 Level 2A 250 (26%) 165 (46%) 15 (12%) >= Level 3 3 (0%) 191 (53%) 113 (88%) Total 961 362 128 28 The above gives evidence that the combination of tests sat is predictive (to some degree) of level achieved. Now the combination of tests sat by treatment: Level 2 test only Level 2 and level 3 tests Level 3 test only Erythromycin 489 (51%) 180 (50%) 58 (45%) No Erythromcyin 473 (49%) 182 (50%) 70 (55%) Co-amoxiclav 461 (48%) 180 (50%) 62 (48%) No Co-amoxiclav 501 (52%) 182 (50%) 66 (52%) Total 962 362 128 The ‘highest equivalent level (HEL)’ from all raw scores has been derived. This is an extension of the equivalent level corresponding to the level 2 test (from part 2) above), and is the highest level from all tests the child sat. So for example if child 1 achieves level 2B in the level 2 test and fails the level 3 test their HEL will be level 2B. If child 2 achieves level 2B in the level 2 test and level 3 in the level 3 test their HEL will be level 3. This is therefore the best predictor of teacher assessed level from the raw score data. It is tabulated below with teacher assessed level: HEL Under Level 1 Level 1 Level 2C Level 2B Level 2A Level 3 or above Missing < Level 1 4 1 0 0 0 0 0 Level 1 8 25 12 1 1 0 0 Teacher assessed level Level 2C Level 2B Level 2A 4 1 0 11 0 0 225 24 1 34 294 29 9 60 351 0 0 41 0 0 8 >= Level 3 0 0 0 0 13 292 2 Missing 0 0 0 0 1 0 0 Levels agree for 1191/1452 (82%) of children, HEL levels are higher than teacher assessed for 159 (11%) of children and teacher assessed levels are higher than HEL for 91 (6%) of children. Modelling HEL and teacher assessed level adjusting for combination of tests sat Poisson regression has been used for this. Ordinal logistic regression was also attempted but there were issues with convergence in some models. The models have been fitted twice – once adjusting for academic year and one adjusting for Level 2 test sat. The models were fitted both without adjustment for test combination and with. When adjusting the three groups outlined above are used – with Level 2 only as the baseline. SPL cohort 29 Highest Equivalent Level (HEL) Unadjusted Adjusting for test combination Erythromycin Co-amoxiclav Erythromycin* Co-amoxiclav Erythromycin Co-amoxiclav Erythromycin* Co-amoxiclav Papers 2 and 3 Paper 3 only Adjusting for academic year Erythromycin Co-amoxiclav Model with model model interactions 1.04 (0.97, 1.11) 1.05 (0.96, 1.15) 0.99 (0.93, 1.06) 1.01 (0.92, 1.11) 0.98 (0.86, 1.11) 1.02 (0.96, 1.09) 0.46 (0.41, 0.50) 0.32 (0.27, 0.39) Erythromycin model 1.03 (0.96, 1.10) 1.04 (0.95, 1.14) 1.02 (0.93, 1.12) 0.96 (0.84, 1.09) 1.02 (0.96, 1.10) 1.00 (0.94, 1.07) 0.46 (0.41, 0.50) 0.32 (0.27, 0.39) 0.46 (0.41, 0.50) 0.32 (0.27, 0.39) 0.46 (0.41, 0.50) Dropped due to collinearity Adjusting for test sat Co-amoxiclav Model with model interactions 1.04 (0.95, 1.14) 0.99 (0.93, 1.06) 1.00 (0.91, 1.10) 0.98 (0.86, 1.12) 1.00 (0.94, 1.07) 1.05 (0.95, 1.15) 1.02 (0.93, 1.13) 0.96 (0.84, 1.09) 0.46 (0.41, 0.50) Dropped due to collinearity 0.46 (0.41, 0.50) Dropped due to collinearity Teacher assessed level Unadjusted Adjusting for test combination Erythromycin Co-amoxiclav Erythromycin* Co-amoxiclav Erythromycin Co-amoxiclav Erythromycin* Co-amoxiclav Papers 2 and 3 Paper 3 only Adjusting for academic year Erythromycin Co-amoxiclav Model with model model interactions 1.02 (0.96, 1.09) 1.04 (0.95, 1.14) 0.98 (0.92, 1.05) 1.01 (0.92, 1.10) 0.96 (0.85, 1.09) 1.00 (0.94, 1.07) 0.48 (0.43, 0.52) 0.36 (0.30, 0.42) Erythromycin model 1.02 (0.95, 1.09) Adjusting for test sat Co-amoxiclav model 0.98 (0.92, 1.05) 1.03 (0.94, 1.13) 1.02 (0.93, 1.11) 0.96 (0.84, 1.09) 1.01 (0.95, 1.08) 0.99 (0.93, 1.06) 0.48 (0.43, 0.52) 0.36 (0.30, 0.42) 0.48 (0.43, 0.52) 0.36 (0.30, 0.42) 0.48 (0.43, 0.52) Dropped due to collinearity Model with interactions 1.03 (0.94, 1.13) 1.00 (0.91, 1.10) 0.96 (0.85, 1.10) 0.99 (0.93, 1.06) 1.04 (0.95, 1.14) 1.02 (0.93, 1.12) 0.95 (0.83, 1.08) 0.48 (0.43, 0.52) Dropped due to collinearity 0.48 (0.43, 0.52) Dropped due to collinearity Therefore there are no treatment differences evident when adjusting for tests sat. Sitting the Level 2 and 3 tests increases the level achieved, and sitting the Level 3 test increases the level achieved further, compared to sitting only the Level 2 test. SPL cohort 30 Combined raw test score The next step is to devise a combined raw test score (level 2 and level 3 tests), this would extend the modelling of level 2 raw score data implemented in section 3). Standardisation/anchoring using PIPS data To begin with the Maths data has been used, data is available from 2001 to 2007 on 104,750 children. The data consists of a PIPS score and KS1 level, along with the year the child sat the test. Data is not available on which test the child sat. 1) Exploratory analyses on PIPS data Mean and 95% CI for Maths PIPS score each year, and overall N Mean (95% CI) 2001 21,078 20.79 (20.68, 20.89) 2002 17,152 20.80 (20.69, 20.91) 2003 15,547 20.62 (20.50, 20.73) 2004 11,190 20.73 (20.59, 20.87) 2005 10,787 20.83 (20.69, 20.97) 2006 13,827 20.74 (20.62, 20.86) 2007 15,169 20.56 (20.44, 20.67) Overall 104,750 20.72 (20.68, 20.77) Variations year on year are very minor. Furthermore there is no evidence of an increasing or decreasing trend in scores over time, which the graph below illustrates more clearly: SPL cohort 20.4 20.6 PIPS score 20.8 21 31 2001 2002 2003 2004 School year 2005 2006 2007 The horizontal red lines represent the mean and 95% CI for the overall scores. The only year for which 95% CIs don’t overlap with the Overall CI is 2007. The table and graph suggest that PIPS scores are fairly constant over time, suggesting that standards have not changed. Histogram of overall Maths PIPS score (histograms by year are available in Appendix A – section 8) SPL cohort .04 .02 0 Density .06 .08 32 0 10 20 mathsPIPS 30 40 The data appears to broadly follow a normal distribution, although the tail for the lower scores is noticeably larger than the tail for the upper scores. 2) Analyses of the relationship between PIPS score and KS1 level The relationship between PIPS score and KS1 level is examined, both overall and by year. This is to: 1) assess the appropriateness of the use of PIPS data with KS1 levels and 2) look for evidence of changes over the years in KS1 test standards. Box plot of PIPS score by KS1 level (box plots by year are available in Appendix A – section 8) SPL cohort 0 10 20 30 40 33 Below level 1 Level 1 Level 2C Level 2B Level 2A Level 3+ There is a trend of increasing PIPS score with increasing KS1 level, although there is a moderate amount of overlap between the levels. Mean and 95% CI for PIPS score, by KS1 level and school year For this the data has been standardised to enable easier identification of trends. The data has been standardised relative to the 2001 data, so that the 2001 data has mean 50 and standard deviation 10. The mean (95% CI) standardised PIPS scores by KS1 level and school year are given below: 2001 Below level 1 N Mean (95% CI) Level 1 Level 2C 403 31.70 (31.11, 2002 215 30.63 32.28) (29.87, 2003 246 31.49 31.38) (30.79, 2004 171 30.64 32.19) (29.82, 2005 117 29.73 31.47) (28.94, 2006 246 30.35 30.52) (29.63, 2007 264 30.43 31.07) (29.84, Overall 1662 30.88 31.02) (30.61, N 1305 1183 939 618 642 821 926 6434 Mean 35.40 35.74 34.96 34.30 34.98 35.05 34.96 35.14 (95% CI) (35.04, N Mean (95% CI) 3431 41.59 (41.37, SPL cohort 35.75) (35.38, 41.81) 2527 41.54 (41.30, 36.09) (34.59, 41.79) 2555 41.33 (41.08, 35.32) (33.84, 41.58) 1635 40.50 (40.19, 34.76) (34.54, 40.81) 1772 40.40 (40.11, 35.42) (34.63, 40.69) 2135 40.94 (40.67, 35.48) (34.58, 41.21) 2161 40.56 (40.29, 31.15) 35.33) (34.99, 35.29) 40.83) 16,216 41.08 (40.98, 41.18) 34 Level 2B N 5203 3342 3075 2378 2217 3065 3315 Mean 47.85 47.17 47.18 46.68 47.22 47.27 46.86 (95% CI) Level 2A (47.68, 48.03) (46.95, 47.38) (46.96, 47.40) (46.42, 46.93) (46.96, 47.49) (47.05, 47.49) (46.64, N 4602 4139 3657 2720 3020 3993 4053 Mean 53.31 52.30 52.40 52.40 52.96 53.54 52.85 (95% CI) Level 3+ N Mean (95% CI) Missing (53.14, 5579 (95% CI) (52.12, 52.49) 5129 59.63 (59.48, N Mean 53.48) (58.88, 52.59) (52.18, 4471 59.04 59.79) (52.21, 3397 58.98 59.20) (58.81, 52.62) (58.76, 53.17) 2817 58.96 59.15) (52.75, (59.74, 53.72) (52.67, 3172 59.93 59.16) (53.36, (59.65, 47.25 47.07) (59.30, (52.78, 59.37 59.64) (59.31, 617 604 271 202 395 714 3358 45.47 43.98 45.15 44.91 44.76 46.68 47.64 45.65 46.33) (43.12, 44.84) (44.25, 46.06) (43.54, 46.28) (43.29, 46.23) (45.61, 47.74) (46.88, 52.92) 28,301 555 (44.60. 47.33) 52.85 53.03) 59.47 60.03) (47.17, 26,184 3736 59.84 60.13) 22,595 48.40) (45.29, 59.44) 46.02) For all years there are strong distinctions between the mean (95% CI) PIPS scores for each KS1 level. There are some differences between years in mean PIPS scores for each level. These are represented graphically in Appendix A – section 8. These plots do not demonstrate any trends in levels over time, there are some variations but these appear to be at random as they are not supported by all levels, or by all years. The correlation coefficient for PIPS score and KS1 level is 0.79, indicating a relatively strong correlation between the two measures. If a regression model is fitted with PIPS score as the outcome and KS1 level as the explanatory variable the adjusted R 2 value is 0.63, and the coefficient estimate for KS1 level is 4.44 (4.42, 4.47). Adding in school year to the regression model does not alter the value of R2. All of this provides evidence that the PIPS scores are closely related to KS1 levels, and that overall standards have not changed over time as PIPS scores are relatively stable over time. 3) Anchoring KS1 level data The KS1 level scores for the students for whom we have PIPS scores have been dichotomised at level 2 and above, and below level 2. These have been tabulated against PIPS scores dichotomised at above 12 and 12 and below for each year: 2001 >= Level 2 SPL cohort < Level 2 2002 Total >= Level 2 < Level 2 2003 Total >= Level 2 < Level 2 2004 Total >= Level 2 < Level 2 Total 35 PIPS >12 PIPS <=12 17,069 380 97.82% 2.18% 17,449 1,746 1,328 56.8% 43.20% 3,074 >= Level 2 < Level 2 Total 8,819 146 8,965 98.37% 1.63% 13,838 354 97.51% 2.49% 1,299 1,044 55.44% 44.56% >= Level 2 < Level 2 Total 11,224 213 11,437 98.14% 1.86% 2005 PIPS >12 PIPS <=12 1,007 613 62.16% 37.84% 14,192 2,343 12,452 241 98.10% 1.90% 1,306 944 58.04% 41.96% >= Level 2 < Level 2 Total 11,965 217 12,182 98.22% 1.78% 2006 1,620 12,693 2,250 9,141 144 98.45% 1.55% 989 645 60.53% 39.47% 9,285 1,634 2007 1,141 854 57.19% 42.81% 1,995 1,300 973 57.19% 42.81% 2,273 If the tests were identical over time we would expect identical percentages for each year in the table above. For percentages for 2002-2007 to be identical to those from 2001, KS1 levels will need ‘reassigning’ as indicated in the table below: Year 2002 2003 2004 2005 2006 2007 Movement -1.67% <level 2 moved to >=level 2 1.52% >=level 2 moved to <level 2 4.36% >=level 2 moved to <level 2 5.91% >=level 2 moved to <level 2 0.71% >=level 2 moved to <level 2 0.79% >=level 2 moved to <level 2 We have applied this to the Oracle KS1 data to anchor the data according to the PIPS data. However it would be most logical when reassigning from >=level 2 to <level 2 to reassign those who scored >=level 2 with the lowest score, and vice versa when reassigning in the opposite direction. We do not know this information without reverting to raw score data. Therefore the only solution is to reassign equally from each treatment group. This has been done, the tables below describe how many children have been moved in each group for both parental and DfE data: Parental 2001 2002 2003 Total children Number of children to move Total children Number of children to move Total children SPL cohort Erythromycin & Co-amoxiclav 1 Erythromycin only 4 Co-amoxiclav only 2 Double placebo 1 27 0 52 30 1 58 19 0 50 25 0 53 Percentage to move and direction 1.67% down 1.52% up 36 2004 2005 2006 2007 Number of children to move Total children Number of children to move Total children Number of children to move Total children Number of children to move Total children Number of children to move 1 90 4 102 6 119 1 68 1 1 92 4 120 7 129 1 71 1 1 85 4 100 6 124 1 79 1 1 86 4 127 8 114 1 71 1 Erythromycin & Co-amoxiclav 1 Erythromycin only 4 Co-amoxiclav only 2 Double placebo 2 48 1 79 1 131 6 195 12 224 2 139 1 47 1 81 1 125 5 191 11 224 2 152 1 39 1 76 1 126 5 167 10 211 1 170 1 38 1 79 1 135 6 197 12 223 2 133 1 4.36% up 5.91% up 0.71% up 0.79% up DfE 2001 2002 2003 2004 2005 2006 2007 Total children Number of children to move Total children Number of children to move Total children Number of children to move Total children Number of children to move Total children Number of children to move Total children Number of children to move Total children Number of children to move Percentage to move and direction 1.67% down 1.52% up 4.36% up 5.91% up 0.71% up 0.79% up The data have now been reanalysed using Mantel-Haenszel methods as done in part 1. Results are below: N Maths SPL cohort Below level 2 Erythromycin 963 121 12.6% Parental data No CoErythromycin amoxiclav 936 918 110 105 11.8% 11.4% No Coamoxiclav 981 126 12.8% Erythromycin 1641 279 17.0% DfE data No CoErythromycin amoxiclav 1598 1608 263 268 16.5% 16.7% No Coamoxiclav 1631 274 16.8% 37 Maths MH OR (95% CI) 1.08 (0.82, 1.43) 0.88 (0.67, 1.16) 1.04 (0.86, 1.25) ORs are similar to those obtained earlier from MH methods (page 4). If anything estimates from the anchored data are closer to one. SPL cohort 1.00 (0.83, 1.20) 38 SPL cohort 39 Appendix A Section 1 – Unadjusted Ordinal Logistic Regression, Proportional Odds Assumptions The graphs overleaf illustrate the assumptions for the parental data for reading level associated with Erythromycin, and maths level associated with Coamoxiclav: SPL cohort 40 Reading Erythromycin 100 90 80 70 Level 3 or above 60 Level 2A Level 2B 50 Level 2C Level 1 40 Under level 1 30 20 10 0 Eryth No Eryth Eryth No Eryth Eryth No Eryth 2001 2001 2002 2002 2003 N=5 N=3 N=57 N=44 N=105 N=102 SPL cohort 2003 Eryth 2004 No Eryth 2004 N=182 N=171 Eryth 2005 No Eryth 2005 N=222 N=227 Eryth 2006 No Eryth 2006 N=247 N=237 Eryth 2007 No Eryth 2007 N=139 N=150 Eryth Total No Eryth Total N=957 N=934 41 SPL cohort 42 The following graphs are from the DfE data – writing for both Erythromycin and Co-amoxiclav SPL cohort 43 Writing Co-amoxiclav 100 90 80 70 Level 3 or above 60 Level 2A Level 2B 50 Level 2C Level 1 40 Under level 1 30 20 No Co-amox Co-amox No Co-amox Co-amox No Co-amox Co-amox No Co-amox Co-amox No Co-amox Co-amox No Co-amox Co-amox No Co-amox Co-amox Co-amox 0 No Co-amox 10 2001 2001 2002 2002 2003 2003 2004 2004 2005 2005 2006 2006 2007 2007 Total Total N=10 N=7 N=99 N=109 N=202N=188 N=248N=238 N=339N=364 N=446N=432 N=446N=432 N=1344 N=1338 SPL cohort 44 Section 2 – Unadjusted Poisson Regression Assumptions The following plots assess the assumptions and viability of the parental data reading with erythromycin model: Pearson’s residuals against linear predictor: 2 1 -1 0 Pearson residual 1 0 -1 Pearson residual 2 Pearsons residuals against fitted values: 2.75 2.8 2.85 predicted mean readscale Standardized Pearson’s residuals against id: SPL cohort 2.9 2.95 1 1.02 1.04 linear predictor 1.06 1.08 1 0 -1 Pearson residual 2 45 0 SPL cohort 500 1000 id 1500 2000 46 Section 3 - Adjusted Poisson Regression Again plots are for the parental dataset reading with erythromycin model: Pearson’s residuals against linear predictor: 2 1 -2 -1 0 Pearson residual 0 -1 -2 Pearson residual 1 2 Pearsons residuals against fitted values: 2 2.5 3 3.5 predicted mean readscale Standardized Pearson’s residuals against id: SPL cohort 4 .6 Leverage against id .8 1 linear predictor 1.2 1.4 0 -2 .005 -1 0 SPL cohort 500 1000 id 1500 .01 hat diagonal 0 Pearson residual .015 1 2 .02 47 2000 0 500 1000 id 1500 2000 48 Section 4 – Mapping categories to continuous scores Residual plots using the parental dataset and the reading with erythromycin model: Normal probability plot of standardised residuals 0.75 0.00 0.25 0.50 Normal F[(rstandard-m)/s] 1 .5 0 Density 1.5 2 1.00 Histogram of standardised residuals -3 -2 -1 0 Standardized residuals Plot of standardised residuals against fitted values SPL cohort 1 2 0.00 0.25 0.50 Empirical P[i] = i/(N+1) Plot of standardised residuals against id 0.75 1.00 0 -1 -3 -3 -2 -2 -1 0 Standardized residuals 1 1 49 15.1 SPL cohort 15.2 15.3 15.4 Fitted values 15.5 15.6 0 500 1000 id 1500 2000 50 Section 5 – Adjusted mapping categorical to continuous models Residual plots using the reading with erythromycin model: Allowing neonatal outcomes Histogram of residuals .2 .4 Density .4 0 .2 0 Density .6 .6 .8 .8 Not allowing neonatal outcomes Histogram of residuals -4 -2 0 Standardized residuals Normal probability plot of standardised residuals SPL cohort 2 -3 -2 -1 0 Standardized residuals Normal probability plot of standardised residuals 1 2 0.75 0.50 0.25 0.00 0.00 0.25 0.50 Normal F[(rstandard2-m)/s] 0.75 1.00 1.00 51 0.00 0.25 0.50 Empirical P[i] = i/(N+1) Plot of standardised residuals against fitted values SPL cohort 0.75 1.00 0.00 0.25 0.50 Empirical P[i] = i/(N+1) Plot of standardised residuals against fitted values 0.75 1.00 0 -1 -4 -3 -2 -2 0 Standardized residuals 1 2 2 52 13 14 15 16 Fitted values Plot of standardised residuals against gestation at birth SPL cohort 17 18 12 14 16 Fitted values 18 -4 -2 0 2 53 150 200 250 gest_at_birth SPL cohort 300 54 Section 6 – Unadjusted raw score modelling Residual plots using the maths raw score adjusting for paper sat with erythromycin model: Normal probability plot of standardised residuals 0.75 0.00 0.25 0.50 Normal F[(rstandard-m)/s] .2 .1 0 Density .3 .4 1.00 Histogram of residuals -3 -2 -1 0 Standardized residuals Plot of standardised residuals against fitted values SPL cohort 1 2 0.00 0.25 0.50 Empirical P[i] = i/(N+1) Plot of standardised residuals against id 0.75 1.00 0 -1 -3 -3 -2 -2 -1 0 Standardized residuals 1 1 2 2 55 17.6 17.8 18 Fitted values 18.2 0 500 1000 id Section 7– Adjusted raw score modelling Residual plots using the maths raw score with erythromycin model, allowing for neonatal outcomes and adjusting for academic year: Histogram of standardised residuals SPL cohort Normal probability plot of standardised residuals 1500 0.75 0.00 0.25 0.50 Normal F[(rstandard-m)/s] .2 .1 0 Density .3 .4 1.00 56 -3 -2 -1 0 Standardized residuals Plot of standardised residuals against fitted values SPL cohort 1 2 0.00 0.25 0.50 Empirical P[i] = i/(N+1) Plot of standardised residuals against id 0.75 1.00 0 -1 -3 -3 -2 -2 -1 0 Standardized residuals 1 1 2 2 57 14 SPL cohort 16 18 Fitted values 20 22 0 500 1000 id 1500 58 Section 8 – PIPS scores Histograms of PIPS scores by academic year 2002 2003 2004 2005 2006 0 .02 .04 .06 .08 0 10 20 2007 0 .02 .04 .06 .08 Density 0 .02 .04 .06 .08 2001 0 10 20 30 40 mathsPIPS Graphs by schoolyear Boxplots of PIPS score for KS1 level, by school year SPL cohort 30 40 0 10 20 30 40 59 2002 2003 2004 2005 2006 0 10 20 30 40 0 10 20 30 40 2001 0 10 20 30 40 2007 Graphs by schoolyear (KS1 labels have been omitted for space – but all boxes are in the order Below level 1, Level 1, Level 2C, Level 2B, Level 2A, Level 3+) Below level 1 SPL cohort Level 1 Level 2C 42 40.5 41 PIPS score 35.5 40 29 34 30 34.5 35 PIPS score 32 31 PIPS score 41.5 36 33 60 2001 2002 2003 2004 School year 2005 2006 2007 2002 2003 2004 School year 2005 2006 2001 2007 Level 2A 2002 2003 2004 School year 2005 2006 2007 2002 2003 2004 School year 2005 2006 2007 Level 3+ 59.5 PIPS score 53 PIPS score 58.5 52 46.5 52.5 59 47 PIPS score 47.5 53.5 60 48 54 Level 2B 2001 2001 2002 SPL cohort 2003 2004 School year 2005 2006 2007 2001 2002 2003 2004 School year 2005 2006 2007 2001 61 For PROM cohort (not analysed in main paper) Tables 1-5 for PROM cohort (followed up from ORACLE I trial) corresponding to those for SPL cohort in main paper Characteristics of responders and non-responders, and characteristics of responder by treatment group (extending Table 1) Additional more detailed data and analyses (including example Stata commands) Different methods of analysis 1) Dichotomising at level 2 2) Extended analysis retaining most categories 3) Ordinal logistic regression 4) Poisson regression Adjusting for covariates 1) Ordinal logistic regression 2) Poisson regression Mapping categorical scores to continuous scores 1) Unadjusted models 2) Adjusted models Use of raw score data 1) Level scores for those with raw score data available 2) Descriptive analyses of level 2 test raw scores 3) Modelling level 2 raw score data 4) Extending analyses for other tests sat Standardisation/anchoring using PIPS data 1) Exploratory analyses on PIPS data 2) Analysis of the relationship between PIPS scores and KS1 levels 3) Anchoring KS1 level data PROM cohort 61 62 (Table 1) Characteristics of groups consenting/not to collection of KS1 data from the child’s school. No contact was made with parents/carers to seek consent in 5 cases. Number of women Maternal age - Median (IQR) years Gestation age at trial entry – Median (IQR) days Multiple births Number of children Delivery within 48hrs Delivery within 7 days Gestational age at delivery – Median (IQR) days Birthweight - Median (IQR) g Males Admission to Neonatal unit Ventilated Respiratory Distress Syndrome Oxygen at 28 days Positive blood culture Necrotising enterocolitis (suspected or proven) Abnormal cerebral ultrasonography Social deprivation: number (%) in lowest quartile for: Income Education Child Poverty Ethnicity PROM cohort White Consent to KS1 data being collected 2025 (100%) 28.9 (24.6, 32.8) 226 (207, 238) 128 (6.3%) Contact made but no consent 1170 (100%) 25.6 (21.0, 30.3) 224 (204, 237) 61 (5.2%) 2149 (100%) 777 (36.2%) 1310 (61.0%) 237 (222, 248) 2100 (1660, 2550) 1155 (53.7%) 1582 (73.6%) 413 (19.2%) 1223 (100%) 403 (33.0%) 687 (56.2%) 239 (222, 253) 2145 (1642, 2680) 672 (54.9%) 831 (67.9%) 221 (18.1%) 428 (19.9%) 180 (8.4%) 128 (6.0%) 210 (17.2%) 98 (8.0%) 46 (3.8%) 49 (2.3%) 55 (2.6%) 21 (1.7%) 41 (3.4%) 897 (41.7%) 861 (40.1%) 893 (41.6%) 754 (61.7%) 722 (59.0%) 738 (60.3%) 1764 1655 (93.8%) 1474 1108 (75.2%) 62 63 (Table 2) Educational attainment in reading, writing and mathematics at Key Stage 1, for children in England only: data from i) DFE, and ii ) schools, with parental consent and direct from parents Numbers (percentages) are those failing to achieve level 2 or higher. Mantel-Haenszel Odds ratios (95% confidence intervals) Anonymized data from DFE N Erythromycin 1596 No Erythromycin 1642 Co-amoxiclav 1623 Data from schools/parents No Co-amoxiclav 1615 Erythromycin 1030 No Erythromycin 1119 No Coamoxiclav 1049 Co-amoxiclav 1100 Reading 360 363 22.6% 22.1% (20.5%, 24.7%) (20.1%, 24.2%) 1.03 (0.87, 1.21) 354 369 21.8% 22.8% (19.8%, 23.9%) (20.8%, 25.0%) 0.94 (0.79, 1.11) 167 189 16.2% 16.9% (14.0%, 18.6%) (14.7%, 19.2%) 0.95 (0.76, 1.20) 172 184 15.6% 17.5% (13.5%, 17.9%) (15.3%, 20.0%) 0.87 (0.69, 1.09) Writing 418 426 26.2% 25.9% (24.0%, 28.4%) (23.8%, 28.1%) 1.01 (0.86, 1.18) 405 439 25.0% 27.2% (22.9%, 27,1%) (25.0%, 29.4%) 0.88 (0.75, 1.03) 206 233 20.0% 20.8% (17.6%, 22.6%) (18.5%, 23.3%) 0.94 (0.76, 1.17) 213 226 19.4% 21.5% (17.1%, 21.8%) (19.1%, 24.2%) 0.87 (0.70, 1.07) 257 16.1% (14.3%, 18.0%) 250 15.4% (13.7%, 17.3%) 115 11.2% (9.3%, 13.2%) 114 10.4% (8.6%, 12.3%) Maths 257 15.7% (13.9%, 17.5%) 1.03 (0.85, 1.25) PROM cohort 264 16.3% (14.6%, 18.2%) 0.92 (0.76, 1.11) 113 10.1% (8.4%, 12.0%) 1.12 (0.85, 1.47) 114 10.9% (9.0%, 12.9%) 0.94 (0.71, 1.24) 63 64 (Table 3) Educational attainment in reading, writing and mathematics at Key Stage 1, for England only, for children whose mothers had PROM: data from i) DFE, and ii ) schools, with parental consent and direct from parents Overall Relative Risks (RR) and 95% Confidence Intervals are from Poisson models for level achieved (scaled 1-6) adjusting for test year, 2002-7.) Subject Reading DFE data Erythromycin 1.03 (0.99, 1.07) Writing 1.01 (0.97, 1.05) Maths 1.01 (0.97, 1.06) Co-amoxiclav 0.98 (0.94, 1.02) 0.98 (0.94, 1.01) 0.99 (0.95, 1.03) Subject Reading Writing Maths Parental /school data Erythromycin Co-amoxiclav 1.01 (0.96, 1.06) 0.99 (0.94, 1.04) 1.01 (0.97, 1.06) 0.97 (0.93, 1.02) 1.00 (0.95, 1.06) 1.00 (0.95, 1.05) There is no evidence of overdispersion when these Poisson models are fitted. Inclusion of a treatment interaction term does not alter estimates. PROM cohort 64 65 (Table S1) Highest equivalent (HEL) score derived from raw score data, compared with level returned from teacher’s assessment, for Mathematics, 2004 onwards. HEL Under Level 1 Level 1 Level 2C Level 2B Level 2A Level 3 or above Missing PROM cohort < Level 1 3 0 1 0 0 0 0 Level 1 14 26 17 8 1 0 0 Teacher-assessed level Level 2C Level 2B Level 2A 2 0 0 9 1 0 247 20 1 27 332 14 7 67 389 0 3 42 1 0 6 >= Level 3 0 0 0 1 15 350 2 Missing 0 0 0 0 0 1 0 66 (Table S2) KS1 level data for Mathematics anchored using PIPS score. Numbers (percentages) failing to achieve level 2 or higher. Mantel-Haenszel Odds ratios (95% confidence intervals) DFE data N Erythromycin 1596 296 (18.5%) 1.03(0.86, 1.24) PROM cohort No Erythromycin 1642 296 (18.0%) No CoCo-amoxiclav amoxiclav 1623 1615 289 (17.8%) 303 (18.8%) 0.93(0.78, 1.11) Parental/school data No CoNo CoErythromycin Erythromycin amoxiclav amoxiclav 1030 1119 1100 1049 142 (13.8%) 140 (12.5%) 140 (12.7%) 142 (13.5%) 1.11(0.87, 1.43) 0.93(0.72, 1.20) 67 Characteristics of responders and non-responders, and characteristics of responder by treatment group Consent to KS1 data being collected Number of women Maternal age - Median (IQR) years Gestation age at trial entry – Median (IQR) days Multiple births Maternal antibiotics Number of children Delivery within 48hrs Delivery within 7 days Gestational age at delivery – Median (IQR) days Birthweight - Median (IQR) g Males Admission to Neonatal unit Ventilated Respiratory Distress Syndrome Oxygen at 28 days Positive blood culture Necrotising enterocolitis (suspected or proven) Abnormal cerebral ultrasonography Social deprivation - PROM cohort Income Consent to KS1 data being collected 2025 (63%) 28.9 (24.6, 32.8) 226 (207, 238) 128 6.3% 490 24.2% 2149 777 36.2% 1310 61.0% 237 (222, 248) 2100 (1660, 2550) 1155 53.7% 1582 73.6% 413 19.2% Contact made but no consent 1170 25.6 (21.0, 30.3) 224 (204, 237) 61 5.2% 290 24.8% 1223 403 33.0% 687 56.2% 239 (222, 253) 2145 (1642, 2680) 672 54.9% 831 67.9% 221 18.1% Erythromycin and Co-amoxiclav 490 (63%) 29.4 (24.9, 33.1) 225 (209, 237) 37 7.6% 103 21.0% 524 184 35.1% 298 56.9% 236 (222, 249) 2097.5 (1680, 2595) 275 52.5% 383 73.1% 99 18.9% Erythromycin only 475 (60%) 28.3 (24.1, 32.4) 226 (204, 238) 31 6.5% 120 25.3% 506 175 34.6% 313 61.9% 236 (221, 247) 2070 (1600, 2450) 266 52.6% 375 74.1% 103 20.4% Co-amoxiclav only 547 (66%) 29.3 (24.6, 33.2) 226 (209, 239) 30 5.5% 128 23.4% 576 193 33.5% 339 58.9% 238 (222, 248) 2120 (1690, 2555) 298 51.7% 421 73.1% 107 18.6% Double placebo 513 (64%) 28.6 (24.8, 32.3) 227 (207, 237) 30 5.8% 139 27.1% 543 225 41.4% 360 66.3% 237 (221, 248) 2090 (1660, 2560) 316 58.2% 403 74.2% 104 19.2% 428 19.9% 180 8.4% 128 6.0% 49 2.3% 55 2.6% 897 210 17.2% 98 8.0% 46 3.8% 21 1.7% 41 3.4% 754 111 21.2% 36 6.9% 28 5.3% 10 1.9% 13 2.5% 220 112 22.1% 42 8.3% 26 5.1% 11 2.2% 15 3.0% 211 105 18.2% 51 8.9% 34 5.9% 16 2.8% 14 2.4% 247 100 18.4% 51 9.4% 40 7.4% 12 2.2% 13 2.4% 219 68 - lowest quartile Education Child Poverty Ethnicity White 41.7% 861 40.1% 893 41.6% 61.7% 722 59.0% 738 60.3% 42.0% 197 37.6% 215 41.0% 41.7% 224 44.3% 210 41.5% 42.9% 230 39.9% 244 42.4% 40.3% 210 38.7% 224 41.3% 1764 1655 93.8% 1474 1108 75.2% 428 394 92.1% 426 401 94.1% 475 446 93.9% 435 414 95.2% Different methods of analysis 5) Dichotomising at level 2 The table below shows the results of using Mantel-Haenszel methods stratifying by test year, dichotomising into scoring level 2 and above, and failing to achieve level 2: N Reading Below level 2 Writing Below level 2 Maths Below level 2 Reading Writing Maths MH OR (95% CI) MH OR (95% CI) MH OR (95% CI) Example Stata command: PROM cohort Erythromycin 1030 167 16.2% 206 20.0% 115 11.2% Parental data No CoErythromycin amoxiclav 1119 1100 189 172 16.9% 15.6% 233 213 20.8% 19.4% 113 114 10.1% 10.4% No Coamoxiclav 1049 184 17.5% 226 21.5% 114 10.9% Erythromycin 1596 360 22.6% 418 26.2% 257 16.1% DfE data No CoErythromycin amoxiclav 1642 1623 363 354 22.1% 21.8% 426 405 25.9% 25.0% 257 250 15.7% 15.4% No Coamoxiclav 1615 369 22.8% 439 27.2% 264 16.3% 0.95 (0.76, 1.20) 0.87 (0.69, 1.09) 1.03 (0.87, 1.21) 0.94 (0.79, 1.11) 0.94 (0.76, 1.17) 0.87 (0.70, 1.07) 1.01 (0.86, 1.18) 0.88 (0.75, 1.03) 1.12 (0.85, 1.47) 0.94 (0.71, 1.24) 1.03 (0.85, 1.25) 0.92 (0.76, 1.11) mhodds read_di eryth, by(academic_year), 69 where read_di = 0 - level 2 or higher, 1 - below level 2 PROM cohort 70 6) Extended analysis retaining most categories Erythromycin Reading Erythromycin DfE data No CoErythromycin amoxiclav No Coamoxiclav 27 2.6% 32 2.9% 28 2.5% 31 3.0% 90 5.6% 102 6.2% 86 5.3% 106 6.6% Level 1 140 13.6% 157 14.0% 144 13.1% 153 14.6% 270 16.9% 261 15.9% 268 16.5% 263 16.3% Level 2C 160 15.5% 155 13.9% 170 15.5% 145 13.8% 225 14.1% 218 13.3% 218 13.4% 225 13.9% Level 2B 228 22.1% 247 22.1% 241 21.9% 234 22.3% 439 27.5% 412 25.1% 427 26.3% 424 26.3% Level 2A 231 22.4% 244 21.8% 246 22.4% 229 21.8% 276 17.3% 284 17.3% 276 17.0% 284 17.6% Level 3 or over 238 23.1% 6 0.6% 283 25.3% 1 0.1% 266 24.2% 5 0.5% 255 24.3% 2 0.2% 286 17.9% 10 0.8% 356 21.7% 9 0.7% 336 20.7% 12 0.9% 306 18.9% 7 0.5% Under level 1 54 5.2% 59 5.3% 55 5.0% 58 5.5% 133 8.3% 150 9.1% 131 8.1% 152 9.4% Level 1 152 14.8% 174 15.5% 158 14.4% 168 16.0% 285 17.9% 276 16.8% 274 16.9% 287 17.8% Level 2C 237 23.0% 244 21.8% 239 21.7% 242 23.1% 331 20.7% 348 21.2% 335 20.6% 344 21.3% Level 2B 269 26.1% 273 24.4% 289 26.3% 253 24.1% 478 29.9% 450 27.4% 474 29.2% 454 28.1% Level 2A 199 19.3% 212 18.9% 211 19.2% 200 19.1% 222 13.9% 233 14.2% 236 14.5% 219 13.6% Level 3 or over 117 11.4% 2 0.2% 156 13.9% 1 0.1% 146 13.3% 2 0.2% 127 12.1% 1 0.1% 144 9.0% 3 0.2% 182 11.1% 3 0.2% 169 10.4% 4 0.3% 157 9.7% 2 0.1% 17 24 20 21 63 70 61 72 Missing Maths No Coamoxiclav Under level 1 Missing Writing Parental data No CoErythromycin amoxiclav Under level 1 PROM cohort 71 1.7% 2.1% 1.8% 2.0% 3.9% 4.3% 3.8% 4.5% Level 1 98 9.5% 89 8.0% 94 8.5% 93 8.9% 194 12.2% 187 11.4% 189 11.6% 192 11.9% Level 2C 161 15.6% 200 17.9% 171 15.5% 190 18.1% 262 16.4% 265 16.1% 247 15.2% 280 17.3% Level 2B 251 24.4% 263 23.5% 283 25.7% 231 22.0% 457 28.6% 454 27.6% 481 29.6% 430 26.6% Level 2A 277 26.9% 285 25.5% 301 27.4% 261 24.9% 331 20.7% 343 20.9% 345 21.3% 329 20.4% Level 3 or over 223 21.7% 3 0.3% 257 23.0% 1 0.1% 228 20.7% 3 0.3% 252 24.0% 1 0.1% 285 17.9% 4 0.3% 318 19.4% 5 0.4% 296 18.2% 4 0.3% 307 19.0% 5 0.4% Missing 7) Ordinal logistic regression Ordinal logistic regression for the level achieved (6 groups) with explanatory variables indicating allocation to Erythromycin and/or Co-amoxiclav, and also school year: Parental data – OR (95% CI) Subject Reading Writing Maths Models with no interactions Erythromycin Co-amoxiclav 1.05 (0.91, 1.22) 1.05 (0.90, 1.22) 1.01 (0.87, 1.18) 0.97 (0.84, 1.13) 0.90 (0.77, 1.04) 1.02 (0.88, 1.18) Model with interaction Co-amoxiclav Erythromycin* Co-amoxiclav 1.05 (0.84, 1.30) 0.97 (0.79, 1.19) 1.01 (0.75, 1.36) 1.05 (0.85, 1.30) 0.90 (0.73, 1.11) 0.99 (0.74, 1.34) 1.03 (0.83, 1.28) 1.03 (0.84, 1.27) 0.97 (0.72, 1.31) Erythromycin DfE data – OR (95% CI) Subject PROM cohort Models with no interactions Erythromycin Co-amoxiclav Erythromycin Model with interaction Co-amoxiclav Erythromycin* Co-amoxiclav 72 Reading Writing Maths 1.12 (0.99, 1.26) 1.04 (0.92, 1.18) 1.06 (0.94, 1.20) Example Stata command: 0.93 (0.82, 1.05) 0.90 (0.79, 1.01) 0.95 (0.84, 1.08) 1.15 (0.96, 1.36) 1.03 (0.87, 1.22) 1.06 (0.89, 1.26) 0.96 (0.80, 1.14) 0.89 (0.75, 1.06) 0.96 (0.80, 1.14) 0.95 (0.74, 1.21) 1.02 (0.80, 1.30) 1.00 (0.78, 1.27) ologit read_scale eryth academic_year, or where read_scale = {1, 2, 3, 4, 5, 6} Ordinal logistic regression relies on the proportional odds assumption. This was tested via likelihood ratio (LR) tests, the p-values from which are given below: Parental data – p-values from LR tests for proportional odds Subject Erythromycin Co-amoxiclav Reading Writing Maths 0.42 0.18 0.05 0.39 0.35 0.01 Model with interaction 0.74 0.30 0.07 DfE data – p-values from LR tests for proportional odds Subject Erythromycin Co-amoxiclav Reading Writing Maths 0.01 <0.001 <0.001 0.03 <0.001 <0.001 Example Stata command: Model with interaction 0.06 <0.001 <0.001 omodel logit read_scale eryth academic_year where read_scale = {1, 2, 3, 4, 5, 6} These tests indicate the assumptions are valid for reading and writing using the parental data, but not valid for maths using the parental data or for any subject using the DfE data. To investigate impact of adjusting for school year on the assumptions, if we exclude school year from the models the following pvalues are given from the LR tests for proportional odds: (See also graphs in Appendix A, Section 1.). PROM cohort 73 Parental data - p-values from LR tests for proportional odds excluding school year from the models Subject Erythromycin Co-amoxiclav Reading Writing Maths 0.76 0.46 0.31 0.71 0.89 0.06 Model with interaction 0.93 0.54 0.25 DfE data - p-values from LR tests for proportional odds excluding school year from the models Subject Erythromycin Co-amoxiclav Reading Writing Maths 0.22 0.19 0.90 0.68 0.98 0.21 8) Model with interaction 0.52 0.74 0.47 Poisson regression Poisson regression for the level achieved (scaled 1 to 6) with explanatory variables indicating allocation to Erythromycin and/or Co-amoxiclav, and also school year: Parental data – RR (95% CI) Subject Reading Writing Maths Models with no interactions Erythromycin Co-amoxiclav 1.01 (0.96, 1.06) 1.01 (0.97, 1.06) 1.00 (0.95, 1.06) 0.99 (0.94, 1.04) 0.97 (0.93, 1.02) 1.00 (0.95, 1.05) Model with interaction Co-amoxiclav Erythromycin* Co-amoxiclav 1.01 (0.94, 1.08) 0.98 (0.92, 1.06) 1.01 (0.91, 1.12) 1.01 (0.95, 1.08) 0.97 (0.91, 1.04) 1.00 (0.91, 1.10) 1.01 (0.93, 1.08) 1.00 (0.94, 1.08) 0.99 (0.90, 1.10) Erythromycin DfE data – RR (95% CI) Subject Reading PROM cohort Models with no interactions Erythromycin Co-amoxiclav 1.03 (0.99, 1.07) 0.98 (0.94, 1.02) Model with interaction Co-amoxiclav Erythromycin* Co-amoxiclav 1.03 (0.98, 1.09) 0.98 (0.93, 1.04) 0.99 (0.91, 1.07) Erythromycin 74 Writing Maths 1.01 (0.97, 1.05) 1.01 (0.97, 1.06) Example Stata command: 0.98 (0.94, 1.01) 0.99 (0.95, 1.03) 1.01 (0.96, 1.06) 1.01 (0.96, 1.07) 0.97 (0.92, 1.03) 0.99 (0.93, 1.05) 1.00 (0.93, 1.08) 1.00 (0.92, 1.08) poisson read_scale eryth academic_year, irr where read_scale = {1, 2, 3, 4, 5, 6} Again there are no statistically significant treatment estimates and inclusion of interaction terms does not alter estimates. Confidence intervals are much smaller than for ordinal regression, and point estimates are generally more conservative (closer to 1). For illustrations of residual plots, etc, to assess the assumptions of the models, see Appendix A, Section 2. Adjusting for covariates Parental data can only be used due to the anonymous nature of DfE data. Models were fitted including terms indicating treatment allocation, school year and allowance was made for the following variables: Baseline factors: Maternal age (years), gestation at randomisation and birth (days), multiple births, maternal antibiotics, delivery with 48 hours and 7 days, birthweight (grams), sex Social factors: Ethnicity (white/non white), smoking in family, damp/mould problems, family history of asthma, social deprivation scores for income, education and child poverty (on continuous scales with higher scores indicating higher deprivation) Neonatal outcomes (two models were fitted – allowing for and excluding these variables): Admission to neonatal unit, ventilated, respiratory distress syndrome, oxygenation at 28 days, positive blood culture, necrotising enterocolitis, abnormal ultrasound scan 3) Ordinal logistic regression – Reading data Not allowing neonatal outcomes – the ‘best’ fitting models are given below: PROM cohort 75 Models with no treatment interactions: Subject Treatment Smoking in family Sex Gestation at birth Erythromycin 1.05 (0.89, 1.24) 2.21 (1.87, 2.62) 1.76 (1.49, 2.08) 0.99 (0.99, 1.00) Co-amoxiclav 1.02 (0.87, 1.21) 2.21 (1.87, 2.62) 1.76 (1.49, 2.08) 0.99 (0.99, 1.00) Model with interaction: Subject Erythromycin Co-amoxiclav Erythromycin*Co-amoxiclav Smoking in family Sex Gestation at birth OR (95% CI) 1.08 (0.85, 1.37) 1.05 (0.83, 1.33) 0.95 (0.68, 1.33) 2.21 (1.87, 2.62) 1.76 (1.49, 2.09) 0.99 (0.99, 1.00) Example Stata command: ologit read_scale eryth academic_year smoking sex gest_at_birth, or where read_scale = {1, 2, 3, 4, 5, 6} Allowing neonatal outcomes – the ‘best’ fitting models are given below: Models with no treatment interactions: Subject Treatment Smoking in family Sex Oxygenation at 28 days Delivery within 7 days Erythromycin 1.06 (0.90, 1.25) 2.21 (1.87, 2.62) 1.77 (1.50, 2.10) 2.44 (1.79, 3.34) 0.82 (0.69, 0.98) Model with interaction: Subject Erythromycin Co-amoxiclav Erythromycin*Co-amoxiclav PROM cohort OR (95% CI) 1.09 (0.86, 1.38) 1.04 (0.82, 1.31) 0.95 (0.68, 1.33) Co-amoxiclav 1.01 (0.85, 1.19) 2.21 (1.87, 2.62) 1.77 (1.49, 2.09) 2.44 (1.78, 3.33) 0.82 (0.69, 0.98) 76 Smoking in family Sex Oxygenation at 28 days Delivery within 7 days 2.21 (1.87, 2.62) 1.78 (1.50, 2.10) 2.44 (1.79, 3.34) 0.82 (0.69, 0.98) For all models – conclusions of treatment effects are unchanged after adjustment (unadjusted ORs were Erythromycin 1.05 (0.91, 1.22) and Co-amoxiclav 0.97 (0.84, 1.13)). However the Co-amoxiclav point estimate has shifted from below 1 to above 1. Smoking in family, males, being born at low gestations and being oxygenated at 28 days are related to poorer KS1 grades (as expected). However delivery within 7 days of treatment is related to better KS1 grades but this would be expected to be related to low gestational babies (see later). N.B. Smoking is missing for 383 (18%) children Proportional odds assumption – the table below gives the p-values from likelihood ratio tests for proportional odds: Not allowing neonatal outcomes Allowing neonatal outcomes Example Stata command: Erythromycin Co-amoxiclav 0.36 0.17 0.43 0.22 Model with interaction 0.66 0.41 omodel logit read_scale eryth academic_year smoking sex gest_at_birth where read_scale = {1, 2, 3, 4, 5, 6} Results are very similar to unadjusted modelling with proportional odds assumptions appearing valid. Repeating the analysis without adjusting for academic year yields the following results: Not allowing neonatal outcomes Allowing neonatal outcomes Erythromycin Co-amoxiclav 0.46 0.28 0.55 0.35 Model with interaction 0.77 0.57 Again assumptions appear valid, with results similar (albeit p-values are slightly reduced) to the unadjusted models. 4) Poisson regression – Reading data Not allowing neonatal outcomes – the ‘best’ fitting models are given below: PROM cohort 77 Models with no treatment interactions: Subject Treatment Smoking in family Sex Gestation at birth Erythromycin 1.01 (0.96, 1.07) 1.25 (1.18, 1.32) 1.17 (1.11, 1.24) 1.00 (1.00, 1.00) Co-amoxiclav 1.00 (0.94, 1.05) 1.25 (1.18, 1.32) 1.17 (1.11, 1.24) 1.00 (1.00, 1.00) Model with interaction: Subject Erythromycin Co-amoxiclav Erythromycin*Co-amoxiclav Smoking in family Sex Gestation at birth OR (95% CI) 1.01 (0.94, 1.10) 1.00 (0.93, 1.08) 1.00 (0.89, 1.11) 1.25 (1.18, 1.32) 1.17 (1.11, 1.24) 1.00 (1.00, 1.00) Example Stata command: poisson read_scale eryth academic_year smoking sex gest_at_birth, irr where read_scale = {1, 2, 3, 4, 5, 6} Allowing neonatal outcomes – the ‘best’ fitting models are given below: Models with no treatment interactions: Subject Treatment Smoking in family Sex Oxygenation at 28 days Erythromycin 1.02 (0.96, 1.07) 1.24 (1.18, 1.32) 1.17 (1.11, 1.24) 1.24 (1.13, 1.36) Model with interaction: Subject Erythromycin Co-amoxiclav Erythromycin*Co-amoxiclav PROM cohort OR (95% CI) 1.02 (0.94, 1.10) 1.00 (0.93, 1.08) 1.00 (0.89, 1.12) Co-amoxiclav 1.00 (0.94, 1.06) 1.24 (1.18, 1.32) 1.17 (1.11, 1.24) 1.24 (1.13, 1.36) 78 Smoking in family Sex Oxygenation at 28 days 1.24 (1.18, 1.32) 1.17 (1.11, 1.24) 1.24 (1.13, 1.36) Results are very similar to those for ordinal regression modelling. For tests of model assumptions see Appendix A, Section 3. Mapping categorical to continuous scores 3) Unadjusted models (W, 1, 2C, 2B, 2A, 3) → (3, 9, 13, 15, 17, 21). Categorical scores are mapped to continuous outcomes according to the following: Linear regression is then used to estimate treatment effects (allowing for test year) with the results displayed below: Parental data – estimates (95% CIs) Subject Reading Writing Maths Models with no interactions Erythromycin Co-amoxiclav -0.08 (-0.45, 0.29) 0.11 (-0.26, 0.48) -0.11 (-0.47, 0.26) 0.26 (-0.10, 0.63) -0.04 (-0.37, 0.29) -0.05 (-0.38, 0.27) Erythromycin -0.01 (-0.54, 0.52) -0.13 (-0.65, 0.39) -0.05 (-0.52, 0.42) Model with interaction Co-amoxiclav Erythromycin* Co-amoxiclav 0.17 (-0.34, 0.69) -0.13 (-0.87, 0.60) 0.24 (-0.27, 0.74) 0.05 (-0.68, 0.78) -0.06 (-0.52, 0.39) 0.02 (-0.64, 0.68) Erythromycin -0.26 (-0.72, 0.21) -0.07 (-0.52, 0.38) -0.11 (-0.53, 0.30) Model with interaction Co-amoxiclav Erythromycin* Co-amoxiclav 0.21 (-0.25, 0.68) 0.05 (-0.61, 0.71) 0.30 (-0.15, 0.74) -0.02 (-0.66, 0.62) 0.10 (-0.31, 0.52) 0.02 (-0.58, 0.61) DfE data – estimates (95% CI) Subject Reading Writing Maths Models with no interactions Erythromycin Co-amoxiclav -0.23 (-0.56, 0.10) 0.24 (-0.09, 0.57) -0.08 (-0.40, 0.23) 0.29 (-0.03, 0.61) -0.11 (-0.40, 0.19) 0.11 (-0.18, 0.41) Example Stata command: regress read_cts eryth academic_year where read_cts = {3, 9, 13, 15, 17, 21} PROM cohort 79 N.B. These estimates will be in the opposite direction to the estimates for ordinal logistic regression and Poisson regression, as the scales for ordinal and Poisson regression are purposely set to estimate degree of disability, not ability. The continuous score scale estimates degree of ability. As with the earlier methods there is no evidence of any statistically significant treatment effects, and estimates are broadly similar to those using the .3 .2 .1 0 Density .4 .5 alternative methods. One of the assumptions of the model is normality of the outcome variables; a histogram of parental reading scores is given below: 0 20 read_cts The histogram provides evidence that the assumptions of the model are not met, and therefore this method is not advisable. Further residual plots to determine model assumptions are given in Appendix A, Section 4. These plots provide evidence that other assumptions are also not met. 4) Adjusted models Adjusting for covariates gives the same variables proving important to the model when using the alternative two methods. Not allowing neonatal outcomes – the ‘best’ fitting models are given below: PROM cohort 80 Models with no treatment interactions: Subject Treatment Smoking in family Sex Gestation at birth Erythromycin -0.07 (-0.46, 0.32) -1.91 (-2.31, -1.52) -1.29 (-1.68, -0.89) 0.02 (0.01, 0.02) Co-amoxiclav 0.06 (-0.09, 0.21) -1.91 (-2.31, -1.52) -1.28 (-1.67, -0.89) 0.02 (0.01, 0.02) Model with interaction: Subject Erythromycin Co-amoxiclav Erythromycin*Co-amoxiclav Smoking in family Sex Gestation at birth OR (95% CI) -0.05 (-0.62, 0.50) 0.04 (-0.50, 0.59) -0.02 (-0.81, 0.77) -1.91 (-2.31, -1.52) -1.28 (-1.68, -0.89) 0.02 (0.01, 0.02) Example Stata command: regress read_cts eryth academic_year smoking sex gest_at_birth where read_cts = {3, 9, 13, 15, 17, 21} Allowing neonatal outcomes – the ‘best’ fitting models are given below: Models with no treatment interactions: Subject Treatment Smoking in family Sex Oxygenation at 28 days Delivery within 7 days PROM cohort Erythromycin -0.09 (-0.48, 0.30) -1.89 (-2.28, -1.49) -1.29 (-1.68, -0.90) -2.02 (-2.72, -1.32) 0.49 (0.09, 0.89) Co-amoxiclav 0.06 (-0.33, 0.45) -1.88 (-2.28, -1.49) -1.28 (-1.67, -0.89) -2.02 (-2.72, -1.31) 0.49 (0.09, 0.89) 81 Model with interaction: Subject Erythromycin Co-amoxiclav Erythromycin*Co-amoxiclav Smoking in family Sex Oxygenation at 28 days Delivery within 7 days OR (95% CI) -0.07 (-0.63, 0.48) 0.07 (-0.47, 0.62) -0.03 (-0.81, 0.75) -1.88 (-2.28, -1.49) -1.29 (-1.68, -0.89) -2.02 (-2.72, -1.32) 0.49 (0.09, 0.89) Treatment effects are largely unaltered from unadjusted models. Once again estimated effects will be in the opposite direction to when using ordinal or poisson regression. Again smoking in the family, being male, lower gestation, oxygenation at 28 days are all associated with poorer KS1 performance. Again delivery within 7 days is related to better KS1 performance. For model assumptions see Appendix A, Section 5. The residual plots are much better than for the unadjusted models, although there is still some evidence of grouping of residuals into six groups according to the six groupings of KS1 level. Delivery within 7 days This relationship is the opposite direction to what would be expected. This isn’t due to a complicated relationship within the models between this variable and others, as similar effects are observed when the variable is included in univariate models with reading scores as the outcomes. This appears to be due to the majority of women who deliver within 7 days delivering at gestations over 32 weeks (65%). Therefore despite at early gestations delivering within 7 days being associated with giving birth early, the majority of women who do this are already at higher gestations. Use of raw score data The maths raw score data has been examined. Data is available on 1795 PROM children, and is quite complicated due to the combination of tests children could sit and therefore the amount of data for each child varies. The tests available were: task ab (pre 2003), task c (pre 2003), test 23 (testing levels 2 and 3 PROM cohort 82 and pre 2003), test 2 (level 2 test, 2003 onwards), test 3 (level 3 test, 2003 onwards). It was decided to exclude the data from pre 2003 (188 PROM children) due to the different nature of the data. 5) Level scores for those with raw score data available The PROM KS1 maths levels (from teachers) are tabulated below for those with raw score data compared to those without from 2003 onwards: N Below level 1 Level 1 Level 2C Level 2B Level 2A Level 3 or above Missing Raw score 1607 4 (0%) 66 (4%) 293 (18%) 423 (26%) 452 (28%) 368 (23%) 1 (0%) No raw score 391 35 (9%) 105 (27%) 53 (14%) 63 (16%) 67 (17%) 65 (17%) 3 (1%) Therefore there is slightly less raw score available for the lower grades, but this could be due to weak children not being entered for the tests and merely awarded a level via teacher assessment. 6) Descriptive analyses of level 2 test raw scores The raw scores just from those who sat the level 2 test (regardless of whether they also sat the level 3 test) are examined initially. The table below gives the distribution of level 2 raw scores by teacher assessed level, and by test sat: Test 2003 2004 2005 2007 TOTAL N Median (IQR range) N Median (IQR range) N Median (IQR range) N Median (IQR range) N Median (IQR range) (Range) PROM cohort Under Level 1 1 4 (., .) 2 6.5 (4, 9) 0 1 4 (., .) 4 4 (4, 6.5) (4, 9) Level 1 9 5 (5, 6) 22 6 (4, 8) 26 6 (5, 8) 9 7 (5, 9) 66 6 (5, 8) (0, 23) Level 2C 57 10 (9,12) 74 10 (9, 12) 136 10 (9, 13) 26 10.5 (9, 12) 293 10 (9, 12) (2, 22) Level 2B 59 16 (15, 17) 126 16 (14, 18) 173 17 (14, 18) 64 17 (15, 18) 422 16 (15, 18) (5, 28) Level 2A 69 21 (20, 24) 123 22 (20, 24) 201 22 (20, 24) 48 22.5 (21, 24.5) 441 22 (20, 24) (11, 30) Level 3 or above 59 25 (22, 28) 64 26 (24.5, 28) 74 26 (24, 27) 19 26 (25, 27) 216 26 (24, 27) (13, 30) 83 The next table gives similar distributions but by year of assessment: Year 2003 2004 2005 2006 2007 TOTAL N Median (IQR range) N Median (IQR range) N Median (IQR range) N Median (IQR range) N Median (IQR range) N Median (IQR range) (Range) Under Level 1 1 4 (., .) 2 6.5 (4, 9) 0 0 1 4 (., .) 4 4 (4, 6.5) (4, 9) Level 1 8 5 (4.5, 5.5) 11 6 (3, 8) 15 7 (6, 12) 21 5 (4, 7) 11 7 (5, 9) 66 6 (5, 8) (0, 23) Level 2C 55 10 (9,12) 47 10 (9, 11) 76 10 (9, 12.5) 84 10.5 (8.5, 12) 31 11 (9, 12) 293 10 (9, 12) (2, 22) Level 2B 49 16 (15, 17) 90 16 (14, 17) 98 16 (14, 18) 105 17 (15, 19) 80 17 (15, 18.5) 422 16 (15, 18) (5, 28) Level 2A 65 21 (20, 23) 79 21 (20, 24) 103 22 (20, 24) 123 22 (20, 24) 71 23 (21, 25) 441 22 (20, 24) (11, 30) Level 3 or above 56 25 (22, 28) 40 26 (24, 28) 42 26 (24, 27) 53 26 (24, 27) 25 26 (24, 26) 216 26 (24, 27) (13, 30) Level from raw score The equivalent level derived from the level 2 raw score is tabulated by the overall teacher assessment awarded: 0 Under Level 1 Level 1 Level 2C Level 2B Level 2A Under Level 1 0 3 0 1 0 0 Level 1 0 14 26 17 8 1 Teacher awarded level Level 2C Level 2B 1 0 2 0 9 1 247 20 27 332 7 69 Level 2A 0 0 0 1 14 426 Level 3 or above 0 0 0 0 3 213 The above table demonstrates agreement between the teacher awarded score and level score for 1034/1442 (72%) of children. When scores do disagree it is more common for the teacher to award a level higher than that achieved in the test compared to lower, although at this stage we do not present information on whether a higher test (level 3 test) has also been sat. This will be expanded upon later. PROM cohort 84 7) Modelling level 2 raw score 0 .02 Density .04 .06 The level 2 raw scores are now modelled using normal least squares. Firstly the assumption of normality of the scores is investigated: 0 10 20 30 2 score There is some doubt as to the normality of the scores, mainly due to the ‘tail’ of low scoring pupils. Unadjusted models In the table below are results of fitting models adjusting only for academic year the child sat the test, or the paper sat: Adjusting for Academic year Paper sat Models with no interactions Erythromycin Co-amoxiclav -0.22 (-0.87, 0.43) 0.40 (-0.25, 1.05) -0.24 (-0.89, 0.41) 0.40 (-0.25, 1.05) Example Stata command: Erythromycin -0.51 (-1.45, 0.42) -0.53 (-1.47, 0.41) regress score eryth academic_year where score = Maths raw score PROM cohort Model with interaction Co-amoxiclav Erythromycin* Co-amoxiclav 0.11 (-0.80, 1.02) 0.59 (-0.72, 1.89) 0.11 (-0.80, 1.02) 0.58 (-0.72, 1.88) 85 Firstly results are very similar regardless of whether the academic year or the paper sat is adjusted for in the model. There are no statistically significant treatment differences. Estimates are similar to those given when converting the categorical level score to a continuous score for DfE data, and somewhat similar for parental data, although the Co-amoxiclav estimates are in the other direction. N.B. Again these estimates will be in the opposite direction to the estimates when looking at KS1 levels for ordinal logistic regression and Poisson regression, as the scales for ordinal and Poisson regression are purposely set to estimate degree of disability, not ability. The raw scores estimate degree of ability. Residual plots to determine model assumptions are given in Appendix A, Section 6. A histogram of the standardised residuals shows a ‘tail’ of negative residuals, on examination this group relates to those scoring poorly (5 out of 30 or below) and therefore the models do not seem to be accurate for low scoring children. The normal probability plot shows distinct groups of residuals relating to the fact the scores are technically ordinal and not continuous. Adjusted models The models allowing for academic year have been adjusted for covariates: Not allowing neonatal outcomes – the ‘best’ fitting models are given below: Models with no treatment interactions: Subject Treatment Social dep – child poverty score Gestation at birth White Erythromycin -0.15 (-0.86, 0.56) 0.00 (0.00, 0.00) 0.02 (0.01, 0.04) 1.90 (0.47, 3.32) Model with interaction: Subject Erythromycin Co-amoxiclav Erythromycin*Co-amoxiclav Social dep – child poverty score PROM cohort Coeff (95% CI) -0.37 (-1.39, 0.66) 0.27 (-0.73, 1.27) 0.46 (-0.97, 1.88) 0.00 (0.00, 0.00) Co-amoxiclav 0.50 (-0.21, 1.21) 0.00 (0.00, 0.00) 0.02 (0.01, 0.04) 1.97 (0.55, 3.40) 86 Gestation at birth White 0.02 (0.01, 0.04) 1.96 (0.54, 3.39) Example Stata command: regress score eryth academic_year child_pov gest_at_birth white where score = Maths raw score Allowing neonatal outcomes – the ‘best’ fitting models are given below: Models with no treatment interactions: Subject Treatment Social dep – child poverty score Oxygenation at 28 days White Erythromycin -0.20 (-0.91, 0.51) 0.00 (0.00, 0.00) -3.14 (-4.45, -1.83) 2.02 (0.60, 3.44) Co-amoxiclav 0.50 (-0.21, 1.21) 0.00 (0.00, 0.00) -3.12 (-4.42, -1.81) 2.10 (0.68, 3.52) Model with interaction: Subject Erythromycin Co-amoxiclav Erythromycin*Co-amoxiclav Social dep – child poverty score Oxygenation at 28 days White Coeff (95% CI) -0.36 (-1.38, 0.65) 0.32 (-0.68, 1.31) 0.36 (-1.06, 1.77) 0.00 (0.00, 0.00) -3.11 (-4.42, -1.80) 2.08 (0.67, 3.50) Treatment effects are largely unaltered from unadjusted models. Once again estimated effects will be in the opposite direction to when using ordinal or poisson regression. Being non-white, oxygenation at 28 days, born earlier and having worse social deprivation on the child poverty scale are all associated with poorer KS1 performance. For model assumptions see Appendix A, Section 7. The residual plots are much better than for the unadjusted models. PROM cohort 87 8) Extending analysis for other tests sat Initially the combination of tests sat (level 2 only, level 2 and level 3, level 3 only) have been compared to the teacher assessed maths level: Below level 1 Level 1 Level 2C Level 2B Level 2A Level 2 test only 4 (0%) 66 (6%) 293 (28%) 412 (40%) 250 (24%) Level 2 and level 3 tests 0 0 0 10 (2%) 191 (47%) Level 3 test only 0 0 0 1 (1%) 11 (7%) The above gives evidence that the combination of tests sat is predictive (to some degree) of level achieved. >= Level 3 7 (1%) 209 (51%) 152 (92%) Total 1032 410 165 Now the combination of tests sat by treatment: Level 2 test only Level 2 and level 3 tests Level 3 test only Erythromycin 502 (49%) 203 (50%) 74 (45%) No Erythromcyin 530 (51%) 207 (50%) 91 (55%) Co-amoxiclav 542 (53%) 205 (50%) 75 (45%) No Co-amoxiclav 490 (47%) 205 (50%) 90 (55%) Total 1032 410 165 The ‘highest equivalent level (HEL)’ from all raw scores has been derived. This is an extension of the equivalent level corresponding to the level 2 test (from part 2) above), and is the highest level from all tests the child sat. So for example if child 1 achieves level 2B in the level 2 test and fails the level 3 test their HEL will be level 2B. If child 2 achieves level 2B in the level 2 test and level 3 in the level 3 test their HEL will be level 3. This is therefore the best predictor of teacher assessed level from the raw score data. It is tabulated below with teacher assessed level: HEL Under Level 1 Level 1 Level 2C Level 2B Level 2A Level 3 or above Missing < Level 1 3 0 1 0 0 0 0 Level 1 14 26 17 8 1 0 0 Teacher assessed level Level 2C Level 2B Level 2A 2 0 0 9 1 0 247 20 1 27 332 14 7 67 389 0 3 42 1 0 6 >= Level 3 0 0 0 1 15 350 2 Missing 0 0 0 0 0 1 0 Levels agree for 1347/1607 (84%) of children, HEL levels are higher than teacher assessed for 173 (11%) of children and teacher assessed levels are higher than HEL for 77 (5%) of children. Modelling HEL and teacher assessed level adjusting for combination of tests sat PROM cohort 88 Poisson regression has been used for this. Ordinal logistic regression was also attempted but there were issues with convergence in some models. The models have been fitted twice – once adjusting for academic year and one adjusting for Level 2 test sat. The models were fitted both without adjustment for test combination and with. When adjusting the three groups outlined above are used – with Level 2 only as the baseline. Highest Equivalent Level (HEL) Unadjusted Adjusting for test combination Erythromycin Co-amoxiclav Erythromycin* Co-amoxiclav Erythromycin Co-amoxiclav Erythromycin* Co-amoxiclav Papers 2 and 3 Paper 3 only Example Stata command: Adjusting for academic year Erythromycin Co-amoxiclav Model with model model interactions 1.03 (0.96, 1.09) 1.05 (0.96, 1.15) 1.00 (0.94, 1.07) 1.02 (0.94, 1.12) 0.96 (0.85, 1.09) 1.02 (0.95, 1.08) 0.46 (0.42, 0.50) 0.32 (0.28, 0.38) Erythromycin model 1.02 (0.95, 1.08) 1.02 (0.93, 1.11) 0.98 (0.89, 1.06) 0.99 (0.88, 1.13) 1.02 (0.96, 1.09) 0.97 (0.91, 1.03) 0.46 (0.42, 0.50) 0.32 (0.28, 0.38) 0.46 (0.42, 0.50) 0.32 (0.28, 0.38) 0.46 (0.42, 0.50) Dropped due to collinearity Adjusting for test sat Co-amoxiclav Model with model interactions 1.03 (0.94, 1.13) 0.98 (0.92, 1.05) 1.00 (0.91, 1.09) 0.97 (0.86, 1.11) 0.97 (0.91, 1.03) 1.02 (0.93, 1.12) 0.97 (0.89, 1.07) 0.99 (0.87, 1.13) 0.46 (0.42, 0.50) Dropped due to collinearity 0.46 (0.42, 0.50) Dropped due to collinearity poisson maths_HEL eryth academic_year, irr where maths_HEL = {1, 2, 3, 4, 5, 6} Teacher assessed level Unadjusted Adjusting for test combination PROM cohort Erythromycin Co-amoxiclav Erythromycin* Co-amoxiclav Erythromycin Co-amoxiclav Erythromycin* Co-amoxiclav Papers 2 and 3 Adjusting for academic year Erythromycin Co-amoxiclav Model with model model interactions 1.01 (0.95, 1.07) 1.04 (0.95, 1.13) 1.01 (0.95, 1.07) 1.04 (0.95, 1.13) 0.94 (0.83, 1.06) 1.00 (0.94, 1.06) 0.48 (0.44, 0.52) Erythromycin model 0.99 (0.93, 1.06) Adjusting for test sat Co-amoxiclav model 0.99 (0.93, 1.05) 1.01 (0.92, 1.10) 0.99 (0.91, 1.08) 0.98 (0.87, 1.11) 1.00 (0.93, 1.06) 0.98 (0.92, 1.04) 0.48 (0.44, 0.52) 0.48 (0.44, 0.52) 0.48 (0.44, 0.52) Model with interactions 1.01 (0.93, 1.11) 1.01 (0.93, 1.10) 0.96 (0.85, 1.09) 0.98 (0.92, 1.04) 1.01 (0.92, 1.10) 0.99 (0.90, 1.08) 0.98 (0.86, 1.11) 0.48 (0.44, 0.52) 0.48 (0.44, 0.52) 89 Paper 3 only Example Stata command: 0.34 (0.30, 0.40) 0.34 (0.30, 0.40) 0.34 (0.30, 0.40) Dropped due to collinearity Dropped due to collinearity Dropped due to collinearity poisson maths_scale eryth academic_year, irr where maths_scale = {1, 2, 3, 4, 5, 6} Therefore there are no treatment differences evident when adjusting for tests sat. Sitting the Level 2 and 3 tests increases the level achieved, and sitting the Level 3 test increases the level achieved further, compared to sitting only the Level 2 test. Standardisation/anchoring using PIPS data To begin with the Maths data has been used, data is available from 2001 to 2007 on 104,750 children. The data consists of a PIPS score and KS1 level, along with the year the child sat the test. Data is not available on which test the child sat. 4) Exploratory analyses on PIPS data Mean and 95% CI for Maths PIPS score each year, and overall N Mean (95% CI) 2001 21,078 20.79 (20.68, 20.89) 2002 17,152 20.80 (20.69, 20.91) 2003 15,547 20.62 (20.50, 20.73) 2004 11,190 20.73 (20.59, 20.87) 2005 10,787 20.83 (20.69, 20.97) 2006 13,827 20.74 (20.62, 20.86) 2007 15,169 20.56 (20.44, 20.67) Overall 104,750 20.72 (20.68, 20.77) Variations year on year are very minor. Furthermore there is no evidence of an increasing or decreasing trend in scores over time, which the graph below illustrates more clearly: PROM cohort 20.4 20.6 PIPS score 20.8 21 90 2001 2002 2003 2004 School year 2005 2006 2007 The horizontal red lines represent the mean and 95% CI for the overall scores. The only year for which 95% CIs don’t overlap with the Overall CI is 2007. The table and graph suggest that PIPS scores are fairly constant over time, suggesting that standards have not changed. Histogram of overall Maths PIPS score (histograms by year are available in Appendix A – section 8) PROM cohort .04 .02 0 Density .06 .08 91 0 10 20 mathsPIPS 30 40 The data appears to broadly follow a normal distribution, although the tail for the lower scores is noticeably larger than the tail for the upper scores. 5) Analyses of the relationship between PIPS score and KS1 level The relationship between PIPS score and KS1 level is examined, both overall and by year. This is to: 1) assess the appropriateness of the use of PIPS data with KS1 levels and 2) look for evidence of changes over the years in KS1 test standards. Box plot of PIPS score by KS1 level (box plots by year are available in Appendix A – section 8) PROM cohort 0 10 20 30 40 92 Below level 1 Level 1 Level 2C Level 2B Level 2A Level 3+ There is a trend of increasing PIPS score with increasing KS1 level, although there is a moderate amount of overlap between the levels. Mean and 95% CI for PIPS score, by KS1 level and school year For this the data has been standardised to enable easier identification of trends. The data has been standardised relative to the 2001 data, so that the 2001 data has mean 50 and standard deviation 10. The mean (95% CI) standardised PIPS scores by KS1 level and school year are given below: 2001 Below level 1 N Mean (95% CI) Level 1 Level 2C 403 31.70 (31.11, 2002 215 30.63 32.28) (29.87, 2003 246 31.49 31.38) (30.79, 2004 171 30.64 32.19) (29.82, 2005 117 29.73 31.47) (28.94, 2006 246 30.35 30.52) (29.63, 2007 264 30.43 31.07) (29.84, Overall 1662 30.88 31.02) (30.61, N 1305 1183 939 618 642 821 926 6434 Mean 35.40 35.74 34.96 34.30 34.98 35.05 34.96 35.14 (95% CI) (35.04, N Mean (95% CI) 3431 41.59 (41.37, PROM cohort 35.75) (35.38, 41.81) 2527 41.54 (41.30, 36.09) (34.59, 41.79) 2555 41.33 (41.08, 35.32) (33.84, 41.58) 1635 40.50 (40.19, 34.76) (34.54, 40.81) 1772 40.40 (40.11, 35.42) (34.63, 40.69) 2135 40.94 (40.67, 35.48) (34.58, 41.21) 2161 40.56 (40.29, 31.15) 35.33) (34.99, 35.29) 40.83) 16,216 41.08 (40.98, 41.18) 93 Level 2B N 5203 3342 3075 2378 2217 3065 3315 Mean 47.85 47.17 47.18 46.68 47.22 47.27 46.86 (95% CI) Level 2A (46.95, 47.38) (46.96, 47.40) (46.42, 46.93) (46.96, 47.49) (47.05, 47.49) (46.64, 4602 4139 3657 2720 3020 3993 4053 Mean 53.31 52.30 52.40 52.40 52.96 53.54 52.85 N Mean (95% CI) Missing 48.03) N (95% CI) Level 3+ (47.68, N Mean (95% CI) (53.14, 53.48) 5579 52.49) 5129 59.63 (59.48, (52.12, (58.88, 52.59) 4471 59.04 59.79) (52.21, (58.81, 52.62) 3397 58.98 59.20) (52.18, (58.76, 53.17) 2817 58.96 59.15) (52.75, (59.74, 53.72) 3172 59.93 59.16) (53.36, (59.65, 47.25 47.07) (59.30, (52.78, 59.37 59.64) (59.31, 617 604 271 202 395 714 3358 45.47 43.98 45.15 44.91 44.76 46.68 47.64 45.65 46.33) (43.12, 44.84) (44.25, 46.06) (43.54, 46.28) (43.29, 46.23) (45.61, 47.74) (46.88, 52.92) 28,301 555 (44.60. 47.33) 52.85 53.03) 59.47 60.03) (47.17, 26,184 3736 59.84 60.13) (52.67, 22,595 48.40) (45.29, 59.44) 46.02) For all years there are strong distinctions between the mean (95% CI) PIPS scores for each KS1 level. There are some differences between years in mean PIPS scores for each level. These are represented graphically in Appendix A – section 8. These plots do not demonstrate any trends in levels over time, there are some variations but these appear to be at random as they are not supported by all levels, or by all years. The correlation coefficient for PIPS score and KS1 level is 0.79, indicating a relatively strong correlation between the two measures. If a regression model is fitted with PIPS score as the outcome and KS1 level as the explanatory variable the adjusted R2 value is 0.63, and the coefficient estimate for KS1 level is 4.44 (4.42, 4.47). Adding in school year to the regression model does not alter the value of R2. Example Stata command: regress mathsPIPS mathsKS1_cat academic_year where mathsPIPS = maths PIPS score, mathsKS1_cat = {1, 2, 3, 4, 5, 6} All of this provides evidence that the PIPS scores are closely related to KS1 levels, and that overall standards have not changed over time as PIPS scores are relatively stable over time. 6) Anchoring KS1 level data The KS1 level scores for the students for whom we have PIPS scores have been dichotomised at level 2 and above, and below level 2. These have been tabulated against PIPS scores dichotomised at above 12 and 12 and below for each year: PROM cohort 94 2001 >= Level 2 2002 < Level 2 Total 17,069 380 17,449 97.82% 2.18% PIPS >12 PIPS <=12 1,746 1,328 56.8% 43.20% >= Level 2 < Level 2 8,819 146 98.37% 1.63% 1,007 613 62.16% 37.84% >= Level 2 Total 13,838 354 14,192 97.51% 2.49% 3,074 1,299 1,044 55.44% 44.56% Total >= Level 2 < Level 2 8,965 11,224 213 98.14% 1.86% 1,141 854 57.19% 42.81% 2005 PIPS >12 PIPS <=12 2003 < Level 2 2,343 >= Level 2 Total 12,452 241 12,693 98.10% 1.90% 1,306 944 58.04% 41.96% Total >= Level 2 < Level 2 Total 11,437 11,965 217 12,182 98.22% 1.78% 2006 1,620 2004 < Level 2 2,250 >= Level 2 < Level 2 Total 9,141 144 9,285 98.45% 1.55% 989 645 60.53% 39.47% 1,634 2007 1,995 1,300 973 57.19% 42.81% 2,273 If the tests were identical over time we would expect identical percentages for each year in the table above. For percentages for 2002-2007 to be identical to those from 2001, KS1 levels will need ‘reassigning’ as indicated in the table below: Year 2002 2003 2004 2005 2006 2007 Movement 1.67% <level 2 moved to >=level 2 1.52% >=level 2 moved to <level 2 4.36% >=level 2 moved to <level 2 5.91% >=level 2 moved to <level 2 0.71% >=level 2 moved to <level 2 0.79% >=level 2 moved to <level 2 We have applied this to the Oracle KS1 data to anchor the data according to the PIPS data. However it would be most logical when reassigning from >=level 2 to <level 2 to reassign those who scored >=level 2 with the lowest score, and vice versa when reassigning in the opposite direction. We do not know this information without reverting to raw score data. Therefore the only solution is to reassign equally from each treatment group. This has been done, the tables below describe how many children have been moved in each group for both parental and DfE data: Parental 2001 2002 Total children Number of children to move Total children PROM cohort Erythromycin & Co-amoxiclav 3 Erythromycin only 0 Co-amoxiclav only 1 Double placebo 1 39 29 34 44 Percentage to move and direction 1.67% down 95 2003 2004 2005 2006 2007 Number of children to move Total children Number of children to move Total children Number of children to move Total children Number of children to move Total children Number of children to move Total children Number of children to move 1 81 1 81 4 104 6 128 1 88 1 0 73 1 86 4 130 8 121 1 67 1 1 74 1 96 4 133 8 152 1 86 1 1 67 1 82 4 118 7 140 1 91 1 Erythromycin & Co-amoxiclav 5 Erythromycin only 3 Co-amoxiclav only 5 Double placebo 4 53 1 107 2 111 5 161 10 219 2 136 1 47 1 97 1 128 6 186 11 214 2 129 1 46 1 95 1 138 6 180 11 228 2 139 1 62 1 91 1 111 5 178 11 219 2 146 1 1.52% up 4.36% up 5.91% up 0.71% up 0.79% up DfE 2001 2002 2003 2004 2005 2006 2007 Total children Number of children to move Total children Number of children to move Total children Number of children to move Total children Number of children to move Total children Number of children to move Total children Number of children to move Total children Number of children to move Percentage to move and direction 1.67% down 1.52% up 4.36% up 5.91% up 0.71% up 0.79% up The data have now been reanalysed using Mantel-Haenszel methods as done in part 1. Results are below: Erythromycin PROM cohort Parental data No CoErythromycin amoxiclav No Coamoxiclav Erythromycin DfE data No CoErythromycin amoxiclav No Coamoxiclav 96 N Maths Below level 2 Maths MH OR (95% CI) Example Stata command: 1030 142 13.8% 1119 140 12.5% 1100 140 12.7% 1049 142 13.5% 1596 296 18.5% 1642 296 18.0% 1623 289 17.8% 1615 303 18.8% 1.11 (0.87, 1.43) 0.93 (0.72, 1.20) 1.03 (0.86, 1.24) 0.93 (0.78, 1.11) mhodds maths_di eryth, by(academic_year) where maths_di = 0 - level 2 or higher, 1 - below level 2 ORs are very similar to those obtained from the unanchored data on page 4. PROM cohort 97 Appendix A Section 1 – Unadjusted Ordinal Logistic Regression, Proportional Odds Assumptions The graphs overleaf illustrate the assumptions for the parental data for reading level associated with Erythromycin, and maths level associated with Coamoxiclav: PROM cohort 98 Reading Erythromycin 100 90 80 70 Level 3 or above 60 Level 2A Level 2B 50 Level 2C Level 1 40 Under level 1 30 20 10 0 Eryth No Eryth Eryth No Eryth Eryth No Eryth 2001 2001 2002 2002 2003 N=3 N=2 N=68 N=78 N=152 N=141 PROM cohort 2003 Eryth 2004 No Eryth 2004 N=167 N=178 Eryth 2005 No Eryth 2005 N=231 N=250 Eryth 2006 No Eryth 2006 N=248 N=292 Eryth 2007 No Eryth 2007 N=155 N=177 Eryth Total No Eryth Total N=1024N=1118 99 PROM cohort 100 The following graphs are from the DfE data – writing for both Erythromycin and Co-amoxiclav PROM cohort 101 Writing Co-amoxiclav 100 90 80 70 Level 3 or above 60 Level 2A Level 2B 50 Level 2C Level 1 40 Under level 1 30 20 No Co-amox Co-amox No Co-amox Co-amox No Co-amox Co-amox No Co-amox Co-amox No Co-amox Co-amox No Co-amox Co-amox Co-amox 0 No Co-amox 10 2001 2001 2002 2002 2003 2003 2004 2004 2005 2005 2006 2006 Total Total N=10 N=7 N=99 N=109 N=202 N=188 N=248 N=238 N=339 N=364 N=446 N=432 N=1344N=1338 PROM cohort Section 2 – Unadjusted Poisson Regression Assumptions The following plots assess the assumptions and viability of the parental data reading with erythromycin model: Pearson’s residuals against linear predictor: 1 0 -1 0 Pearsons residual 1 2 2 Pearsons residuals against fitted values: -1 Pearsons residual 102 2.7 2.8 2.9 Fitted values Standardized Pearson’s residuals against id: PROM cohort 3 3.1 .95 1 1.05 Linear predictor 1.1 -1 0 1 2 103 0 PROM cohort 500 1000 id 1500 2000 Section 3 - Adjusted Poisson Regression Again plots are for the parental dataset reading with erythromycin model, allowing for neonatal outcomes: Pearson’s residuals against linear predictor: 3 2 1 0 -1 -2 -1 0 1 Pearsons residual 2 3 Pearsons residuals against fitted values: -2 Pearsons residual 104 2 2.5 3 3.5 Fitted values Standardized Pearson’s residuals against id: PROM cohort 4 4.5 .8 Leverage against id 1 1.2 Linear predictor 1.4 1.6 0 -2 -1 .005 0 1 Leverage h .01 2 3 .015 105 0 PROM cohort 500 1000 id 1500 2000 0 500 1000 id 1500 2000 106 Section 4 – Mapping categories to continuous scores Residual plots using the parental dataset and the reading with erythromycin model: Normal probability plot of standardised residuals 0.50 0 0.00 0.25 .5 Density 1 Normal F[(rstd-m)/s] 0.75 1.5 1.00 Histogram of standardised residuals -3 -2 -1 0 Standardized residuals Plot of standardised residuals against fitted values PROM cohort 1 0.00 0.25 0.50 Empirical P[i] = i/(N+1) Plot of standardised residuals against id 0.75 1.00 1 0 -1 -2 -3 -3 -2 -1 0 Standardized residuals 1 107 14.8 PROM cohort 15 15.2 15.4 Linear prediction 15.6 15.8 0 500 1000 id 1500 2000 108 Section 5 – Adjusted mapping categorical to continuous models Residual plots using the reading with erythromycin model: Allowing neonatal outcomes Histogram of residuals Density 0 0 .2 .2 Density .4 .4 .6 .6 Not allowing neonatal outcomes Histogram of residuals -4 -2 0 Standardized residuals Normal probability plot of standardised residuals PROM cohort 2 -4 -2 0 Standardized residuals Normal probability plot of standardised residuals 2 1.00 0.75 0.50 0.25 0.00 0.00 0.25 0.50 Normal F[(rstd-m)/s] 0.75 1.00 109 0.00 0.25 0.50 Empirical P[i] = i/(N+1) Plot of standardised residuals against fitted values PROM cohort 0.75 1.00 0.00 0.25 0.50 Empirical P[i] = i/(N+1) Plot of standardised residuals against fitted values 0.75 1.00 0 -2 -4 -4 -2 0 Standardized residuals 2 2 110 13 14 15 16 Fitted values Plot of standardised residuals against gestation at birth PROM cohort 17 18 10 12 14 Fitted values 16 18 -4 -2 0 2 111 100 150 200 gest_at_rnd PROM cohort 250 112 Section 6 – Unadjusted raw score modelling Residual plots using the maths raw score adjusting for paper sat with erythromycin model: Normal probability plot of standardised residuals 0.25 0.50 Normal F[(std-m)/s] .2 0.00 .1 0 Density .3 0.75 .4 1.00 Histogram of residuals -3 -2 -1 0 Standardized residuals Plot of standardised residuals against fitted values PROM cohort 1 2 0.00 0.25 0.50 Empirical P[i] = i/(N+1) Plot of standardised residuals against id 0.75 1.00 0 -1 -3 -3 -2 -2 -1 0 Standardized residuals 1 1 2 2 113 17.7 17.8 17.9 Fitted values 18 18.1 0 500 1000 id Section 7– Adjusted raw score modelling Residual plots using the maths raw score with erythromycin model, allowing for neonatal outcomes and adjusting for academic year: Histogram of standardised residuals PROM cohort Normal probability plot of standardised residuals 1500 0.25 0.50 Normal F[(rstd-m)/s] .2 0.00 .1 0 Density .3 0.75 .4 1.00 114 -3 -2 -1 0 Standardised residuals Plot of standardised residuals against fitted values PROM cohort 1 2 0.00 0.25 0.50 Empirical P[i] = i/(N+1) Plot of standardised residuals against id 0.75 1.00 0 -1 -3 -3 -2 -2 -1 0 Standardised residuals 1 1 2 2 115 12 PROM cohort 14 16 Fitted values 18 20 0 500 1000 id 1500 116 Section 8 – PIPS scores Histograms of PIPS scores by academic year 2002 2003 2004 2005 2006 0 .02 .04 .06 .08 0 10 20 2007 0 .02 .04 .06 .08 Density 0 .02 .04 .06 .08 2001 0 10 20 30 40 mathsPIPS Graphs by schoolyear Boxplots of PIPS score for KS1 level, by school year PROM cohort 30 40 0 10 20 30 40 117 2002 2003 2004 2005 2006 0 10 20 30 40 0 10 20 30 40 2001 0 10 20 30 40 2007 Graphs by schoolyear (KS1 labels have been omitted for space – but all boxes are in the order Below level 1, Level 1, Level 2C, Level 2B, Level 2A, Level 3+) Below level 1 PROM cohort Level 1 Level 2C 42 40.5 41 PIPS score 35.5 40 29 34 30 34.5 35 PIPS score 32 31 PIPS score 41.5 36 33 118 2001 2002 2003 2004 School year 2005 2006 2007 2002 2003 2004 School year 2005 2006 2001 2007 Level 2A 2002 2003 2004 School year 2005 2006 2007 2002 2003 2004 School year 2005 2006 2007 Level 3+ 59.5 PIPS score 53 PIPS score 58.5 52 46.5 52.5 59 47 PIPS score 47.5 53.5 60 48 54 Level 2B 2001 2001 2002 PROM cohort 2003 2004 School year 2005 2006 2007 2001 2002 2003 2004 School year 2005 2006 2007 2001