APPENDIX Table of contents Table of contents ......................................................................................................................1 I. Missing data ..........................................................................................................................2 Missing data analysis ............................................................................................................2 Missing data imputation........................................................................................................7 Sensitivity analysis.................................................................................................................8 Survival prediction model ..................................................................................................8 Pulmonary fibrosis prediction model ..............................................................................10 Pulmonary hypertension prediction model......................................................................11 II. Statistical Analysis ............................................................................................................14 Standardised mortality ratio analysis .................................................................................14 Pulmonary complications and survival prediction models building .................................14 III. Threshold analysis...........................................................................................................15 Threshold analysis for survival model................................................................................15 Threshold analysis for pulmonary fibrosis model .............................................................16 Threshold analysis for pulmonary hypertension model ....................................................16 IV. Pulmonary complications and survival prediction models interpretation.................18 Survival model .....................................................................................................................18 Pulmonary fibrosis model ...................................................................................................18 Pulmonary hypertension model ..........................................................................................19 V. Analysis of cumulative incidence and antibody associations of pulmonary arterial hypertension and clinically significant pulmonary fibrosis-associated pulmonary hypertension separately.........................................................................................................21 VI. References ........................................................................................................................22 1 I. Missing data Missing data analysis The prediction models used 33 predictor and outcome variables, which are listed in Table I.1. A relatively small proportion of data relating to demographic and general clinical characteristics of the subjects as well as organ complications and vital status of the patients were missing. Presence of PF could not be ascertained for 4 (1%) of the patients due to no information regarding HRCT results, although in 2 of those pulmonary function tests were available and based on those csPF could be reasonably excluded. In addition ethnicity data were missing for 15 (3.8%) patients, Raynaud’s phenomenon onset date was not recorded in 25 (6.3%), smoking history in 55 (13.8%) and autoantibody specificities data were missing in 22 (5.5%) of the patients. A significant proportion of the patients had incomplete data regarding clinical assessments and test results in the first years of their disease. For survival and csPF prediction models we used data available within the first 3 years from disease onset, while for the PH prediction model data were used if available within the first 5 years from disease onset. For that reason the dataset used for the survival/PF prediction model has a larger proportion of missing data compared to the dataset used for PH prediction. In particular, mRss was assessed within the first 3 years of disease onset in 61% of the subjects and within the first 5 years in 76% of them. PFT results within 3 years of disease onset were available in 65% of the patients and within 5 years in 81% of them. Similar patterns were observed in blood test results and assessments of Raynaud’s phenomenon severity, presence of digital ulcers or gangrene, oesophageal involvement and tendon friction rubs (Table I.1). 2 Table I.1. Missing data in the variables used for the prediction models Variables Ethnicity Male gender Age at scleroderma onset Diffuse subset Raynaud's duration at onset of scleroderma Smoking history Polymyositis/Dermatomyositis overlap Rheumatoid arthritis overlap Sjogren’s syndrome overlap Systemic lupus erythematosus overlap Coding Other/Asian/Black/Caucasian Y/N Years Y/N Months Demographic and clinical Non-/Past-/Active smoker characteristics Y/N Y/N Y/N Y/N Other/ACA/ATA/ARA/U3RNP/U1RNP/ Auto-antibodies PMScl/ThRNP/ANA/ANAneg* Pulmonary fibrosis No/Mild/Clinically significant Pulmonary hypertension Y/N Cardiac scleroderma Y/N Scleroderma renal crisis Y/N Organ complications Death Y/N and Time to clinically significant pulmonary fibrosis Months vital status Time to pulmonary hypertension Months Time to cardiac scleroderma Months Time to scleroderma renal crisis Months Time to death Months Modified Rodnan skin score 0 ÷ 51 Forced vital capacity (FVC) % predicted Carbon monoxide diffusion capacity (DLCO) % predicted Corrected diffusion capacity (KCO) % predicted Haemoglobin g/dL Erythrocyte sedimentation rate (ESR) mm/1sth First available assessments Serum creatinine μmol/L Proteinuria Y/N Raynaud's grade 2-3 Y/N Digital ulcers and/or gangrene Y/N Friction rubs Y/N Oesophageal involvement Y/N *Anti-centromere antibody; Anti-topoisomerase I antibody; Anti-RNA polymerase antibody; Anti-U3RNP antibody; Anti-U1RNP Non-identified anti-nuclear antibody; Anti-nuclear antibody negative; 3 Missing data in the survival/PF model, n (%) 15 (3.8) 0 (0.0) 0 (0.0) 0 (0.0) 25 (6.3) 55 (13.8) 0 (0.0) 0 (0.0) 0 (0.0) 0 (0.0) Missing data in the PH model, n (%) 15 (3.8) 0 (0.0) 0 (0.0) 0 (0.0) 25 (6.3) 55 (13.8) 0 (0.0) 0 (0.0) 0 (0.0) 0 (0.0) 22 22 (5.5) (5.5) 4 (1.0) 4 (1.0) 0 (0.0) 0 (0.0) 0 (0.0) 0 (0.0) 0 (0.0) 0 (0.0) 0 (0.0) 0 (0.0) 2 (0.5) 2 (0.5) 0 (0.0) 0 (0.0) 0 (0.0) 0 (0.0) 0 (0.0) 0 (0.0) 0 (0.0) 0 (0.0) 156 (39.2) 96 (24.1) 139 (34.9) 74 (18.6) 139 (34.9) 76 (19.1) 267 (67.1) 179 (45.0) 150 (37.7) 76 (19.1) 156 (39.2) 84 (21.1) 145 (36.4) 77 (19.3) 135 (33.9) 83 (20.9) 153 (38.4) 93 (23.4) 152 (38.2) 92 (23.1) 136 (34.2) 84 (21.1) 182 (45.7) 149 (37.4) antibody; Anti-PmScl antibody; Anti-ThRNP antibody; Overall, for models using data available within the first 3 years of disease, only 88 subjects (22%) had complete data for all variables, while for the model using data available within 5 years of disease onset, 136 (34%) of the subjects had complete data. As a result, it was judged that analysing only cases with complete data would lose a substantial amount of available information, and therefore reduce precision. In addition, complete case analysis gives valid unbiased results only when data are missing completely at random (1,2). On the other hand, multiple random imputation gives unbiased results with improved precision while coping with data missing at random (the probability that an observation is missing is not related to the missing value itself, but can be related to other variables in the analysis) (2,3). Neither approach would produce valid results if missing data are not random. For that reason we performed analysis of our data to assess if the assumption that data are missing at random or completely at random is justified. Comparison between complete and incomplete cases in terms of different variables showed some differences. In both datasets (3 year dataset and 5 year dataset) FVC and DLCO were significantly higher in the completed cases compared to those that had at least one missing variable (difference of 6% and 8% for FVC and 8% and 9% for DLCO in the two datasets respectively, p<0.05). More importantly, completed cases were more often of the diffuse SSc subset and ARA positive, while having lower proportion of overlap syndromes, U1RNP positivity and lower cumulative incidence of PH and PF. A very strong relationship was found between missing values and disease duration at the time of patient referral. For the 3 and 5 year datasets at the time patients were first seen in our centre, complete cases had mean disease duration of 8 and 10 months respectively compared to 32 and 35 months in the incomplete cases. 4 We also looked for any patterns in the missing data that may suggest that values were not missing at random. In particular, as lower values of PFT results are generally associated with pulmonary complications in SSc, we compared frequency of lung complications in patients with and without missing PFTs. Similarly, higher serum creatinine levels and presence of proteinuria are associated with renal involvement; therefore we compared frequency of SRC in patients with and without missing creatinine and proteinuria data. In both 3 year and 5 year datasets we found significantly higher frequency of PH among patients with missing PFT data. On the other hand, there was no difference in frequency of csPF among those patients. In addition, we observed significantly longer disease duration at first visit among patients with missing PFT values. PH patients on average had longer disease duration at first visit (42 months) compared to non-PH patients (24 months, p<0.001), which can explain why among patients with missing baseline PFT data PH frequency was higher. For markers of renal function we found no difference in SRC frequency between patients with missing and non-missing creatinine and proteinuria information in the 3 year dataset. In the 5 year dataset there was a significantly lower proportion of SRC cases in the group with missing proteinuria data (1.2% v 7.6%, p=0.038), but no difference in SRC frequency in the missing and non-missing creatinine data patients. Although patients with milder skin and organ disease are generally seen less frequently than those with significant clinical problems, in this analysis we used only the first available measurements and assessments, rather than repeated ones and those were used, if taken within comparatively wide time windows (first 3 or 5 years of disease). As a result, frequency of follow-up, which is strongly dependent on disease severity, is unlikely to have substantially affected the patterns of missing data. 5 The majority of patients seen in our centre are not local to the hospital and often, as part of shared care, they undergo disease monitoring, including regular blood tests and organ disease screening in their local hospitals. Relevant results are generally forwarded to us and since they are a necessary part of initial patient assessment, locally performed tests are often sent to us as part of patient referral, therefore shared care is not necessarily associated with larger proportion of missing data. It is often difficult to ascertain the mechanisms leading to missing information and sometimes those can be different for the different variables and even for different values. For example blood test results done in a local hospital are less likely to be forwarded to a specialist centre, especially if the patient has mild disease, while lung function results are routinely sent to us, regardless of disease severity. On the other hand, missing lung function results, particularly DLCO measurements, are often result of poor patient technique at the time of testing and are generally due to SSc mouth involvement, although in some cases measurements could be too low to detect, if lung involvement is very advanced. Consistently in both datasets we observed significantly lower frequency of diffuse cutaneous subset and longer disease duration at first assessment among patients with missing values for mRss, Hb, ESR, serum creatinine and proteinuria which may reflect that patients with more severe skin disease get diagnosed and referred to a specialist centre earlier and therefore have less missing data relating to their early disease. For that reason, our data were assumed to be missing at random (related to referral pattern and disease subset) and disease duration at first assessment was included in the imputation model, although not in the survival and pulmonary complication prediction analysis. 6 Missing data imputation We performed multiple random imputation of missing variables using SPSS. All variables were included in the imputation model and those having no missing data were used as predictors only. Constraints based on the range of the available values were used for all missing continuous variables. In addition, those with positively skewed distribution (ESR, Cr, mRss and time between Raynaud’s and SSc onset) were log-transformed, while negatively skewed ones (Hb) were square-root transformed in order to achieve near normal distribution prior to the imputation procedure and the imputed values were subsequently exponentiated for the Cox regression analysis. Variables with the least proportion of missing values were imputed first and those with the largest were imputed last. Linear regression was used for imputation of continuous and logistic regression for imputation of categorical variables. Imputation method used was fully conditional specification and 25 imputed datasets were created. The prediction model results presented are pooled from the analysis of the 25 imputed datasets. Comparisons between observed and imputed values demonstrated significantly greater frequency of proteinuria and friction rubs and lower levels of mRss in the imputed group for both 3 year and 5 year datasets. In addition, the 5 year dataset had lower imputed DLCO levels (mean DLCO of 54%) compared to the observed DLCO values (mean DLCO of 66%, p=0.006). When comparing observed and imputed mRss and DLCO levels for lcSSc and dcSSc subset separately, the only difference we found was for DLCO levels in lcSSc patients (mean observed 67%, mean imputed 54%, p=0.003). 7 Sensitivity analysis Sensitivity analysis was performed to assess the degree to which missing data imputation has affected the findings of the analysis. Missing values were substituted with their minimum and maximum imputed values and prediction models were derived in both datasets. Survival prediction model When missing values were substituted with their imputed minimum, the derived survival model was very similar to the one based on the multiply imputed dataset. The only additional variable in the model was presence of proteinuria, which together with serum creatinine levels is a marker of renal involvement (Table I.2). Table I.2. Survival prediction model derived in the dataset with missing values substituted with their imputed minimum values β p-value HR 95.0% CI for HR DcSSc 0.425 0.023 1.530 1.059 2.210 Age at onset 0.049 <0.001 1.050 1.035 1.065 DLCO -0.017 <0.001 0.984 0.976 0.991 Hb -0.179 0.001 0.836 0.752 0.931 Cr 0.003 0.001 1.003 1.001 1.004 Proteinuria 1.431 0.001 4.182 1.811 9.656 PH3y 1.372 0.001 3.943 1.761 8.828 Cardiac SSc3y 1.843 <0.001 6.318 2.746 14.538 Derivation of the survival model in a dataset where missing values were substituted with their imputed maximum demonstrated that subset, age, presence of PH and cardiac SSc are significantly associated with survival. Although lung function results did not remain in the final model, presence of csPF and ATA positivity, which are associated with lower FVC and DLCO, did. In this model, smoking history was also a significant predictor of survival. Serum creatinine levels were no longer in the model, but were substituted by history of SRC (Table I.3). 8 Table I.3. Survival prediction model derived in the dataset with missing values substituted with their imputed maximum values β p-value HR 95.0% CI for HR DcSSc 0.379 0.035 1.460 1.028 2.075 Age at onset 0.051 <0.001 1.052 1.036 1.068 ATA 0.634 0.001 1.886 1.306 2.723 PF3y 0.527 0.009 1.694 1.141 2.515 Smoking history - no reference category Smoking history - ex 0.286 0.182 1.330 0.875 2.024 Smoking history - current 0.774 <0.001 2.168 1.462 3.216 RC3y 1.078 <0.001 2.939 1.650 5.236 PH3y 1.398 0.001 4.047 1.795 9.124 Cardiac SSc3y 1.404 0.004 4.072 1.554 10.670 When the model derived from the imputed dataset was tested in the datasets where missing values were substituted by their imputed minimums and maximums, the model performed comparatively well with only serum creatinine levels not showing significant association with the outcome in the dataset using imputed maximums of the missing variables (Table I.4). Dataset with imputed maximum values Dataset with imputed minimum values Table I.4. Performance of the survival prediction model derived from the imputed dataset in the datasets with using imputed minimums and maximums of missing values β p-value HR 95.0% CI for HR DcSSc 0.461 0.014 1.585 1.099 2.287 Age at onset 0.049 <0.001 1.050 1.035 1.065 DLCO -0.017 <0.001 0.984 0.976 0.991 Hb -0.177 0.001 0.838 0.754 0.932 Cr 0.003 0.001 1.003 1.001 1.004 PH3y 1.395 0.001 4.034 1.810 8.991 Cardiac SSc3y 1.799 <0.001 6.045 2.631 13.889 DcSSc 0.572 0.002 1.771 1.235 2.540 Age at onset 0.051 <0.001 1.052 1.037 1.068 DLCO -0.015 <0.001 0.985 0.978 0.993 Hb -0.180 0.001 0.835 0.750 0.931 Cr 0.0001 0.803 1.000 0.999 1.001 PH3y 1.423 0.001 4.151 1.799 9.577 Cardiac SSc3y 1.857 <0.001 6.407 2.736 15.004 9 Pulmonary fibrosis prediction model The PF prediction models derived in the datasets with missing values substituted with their imputed minimums and maximums were similar to the model derived in the multiply imputed dataset. In particular, ACA remained significant negative predictor, while ATA was a strong positive predictor of csPF. Reduction in DLCO also significantly increased the risk of csPF in both models, although FVC remained a significant predictor only in the model derived in the dataset with missing values substituted with their imputed maximums. Age at onset was also present only in this model, while diffuse subset was associated with signifivant increase in the hazard for csPF only in the model derived from the dataset where missing values were substituted with their imputed minimums (Tables I.5 and I.6). Table I.5. Clinically significant pulmonary fibrosis prediction model derived in the dataset with missing values substituted with their imputed minimum values β p-value HR 95.0% CI HR DcSSc 0.538 0.011 1.712 1.129 2.597 DLCO -0.026 <0.001 0.974 0.965 0.983 ESR 0.023 <0.001 1.024 1.015 1.033 ACA -1.658 <0.001 0.191 0.090 0.405 ATA*T(years) 0.188 <0.001 1.207 1.106 1.317 Table I.6. Clinically significant pulmonary fibrosis prediction model derived in the dataset with missing values substituted with their imputed maximum values β p-value HR 95.0% CI HR Age at onset 0.017 0.022 1.017 1.002 1.032 FVC -0.015 0.003 0.985 0.975 0.995 DLCO -0.012 0.033 0.988 0.977 0.999 ACA -2.051 <0.001 0.129 0.046 0.356 ATA*T(years) 0.173 <0.001 1.188 1.096 1.288 10 When the model derived in the multiply imputed dataset was tested in the two datasets using the minimum and maximum imputed values for the missing variables, the model demonstrated a comparatively good fit with FVC showing no association with csPF in the dataset using imputed minimums of the missing values and disease subset showing no association with csPF in the dataset using imputed maximums of the missing values. Dataset with imputed maximum values Dataset with imputed minimum values Table I.7. Performance of the clinically significant pulmonary fibrosis prediction model derived from the imputed dataset in the datasets using imputed minimums and maximums of missing values β p-value HR 95.0% CI HR DcSSC 0.790 <0.001 2.203 1.463 3.317 Age at onset 0.015 0.050 1.015 1.000 1.030 FVC 0.004 0.451 1.004 0.994 1.014 DLCO -0.025 0.001 0.975 0.961 0.990 ACA -1.659 <0.001 0.190 0.089 0.405 ATA*T(years) 0.180 <0.001 1.198 1.099 1.305 DcSSC 0.129 0.518 1.138 0.769 1.682 Age at onset 0.018 0.019 1.018 1.003 1.033 FVC -0.015 0.005 0.985 0.975 0.996 DLCO -0.012 0.040 0.988 0.977 0.999 ACA -1.992 <0.001 0.136 0.048 0.384 ATA*T(years) 0.175 <0.001 1.191 1.098 1.291 Pulmonary hypertension prediction model Derivation of prediction models for PH in the datasets with missing values substituted with their imputed minimums and maximums demonstrated almost identical results (Table I.8 and I.9). Both showed, similar to the model derived from the multiply imputed dataset, that age at onset, DLCO, ARA, AFA, SRC and its interaction with DLCO were significant predictors of PH development. On the other hand, unlike the originally developed model, ATA and serum creatinine levels did not show any significant association with PH, although both models included presence of proteinuria. 11 Table I.8. Pulmonary hypertension prediction model derived in the dataset with missing values substituted with their imputed minimum values β p-value HR 95.0% CI for HR Age at onset 0.035 0.001 1.035 1.014 1.057 DLCO <0.001 -0.046 0.955 0.939 0.971 Proteinuria <0.001 2.069 7.914 3.093 20.248 Raynaud’s severity 0.992 0.015 2.698 1.217 5.982 ARA 1.405 0.003 4.076 1.628 10.204 AFA 1.208 0.004 3.347 1.485 7.545 RC5y -4.212 0.011 0.015 0.001 0.377 DLCO*SRC5y 0.072 0.006 1.075 1.021 1.132 Table I.9. Pulmonary hypertension prediction model derived in the dataset with missing values substituted with their imputed maximum values β p-value HR 95.0% CI for HR Age at onset 0.032 0.003 1.033 1.011 1.055 DLCO <0.001 -0.031 0.970 0.959 0.980 Proteinuria <0.001 1.560 4.758 2.720 8.323 ARA 1.056 0.023 2.876 1.159 7.137 AFA 1.348 0.001 3.848 1.697 8.727 RC5y -2.717 0.087 0.066 0.003 1.483 DLCO*SRC5y 0.044 0.034 1.045 1.003 1.087 The original PH prediction model, derived in the multiply imputed dataset, demonstrated a relatively good fit in the datasets using the imputed minimum and maximum values in the missing variables (Table I.10). ATA did not show a significant association with PH development in either test dataset, although there was a trend towards significance in the dataset using the imputed maximum of the missing values. ARA showed significant association with PH only in the dataset using imputed minimums of the missing values and while history of SRC in the first 5 years of disease was significant in both datasets, its interaction with DLCO was significant only in the dataset using imputed maximums of the missing values. 12 Dataset with imputed maximum values Dataset with imputed minimum values Table I.10. Performance of the pulmonary hypertension prediction model derived from the imputed dataset in the datasets using imputed minimums and maximums of missing values β p-value HR 95.0% CI HR Age at onset 0.030 0.004 1.031 1.010 1.052 DLCO -0.030 <0.001 0.970 0.960 0.981 Cr 0.004 0.006 1.004 1.001 1.007 ATA -0.497 0.228 0.608 0.271 1.364 ARA 1.103 0.027 3.013 1.134 8.007 AFA 1.088 0.009 2.969 1.317 6.695 RC5y -2.839 0.048 0.059 0.004 0.970 DLCO*RC5y 0.034 0.107 1.035 0.993 1.079 Age at onset 0.033 0.002 1.034 1.012 1.056 DLCO <0.001 -0.033 0.968 0.956 0.980 Cr <0.001 0.004 1.004 1.002 1.006 ATA -0.782 0.062 0.458 0.201 1.040 ARA 0.681 0.141 1.975 0.798 4.890 AFA 1.485 0.001 4.414 1.903 10.237 RC5y -4.093 0.011 0.017 0.001 0.394 DLCO*RC5y 0.050 0.019 1.051 1.008 1.096 Although the final models that were derived during the sensitivity analysis had some differences from the ones derived in the multiply imputed datasets, the predictor variables included did overlap to great extent. In the majority of cases the different predictor variables reflected similar clinical problems, for example serum creatinine levels, proteinuria and history of SRC. This suggests that, even though the missing value imputation has affected the specific variables found to be predictors of survival and pulmonary complications, the general associations described in the prediction models are independent of the methods used for handling of missing data. 13 II. Statistical Analysis Standardised mortality ratio analysis Standardized mortality ratios (SMRs) were calculated by dividing number observed and expected deaths in our cohort and 95% CIs for SMRs were calculated as SMR±1.98xSESMR =SMR±1.98 x number observed deaths1/2/number expected deaths (4). Yearly expected numbers of deaths were calculated by multiplying gender-specific mortality rates and number of patients at risk for each age group. We used the published interim life tables for England, publically available on the website of the UK Office for National Statistics (5). As our group consisted of patients with disease onset between 1995 and 1999, for the first year of follow-up we used expected mortality rates from the 1996-1998 life table and then the consecutive yearly rate for the subsequent follow-up years. At the time of data analysis the most recent life table available was based on data from years 2008-2010, which allowed for 13 years of follow-up. Pulmonary complications and survival prediction models building To build pulmonary complications and survival prediction models, we used Cox regression analysis. Initially, univariable analysis was used to identify variables that significantly predict the outcome of interest (p≤0.05). They were subsequently included in multivariable analysis. In addition, variables that did not show significant association, but were judged clinically relevant, were forced in the multivariable models, to assess for potential interaction effects. Proportional hazards assumption was tested using log minus log plots and Schoenfeld residual plots. If there was an indication for proportionality violation, extended Cox regression models, allowing for use of time-dependent covariates, were used to fit time interaction terms. These were kept in the final model, if statistically significant (p≤0.05). Difference between -2 Log likelihood ratios was used to compare the fit of nested models and Harrell’s c was calculated to assess discrimination of the final models where possible. 14 III. Threshold analysis In order to make the prediction models easier to apply in practice, threshold analysis was used to look for appropriate categorisation of continuous variables. We compared KM survival estimates and Cox regression-derived hazard ratios to identify predictor variable cut-off points associated with the most significant separation in outcome. The models were then run using the categorised predictor variables. We used the sum of rounded β estimates or doubled β estimates where appropriate, for each variable category to calculate risk scores for death, csPF and PH. Threshold analysis for survival model Continuous predictor variables were stratified to find optimal levels for categorisation. Age was divided into groups of <20, 21-30, 31-35, 36-40, 41-45, 46-50, 51-55, 56-60, 61-65, 66-70, 7175 and >75 years. Cox regression analysis showed that hazard of death increased significantly for patients with age at disease onset >60 years (HR 3.3, 95%CIs 2.3-4.7, p<0.001). Similarly DLCO was divided into groups with levels of ≤30, 30-35, 35-40,40-45, 45-50, 50-55, 55-60,6065,65-70, 70-75, 75-80, 80-85 and >85% and the risk of death was significantly higher for patients with DLCO<65% (HR 2.3, 95%CIs 1.6-3.3, p<0.001). For haemoglobin we used the normal range as cut-off points and patients with Hb<11.5g/dL had significantly increased hazard of death (HR 1.9, 95%CIs 1.2-3, p=0.004). Finally, serum creatinine levels were divided into groups of <70, 70-80, 80-90, 90-100, 100-150, 150-200, and >200μmol/L and in a univariable Cox regression, levels greater than 100 μmol/L were associated with significant increase in the hazard of death (HR 2.1, 95%CIs 1.3-3.5, p=0.004). The model with categorical variables is described in Table 4. 15 Threshold analysis for pulmonary fibrosis model Age at onset, FVC and DLCO were categorised to make the model easier to use. Age was split into groups of ≤30, 31-35, 36-40, 41-45, 46-50, 51-55, 56-60, 61-65 and >65 years. The greatest separation in KM estimates of csPF development was seen between patients with age of up to 55 years and those above 55 years, although the difference remained non-significant, consistent with the findings of the univariable analysis. FVC was divided into groups of ≤30, 31-35, 36-40, 41-45, 46-50, 51-55, 56-60, 61-65, 66-70, 71-75, 76-80 and >80%. Three groups had distinct difference in association with development of csPF – patients with FVC>80%, FVC between 65% and 80% (HR 2.7, 95%CIs 1.6-4.6, p<0.001, compared to FVC>80%) and FVC<65% (HR 8.3, 95%CIs 4.88-14.1, p<0.001, compared to FVC>80%). Finally DLCO was similarly divided into groups of ≤30, 31-35, 36-40, 41-45, 46-50, 51-55, 56-60, 61-65, 66-70, 71-75, 76-80 and >80%. DLCO of up to 55% was associated with HR 5.7 (95%CIs 3.46-9.52, p<0.001) compared to DLCO>55%. The final model using categorised variables is shown in Table 4. Rounded doubled β values were used as risk points. The interaction of ATA and disease duration had β of 0.141; therefore it could contribute a risk point for approximately every 4 years of follow-up. Threshold analysis for pulmonary hypertension model Threshold analysis, looking for optimal categorisation of the continuous predictors in the model (age at onset, serum creatinine levels and DLCO) was undertaken to make the model easier for practical application. Age was initially split into groups of <36, 36-40, 41-45, 46-50, 51-55, 5660, 61-65 and >65. Univariable Cox regression analysis demonstrated that hazard of PH increased significantly for groups of age greater than 55. Further analysis showed that age of SSc onset greater than 55 was associated with HR 2.2 (95% CIs 1.3-3.8, p=0.002) and at the end of follow-up 22% (n=24) of those patients had developed PH compared to 13% (n=37) among those aged 55 or younger, p=0.029. Similarly creatinine levels were initially split into groups of 16 <70, 70-80, 80-90, 90-100, 100-150, 150-200, 200-300 and >300μmol/L and significant increase in HR was seen for groups with levels above 90, although separation was relatively mild. Further analysis demonstrated that the greatest separation in risk for PH was seen with a cut-off point of 85 units. Levels of 85 or above were associated with HR of 2.3 (95% CIs 1.3-4, p=0.004) compared to levels below 85 units. Finally, DLCO was split into groups of <55, 55-60, 61-65, 66-70, 71-80 and >80%. There was no significant increase in hazard for PH development in the groups with DLCO>65% and the hazard associated with DLCO<55% was much greater than that of DLCO between 55% and 65%; therefore DLCO was split into 3 groups. Univariably, DLCO between 55% and 65% was associated with HR 2.9 (95% CIs 1.1-7.9, p=0.033) and DLCO<55% was associated with HR 8.7 (95% CIs 3.8-19.9, p<0.001) compared to DLCO>65%. Details of the model using categorical variables are presented in Table 4. 17 IV. Pulmonary complications and survival prediction models interpretation Survival model Each 1 year increment in age at disease onset increased hazard of death by 5%. Patients with dcSSc were 51% more likely to die compared to patients with lcSSc, if other characteristics were the same. A decrease of 1% in DLCO at baseline increased hazard of death by 2%; a decrease by 1 g/dL in baseline haemoglobin level increased hazard of death by 21% and an increase in serum creatinine levels by 1 μmol/L increased hazard of death by 0.3%. Pulmonary fibrosis model Based on this, our findings show that patients with dcSSc have 77% higher hazard for development of csPF and for each 1 year age increment at disease onset, the hazard of csPF increases by 2%. If all other characteristics are the same, 1% lower FVC or DLCO is associated with 3% higher hazard of csPF. As previously shown, ACA was protective and patients who were ACA positive had over 80% reduction in the risk of csPF. On the other hand, ATA positivity increased the risk for development of csPF by 16% for every year of disease duration. For example, a patient who is ATA positive will have exp(0.149)=1.16-1=16% higher hazard of development of csPF after one year of disease, compared to other patients with similar other characteristics, but who do not carry ATA. That hazard will increase to exp(5x0.149)=2.101=110% increase in the hazard of csPF after 5 years of SSc. Risk score points for scPF were calculated, based on the rounded doubled β values from the regression analysis. The interaction between ATA and time in years was associated with β of 0.141. In order to have a β value of around 0.5 (corresponding to 1 risk point), time was recalculated to reflect 4 yearly periods. This yielded β value of 0.566 associated with the interaction. The interpretation would be that in patients, who are ATA positive, for every 4 years 18 of disease duration from onset, a 1 risk point should be added to the risk score predicting csPF development. We believe these results are due to the strong association of ATA and csPF and the early development of csPF in the disease course. The longer an ATA positive patient has had SSc, the greater the risk of csPF development, if it has not developed already. Pulmonary hypertension model The model demonstrated that older age at disease onset, increase in serum creatinine and presence of ARA or AFA are associated with increased risk of PH, while ATA positivity reduced the hazard of PH. In particular, if other predictor variables were kept constant, patients who carried ARA were more than 3 times more likely to develop PH and those who carried AFA nearly 4 times more likely to develop PH compared to those who carried other autoantibodies. Each additional year of age at disease onset contributed 3% increase in the hazard of PH, while 1 μmol/L increase in serum creatinine increased the hazard for PH by 0.4%. ATA positivity was associated with 59% reduction in the hazard for PH. The interaction between history of SRC in the first 5 years of disease and DLCO revealed an interesting effect, where the effect of one variable could be positive or negative depending on the value of the other. When other covariates were kept unchanged, in patients who did not develop SRC within the first 5 years of their disease, a 1% lower DLCO was associated with (1-0.939=0.06) 6% increase in the hazard of PH. On the other hand, in patients with history of SRC, 1% lower DLCO was associated in fact with (1-0.939*1.082=-0.016) 1.6% reduction in the hazard of PH. Similarly, history of SRC also had different effect on the overall hazard of PH depending on the DLCO level with associated HR being greater than one (increased hazard) when DLCO>70.8% and less than one (reduced hazard) when DLCO<70.8%. For that reason and in order to make the interaction easier to interpret, DLCO was centered at 70.8%. Based on this, when DLCO is 70.8%, the HR associated SRC is approximately equal to one (no effect) (95%CIs 0.2-5.7, p=0.999). With every 1% 19 increase in DLCO, presence of SRC increases the hazard of PH by about 8%, while for DLCO lower than 70.8, presence of SRC reduces the hazard for PH. In practical terms this means that when patients had preserved DLCO, history of SRC was associated with a small increase in the hazard of PH, while when patients had low DLCO, history of SRC was associated with reduced hazard of PH, which cancelled out the effect of the low DLCO. This suggests that while lower DLCO strongly predicts PH in patients who have not had SRC, in patients who have had SRC, DLCO levels are not a particularly powerful predictor of PH. 20 V. Analysis of cumulative incidence and antibody associations of pulmonary arterial hypertension and clinically significant pulmonary fibrosis-associated fibrosis associated pulmonary hypertension separately Cumulative incidence of PAH and PF-associated PF associated PH in dcSSc and lcSSc patients Figure A1. There was no significant difference in the cumulative incidence of PAH or PF PFassociated PH between the two major disease subsets of SSc. In addition, associations between tween ACA/ATA and PAH/PF-associated PAH/PF associated PH mirrored the results of the analysis based on data for PAH and PF-PH PF PH pooled together. As shown below, ACA was a strong negative predictor of PF-PH PF PH in keeping with its protective effect with relation to PF development. ATA had significant negative association with PAH, similar to its negative association with PH as a whole while it was not significantly associated with PF PF-PH development alone. ACA had no association with PAH. PF-PH ACA ATA β -1.553 1.553 0.208 p-value 0.036 0.660 HR 0.212 1.231 95.0% CI for HR 0.050 0.900 0.488 3.102 PAH ACA ATA β -0.095 0.095 -2.161 2.161 p-value 0.804 0.035 HR 0.910 0.115 95.0% CI for HR 0.431 1.919 0.015 0.858 21 VI. References 1. Breslow N.E. & Day N.E. Rates and rate standardization. Statistical Methods in Cancer Research. Volume II - The Design and Analysis of Cohort Studies. Oxford University Press; 1987. p. 48-82. 2. http://www.ons.gov.uk 3. Thomas R. Belin, Ming-yi Hu, Alexander S. Young, Oscar Grusky. Using multiple imputation to incorporate cases with missing items in a mental health services study. Health services and outcomes research methodology. 2000; Volume 1, Issue 1, pp 7-22. 4. Mackinnon A. The use and reporting of multiple imputation in medical research - a review. J Intern Med. 2010; 268(6):586-93. Review. 5. Sterne JA, White IR, Carlin JB, Spratt M, Royston P, Kenward MG, et al. Multiple imputation for missing data in epidemiological and clinical research: potential and pitfalls. BMJ. 2009; 338:b2393. 22