eAppendix of “Effect of Hepatitis C on Hepatocellular Carcinoma Mediated through Hepatitis B Viral Load” Section 1. Study Design and Mediation Analyses of Follow-up HBV DNA We chose to study mediation effect among the HBV-positive population (n=3,851) instead of the entire population (n=23,820) because they had available information on the mediator, namely the HBV DNA level. If we instead studied the general population (n=23,820), then the majority of study subjects had no HBV DNA, i.e., no mediator to mediate the effect of HCV. Consequently, the distribution of the mediator would be very asymmetric with >80% of subjects with HBV DNA=0, and results from the mediation analyses can be obscure and difficult to interpret. We collected serum HBV DNA during follow-up with time from study entry to the measurement illustrated in eFigure 1. 45.2% of the follow-up measurement was within year 1, 27.4% was during year 2-5, 11.4% was during year 6-10, and 16.0% was after 11 years. Because the follow-up HBV DNA was measured at various time points, the time scale of mediation analyses was realigned accordingly. Specifically, in Cox survival models, we treated the time of follow-up measurement as the new study entry, and HCV infection and the covariates (age, gender, alcohol consumption, cigarette smoking and ALT) as the past history. With the realignment of the time scale, we attempted to account for HBV DNA fluctuation caused by time. Although anti-HCV was relatively stable during the follow-up, the realignment may introduce time-varying HCV RNA as it was measured at different time prior to the new study entry. Additionally, we also conducted survival analyses using the baseline measurement as the study entry where HCV RNA was measured at the same time but time to follow-up HBV DNA measurement varied as shown in eFigure 1. The results were very similar between the two sets of analyses. The results presented in main text and Figures 4 and 5 were based on the realignment using follow-up measurement time as the new entry. It was also reassuring that the findings from analyses based on the follow-up HBV DNA were consistent with those based on the baseline DNA. Although the results were robust across various analyses, it should be acknowledged that both analyses still did not fully account for the fluctuation of the HBV and HCV activities simultaneously as they were not measured regularly with high frequency, which was the limitation of our study. In addition to follow-up time, one may use age as the time scale, which may provide more biological relevance. Our previous study has shown that using age or follow-up time as the timescale had very similar results if age and recruitment time were fully adjusted1. If we used age as the time scale, both the exposure and the mediator would become time-varying since they were measured at different ages of subjects. The reason we chose follow-up time instead of age as the underlying time scale is that theory of time-varying mediation has not yet been established, and mediation analyses for survival data with time-varying exposures and mediators are not available. Section 2. Additional Discussion on Mediation analyses Intuition behind direct and indirect effects. As stated in text, direct (DE) and indirect effects (IE) of π1 vs. π0 , can be simplified to (π1 − π0 )π½πΈ and (π1 − π0 )π½π πΌπΈ provided there is no HBVby-HCV cross-product interaction. Note the DE π½πΈ is the association of HCV infection (πΈ) with HCC risk in Cox model adjusting for covariates (πΏ) and HBV viral load (π): log β(π‘|πΈπ , ππ , πΏπ ) = log β0 (π‘) + π·ππ πΏπ + π½πΈ πΈπ + π½π ππ . The IE is a product of two associations: π½π , the association of HBV vial load (π) with HCC risk adjusting for covariates and HCV infection, and πΌπΈ , the association of HCV infection with HBV viral load: ππ = πΆππ πΏπ + πΌπΈ πΈπ + πππ . Therefore, π½πΈ (i.e., DE) can be interpreted as an association of HCV infection with HCC risk on top of that attributable to HBV, or a ‘direct effect’ other than effects through HBV where the causal interpretation can be justified with the no unmeasured confounding assumptions discussed in text. Similarly, π½π πΌπΈ (i.e., IE) is proportional to an association of HBV viral load with HCC risk on top of that of HCV, and if HBV viral load is affected by HCV: πΌπΈ ≠ 0, then the quantity can be further interpreted as an ‘indirect effect’, again under the no unmeasured confounding assumptions. If there exist an HBV-by-HCV statistical interaction, the expressions of DE and IE are more complex2, as shown in (3) and (4) in main text. Variances of DElogHR and IElogHR . The variances of DElogHR and IElogHR can be approximated with delta method2,3. Specifically, the two variances can be expressed as: Variance of DElogHR = DπDE ΣDDE Variance of IElogHR = DπIE ΣDIE , π 2 2 2 Μ π + π0 πΌΜπΈ + π½Μπ πΜπ where DDE = (π1 − π0 ) × (π½ΜπΈπ πΏπ , π½ΜπΈπ π0 , 1, π½ΜπΈπ πΜπ , πΏπ πΆ + π½ΜπΈπ πΜπ (π1 + π0 )) , ΣπΌ π DIE = (π1 − π0 ) × (π, π½Μπ + π½ΜπΈπ π1 , 0, πΌΜπΈ , πΌΜπΈ π1 ) and Σ = [ π π Σπ½ ] and ΣπΌ is the covariance matrix T Μ ππ , πΌΜπΈ )T and Σπ½ is the covariance matrix of (π½ΜπΈ , π½Μπ , π½ΜπΈπ ) . Note that the delta method of (πΆ approximation requires a large sample size, which is satisfied in our study, n=3851. Alternatively, the variances of DElogHR and IElogHR can be approximated by bootstrapping, which does not require a large sample size but is computationally more costly. Marginal Effect (ME) and Total Effect (TE). In addition to direct and indirect effects, one may be interested in an ‘overall effect’ of HCV infection on HCC risk, regardless of its mediation through HBV viral load or not, i.e., both black and gray pathways in Figure 1 of main text. Such an overall effect can be estimated in two ways. One is to simply fit a Cox regression model of HCC incidence on HCV infection, adjusting for covariates but ignoring HBV viral load: ∗ ∗ log β(π‘|πΈπ , πΏπ ) = log β0 (π‘) + π·∗π π πΏπ + π½πΈ πΈπ , and π½πΈ represents the overall effect, termed as Marginal Effect (ME) in text. The interpretation is the same as that in conventional survival analyses using Cox model: the association of HCV infection with HCC risk in the log hazard scale, in the population with HBV infection, and with no unmeasured confounding assumptions, the association can be further interpreted as a causal overall effect. The other method to estimate the overall effect is using mediation analyses, termed as Total Effect (TE). Once the DE and IE are estimated, one can simply sum the two effects in the log hazard ratio scale: TElogHR = DElogHR + IElogHR , or equivalently, takes the product of the two effects in the hazard ratio scale: TEHR = DEHR × IEHR . The interpretation of TE is identical to that of ME, and the estimation is even more explicit: decomposing the overall effect into DE and IE and then summing back, under the log hazard ratio scale. In Tables 4 and 5, TE is very similar to ME but slightly more significant, probably due to consideration of effect heterogeneity mediated by HBV viral load in the mediation analyses. Section 3. Additional Discussion on Unmeasured confounding Despite the extensive adjustment for potential confounders, we still considered immunity of the host as a possible unmeasured confounding factor that may violate the assumptions 1)-3) in text. Specifically, impaired immune response affects the risk of having HCC and HCV viral replication, which may violate assumption 1); similarly, an impaired immune response can affect HCC risk and HBV viral activity, which may violate assumption 2) and thus assumption 3). Although we did not directly measure the immunity of our participants, we argued that age and liver function should serve as reliable proxies for immune function. Assumption 4) can be violated if immunity affects HCC risk and HBV viral replication, and immunity is impaired by HCV infection. We think that such an immune characteristic is not likely to exist because if such an immune factor can affect HBV viral activity, it is more likely to affect the activity of HCV as well rather than being caused by HCV. We conducted various analyses to address potential confounding by immunity to ensure the robustness of our findings. Pparticipants with co-infection may represent a selective population carrying unmeasured characteristics such as impaired immunity. The results of mediation analyses among this co-infected population were consistent with the main analyses, revealing an apparent suppressive effect of HCV on the HCC risk mediated by the HBV viral load (eTable 4). On the other hand, results from the analyses with HCV serostatus may be confounded by the unmeasured ability of the HCV viral clearance; for example, participants with positive anti-HCV may have undetectable HCV viral load due to its immune clearance. Our mediation analyses with HCV viral load (Table 5) showed a suppressive indirect effect similar to that based on the anti-HCV serostatus (Table 4). Finally, we investigated the possibility that a very high HBV DNA level may result from impaired immunity. We conducted mediation analyses in the population with HBV DNA <106 copies/mL, which again revealed similar findings (eTable 5). eFigure 1. Time to follow-up HBV DNA measurement since study entry in 2,889 subjects. eTable 1. Associations of HBV viral load with HCV viral load in co-infected subjects. Estimate 95% CI P Value For participants seropositive for HBsAg and anti-HCV (n=195) Serum HCV RNA level* -0.15Ζ -0.23, -0.066 0.0006 Pearson correlation -0.23 -0.36, -0.097 0.0010 For participants seropositive for HBsAg and anti-HCV (n=125) * Serum HCV RNA level* -0.13Ζ -0.22, -0.046 0.0030 Pearson correlation -0.25 -0.41, -0.082 0.0042 Adjustment for age (30-39 (referent), 40-49, 50-59, 60-65 years), gender, alcohol consumption (yes/no), smoking (yes/no), and ALT (<15 (referent), 15-44, ≥45 IU/L). Ζ Change in log10 HBV viral load (copies/mL) per 1 unit increase of log10 HCV viral load (IU/mL) or between positive and negative anti-HCV serostatus. Viral load of both hepatitis B and C viruses were log10 transformed in the linear regression. eTable 2. Total, direct and indirect effects of anti-HCV serostatus (positive vs. negative) on the development of HCC among participants seropositive for HBsAg, with serum HBV viral load as the mediator. Effect of Baseline Anti-HCV HR* 95% CI P Value Serostatus Mediator: baseline serum HBV viral load (n=3851, no. of HCC=278) Total effect 1.96 1.34, 2.88 0.0005 Direct effect 2.41 1.64, 3.52 <0.0001 Indirect effect 0.82 0.71, 0.93 0.0031 Mediator: follow-up serum HBV viral load (n=2888, no. of HCC=190) Total effect 2.80 1.76, 4.47 <0.0001 Direct effect 3.34 2.04, 5.47 <0.0001 Indirect effect 0.84 0.71, 0.99 0.044 * Hazard ratio of HCC cases comparing subjects seropositive for anti-HCV with seronegative ones, adjusting for age (30-39 (referent), 40-49, 50-59, 60-65 years), gender, alcohol consumption (yes/no), smoking (yes/no), and ALT (<15 (referent), 15-44, ≥45 IU/L). eTable 3. Total, direct and indirect effects of serum HCV RNA level (404,000 vs. 800 IU/mL) on the development of HCC among participants seropositive for HBsAg, with HBV viral load as the mediator. Effect of Baseline HR* 95% CI P Value Serum HCV RNA Level Mediator: baseline serum HBV viral load (n=3851, no. of HCC=278) Total effect 1.16 0.89, 1.50 0.28 Direct effect 1.42 1.07, 1.90 0.02 Indirect effect 0.81 0.74, 0.89 <0.0001 Mediator: follow-up serum HBV viral load (n=2888, no. of HCC=190) Total effect 1.04 0.72, 1.50 0.85 Direct effect 1.14 0.73, 1.80 0.56 Indirect effect 0.91 0.79, 1.04 0.16 * Hazard ratio of HCC cases comparing serum HCV RNA level of 404,000 vs. 800 IU/mL, adjusting for age (30-39 (referent), 40-49, 50-59, 60-65 years), gender, alcohol consumption (yes/no), smoking (yes/no), and ALT (<15 (referent), 15-44, ≥45 IU/L). eTable 4. Marginal, total, direct and indirect effects of serum HCV RNA level (404,000 vs. 800 IU/mL) on the development of HCC among participants seropositive for both HBsAg and antiHCV, with HBV viral load as the mediator. Effect of Baseline Serum HR* 95% CI P Value HCV RNA Level Mediator: baseline serum HBV DNA level (n=195, no. of HCC=33) Marginal effect 0.82 0.57, 1.18 0.29 Total effect 0.87 0.60, 1.25 0.45 Direct effect 0.97 0.67, 1.41 0.87 Indirect effect 0.90 0.81, 0.98 0.021 Mediator: follow-up serum HBV DNA level (n=125, no. of HCC=23) Marginal effect 0.73 0.48, 1.11 0.14 Total effect 0.72 0.46, 1.11 0.13 Direct effect 0.79 0.51, 1.22 0.29 Indirect effect 0.91 0.80, 1.02 0.11 * Hazard ratio of HCC cases comparing serum HCV RNA level of 404,000 vs. 800 IU/mL, adjusting for age (30-39 (referent), 40-49, 50-59, 60-65 years), gender, alcohol consumption (yes/no), smoking (yes/no), and ALT (<15 (referent), 15-44, ≥45 IU/L). eTable 5. Total, direct and indirect effects of serum HCV RNA level (404,000 vs. 800 IU/mL) on the development of HCC among participants seropositive for HBsAg and baseline HBV DNA < 106 copies/mL. Effect of Baseline Serum HCV RNA HR* 95% CI P Value Level Mediator: baseline serum HBV viral load Total effect 1.48 1.13, 1.93 0.0044 Direct effect 1.74 1.33, 2.28 <0.0001 Indirect effect 0.85 0.79, 0.91 <0.0001 Mediator: follow-up serum HBV viral load Total effect 1.55 1.13, 2.14 0.0074 Direct effect 1.72 1.26, 2.37 0.0008 Indirect effect 0.90 0.84, 0.97 0.0041 * Hazard ratio of HCC cases comparing serum HCV RNA level of 404,000 vs. 800 IU/mL, adjusting for age (30-39 (referent), 40-49, 50-59, 60-65 years), gender, alcohol consumption (yes/no), smoking (yes/no), and ALT (<15 (referent), 15-44, ≥45 IU/L). REFERENCES 1. 2. 3. Huang YT, Jen CL, Yang HI, Lee MH, Su J, Lu SN, Iloeje UH, Chen CJ. Lifetime risk and sex difference of hepatocellular carcinoma among patients with chronic hepatitis B and C. J Clin Oncol 2011;29(27):3643-50. VanderWeele TJ. Causal mediation analysis with survival data. Epidemiology 2011;22(4):582-5. Vanderweele TJ, Vansteelandt S. Odds ratios for mediation analysis for a dichotomous outcome. Am J Epidemiol 2010;172(12):1339-48.