Online supporting information for the following article published in Indoor Air DOI: TO BE ADDED BY THE PRODUCTION EDITOR Do time-averaged, whole-building, effective VOC emissions depend on the air exchange rate? A statistical analysis of trends for 46 VOCs in U.S. offices Adams Rackes a,*, Michael S. Waring a a Department of Civil, Architectural and Environmental Engineering, Drexel University, 3141 Chestnut Street, Curtis 251, Philadelphia, PA 19104, USA * Corresponding author. Tel: +1 215 895 1502; fax: +1 215 895 1363. E-mail addresses: aer37@drexel.edu (A. Rackes), msw59@drexel.edu (M.S. Waring) This supporting information (SI) document accompanies the article Do time-averaged, wholebuilding, effective VOC emissions depend on the air exchange rate? A statistical analysis of trends for 46 VOCs in U.S. offices. We have included here additional details about processing the data, the development of metrics, and fusing the criteria for evaluating model fit. We also present the complete Stan models we used so our assumptions are clear, and in case they may be of use to others. The SI also includes other information we hope will be useful to future researchers, including tables of parametric models that did not fit in the main article, and our advice for putting together all the models developed in this work, in order to simulate time-averaged VOC concentration responses in U.S. offices. TABLE OF CONTENTS Data processing ........................................................................................................................... 2 Human breath emission distributions for individuals ................................................................. 2 Exploratory assessment of air exchange rate (AER) impact on emissions ................................. 5 Complete Stan models for Bayesian estimation and inference ................................................... 7 Building parameter lognormal models in IP units .................................................................... 10 Outdoor concentration lognormal models................................................................................. 10 DEM “suggestively preferred” resampling, criteria, and data fusion logic .............................. 12 Regression coefficients βΔC and βCin and relative air exchange efficacy .................................. 12 Detailed information on assessing the role of other factors ...................................................... 14 Advice for conducting Monte Carlo simulations with our models ........................................... 16 References (SI) .......................................................................................................................... 17 Rackes and Waring 2015 1 Data processing During the BASE study, three methods were used to measure VOCs, each sampling for ~8 to 10 hours. In coordinating data, we used only VOC concentrations measured with the EPA preferred method for each compound, even when concentrations obtained by other methods were available. (For a detailed account of the study protocol and measurement methods, see U.S. EPA (2003).) For each VOC measurement, the BASE study reported either a real value or codes indicating that the concentration was either below the limit of quantification (LOQ), below the limit of detection (LOD), or an error or missing measurement. We replaced entries below the LOQ or LOD with half the value of the respective threshold for that VOC (Hornung and Reed, 1990). Any duplicate samples were averaged with the primary sample values. After removing a small number of extreme outliers, we used the outdoor concentration directly and the average of the three indoor readings for the indoor concentration. We discarded any measurements where either the indoor or outdoor concentration was missing, since both are needed to apply Equation 1. Ventilation measurements were also taken in each building, and subsequently analyzed by National Institute of Standards and Technology (NIST), whose final report contains detailed descriptions of procedures and findings (Persily and Gorfain, 2008). Multiple measurements were taken, usually with separate morning and afternoon readings on consecutive days. We coordinated each primary ventilation measurement (averaging multiple measurements) with its associated VOC sampling period, and we also recorded the estimate uncertainty. We excluded one building for which the report did not provide any ventilation measurements. In subsequent model fitting, we added the effect of infiltration to the ventilation measurements, as described in the article. We did not apply this adjustment to the 14 buildings in the BASE study where CO2 tracer gas decay tests were used to derive total outdoor air flow rates. Human breath emission distributions for individuals We developed our per-occupant emissions estimates from literature, restricting our scope to human breath emissions as discussed in the paper body. VOCs have been measured in human breath, either via exhalation into a tube or sampling near the mouth and subtracting background concentration (Fenske and Paulson, 1999; Moser et al., 2005; Riess et al., 2010; Smith et al., 2007; Wallace et al., 1991). Most studies report point estimates, but two (Riess et al., 2010; Wallace et al., 1991) gave enough information for some VOCs to develop lognormal distributions for breath concentration. To combine VOC estimates from multiple studies, we used this method: If only point estimates were available, their values were averaged. If one distribution was available, we used it directly, and any other point estimates were ignored. If two distributions were available, their parameters were combined geometrically, weighted by the number of samples, and again any point estimates were ignored. Rackes and Waring 2015 2 VOC breath concentrations were converted to emissions by assuming atmospheric pressure, a breath temperature of 35.6 °C (Carpenter and Buttram, 1998), and a breathing rate of 0.78 m3/h, typical for sedentary activity (U.S. EPA, 2011). This procedure yielded breath concentration distributions for 18 VOCs and point estimates for 23 more, for a total of 41 VOCs, of which 26 were measured in the BASE study. The final estimates are listed in Table SI-1 for VOCs measured in the BASE study and Table SI-2 for VOCs not in BASE. It is worth emphasizing that the distributions in Tables SI-1 and SI-2 are for individuals. The average breath emissions term in Equation 1, Ep,ij, does not follow the individual distribution since it is the average of emissions from many (typically at least 50) occupants, each of which is a lognormal random variable. Rackes and Waring 2015 3 Table SI-1 Human breath emissions for compounds included in the BASE study Human breath emissions Classification Alcohol Alcohol Alcohol Alcohol Alcohol Alkane Alkane Alkane Alkane Alkane Alkane Alkane Alkene Alkene Aromatic Aromatic Aromatic Aromatic Aromatic Aromatic Aromatic Aromatic Aromatic Aromatic Aromatic Aromatic Aromatic Aromatic Aromatic Aromatic Carbonyl Carbonyl Carbonyl Carbonyl Carbonyl Carbonyl Carbonyl Carbonyl Carbonyl Ester Ester Ester Ester Ether Ether Halocarbon Halocarbon Halocarbon Halocarbon Halocarbon Halocarbon Halocarbon Halocarbon Halocarbon Halocarbon Halocarbon Halocarbon Rackes and Halocarbon Organosulfur Organosulfur Compound 1-butanol 2-ethyl-1-hexanol 2-methyl-1-propanol 2-propanol ethanol 3-methyl pentane decane dodecane hexane octane nonane undecane a-pinene d-limonene 1,2,4-trichlorobenzene 1,2,4-trimethylbenzene 1,2-dichlorobenzene 1,3,5-trimethylbenzene 1,4-dichlorobenzene 4-ethyltoluene 4-phenylcyclohexene benzene chlorobenzene ethylbenzene m & p-xylenes naphthalene o-xylene phenol styrene toluene 2-butanone (MEK) 4-methyl-2-pentanone acetaldehyde acetone formaldehyde hexanal heptanal nonanal pentanal 2,2,4-trimethyl-1,3-pentanediol diisobutyrate 2,2,4-trimethyl-1,3-pentanediol monoisobutyrate butyl acetate ethyl acetate 2-butoxyethanol t-butyl methyl ether 1,1,1-trichloroethane 1,2-dichloroethane bromomethane carbon tetrachloride chloroethane chloroform chloromethane dichlorodifluoromethane methylene chloride tetrachloroethene trichloroethene Waringtrichlorofluoromethane 2015 trichlorotrifluoroethane carbon disulfide dimethyl disulfide CAS 71-36-3 104-76-7 78-83-1 67-63-0 64-17-5 96-14-0 124-18-5 112-40-3 110-54-3 111-65-9 111-84-2 1120-21-4 80-56-8 5989-27-5 120-82-1 95-63-6 95-50-1 108-67-8 106-46-7 622-96-8 4994-16-5 71-43-2 108-90-7 100-41-4 1330-20-7 91-20-3 95-47-6 108-95-2 100-42-5 108-88-3 78-93-3 108-10-1 75-07-0 67-64-1 50-00-0 66-25-1 111-71-7 124-19-6 110-62-3 6846-50-0 25265-77-4 123-86-4 141-78-6 111-76-2 1634-04-4 71-55-6 107-06-2 74-83-9 56-23-5 75-00-3 67-66-3 74-87-3 75-71-8 75-09-2 127-18-4 79-01-6 75-69-4 76-13-1 75-15-0 624-92-0 GM (µg/h/occ) 285.00 159.00 0.24 0.10 0.67 0.41 0.46 0.48 12.30 0.46 3.78 0.32 6.70 2.23 18.10 0.10 31.80 29.00 51.52 725.90 3.94 34.00 46.00 2.16 0.27 2.67 - GSD (-) 3.24 1.90 1.75 1.75 1.99 1.98 1.50 1.68 1.85 - Sources 1,3 3 5 5 5 5 5 5 5 5 2 5 2,5 2,5 2 5 2 1 2,4 3,4 2 1 1 5 5 5 - 4 Carbonyl pentanal 110-62-3 Ester 2,2,4-trimethyl-1,3-pentanediol diisobutyrate 6846-50-0 Ester 2,2,4-trimethyl-1,3-pentanediol monoisobutyrate 25265-77-4 Ester butyl acetate 123-86-4 Ester ethyl acetate 141-78-6 Ether 2-butoxyethanol 111-76-2 Table SI-1 (cont.) Human breath emissions for compounds included in the BASE study Ether t-butyl methyl ether 1634-04-4 Halocarbon 1,1,1-trichloroethane 71-55-6 Halocarbon 1,2-dichloroethane 107-06-2 Halocarbon bromomethane 74-83-9 Halocarbon carbon tetrachloride 56-23-5 Halocarbon chloroethane 75-00-3 Halocarbon chloroform 67-66-3 Halocarbon chloromethane 74-87-3 Halocarbon dichlorodifluoromethane 75-71-8 Halocarbon methylene chloride 75-09-2 Halocarbon tetrachloroethene 127-18-4 Halocarbon trichloroethene 79-01-6 Halocarbon trichlorofluoromethane 75-69-4 Halocarbon trichlorotrifluoroethane 76-13-1 Organosulfur carbon disulfide 75-15-0 Organosulfur dimethyl disulfide 624-92-0 Sources: 1 - Fenske and Paulson, 1999 2 - Moser et al., 2004 3 - Smith et al., 2007 4 - Riess et al., 2010 5 - Wallace et al., 1991 46.00 2.16 0.27 2.67 - - 1 5 5 5 - Table SI-2 Human breath emissions for additional compounds not included in the BASE study. These compounds are included for reference only. Human breath emissions Classification Compound Not included in BASE dataset Alcohol Methanol Alkane Isopentane Alkane Pentane Alkene Butene Alkene Ethylene Alkene Isoprene Aromatic Furan Carbonyl 2,6-Di-tert-Butyl-p-Benzoquinone Carboxylic acid Acetic Acid Carboxylic acid Formic acid Inorganic Ammonia Inorganic Hydrogen Sulfide Inorganic Nitric Oxide Nitrile Acetonitrile Organosulfur Dimethyl Sulfide Sources: 1 - Fenske and Paulson, 1999 2 - Moser et al., 2004 3 - Smith et al., 2007 4 - Riess et al., 2010 5 - Wallace et al., 1991 CAS GM (µg/h/occ) GSD (-) Sources 67-56-1 78-78-4 109-66-0 106-98-9 74-85-1 78-79-5 110-00-9 719-22-2 64-19-7 64-18-6 7664-41-7 7783-06-4 10102-43-9 75-05-8 78-18-3 178.80 0.73 27.00 126.00 20.00 114.80 29.00 4.14 113.90 49.12 437.00 0.40 23.22 11.49 20.75 1.82 1.79 1.81 1.96 1.62 2.23 1.36 1.80 1.90 2,3,4 1 1 1 1 2,3 1 1 2 2 3 2 4 2 2 Exploratory assessment of air exchange rate (AER) impact on emissions Rackes and Waring 2015 5 For an exploratory assessment of the influence of the AER on effective emissions, we divided the emissions rates calculated according to Equation 2 into three bins according to the associated AER. The bins—less than 0.5 h-1, between 0.5 h-1 and 3 h-1, and above 3 h-1—approximately correspond to the lower 25% of AERs, the interquartile range of AERs, and the upper 25% of AERs, respectively. We fit a lognormal model to the data in each bin. If the effective emissions were independent of the AER, one would expect the three distributions to look similar. Figure SI-1 shows the probability densities of the lognormal models in the three bins for six VOCs: d-limonene, formaldehyde, acetaldehyde, decane, toluene, and butanol. We use these compounds to illustrate results, since they represent a range of associations between emissions and air exchange. For all these VOCs, the plots show that buildings with higher AERs had emission rates that were both higher on average and much more variable. Conversely, buildings with lower air exchange had both lower emissions and much tighter distributions. Fig. SI-1 Lognormal density estimates of effective emission rates in three AER ranges. The labeled points are distribution medians Figure SI-1 indicates that the impact of AER on the effective emissions distribution is most pronounced in the upper AER bin, which has a very large spread for all the VOCs depicted. At least part of this spread is a result of back-calculating emissions and may not reflect a real phenomenon. This is because, when inferring effective emissions from the measured quantities, Ez is proportional to the measured AER (per Equation 2), and larger AERs in the BASE study were generally subject to larger measurement uncertainties. If we compare the distribution modes, which are less affected by variability inflation, we would probably conclude from Figure 1 that there appear to be systematic and consistent increases in effective emissions with increasing AER for d-limonene, formaldehyde, and acetaldehyde. For decane and tolune, the relation is more subtle but potentially present, and could easily merely be an artifact of amplified measurement uncertainty for butanol. Rackes and Waring 2015 6 This screening exercise shows that there is good reason to doubt whether emissions are in fact independent of air exchange, and does so without making any assumptions about the nature of the relation between Ez and λ. In addition, this model-free assessment does not allow us to generalize discrete findings to estimate a presumably continuous impact, and also sacrifices statistical power by treating data in each bin in isolation. Therefore, a more sophisticated treatment of measurement uncertainty is clearly needed before drawing any conclusions. Bayesian estimation, as discussed in the main article, provides this treatment. Complete Stan models for Bayesian estimation and inference AER model data { int<lower=0> N; // number of buildings vector[N] Qvent_meas; // measured ventilation in L/s-m2 vector[N] Qunc; // uncertainty in ventilation measurement, L/s-m2 vector[N] h_meas; // measured ceiling height, m real aer_inf_mu; // log infiltration GM real aer_inf_sigma; // log infiltration GSD } parameters { vector<lower=0>[N] Qvent; // true ventilation in L/s-m2 vector<lower=0>[N] aer_inf; // true infiltration AER, h-1 real aer_tot_mu; // mu parameter of AER distribution real<lower=0> aer_tot_sigma; // sigma parameter of AER distribution } transformed parameters { vector<lower=0>[N] aer_vent; // true ventilation AER, h-1 vector<lower=0>[N] aer_tot; // true total AER, h-1 for (n in 1:N) aer_vent[n] <- 3.6 * Qvent[n] / h_meas[n]; // convert from L/s-m2 to h-1 aer_tot <- aer_vent + aer_inf; } model { Qvent_meas ~ normal(Qvent,Qunc); // measurement model aer_inf ~ lognormal(aer_inf_mu, aer_inf_sigma); // infiltration model aer_tot ~ lognormal(aer_tot_mu, aer_tot_sigma); // total AER model } Description of sampling for posterior replications We describe the process of sampling to compute replications in more detail. For the IEM, the routine did the following Ni times (one for each building): randomly sampled from the lognormal distribution for Ez,ij, used the governing equation (Equation 7) to calculate Cin,ij given the estimated true values of Cout,ij, λj, Ep,ij, Pz,j, Az,j, and Vz,j, and applied the measurement equations for Cin,ij and λj to give simulated measurements. The distribution parameters µEz,i and σEz,i and all true values of latent variables were those proposed by the MCMC sampling on a given iteration. The procedure was equivalent for the DEM. In the code that follows, posterior replications are created in the “generated quantities” block, where the suffix “_rng” stands for random number generator. IEM Rackes and Waring 2015 7 data { int<lower=0> N; real<lower=0> C_mn_spike_pct; // mean percent of spike recovered real<lower=0> C_sd_spike_pct; // SD percent of spike recovered vector[N] Cin_meas; vector[N] Cout_meas; vector<lower=0>[N] Pz; // assumed perfectly measured real<lower=0> Ep_mu; // literature estimate of Ep normal per occ distribution mean real<lower=0> Ep_sigma; // literature estimate of Ep normal distribution sd vector<lower=0>[N] Az; // zone area vector<lower=0>[N] Vz; // zone volume vector<lower=0>[N] AER_meas; // vector of AER measurements, h-1 vector<lower=0>[N] AER_meas_unc; // vector of AER measurement uncertainties, h-1 } parameters { vector<lower=0>[N] Cout; // Cout in ug/m3 in each building zone vector<lower=0>[N] Ep_ave; // Average Ep in ug/h-occ in each building zone vector<lower=0>[N] Ez; // Ez in ug/h-m2 in each building zone vector<lower=0>[N] AER; // AER in h-1 in each building zone real Ez_mu; // log of Ez GM real<lower=0> Ez_sigma; // log of Ez GSD } transformed parameters { vector<lower=0>[N] Cin; vector<lower=0>[N] S; for (n in 1:N) { S[n] <- AER[n]*Cout[n] + Ep_ave[n]*Pz[n]/Vz[n]; Cin[n] <- (S[n] + Ez[n]*Az[n]/Vz[n])/AER[n]; } } model { Cout_meas ~ normal(Cout*C_mn_spike_pct, Cout*C_sd_spike_pct); // Cin measurement model Cin_meas ~ normal(Cin*C_mn_spike_pct, Cin*C_sd_spike_pct); // Cout measurement model AER_meas ~ normal(AER, AER_meas_unc); // AER measurement model for (n in 1:N) { Ep_ave[n] ~ normal(Ep_mu, Ep_sigma/sqrt(Pz[n])); // Ep model, var. inv. prop. to # occ. } AER ~ lognormal(log(1.129337), log(2.589431)); // sector-wide AER distribution Ez ~ lognormal(Ez_mu,Ez_sigma); // Ez model } generated quantities { vector<lower=0>[N] Ez_pred; vector<lower=0>[N] Cin_meas_prct_error; vector<lower=0>[N] Cin_meas_pred; vector[N] AER_meas_pred; for (n in 1:N) { Ez_pred[n] <- lognormal_rng(Ez_mu,Ez_sigma); Cin_meas_prct_error[n] <- normal_rng(C_mn_spike_pct,C_sd_spike_pct); Cin_meas_pred[n] <- Cin_meas_prct_error[n]*((S[n] + Ez_pred[n]*Az[n]/Vz[n])/AER[n]); AER_meas_pred[n] <- 0; while (AER_meas_pred[n] <= 0) AER_meas_pred[n] <- normal_rng(AER[n],AER_meas_unc[n]); } } Rackes and Waring 2015 8 DEM data { int<lower=0> N; real<lower=0> C_mn_spike_pct; // mean percent of spike recovered real<lower=0> C_sd_spike_pct; // SD percent of spike recovered vector[N] Cin_meas; vector[N] Cout_meas; vector<lower=0>[N] Pz; // assumed perfectly measured real<lower=0> Ep_mu; // literature estimate of Ep normal distribution mean real<lower=0> Ep_sigma; // literature estimate of Ep normal distribution sd vector<lower=0>[N] Az; // zone area vector<lower=0>[N] Vz; // zone volume vector<lower=0>[N] AER_meas; // vector of AER measurements, h-1 vector<lower=0>[N] AER_meas_unc; // vector of AER measurement uncertainties, h-1 real kL_priorGM; // prior distribution from bootstrapped curve fits real kL_priorGSD; // prior distribution from bootstrapped curve fits } parameters { vector<lower=0>[N] Cout; // Cout in ug/m3 in each building zone vector<lower=0>[N] Ep_ave; // Average Ep in ug/h-occ in each building zone vector<lower=0>[N] AER; // AER in h-1 in each building zone real<lower=0.000001,upper=100> kL; // air/boundary layer coupling time constant h-1 vector<lower=0,upper=10000>[N] Ceq; // boundary layer concentrations real<lower=0> Ceq_mu; // boundary layer concentration, log(GM) real<lower=0,upper=log(3)> Ceq_sigma; // boundary layer concentration, log(GSD) } transformed parameters { vector<lower=0>[N] Cin; vector<lower=0>[N] S; for (n in 1:N) { S[n] <- AER[n]*Cout[n] + Ep_ave[n]*Pz[n]/Vz[n]; Cin[n] <- (S[n] + kL*Ceq[n])/(AER[n] + kL); } } model { Cout_meas ~ normal(Cout*C_mn_spike_pct, Cout*C_sd_spike_pct); // Cin measurement model Cin_meas ~ normal(Cin*C_mn_spike_pct, Cin*C_sd_spike_pct); // Cout measurement model AER_meas ~ normal(AER, AER_meas_unc); // AER measurement model for (n in 1:N) { Ep_ave[n] ~ normal(Ep_mu, Ep_sigma/sqrt(Pz[n])); // Ep model, var. inv. prop. to # occ. } AER ~ lognormal(log(1.129337), log(2.589431)); // sector-wide AER distribution Ceq ~ lognormal(Ceq_mu,Ceq_sigma); // Ceq distribution kL ~ lognormal(log(kL_priorGM),log(kL_priorGSD)) T[0.000001,100]; // kL prior } generated quantities { vector<lower=0>[N] Ceq_pred; vector<lower=0>[N] Cin_meas_prct_error; vector<lower=0>[N] Cin_meas_pred; vector[N] AER_meas_pred; for (n in 1:N) { Ceq_pred[n] <- lognormal_rng(Ceq_mu,Ceq_sigma); Cin_meas_prct_error[n] <- normal_rng(C_mn_spike_pct,C_sd_spike_pct); Cin_meas_pred[n] <- Cin_meas_prct_error[n]*((S[n] + kL*Ceq_pred[n])/(AER[n] + kL)); AER_meas_pred[n] <- 0; while (AER_meas_pred[n] <= 0) AER_meas_pred[n] <- normal_rng(AER[n],AER_meas_unc[n]); } } Rackes and Waring 2015 9 Building parameter lognormal models in IP units Table SI-3 Building parameter lognormal models, in IP units Estimation method GM GSD Acceptable fit? (KS test p-value) h -1 Bayes 1.13 2.59 Yes (0.35) h -1 MLE 1.21 2.71 Yes (0.40) 3 Building parameter Units Air exchange rate, λ Air exchange rate, λ Outdoor airflow per occupant MLE 70.3 2.71 Yes (0.81) Outdoor airflow per floor area ft 3/min/ft 2 ft /min/occ MLE 0.24 2.67 Yes (0.41) Area served by system, A z ft 2 MLE 15202 1.49 Yes (0.82) Suspended ceiling height ft MLE 8.87 1.08 No (0.00) Total ceiling height, hz ft MLE 12.04 1.12 Yes (0.41) Occupants per system, Pz occ MLE 52.69 1.39 Yes (0.67) Occupant density, P z /A z occ/ft 2 MLE 0 1.44 Yes (0.93) Outdoor concentration lognormal models See next page. Rackes and Waring 2015 10 Table SI-4 Outdoor concentration data information and lognormal model parameters and goodness-of-fit test results Compound Alcohols 1-butanol 2-ethyl-1-hexanol 2-methyl-1-propanol 2-propanol ethanol Alkanes 3-methyl pentane decane dodecane hexane octane nonane undecane Alkenes a-pinene d-limonene Aromatics 1,2,4-trichlorobenzene 1,2,4-trimethylbenzene 1,2-dichlorobenzene 1,3,5-trimethylbenzene 1,4-dichlorobenzene 4-ethyltoluene 4-phenylcyclohexene benzene chlorobenzene ethylbenzene m & p-xylenes naphthalene o-xylene phenol styrene toluene Carbonyls 2-butanone (MEK) 4-methyl-2-pentanone acetaldehyde acetone formaldehyde hexanal heptanal nonanal pentanal Esters 2,2,4-trimethyl-1,3-pentanediol diisobutyrate 2,2,4-trimethyl-1,3-pentanediol monoisobutyrate butyl acetate ethyl acetate Ethers 2-butoxyethanol t-butyl methyl ether Halocarbons 1,1,1-trichloroethane 1,2-dichloroethane bromomethane carbon tetrachloride chloroethane chloroform chloromethane dichlorodifluoromethane methylene chloride tetrachloroethene trichloroethene trichlorofluoromethane trichlorotrifluoroethane Organosulfurs carbon disulfide dimethyl disulfide N N > LOQ LOQ (µg/m 3 ) Ez model? GM (µg/m 3 ) GSD (-) p-value, χ 2 test Result, χ2 test 32 33 13 13 13 1 0 0 6 13 0.66 0.33 1.70 4.70 2.50 Yes Yes Yes 0.16 0.05 3.18 25.22 2.03 1.45 3.69 2.12 0.87 0.06 0.89 0.79 0 0 0 0 59 63 61 33 60 61 61 22 41 21 29 26 29 32 1.31 0.35 0.53 0.66 0.35 0.35 0.35 Yes Yes Yes Yes Yes Yes Yes 0.83 0.48 0.27 1.40 0.27 0.30 0.37 2.58 2.65 2.71 2.82 2.88 2.58 2.45 0.67 0.49 0.26 0.15 0.50 0.11 0.55 0 0 0 0 0 0 0 61 61 8 16 0.35 0.35 Yes Yes 0.11 0.19 2.69 2.98 0.62 0.68 0 0 26 65 59 62 61 61 33 66 60 64 66 62 66 39 63 66 0 53 0 23 11 35 0 64 0 49 63 18 54 32 20 63 0.71 0.35 0.35 0.35 0.35 0.35 0.33 0.35 0.35 0.35 0.35 0.35 0.35 0.33 0.35 0.35 Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes 0.12 0.88 0.06 0.26 0.14 0.39 2.61 0.07 0.68 2.25 0.21 0.83 0.99 0.26 3.41 1.60 2.73 1.23 2.76 3.40 2.68 1.84 1.73 2.50 2.92 2.59 2.76 2.88 2.49 2.47 0.06 0.41 0.00 0.92 0.59 0.43 0.30 0.00 0.94 0.92 0.09 0.69 0.80 0.93 0.69 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 62 61 85 65 98 35 13 37 32 53 10 84 65 95 18 2 30 5 0.61 0.35 0.49 1.73 0.39 0.66 2.50 0.66 0.66 Yes Yes Yes Yes Yes Yes Yes Yes 1.24 0.14 2.61 7.72 2.70 0.57 0.65 1.04 0.22 2.09 2.68 2.14 1.66 2.74 2.18 4.07 1.63 2.50 0.01 0.20 0.89 0.65 0.12 0.12 0.83 0.70 0.54 1 0 0 0 0 0 0 0 0 33 0 0.33 Yes 0.05 1.32 0.06 0 33 61 63 2 18 22 0.66 0.35 0.35 Yes Yes Yes 0.14 0.18 0.25 2.10 3.05 2.89 0.53 0.66 0.34 0 0 0 33 58 6 10 0.66 1.69 Yes - 0.20 0.54 2.51 3.88 0.37 0.20 0 0 60 85 85 58 85 85 86 85 98 62 58 97 85 35 1 4 0 3 4 86 78 40 41 10 18 1 0.69 1.00 1.50 1.73 1.50 0.70 1.00 1.80 2.40 0.35 0.35 3.50 2.80 Yes Yes Yes Yes Yes Yes Yes Yes 0.72 0.15 0.25 0.37 0.25 0.13 2.50 3.92 1.44 0.58 0.11 1.35 0.48 2.10 1.27 1.76 1.74 1.64 2.26 1.40 2.41 3.29 3.37 2.79 2.57 1.55 0.03 0.70 0.93 0.00 0.00 0.00 0.53 0.00 0.22 0.81 0.60 0.00 0.54 1 0 0 1 1 1 0 1 0 0 0 1 0 86 85 43 8 1.30 3.80 Yes - 0.87 0.78 4.46 2.40 0.52 0.00 0 1 Rackes and Waring 2015 - 11 DEM “suggestively preferred” resampling, criteria, and data fusion logic As discussed in the paper section “Judging the DEM as suggestively better than the IEM,” we assessed whether the IEM or DEM posterior replications were more similar to the measurements. We used resampling to limit the undue impact of outliers or overly influential points. For the BASE measurements, we employed repeated bootstrap samples; for the models, we repeatedly randomly selected a single replication (of the 4000), then applied bootstrap sampling to the N buildings. This process yielded distributions of each metric for the data and each model. Our analysis compared the medians of these distributions. Once the three metrics—the arithmetic mean, βCin, and βΔC—were calculated, we turned each into a criterion by using thresholds to determine whether either model was meaningfully more similar to the data according to each metric. To be preferred according the AM, a model’s percent difference from the data had to be meaningfully better than the other model’s (at least 10% better). For βCin and βΔC, the same condition applied with a more stringent threshold (20%), and there were two others. The model’s absolute difference from the data also had to be meaningfully better than the other model’s (at least 0.10 better) and the model slope had to be within 0.25 of the data slope (i.e., it couldn’t be the case that a model was preferred just because it was closer to the data if both models were not similar to the data). We also used a fourth criterion: substantially greater likelihood according to both density estimates (multivariate normal and 2D histogram), with at least one of the two likelihood ratio tests having a p-value less than 0.20. We combined the criteria cautiously in an attempt to avoid spurious conclusions: any VOC where the DEM met at least two of the four criteria we deemed to be suggestively preferred. Regression coefficients βΔC and βCin and relative air exchange efficacy When practitioners or designers consider a ventilation intervention, they often expect an impact based on an inverse relation to AER. For example, for doubling the AER, the ideal percent change would be –50% (halving), and for halving the AER, the ideal percent change would be 100% (doubling). This inverse impact only occurs under idealized circumstances, with no outdoor concentration or emissions response. We define the relative air exchange efficacy ηΔC as πΔπΆ = actual change in ΔπΆ due to an AER change ideal inverse change in ΔπΆ due to an AER change (9) and ηCin analogously with respect to Cin, Note that ηΔC will be different for different interventions, but if the process is geometric will be the same for any ratio of after-to-before intervention AERs, regardless of the AERs themselves. We will now show, as stated in the paper section “Comparing emission rate models: similarity and equivalence,” that βΔC is a good approximation of ηΔC and βCin is a good approximation for ηCin. First, note that for an ideal intervention, ηΔC and ηΔC equal 1, by definition. The slopes βΔC and βCin will also be exactly 1. (This can easily be seen by negating the natural logarithm of Equation 1, Rackes and Waring 2015 12 which yields a linear expression with a coefficient of 1 for ln(λ): –ln(ΔC) = –ln(Cin – Cout) = ln(λ) – ln((EpPz + EzAz)/Vz).) If emissions are AER-dependent, ηΔC and βΔC will be less than 1, according to the lower efficacy of air exchange at reducing indoor concentration from indoor sources. If there is also an outdoor concentration, ηCin and βCin will be less than ηΔC and βΔC, respectively, since an outdoor concentration floor further reduces the possible percent change in indoor concentration. All of these measures will also be reduced further by measurement and process noise, which corrupts the linearity of the relation between ln(λ) and ln(Cin) or ln(ΔC). Beyond these observations, we can formally develop the relation between ηΔC and βΔC (the development is identical for ηCin and βCin). If the line of best fit for –ln(ΔC) as a function of ln(λ) has intercept α and slope β, then the regression model for ΔC holds that: ΔπΆ = exp(πΌ) ππ½ According to the regression model, the actual change in ΔC due to an AER change will be: ΔπΆ2 − ΔπΆ1 = exp(πΌ) π½ π2 − exp(πΌ) π½ π1 = ππΌ ( 1 1 − ) π½ π½ π2 π1 The ideal change would be: ΔπΆ2 − ΔπΆ1 = G G 1 1 − = πΊ( − ) π2 π1 π2 π1 Therefore: 1 1 π½ − π½) π2 π1 = 1 1 πΊ( − ) π2 π1 ππΌ ( πΔπΆ Since the initial concentration is the same under the regression model or the idealization, we know that: ΔπΆ1 = exp(πΌ) π½ π1 = G 1−π½ ⇒ πΊ = π πΌ π1 π1 Therefore: πΔπΆ π½ 1 1 1 π½ 1 ππΌ ( π½ − π½) π1 ( π½ − π½ ) (π1 ) − 1 π2 π1 π2 π1 π = = = 2 1 1 1 1 π1 1−π½ π πΌ π1 ( − ) π1 ( − ) −1 π2 π1 π2 π1 π2 If we let c be the ratio of the post-intervention AER to the baseline AER, i.e. c = λ2/λ1, we can write Rackes and Waring 2015 13 πΔπΆ = (1/π)−π½ − 1 1/π − 1 This expression looks like: Relative AER efficacy, η 1.0 c = λ2/λ1 0.8 1/3 1/2 0.6 2/3 1 1/2 0.4 2 3 0.2 0.0 0 0.2 0.4 0.6 0.8 1 slope, β For c = 2/3 or 1-1/2, the error in using β to approximate η is never more than about 0.05. For c = 1/2 or 2, the error is never more than about 0.09. For c = 1/3 or 3, the error is never more than 0.14. Detailed information on assessing the role of other factors To assess the influence of other factors on effective VOC emission rates in the BASE buildings, we conducted multiple linear regressions of Ez,ij or Ceq,ij against the AER, the age of the building at the time of the measurement (since year of construction or last major renovation, whichever was later), and the indoor and outdoor temperature and relative humidity (RH). Only detectable emissions, as defined in the text of the paper following Equation 2, were included. Because there was no obvious best dependence relation, we used four fits. In the first two, the dependent variable was Ez,ij calculated via Equation 2. In the other two, the dependent variable was the median estimate of Ceq,ij under the DEM. (Recall that Bayesian estimation provides estimates of all latent variables, including Ceq for each VOC in each building.) For each dependent variable, there was both an untransformed and a log-transformed regression; in the log-transformed versions, the logarithm of the AER was used as a predictor. We make no claim that these four regressions are ideal, but we suggest that any strong associations between long-term effective emissions and environmental factors would show up in at least some of them. Table SI-5 shows the number of VOCs for which each factor, as well as the regression model as a whole, was significant. Results from the first two regressions again confirmed that the AER is a primary driver of effective emissions. It was a significant predictor for 24 or 22 VOCs, and always with a positive impact. As expected, it was generally not significant in the regressions for Ceq. All the remaining environmental factors proved significant for only a small number of VOCs—always between zero and three—under all regression models. A cautious interpretation, since α = 0.05 and there are 46 VOCs, is that these few significant associations were probably just due to chance. This interpretation is es- Rackes and Waring 2015 14 pecially likely given that VOCs that might be expected to behave similarly did not necessarily share associations, and different VOCs sometimes appeared significant under the four regression models. We note that using the specific regression model derived in (Parthasarathy, Maddalena, et al., 2011), which was a very good predictor for temperature and humidity impacts on formaldehyde emissions in temporary housing units, did not lead to more VOCs with significant relations. Table SI-5 Number of VOCs for which age and environmental variables were significant predictors in four multiple linear regressions. The “+” and “-” symbols refer to the number of positive and negative associations Number of times each factor was significant (α = 0.05), out of 46 VOCs Regression Dependence structure Model λ Age T in T out RH in RH out Linear E z E z ~ λ, age, Tin , Tout, RHin , RHout 27 24+, 0- 0+, 0- 0+, 2- 0+, 0- 0+, 1- 0+, 2- Log E z ln(E z ) ~ ln(λ), age, T in , Tout, RHin , RHout 20 22+, 0- 0+, 0- 0+, 1- 3+, 0- 1+, 1- 1+, 2- Linear C eq C eq ~ λ, age, T in , Tout, RHin , RHout 2 4+, 0- 0+, 0- 0+, 1- 3+, 0- 1+, 2- 1+, 2- Log C eq ln(C eq ) ~ ln(λ), age, Tin , Tout, RHin , RHout 4 3+, 0- 0+, 0- 0+, 0- 3+, 0- 1+, 2- 1+, 2- Though these may be due to chance, the significant associations that showed up in at least three regressions (all four unless otherwise noted) were: ο· ο· ο· ο· Indoor temperature, Tin. There was a negative association for dichlorobenzene (in all regressions but log Ceq). There were no VOCs with positive associations. Outdoor temperature, Tout. There were positive associations for 2-ethyl-1-hexanol, styrene, and trichlorofluoromethane (in all regressions all but linear Ez). There were no VOCs with negative associations. Indoor relative humidity, RHin. There was a positive association for acetaldehyde (in all regressions but linear Ez). There was a negative association for styrene (in all regressions but linear Ez). Outdoor relative humidity, RHout. There was a positive association for methylene chloride (in all regressions but linear Ez). There were negative associations for 2,2,4-trimethyl-1,3pentanediol monoisobutyrate (TMPD-MIB), and trichloroethene. It was surprising to find essentially no evidence of a relation between temperature or RH and effective emissions, since others have found that higher temperature and RH can both lead to increased emissions of formaldehyde (Myers, 1985; Parthasarathy, Maddalena, et al., 2011) and other VOCs (Haghighat and De Bellis, 1998; Lee and Kim, 2012). Higher indoor RH might also be expected to increase effective emissions by displacing sorbed VOC molecules, but no relation between Ez and RHin was observed for more sorptive VOCs with lower vapor pressures. Indeed, higher RHout, which could plausibly affect the humidity near exterior walls, was actually associated with reduced emissions of TMPD-MIB, a paint coalescing agent. Perhaps even more unexpected, almost no VOCs exhibited any relation between the age of the building and the effective emission rate. Indeed, given that the mass transfer model we have used for the DEM is a pseudo-steady state snapshot of the long-term emissions decay model of Sherman and Hult (2013), it was surprising to find no sign of decay for any VOC across the office sector. Rackes and Waring 2015 15 This was true even though there were 30 buildings less than a decade old, including ten that were 1– 3 years old. These periods are well within the range before which the elapsed time of emission would be considered “long” for thick materials with low diffusivity (Qian et al., 2007), so it is unlikely that material emissions represented in this dataset had already been greatly depleted. There is some evidence from chamber tests that formaldehyde emissions from some pressed wood products remain stable after about three months, presumably due to continued hydrolysis of resins (Brown, 1999), but that effect has not been observed for other VOCs. Advice for conducting Monte Carlo simulations with our models To perform a Monte Carlo simulation, we recommend the following steps (for each VOC): 1. Sample building parameters independently from the distributions for Pz/Az, Az, and h (Table 1). If you are evaluating a specific ventilation strategy, sample infiltration-only AER from the geometric distribution with GSD 0.085 h-1 and GSD 2.0 (Chan, 2006; Rackes and Waring, 2013), and add the AER from your ventilation strategy to it. If you would like to use the sector-wide AER distribution in your analysis, sample from lognormal distribution with the Bayesian estimates (Table 1). 2. Check Table SI-1 for the human breath emission rate. If it is a point estimate, use it directly for the average emission Ep. If it is a distribution, calculate Pz,j for each building by Pz,j = round((Pz,j/Az,j)×Az,j). Then, for each building j, sample from the individual breath emission distribution Pz,j times and use the average of these individual emissions as Ep,j. 3. Sample Cout from the lognormal distributions given in Table SI-4. 4. Calculate Soth according to the slightly modified version of Equation 6 as follows: πΈp,ππ πz,π πΈp,ππ (πz,π ⁄π΄z,π ) (6′-SI) = πΆout,ππ ππ + πz,π βz,π (The second form is used since Pz,j and Vz,j are both proportional to Az,j and are therefore highly correlated and would require a joint distribution, while the occupant density Pz,j/Az,j and ceiling height hz,j are statistically independent and can be sampled directly from their marginal distributions.) πoth,ππ = πΆout,ππ ππ + 5. For each building, sample on the uniform distribution from 0 to 1. If the result is less than or equal to pe, the probability of a building having detectable emissions (Table 2), continue to Step 5. If not, calculate the indoor concentration according to: πΆin,ππ = πoth,ππ ππ (9-SI) 6. If the compound is DEM-preferred (statistically or suggestively better modeled by the DEM) according to Table 2, sample from the lognormal distribution for Ceq using median parameter estimates from Table 2. Use the median kL estimate from Table 2. Calculate indoor concentration with Equation 5: Rackes and Waring 2015 16 πoth,ππ + (ππΏ)π πΆeq,ππ (DEM) (5) ππ + (ππΏ)π 7. If the compound is not DEM-preferred (either effectively equivalent or no preference), sample from the lognormal distribution for Ez using median parameter estimates from Table 2. Calculate indoor concentration with Equation 7: πΆin,ππ = πΆin,ππ = πoth,ππ + πΈz,ππ ⁄βz,π ππ (IEM) (7) References (SI) Brown, S.K. (1999) Chamber assessment of formaldehyde and VOC emissions from wood-based panels, Indoor Air, 9, 209–215. Carpenter, D.A. and Buttram, J.M. (1998) Breath temperature: An Alabama perspective, Newsletter of the International Association for Chemical Testing, 16–17. Chan, W.R. (2006) Assessing the effectiveness of shelter-in-place as an emergency response to large-scale outdoor chemical releases, University of California, Berkeley, Available from: http://gradworks.umi.com/32/46/3246850.html (accessed 2 January 2015). Fenske, J.D. and Paulson, S.E. (1999) Human breath emissions of VOCs, J. Air Waste Manag. Assoc., 49, 594–598. Haghighat, F. and De Bellis, L. (1998) Material emission rates: Literature review, and the impact of indoor air temperature and relative humidity, Build. Environ., 33, 261–277. Hornung, R.W. and Reed, L.D. (1990) Estimation of average concentration in the presence of nondetectable values, Appl. Occup. Environ. Hyg., 5, 46–51. Lee, Y.-K. and Kim, H.-J. (2012) The effect of temperature on VOCs and carbonyl compounds emission from wooden flooring by thermal extractor test method, Build. Environ., 53, 95–99. Moser, B., Bodrogi, F., Eibl, G., Lechner, M., Rieder, J. and Lirk, P. (2005) Mass spectrometric profile of exhaled breath—field study by PTR-MS, Respir. Physiol. Neurobiol., 145, 295–300. Myers, G. (1985) The Effects of Temperature and humidity on formaldehyde emission from UFbonded boards - A literature critique, For. Prod. J., 35, 20–31. Parthasarathy, S., Maddalena, R.L., Russell, M.L. and Apte, M.G. (2011) Effect of temperature and humidity on formaldehyde emissions in temporary housing units, J. Air Waste Manag. Assoc., 61, 689–695. Rackes and Waring 2015 17 Persily, A.K. and Gorfain, J. (2008) Analysis of ventilation data from the US Environmental Protection Agency building assessment survey and evaluation (BASE) study, US Department of Commerce, National Institute of Standards and Technology, Building and Fire Research Laboratory, Available from: http://www.bfrl.nist.gov/IAQanalysis/docs/NISTIR7145Analysis%20of%20Ventilation%20Data%20from%20the%20U.S.%20Enviro.pdf (accessed 2 January 2015). Qian, K., Zhang, Y., Little, J.C. and Wang, X. (2007) Dimensionless correlations to predict VOC emissions from dry building materials, Atmos. Environ., 41, 352–359. Rackes, A. and Waring, M.S. (2013) Modeling impacts of dynamic ventilation strategies on indoor air quality of offices in six US cities, Build. Environ., 60, 243–253. Riess, U., Tegtbur, U., Fauck, C., Fuhrmann, F., Markewitz, D. and Salthammer, T. (2010) Experimental setup and analytical methods for the non-invasive determination of volatile organic compounds, formaldehyde and NOx in exhaled human breath, Anal. Chim. Acta, 669, 53–62. Sherman, M.H. and Hult, E.L. (2013) Impacts of contaminant storage on indoor air quality: Model development, Atmos. Environ., 72, 41–49. Smith, D., Turner, C. and ŠpanΔl, P. (2007) Volatile metabolites in the exhaled breath of healthy volunteers: their levels and distributions, J. Breath Res., 1, 014004. U.S. EPA (2003) A Standardized EPA Protocol for Characterizing indoor Air Quality in Large Office Buildings, Washington, DC, U.S. Environmental Protection Agency, Available from: http://www.epa.gov/iaq/base/pdfs/2003_base_protocol.pdf (accessed 2 January 2015). U.S. EPA (2011) Exposure Factors Handbook, Washington, DC, U.S. Environmental Protection Agency, Available from: http://www.epa.gov/ncea/efh/pdfs/efh-complete.pdf (accessed 2 January 2015). Wallace, L., Nelson, W., Ziegenfus, R., Pellizzari, E., Michael, L., Whitmore, R., Zelon, H., Hartwell, T., Perritt, R. and Westerdahl, D. (1991) The Los Angeles TEAM study: Personal exposures, indoor-outdoor air concentrations, and breath concentrations of 25 volatile organic compounds, J. Expo. Anal. Environ. Epidemiol., 1, 157–92. Rackes and Waring 2015 18