Appendix 1. EM Algorithm for Estimating Stratum-Level Bayesian Hyperparameters The EM algorithm is used to estimate the unknown population parameters π½π ,π and Σπ ,π from the following setup, π½Μππ ,π ~ πππ(π½ππ ,π , πΜππ ,π ) π½ππ ,π ~ πππ(π½π ,π πππ , Σπ ,π ) where π = (1,2, … , π) is used to index the set of parameters associated with the ππ‘β synthetic variable of interest and the ππ‘β regression model from which the direct estimates π½Μππ and πΜππ were obtained in Step 1. The E step consists of solving the following expectations, ∗ −1 −1 −1 Μ −1 Μ −1 Μππ ,π π½ππ ,π = πΈ(π½ππ ,π ) = [(V + Σπ ,π π½π ,π πππ )] ) (Vππ ,π π½ππ ,π + Σπ ,π π ∗ π π −1 −1 −1 ∗ ∗ Μππ ,π + Σπ ,π [π½ππ ,π (π½ππ ,π ) ] = πΈ[π½ππ ,π π½ππ ,π ] = (V ) + π½ππ ,π (π½ππ ,π ) Once these expectations are computed they are then incorporated into the maximization (M-step) of the unknown hyperparameters π½π ,π and ΣΜπ ,π using the following equations, πΆπ πΆπ ∗ π) (πππ πππ π½Μπ ,π = [∑π=1 πππ )][∑π=1 (π½ππ ,π ] π −1 , and πΆπ π ∗ ∗ ΣΜπ ,π = [∑ [∑(π½ππ ,π − π½Μπ ,π πππ )(π½ππ ,π − π½Μπ ,π πππ ) ]⁄πΆπ ] π =1 π=1 After convergence the maximum likelihood estimates are incorporated into the posterior distribution of π½ππ ,π shown in equation [7]. Appendix 2. EM Algorithm for Estimating Overall Bayesian Hyperparameters The EM algorithm is used to estimate the unknown population parameters π½π and Ωπ from the following setup, π½Μπ ,π ~ πππ(π½π ,π , πΜπ ,π ) π½π ,π ~ πππ(π½π ππ , Ωπ ) where π = (1,2, … , π) is used to index the set of parameters associated with the ππ‘β synthetic variable of interest and the ππ‘β regression model from which the hyperparameter estimates π½Μπ and πΜπ were obtained via the EM algorithm. The E step consists of solving the following expectations, −1 ∗ −1 −1 Μ Μπ ,π Μπ ,π π½π ,π = πΈ(π½π ,π ) = [(V + Ωπ−1 ) (V π½π ,π + Ωπ−1 π½π ππ )] π ∗ −1 π −1 Μπ ,π + Ωπ−1 ) [π½π ,π (π½π ,π ) ] = πΈ[π½π ,π π½π ,π ] = (V π ∗ ∗ + π½π ,π (π½π ,π ) Once these expectations are computed they are then incorporated into the maximization (M-step) of Μ π using the following equations, the unknown hyperparameters π½π and Ω ∗ π½Μπ = π½π ,π ππ (ππ ππ π )−1 , and π πΆπ π ∗ ∗ Μ π = [∑ [∑(π½π ,π Ω − π½Μπ ππ )(π½π ,π − π½Μπ ππ ) ]⁄π] π =1 π=1 After convergence the maximum likelihood estimates are incorporated into the posterior distribution of π½π ,π shown in equation [8]. Appendix 3. Simulation Study This section evaluates the repeated sampling properties of the small area inferences based on a simulation application. In this simulation, the 2003-2005 NHIS data is treated as a population from which subsamples are drawn. 500 random subsamples are drawn from each PSU with replacement. Each subsample accounts for approximately 30% of the total sample in each PSU. Each NHIS subsample is used as the basis for constructing a synthetic population from which 100 synthetic samples are drawn. A total of 50,000 synthetic data sets are generated. Two types of inferences can be obtained from the synthetic data: conditional and unconditional. Conditional synthetic inferences are obtained from synthetic samples that are based on a single observed sample drawn from the observed population. This is the situation most commonly encountered in practice where a survey is carried out on a single population-based sample and the synthetic data is generated conditional on that sample. Unconditional inferences are obtained from synthetic samples that are based on multiple, or repeated, population-based samples. Unconditional inferences are not feasible in practice but can be achieved through simulation. To obtain conditional inferences, 500 sets of 10 synthetic samples are randomly selected (with replacement) from each of the 100 synthetic samples generated conditional on each of the 500 NHIS subsamples. For each set of 10 synthetic samples, a synthetic estimate and associated confidence interval is obtained for each variable in each PSU using the combining rule equations [1] and [2] in Section 2.2. To obtain unconditional inferences, 100 sets of 10 synthetic samples are randomly selected with replacement across each of the 100 NHIS subsamples and estimates are obtained again using the relevant combining rules. We use two measures to evaluate the validity of the synthetic data estimates. The first one is confidence interval coverage (CIC). For conditional inferences, CIC is defined as the proportion of times that the synthetic data confidence interval [πΏπΜπ ,π π¦π , π πΜπ ,π π¦π ] contains the actual estimate π¦Μπππ‘ : ππΆπΌπΆ = πΌ(π¦Μπππ‘ ∈ [πΏπΜπ ,π π¦π , π πΜπ ,π π¦π ]) where πΌ(β) is an indicator function. ππΆπΌπΆ = 1 if πΏπΜπ ,π π¦π ≤ π¦Μπππ‘ ≤ π πΜπ ,π π¦π and ππ΄ = 0 otherwise. For unconditional inferences, the only difference is that CIC is calculated as the proportion of times that the synthetic data confidence interval contains the “true” population value ππππ , i.e., πΏπΜπ ,π π¦π ≤ ππππ ≤ π πΜπ ,π π¦π . The second evaluative measure is referred to as the confidence interval overlap (CIO; Karr et al., 2006). CIO is defined as the average relative overlap between the synthetic and actual data confidence intervals. For every estimate the average overlap is calculated by, 1 πππ£ππ −πΏππ£ππ 2 ππππ‘ −πΏπππ‘ ππΆπΌπ = ( + πππ£ππ −πΏππ£ππ ) ππ π¦π −πΏπ π¦π , where ππππ‘ and πΏπππ‘ denote the upper and the lower bound of the confidence interval for the actual estimate π¦Μπππ‘ , ππ π¦π and πΏπ π¦π denote the upper and the lower bound of the confidence interval for the synthetic data estimate πΜπ , and πππ£ππ and πΏππ£ππ denote the upper and lower bound of the confidence interval overlap between the actual and synthetic data estimates. ππΆπΌπ can take on any value between 0 and 1. A value of 0 means that there is no overlap between the two intervals and a value of 1 means the synthetic data interval completely covers the actual data interval. Calculating the confidence interval overlap is only possible for conditional, not unconditional, inferences. This measure yields a more accurate assessment of data utility in the sense that it accounts for the significance level of the estimate. That is, estimates with low significance might still have a high confidence interval overlap and therefore a high data utility even if their point estimates differ considerably from each other. A3.1 Validity of Univariate Estimates Table 5 shows CIC and CIO values for estimated means obtained from sampled PSUs. The conditional CIC and CIO values are high, ranging from 0.91-0.99 and 0.92-0.99, respectively. Furthermore, all of the unconditional CIC values correspond closely to the true CIC values. The same pattern holds true for estimates obtained from nonsampled counties. For conditional and unconditional inferences (Table 6), all CIC and CIO values equal 0.99 reflecting the large amount of variation in the synthetic data estimates resulting in wide confidence intervals. Overall, the simulation results suggest that the synthetic data method yields reasonably valid univariate inferences for sampled and nonsampled small areas. Table A1. Simulation-Based Confidence Interval Results for Estimated Means Obtained from Synthetic Data Sets Across all Sampled PSUs Conditional Unconditional Inference Inference CIC CIO CIC CIC (Actual) BMI 0.99 0.99 0.99 0.97 Age 0.91 0.92 0.99 0.98 Smoker 0.99 0.98 0.99 0.98 Moderate activity 0.99 0.99 0.99 0.98 Male 0.99 0.98 0.99 0.98 Hypertension 0.99 0.97 0.99 0.97 Fair/poor health status 0.99 0.92 0.99 0.97 Abbrevations: CIC – Confidence Interval Coverage; CIO – Confidence Interval Overlap Table A2. Simulation-Based Confidence Interval Results for Estimated Means Obtained from Synthetic Data Sets Across all Nonsampled PSUs Conditional Unconditional Inference Inference CIC CIO CIC CIC (Actual) BMI 0.99 0.99 0.99 0.99 Age 0.99 0.99 0.99 0.99 Smoker 0.99 0.99 0.99 0.99 Moderate activity 0.99 0.99 0.99 0.99 Male 0.99 0.99 0.99 0.99 Hypertension 0.99 0.99 0.99 0.99 Fair/poor health status 0.99 0.99 0.99 0.99 Abbrevations: CIC – Confidence Interval Coverage; CIO – Confidence Interval Overlap A3.2 Validity of Multivariate Estimates Simulation results for multivariate estimands are shown in Tables 7 and 8 for sampled and nonsampled areas, respectively. These tables show average CIC and CIO values for regression coefficients for the dependent variable log(BMI) estimated within each PSU (or county). For the sampled PSUs, the conditional CIC and CIO values are high and range from 0.98-0.99 and 0.94-0.99, respectively, indicating good confidence interval coverage and overlap for these multivariate estimands. The unconditional CIC values equal 0.99, which either meets or exceeds the true CIC values obtained from the actual data. For the nonsampled counties, the confidence interval coverage and overlap is similarly high for all coefficient estimates, ranging from 0.98-0.99. As was the case for the univariate estimands, the analytic validity of the multivariate synthetic data estimands seems to be generally high from a repeated sampling perspective. Table A3. Simulation-Based Confidence Interval Results for Linear Regression Coefficients Obtained from Synthetic Data Sets Across all Sampled PSUs Conditional Unconditional Inference Inference CIC CIO CIC CIC (Actual) Regression of BMI(log) on Intercept 0.99 0.98 0.99 0.97 Age 0.99 0.98 0.99 0.97 Smoker 0.99 0.98 0.99 0.98 Moderate activity 0.99 0.98 0.99 0.97 Male 0.99 0.98 0.99 0.98 Hypertension 0.99 0.99 0.99 0.98 Fair/poor health 0.99 0.94 0.99 0.96 Abbrevations: CIC – Confidence Interval Coverage; CIO – Confidence Interval Overlap Table A4. Simulation-Based Confidence Interval Results for Linear Regression Coefficients Obtained from Synthetic Data Sets Across all Nonsampled PSUs Conditional Unconditional Inference Inference CIC CIO CIC CIC (Actual) Regression of BMI(log) on 0.99 0.99 0.99 0.99 Intercept 0.99 0.99 0.99 0.99 Age 0.99 0.99 0.99 0.99 Smoker 0.99 0.99 0.99 0.99 Moderate activity 0.99 0.99 0.99 0.99 Male 0.99 0.99 0.99 0.99 Hypertension 0.99 0.98 0.99 0.99 Fair/poor health Abbrevations: CIC – Confidence Interval Coverage; CIO – Confidence Interval Overlap