Pak. J. Statist. 2011 Vol. 27(4), 391-396 A NEARLY PSEUDO-OPTIMAL METHOD FOR KEEPING CALIBRATION WEIGHTS FROM FALLING BELOW UNITY IN THE ABSENCE OF NONRESPONSE OR FRAME ERRORS Phillip S. Kott Senior Research Statistician, RTI International 6110 Research Blvd.; Rockville, MD 20852 Email: pkott@rti.org ABSTRACT Calibration weighting is a technique for adjusting randomization-based weights so that the estimator of a population total becomes unbiased under a linear prediction model. In the absence of nonresponse or frame errors, one set of calibration weights has been shown to be asymptotically optimal in some sense under Poisson sampling. Unfortunately, although it is desirable that each weight be at least one (so that the corresponding element “represents itself”), there is no guarantee that will be the case. We will see how to construct an asymptotically equivalent set of weights so that no weight is less than unity. One consequence is that it often will be a simple matter to construct a variance measure that simultaneously estimates the prediction variance and the randomization mean squared error of the estimator. KEY WORDS: Generalized exponential form, Prediction model, Randomization-based. 1. INTRODUCTION Suppose we want to estimate the total a target variable y in a population U for based on data from a probability sample S. If we know the selection probability, k, for each sample element k in S, then we could estimate T y = iU yk U yk , with the expansion (Horvitz-Thompson) estimator t yE S yk k U yk I k k , where Ik = 1 when k S and 0 otherwise. In randomization-based inference, the Ik are random variables. It is easy to see that E I (t yE ) T y . In other words, t yE is a randomizationunbiased estimator for Ty. We can also write t yE U d k yk S d k yk , where dk = I k k . Deville and Särndal (1992) coined the term “calibration estimator” to describe an estimator of the form t CAL S wk yk , where the set of calibration weights {wk} are chosen to satisfy the y calibration equation: wk x k x k Tx kS kU © 2011 Pakistan Journal of Statistics (1.1) 391 392 A Nearly Pseudo-optimal method for keeping calibration… for some row vector of P benchmark variables, xk = (x1k, ..., xPk), about which Tx is be randomization consistent under mild known. Another requirement is that t CAL y conditions. This means that as the sample size grows arbitrarily large, the relative must tends toward zero. For more on the randomization mean squared error of t CAL y randomization-based properties of the calibration estimator, see Kott (2008). estimates Ty perfectly when yk = xk exactly. With Since equation (1.1) holds, t CAL y that in mind, it is reasonable to expect t CAL to be a good estimator when yk and xk are y close. This can be formalized by assuming the yk are random variables satisfying the linear prediction model: yk = xk + k, (1.2) where E(k |{xg, Ig ; gU}) = 0 for all k U. Under this model, it is easy to see that tyCAL is an unbiased estimator for Ty in the sense that E( t CAL Ty) = 0 (suppressing the y conditioning on xg and Ig for notational convenience). Section 2 describes the general regression (GREG) estimator, which translates into the linear calibration estimator with wk = dk (1 + qkxkg) for an appropriately defined vector g. Although each qk is often set to 1 in practice, the calibration estimator can be shown to be optimal in some sense under Poisson sampling when qk = (1 k ) / k . There is no guarantee with linear calibration that all the calibration weights will be nonnegative. Many computer programs cannot handle nonnegative weights. Brewer (1999) argues that calibrated weights should never fall below unity, because when an element’s weight is less than 1, it “does not even represent itself.” Section 3 shows how the nonlinear general exponential form (Folsom and Singh 2000) can be combined with a particular set of instrumental variables to create calibration weights that never fall below unity but are asymptotically equivalent to “optimal” weights. Section 4 provides a discussion of the simultaneous estimation of model variance and randomization mean squared error, which is aided by the absence of calibration weights less than 1. 2. THE GENERAL REGRESSION ESTIMATION The general regression or GREG estimator has the form: t GREG t yE Tx d k x k qk d k x k x k y kS kS 1 qk d k x k yk , kS (2.1) This assumes that the matrix S qk d k x k x k is invertible, which we do. The GREG estimator in equation (2.1) can be put in calibration form as = S wk yk , by setting t GREG y Phillip S. Kott 393 1 wk d k Tx d j x j q j d j x j x j qk d k x k jS jS Strictly speaking, the wk are functions of the realized sample, S, and the qk, but we suppress that in the notation for convenience. To choose values for the qk that are optimal is some sense, we follow the reasoning in Rao (1994) and first consider randomization unbiased estimators of the form E t (b) y t y Tx d k x k b , kS where b is a known matrix. It is not hard to see that the choice of b that minimizes the -1 E randomization variance of t (b) y is b = [VarI(S dkxk)] CovI( t y , S dkxk). Although this b is generally unknown, both VarI(.) and CovI(.) can be estimated from the sample. Under Poisson sampling, plugging unbiased and consistent estimates for VarI(.) and CovI(.) into b would yield the calibration estimator in equation (1.1) with qk = (1 k ) / k d k 1. That calibration estimator is thus asymptotically optimal in some sense under Poisson sampling. Bankier (2002) named it “pseudo-optimal,” observing that it is asymptotically optimal under stratified simple random sampling when the strata are large and their indicator functions are components of xk. Using different reasoning, Brewer (1999) suggests setting qk = (1 k ) / zk , where zk is some measure of size. With this setting, when element k is a certainty selection, which means dk is 1, wk is forced to equal 1 as well. In practice, this tends to limit the number of weights falling below unity under linear calibration. It does not guarantee their elimination, however. When one or more wk falls below 1 after linear calibration, Brewer advises setting those wk to 1, removing the corresponding elements from the sample and population in equation (1.1), and running linear calibration again on the remainder of the sample. This can be done iteratively if necessary. 3. THE GENERALIZED EXPONENTIAL FORM In an earlier work, Brewer (1995) generalizes the GREG by replacing each qjxj with the instrumental vector hj. This generalization assumes that S d k h k x k is invertible, which we do here. After the replacements are made, one can write the resulting estimator 1 in calibration form with wk = dk (1 + hkg), where g d x h T d x . As S k k k x S k k the sample size grows infinitely large, note that g and so each hkg converge to zero under mild conditions. Folsom and Singh (2000) suggest using a series of linear approximation to find if possible a vector g such that weights of the form wk = dk fk(hkg), where f k (h k g) k (uk ck ) uk ( ck k ) exp( Ak h k g) , (uk ck ) ( ck k ) exp( Ak h k g) (3.1) 394 A Nearly Pseudo-optimal method for keeping calibration… Ak and uk k , (ck k )(uk ck ) uk ck k 0 , satisfy the calibration equation in (1.1). Note that each wk in this formulation is bounded between d k k and d k uk and so cannot be negative. The upper bound on wk can be arbitrarily large. In fact, uk can be infinity. Folsom and Singh observe that with this generalized exponential form: df k ( z ) (uk f k ( z ))( f k ( z ) k ) dz (uk ck )(ck k ) . Consequently, when all ck =1, fk(0) and its derivative, dfk(0)/dz, are unity for all k. As a result, when all ck =1, calibration weights having the form wk = dk fk(hkg) are asymptotically equivalent to linear calibration weights of the form wk = dk (1 + hkg) (recall that as the sample size grows, the hkg converge to zero). This suggests the following method for producing nearly pseudo-optimal calibration weights that are never below unity. When possible, set wk = dk fk(hkg), where the fk(.) are defined in equation (3.1), hk = (dk 1)xk, ck =1, and k 1 / d k . This setting for k ensures that no calibration weight is less than unity. Since k must be smaller than ck for the generalized exponential form in equation (3.1) to be used, certainties (i.e., elements such that dk = 1) need special handling. Thus, as with the pseudo-optimal weights, when dk = 1, set wk = 1. An equivalent scheme removes certainties from both the sample and the population and sets their calibration weights to unity before computing the calibration weights for the rest of the sample with the generalized exponential form as described above. The calibration equation effectively becomes S C wk x k U C x k , where C denote the set of certainties. This fitting of the generalized exponential form can be done with the SUDAAN® (RTI International, 2012) procedure WTADJX by labeling the components of xk the “calibration variables” in the CALVARS statement, and the components of hk = (dk 1)xk the “model variables” in the MODEL statement. The ADJUST=POST setting is used with the values of U C x k being the population totals in the POSTWT statement. The CENTER is set to 1 for all elements, while LOWER is 1/dk . UPPER can be any value greater than 1 or it can be left unspecified. 4. DISCUSSION Consider the prediction model in equation (1.2). If the k are uncorrelated, each with variance 2k , then, because equation (1.1) is satisfied, the model variance of the calibration estimator for Ty is Phillip S. Kott Ty E t CAL y 395 E 2 2 wk yk yk kU kS 2 E wk k k kU kS wk2 2k 2 wk 2k 2k kS kS kS ( wk2 wk ) 2k kU wk 2k 2k . kU kS As Kott and Brewer (2001) point out, there are reasons to ignore the last term, 2 2 S wk k U k . It is exactly zero when 2k = xk ζ for some ζ . Even if that equality does not hold, 2 2 2 2 S wk k U k will be relatively small compared to S wk k when for almost all k: wk2 >> wk , as is often the case. Thus, S ( wk2 wk )2k is frequently a good approximation of the . model variance of t CAL y To estimate the model variance of t CAL , one can replace each 2k with ek2 , where y ek yk x k ( S d j h j 'x j ) 1 S d j h j ' y j . Under Poisson sampling, Kott (2008) discusses why v1= S ( wk2 wk )ek2 is also a good estimator of the randomization mean squared . Notice that replacing the dj in ek by wj (or by dj[dfj(hjg)/dz] as Folsom and error of t CAL y Singh (2000) would do) has no asymptotic impact on v1 since wj/dj (and dfj(hjg)/dz) converges to 1 as the sample size grows arbitrarily large. Observe that to guarantee that v1 is nonnegative, no positive wk can be less than unity. Using the calibration procedure described in the previous section avoids that possibility. Kott (2008) also discusses a simultaneous estimator for the model variance and randomization mean squared error of a calibration estimator under stratified simple random sampling, where n is the sample size in stratum S: 2 wk wk k S n 2 2 v2 wk wk ek n 1 kS n 1/2 2 ek . Note that for this estimator even to be defined, no positive wk can be less than unity. Again, using the calibration procedure described here avoids that possibility. 396 A Nearly Pseudo-optimal method for keeping calibration… REFERENCES 1. Bankier, M. (2002). Regression estimators for the 2001 Canadian Census. Presented at the International Conference in Recent Advances in Survey Sampling. 2. Brewer, K.R.W. (1999). Cosmetic calibration with unequal probability sampling. Survey Methodology, 25, 205-212. 3. Brewer, K.R.W. (1995). Combining design-based and model-based inference. In Cox, B.G.; Binder, D.A.; Chinappa, B.N.; Christianson, A.; Colledge, M.J.: and Kott. P.S. eds., Business Survey Methods, Wiley, New York, 589-606. 4. Brewer, K.R.W. (1994). Survey sampling inference: some past perspectives and present prospects. Pak. J. Statist., 10(1)A, 213-233. 5. Deville, J-C. and Särndal, C-E. (1992). Calibration estimators in survey sampling. J. Amer. Statist. Assoc., 87, 376-382. 6. Folsom, R.E. and Singh, A.C. (2000). The generalized exponential model for sampling weight calibration for extreme values, nonresponse, and poststratification. Proceedings of the Section on Survey Research Methods, American Statistical Association, Washington DC, 598-603. 7. Kott, P.S. (2008). Calibration weighting: Combining probability samples and linear prediction models. In Pfeffermann, D. and Rao, C.R. eds., Handbook of statistics 29B, sample surveys: Theory, methods, and inference. Elsevier, New York, pp. 55-82. 8. Kott, P.S. and Brewer, K.R.W. (2001). Estimating the model variance of a randomization-consistent regression estimator. Proceedings of the Section on Survey Research Methods, American Statistical Association, Washington DC. 9. Rao, J.N.K. (1994). Estimating totals and distribution functions using auxiliary information at the estimations stage. J. Official Statist., 25, 1-21. 10. RTI International (2012). SUDAAN Language Manual, Release 11.0. Research Triangle Park, NC: RTI International.