Pak. J. Statist. 2011 Vol. 27(4), 391-396

advertisement
Pak. J. Statist.
2011 Vol. 27(4), 391-396
A NEARLY PSEUDO-OPTIMAL METHOD FOR KEEPING CALIBRATION
WEIGHTS FROM FALLING BELOW UNITY IN THE ABSENCE OF
NONRESPONSE OR FRAME ERRORS
Phillip S. Kott
Senior Research Statistician, RTI International
6110 Research Blvd.; Rockville, MD 20852
Email: pkott@rti.org
ABSTRACT
Calibration weighting is a technique for adjusting randomization-based weights so
that the estimator of a population total becomes unbiased under a linear prediction model.
In the absence of nonresponse or frame errors, one set of calibration weights has been
shown to be asymptotically optimal in some sense under Poisson sampling.
Unfortunately, although it is desirable that each weight be at least one (so that the
corresponding element “represents itself”), there is no guarantee that will be the case. We
will see how to construct an asymptotically equivalent set of weights so that no weight is
less than unity. One consequence is that it often will be a simple matter to construct a
variance measure that simultaneously estimates the prediction variance and the
randomization mean squared error of the estimator.
KEY WORDS:
Generalized exponential form, Prediction model, Randomization-based.
1. INTRODUCTION
Suppose we want to estimate the total a target variable y in a population U for based
on data from a probability sample S. If we know the selection probability, k, for each
sample element k in S, then we could estimate T y =  iU yk  U yk , with the
expansion (Horvitz-Thompson) estimator t yE   S yk k  U yk I k k , where Ik = 1
when k  S and 0 otherwise. In randomization-based inference, the Ik are random
variables. It is easy to see that E I (t yE )  T y . In other words, t yE is a randomizationunbiased estimator for Ty.
We can also write t yE  U d k yk   S d k yk , where dk = I k k . Deville and Särndal
(1992) coined the term “calibration estimator” to describe an estimator of the form
t CAL
  S wk yk , where the set of calibration weights {wk} are chosen to satisfy the
y
calibration equation:
 wk x k   x k  Tx
kS
kU
© 2011 Pakistan Journal of Statistics
(1.1)
391
392
A Nearly Pseudo-optimal method for keeping calibration…
for some row vector of P benchmark variables, xk = (x1k, ..., xPk), about which Tx is
be randomization consistent under mild
known. Another requirement is that t CAL
y
conditions. This means that as the sample size grows arbitrarily large, the relative
must tends toward zero. For more on the
randomization mean squared error of t CAL
y
randomization-based properties of the calibration estimator, see Kott (2008).
estimates Ty perfectly when yk = xk exactly. With
Since equation (1.1) holds, t CAL
y
that in mind, it is reasonable to expect t CAL
to be a good estimator when yk and xk are
y
close. This can be formalized by assuming the yk are random variables satisfying the
linear prediction model:
yk = xk + k,
(1.2)
where E(k |{xg, Ig ; gU}) = 0 for all k  U. Under this model, it is easy to see that tyCAL
is an unbiased estimator for Ty in the sense that E( t CAL
 Ty) = 0 (suppressing the
y
conditioning on xg and Ig for notational convenience).
Section 2 describes the general regression (GREG) estimator, which translates into
the linear calibration estimator with wk = dk (1 + qkxkg) for an appropriately defined
vector g. Although each qk is often set to 1 in practice, the calibration estimator can be
shown to be optimal in some sense under Poisson sampling when qk = (1  k ) / k .
There is no guarantee with linear calibration that all the calibration weights will be
nonnegative. Many computer programs cannot handle nonnegative weights.
Brewer (1999) argues that calibrated weights should never fall below unity, because
when an element’s weight is less than 1, it “does not even represent itself.” Section 3
shows how the nonlinear general exponential form (Folsom and Singh 2000) can be
combined with a particular set of instrumental variables to create calibration weights that
never fall below unity but are asymptotically equivalent to “optimal” weights.
Section 4 provides a discussion of the simultaneous estimation of model variance and
randomization mean squared error, which is aided by the absence of calibration weights
less than 1.
2. THE GENERAL REGRESSION ESTIMATION
The general regression or GREG estimator has the form:



t GREG
 t yE   Tx   d k x k    qk d k x k x k 
y
kS

  kS

1
 qk d k x k yk ,
kS
(2.1)
This assumes that the matrix  S qk d k x k x k is invertible, which we do.
The GREG estimator in equation (2.1) can be put in calibration form as
=  S wk yk , by setting
t GREG
y
Phillip S. Kott
393
1



wk  d k   Tx   d j x j    q j d j x j x j  qk d k x k
jS

  jS

Strictly speaking, the wk are functions of the realized sample, S, and the qk, but we
suppress that in the notation for convenience.
To choose values for the qk that are optimal is some sense, we follow the reasoning in
Rao (1994) and first consider randomization unbiased estimators of the form


E
t (b)
y  t y   Tx   d k x k  b ,
kS


where b is a known matrix. It is not hard to see that the choice of b that minimizes the
-1
E
randomization variance of t (b)
y is b = [VarI(S dkxk)] CovI( t y , S dkxk). Although this b
is generally unknown, both VarI(.) and CovI(.) can be estimated from the sample. Under
Poisson sampling, plugging unbiased  and consistent  estimates for VarI(.) and CovI(.)
into b would yield the calibration estimator in equation (1.1) with qk =
(1  k ) / k  d k  1. That calibration estimator is thus asymptotically optimal in some
sense under Poisson sampling. Bankier (2002) named it “pseudo-optimal,” observing that
it is asymptotically optimal under stratified simple random sampling when the strata are
large and their indicator functions are components of xk.
Using different reasoning, Brewer (1999) suggests setting qk = (1  k ) / zk , where zk
is some measure of size. With this setting, when element k is a certainty selection, which
means dk is 1, wk is forced to equal 1 as well. In practice, this tends to limit the number of
weights falling below unity under linear calibration. It does not guarantee their
elimination, however. When one or more wk falls below 1 after linear calibration, Brewer
advises setting those wk to 1, removing the corresponding elements from the sample and
population in equation (1.1), and running linear calibration again on the remainder of the
sample. This can be done iteratively if necessary.
3. THE GENERALIZED EXPONENTIAL FORM
In an earlier work, Brewer (1995) generalizes the GREG by replacing each qjxj with
the instrumental vector hj. This generalization assumes that  S d k h k x k is invertible,
which we do here. After the replacements are made, one can write the resulting estimator
1
in calibration form with wk = dk (1 + hkg), where g    d x  h  T   d x  . As
S
k
k k
x
S
k
k
the sample size grows infinitely large, note that g and so each hkg converge to zero under
mild conditions.
Folsom and Singh (2000) suggest using a series of linear approximation to find  if
possible  a vector g such that weights of the form wk = dk fk(hkg), where
f k (h k g) 
 k (uk  ck )  uk ( ck   k ) exp( Ak h k g)
,
(uk  ck )  ( ck   k ) exp( Ak h k g)
(3.1)
394
A Nearly Pseudo-optimal method for keeping calibration…
Ak 
and
uk   k
,
(ck   k )(uk  ck )
  uk  ck   k  0 ,
satisfy the calibration equation in (1.1). Note that each wk in this formulation is bounded
between d k  k and d k uk and so cannot be negative. The upper bound on wk can be
arbitrarily large. In fact, uk can be infinity.
Folsom and Singh observe that with this generalized exponential form:
df k ( z ) (uk  f k ( z ))( f k ( z )   k )

dz
(uk  ck )(ck   k )
.
Consequently, when all ck =1, fk(0) and its derivative, dfk(0)/dz, are unity for all k. As
a result, when all ck =1, calibration weights having the form wk = dk fk(hkg) are
asymptotically equivalent to linear calibration weights of the form wk = dk (1 + hkg)
(recall that as the sample size grows, the hkg converge to zero). This suggests the
following method for producing nearly pseudo-optimal calibration weights that are never
below unity.
When possible, set wk = dk fk(hkg), where the fk(.) are defined in equation (3.1),
hk = (dk 1)xk, ck =1, and  k  1 / d k . This setting for  k ensures that no calibration
weight is less than unity. Since  k must be smaller than ck for the generalized
exponential form in equation (3.1) to be used, certainties (i.e., elements such that dk = 1)
need special handling. Thus, as with the pseudo-optimal weights, when dk = 1, set wk = 1.
An equivalent scheme removes certainties from both the sample and the population
and sets their calibration weights to unity before computing the calibration weights for
the rest of the sample with the generalized exponential form as described above. The
calibration equation effectively becomes  S C wk x k  U C x k , where C denote the set
of certainties.
This fitting of the generalized exponential form can be done with the SUDAAN® (RTI
International, 2012) procedure WTADJX by labeling the components of xk the
“calibration variables” in the CALVARS statement, and the components of hk = (dk 1)xk
the “model variables” in the MODEL statement. The ADJUST=POST setting is used
with the values of U C x k being the population totals in the POSTWT statement. The
CENTER is set to 1 for all elements, while LOWER is 1/dk . UPPER can be any value
greater than 1 or it can be left unspecified.
4. DISCUSSION
Consider the prediction model in equation (1.2). If the k are uncorrelated, each with
variance 2k , then, because equation (1.1) is satisfied, the model variance of the
calibration estimator for Ty is
Phillip S. Kott

 Ty
E  t CAL
y

395
   E
2

2

 
   wk yk   yk  
kU
 
  kS
2

 
 E   wk k   k  
kU
 
 kS
  wk2 2k  2  wk  2k   2k
kS
 
kS
kS
( wk2
 wk ) 2k
kU


   wk 2k   2k  .
kU
 kS

As Kott and Brewer (2001) point out, there are reasons to ignore the last term,
2
2
 S wk k  U k .
It is exactly zero when 2k = xk ζ for some ζ . Even if that equality does not hold,
2
2
2 2
 S wk k  U k will be relatively small compared to  S wk k when for almost all k:
wk2 >> wk ,
as is often the case. Thus,  S ( wk2  wk )2k is frequently a good approximation of the
.
model variance of t CAL
y
To estimate the model variance of t CAL
, one can replace each 2k with ek2 , where
y
ek  yk  x k (  S d j h j 'x j ) 1  S d j h j ' y j . Under Poisson sampling, Kott (2008) discusses
why v1=  S ( wk2  wk )ek2 is also a good estimator of the randomization mean squared
. Notice that replacing the dj in ek by wj (or by dj[dfj(hjg)/dz] as Folsom and
error of t CAL
y
Singh (2000) would do) has no asymptotic impact on v1 since wj/dj (and dfj(hjg)/dz)
converges to 1 as the sample size grows arbitrarily large. Observe that to guarantee that v1
is nonnegative, no positive wk can be less than unity. Using the calibration procedure
described in the previous section avoids that possibility.
Kott (2008) also discusses a simultaneous estimator for the model variance and
randomization mean squared error of a calibration estimator under stratified simple
random sampling, where n is the sample size in stratum S:

2
  wk  wk
k
S

n
 
2
2
v2 
 wk  wk ek 
n  1 kS
n




1/2
2

ek 
 .
Note that for this estimator even to be defined, no positive wk can be less than unity.
Again, using the calibration procedure described here avoids that possibility.
396
A Nearly Pseudo-optimal method for keeping calibration…
REFERENCES
1. Bankier, M. (2002). Regression estimators for the 2001 Canadian Census. Presented
at the International Conference in Recent Advances in Survey Sampling.
2. Brewer, K.R.W. (1999). Cosmetic calibration with unequal probability sampling.
Survey Methodology, 25, 205-212.
3. Brewer, K.R.W. (1995). Combining design-based and model-based inference. In Cox,
B.G.; Binder, D.A.; Chinappa, B.N.; Christianson, A.; Colledge, M.J.: and Kott. P.S.
eds., Business Survey Methods, Wiley, New York, 589-606.
4. Brewer, K.R.W. (1994). Survey sampling inference: some past perspectives and
present prospects. Pak. J. Statist., 10(1)A, 213-233.
5. Deville, J-C. and Särndal, C-E. (1992). Calibration estimators in survey sampling.
J. Amer. Statist. Assoc., 87, 376-382.
6. Folsom, R.E. and Singh, A.C. (2000). The generalized exponential model for
sampling weight calibration for extreme values, nonresponse, and poststratification.
Proceedings of the Section on Survey Research Methods, American Statistical
Association, Washington DC, 598-603.
7. Kott, P.S. (2008). Calibration weighting: Combining probability samples and linear
prediction models. In Pfeffermann, D. and Rao, C.R. eds., Handbook of statistics 29B,
sample surveys: Theory, methods, and inference. Elsevier, New York, pp. 55-82.
8. Kott, P.S. and Brewer, K.R.W. (2001). Estimating the model variance of a
randomization-consistent regression estimator. Proceedings of the Section on Survey
Research Methods, American Statistical Association, Washington DC.
9. Rao, J.N.K. (1994). Estimating totals and distribution functions using auxiliary
information at the estimations stage. J. Official Statist., 25, 1-21.
10. RTI International (2012). SUDAAN Language Manual, Release 11.0. Research
Triangle Park, NC: RTI International.
Download