Title stata.com ratio — Estimate ratios Syntax Remarks and examples Also see Menu Stored results Description Methods and formulas Options References Syntax Basic syntax ratio name: varname / varname Full syntax ratio ( name: varname / varname) ( name: varname / varname) . . . if in weight , options options Description Model stdize(varname) stdweight(varname) nostdrescale variable identifying strata for standardization weight variable for standardization do not rescale the standard weight variable if/in/over over(varlist , nolabel ) group over subpopulations defined by varlist; optionally, suppress group labels SE/Cluster vce(vcetype) vcetype may be linearized, cluster clustvar, bootstrap, or jackknife Reporting level(#) noheader nolegend display options set confidence level; default is level(95) suppress table header suppress table legend control column formats and line width coeflegend display legend instead of statistics bootstrap, jackknife, mi estimate, rolling, statsby, and svy are allowed; see [U] 11.1.10 Prefix commands. vce(bootstrap) and vce(jackknife) are not allowed with the mi estimate prefix; see [MI] mi estimate. Weights are not allowed with the bootstrap prefix; see [R] bootstrap. vce() and weights are not allowed with the svy prefix; see [SVY] svy. fweights, iweights, and pweights are allowed; see [U] 11.1.6 weight. coeflegend does not appear in the dialog box. See [U] 20 Estimation and postestimation commands for more capabilities of estimation commands. 1 2 ratio — Estimate ratios Menu Statistics > Summaries, tables, and tests > Summary and descriptive statistics > Ratios Description ratio produces estimates of ratios, along with standard errors. Options Model stdize(varname) specifies that the point estimates be adjusted by direct standardization across the strata identified by varname. This option requires the stdweight() option. stdweight(varname) specifies the weight variable associated with the standard strata identified in the stdize() option. The standardization weights must be constant within the standard strata. nostdrescale prevents the standardization weights from being rescaled within the over() groups. This option requires stdize() but is ignored if the over() option is not specified. if/in/over over(varlist , nolabel ) specifies that estimates be computed for multiple subpopulations, which are identified by the different values of the variables in varlist. When this option is supplied with one variable name, such as over(varname), the value labels of varname are used to identify the subpopulations. If varname does not have labeled values (or there are unlabeled values), the values themselves are used, provided that they are nonnegative integers. Noninteger values, negative values, and labels that are not valid Stata names are substituted with a default identifier. When over() is supplied with multiple variable names, each subpopulation is assigned a unique default identifier. nolabel requests that value labels attached to the variables identifying the subpopulations be ignored. SE/Cluster vce(vcetype) specifies the type of standard error reported, which includes types that are derived from asymptotic theory (linearized), that allow for intragroup correlation (cluster clustvar), and that use bootstrap or jackknife methods (bootstrap, jackknife); see [R] vce option. vce(linearized), the default, uses the linearized or sandwich estimator of variance. Reporting level(#); see [R] estimation options. noheader prevents the table header from being displayed. This option implies nolegend. nolegend prevents the table legend identifying the subpopulations from being displayed. display options: cformat(% fmt) and nolstretch; see [R] estimation options. The following option is available with ratio but is not shown in the dialog box: coeflegend; see [R] estimation options. ratio — Estimate ratios Remarks and examples 3 stata.com Example 1 Using the fuel data from example 3 of [R] ttest, we estimate the ratio of mileage for the cars without the fuel treatment (mpg1) to those with the fuel treatment (mpg2). . use http://www.stata-press.com/data/r13/fuel . ratio myratio: mpg1/mpg2 Ratio estimation myratio: mpg1/mpg2 Ratio myratio .9230769 Number of obs Linearized Std. Err. .032493 = 12 [95% Conf. Interval] .8515603 .9945936 Using these results, we can test to see if this ratio is significantly different from one. . test _b[myratio] = 1 ( 1) myratio = 1 F( 1, 11) = Prob > F = 5.60 0.0373 We find that the ratio is different from one at the 5% significance level but not at the 1% significance level. Example 2 Using state-level census data, we want to test whether the marriage rate is equal to the death rate. . use http://www.stata-press.com/data/r13/census2 (1980 Census data by state) . ratio (deathrate: death/pop) (marrate: marriage/pop) Ratio estimation Number of obs = deathrate: death/pop marrate: marriage/pop Ratio deathrate marrate .0087368 .0105577 Linearized Std. Err. .0002052 .0006184 . test _b[deathrate] = _b[marrate] ( 1) deathrate - marrate = 0 F( 1, 49) = 6.93 Prob > F = 0.0113 50 [95% Conf. Interval] .0083244 .009315 .0091492 .0118005 4 ratio — Estimate ratios Stored results ratio stores the following in e(): Scalars e(N) e(N over) e(N stdize) e(N clust) e(k eq) e(df r) e(rank) Macros e(cmd) e(cmdline) e(varlist) e(stdize) e(stdweight) e(wtype) e(wexp) e(title) e(cluster) e(over) e(over labels) e(over namelist) e(namelist) e(vce) e(vcetype) e(properties) e(estat cmd) e(marginsnotok) Matrices e(b) e(V) e( N) e( N stdsum) e( p stdize) e(error) Functions e(sample) number of observations number of subpopulations number of standard strata number of clusters number of equations in e(b) sample degrees of freedom rank of e(V) ratio command as typed varlist varname from stdize() varname from stdweight() weight type weight expression title in estimation output name of cluster variable varlist from over() labels from over() variables names from e(over labels) ratio identifiers vcetype specified in vce() title used to label Std. Err. b V program used to implement estat predictions disallowed by margins vector of ratio estimates (co)variance estimates vector of numbers of nonmissing observations number of nonmissing observations within the standard strata standardizing proportions error code corresponding to e(b) marks estimation sample Methods and formulas Methods and formulas are presented under the following headings: The ratio estimator Survey data The survey ratio estimator The standardized ratio estimator The poststratified ratio estimator The standardized poststratified ratio estimator Subpopulation estimation ratio — Estimate ratios 5 The ratio estimator Let R = Y /X be the ratio to be estimated, where Y and X are totals; see [R] total. The estimate b = Yb /X b (the ratio of the sample totals). From the delta method (that is, a first-order for R is R b is Taylor expansion), the approximate variance of the sampling distribution of the linearized R b ≈ V (R) o 1 n b b + R2 V (X) b V (Y ) − 2RCov(Yb , X) 2 X b, R b, and the estimated variances and covariance of X b and Yb leads to the Direct substitution of X following variance estimator: n o bCov( d Yb , X) b +R b2 Vb (X) b b = 1 Vb (Yb ) − 2R Vb (R) b2 X (1) Survey data See [SVY] variance estimation, [SVY] direct standardization, and [SVY] poststratification for discussions that provide background information for the following formulas. The survey ratio estimator Let Yj and Xj be survey items for the j th individual in the population, where j = 1, . . . , M and M is the size of the population. The associated population ratio for the items of interest is R = Y /X where M M X X Y = Yj and X= Xj j=1 j=1 Let yj and xj be the corresponding survey items for the j th sampled individual from the population, where j = 1, . . . , m and m is the number of observations in the sample. b for the population ratio R is R b = Yb /X b , where The estimator R Yb = m X j=1 wj yj and b= X m X w j xj j=1 and wj is a sampling weight. The score variable for the ratio estimator is b b b b = yj − Rxj = Xyj − Y xj zj (R) b b2 X X 6 ratio — Estimate ratios The standardized ratio estimator Let Dg denote the set of sampled observations that belong to the g th standard stratum and define IDg (j) to indicate if the j th observation is a member of the g th standard stratum; where g = 1, . . . , LD and LD is the number of standard strata. Also, let πg denote the fraction of the population that belongs to the g th standard stratum, thus π1 + · · · + πLD = 1. Note that πg is derived from the stdweight() option. The estimator for the standardized ratio is LD X bD = R πg g=1 where Ybg = m X Ybg bg X IDg (j) wj yj j=1 bg is similarly defined. The score variable for the standardized ratio is and X bD ) = zj (R LD X πg IDg (j) g=1 bg yj − Ybg xj X bg2 X The poststratified ratio estimator Let Pk denote the set of sampled observations that belong to poststratum k , and define IPk (j) to indicate if the j th observation is a member of poststratum k , where k = 1, . . . , LP and LP is the number of poststrata. Also, let Mk denote the population size for poststratum k . Pk and Mk are identified by specifying the poststrata() and postweight() options on svyset; see [SVY] svyset. The estimator for the poststratified ratio is bP bP = Y R bP X where Yb P = LP X Mk k=1 ck M Ybk = LP m X Mk X k=1 ck M IPk (j) wj yj j=1 b P is similarly defined. The score variable for the poststratified ratio is and X bP ) = zj (R bP zj (X bP ) b P zj (Yb P ) − Yb P zj (X bP ) zj (Yb P ) − R X = b P )2 bP (X X where bP zj (Y ) = LP X k=1 b P ) is similarly defined. and zj (X Mk IPk (j) ck M Ybk yj − ck M ! ratio — Estimate ratios 7 The standardized poststratified ratio estimator The estimator for the standardized poststratified ratio is bDP = R LD X πg g g=1 where YbgP = Lp X Mk k=1 ck M Ybg,k = YbgP bP X Lp m X Mk X k=1 ck M IDg (j)IPk (j) wj yj j=1 bgP is similarly defined. The score variable for the standardized poststratified ratio is and X bDP ) = zj (R LD X πg g=1 where zj (YbgP ) = LP X k=1 and bP ) zj (X g bgP ) bgP zj (YbgP ) − YbgP zj (X X bgP )2 (X Mk IPk (j) ck M ( Ybg,k IDg (j)yj − ck M ) is similarly defined. Subpopulation estimation Let S denote the set of sampled observations that belong to the subpopulation of interest, and define IS (j) to indicate if the j th observation falls within the subpopulation. bS = Yb S /X b S , where The estimator for the subpopulation ratio is R Yb S = m X IS (j) wj yj bS = X and j=1 m X IS (j) wj xj j=1 Its score variable is bS bS bS bS ) = IS (j) yj − R xj = IS (j) X yj − Y xj zj (R bS b S )2 X (X The estimator for the standardized subpopulation ratio is bDS = R LD X g=1 where YbgS = m X πg YbgS bS X g IDg (j)IS (j) wj yj j=1 b S is similarly defined. Its score variable is and X g bDS ) = zj (R LD X g=1 πg IDg (j)IS (j) bgS yj − YbgS xj X bgS )2 (X 8 ratio — Estimate ratios The estimator for the poststratified subpopulation ratio is bPS bP S = Y R bPS X where Yb P S = LP X Mk k=1 ck M YbkS = LP m X Mk X ck M k=1 IPk (j)IS (j) wj yj j=1 b P S is similarly defined. Its score variable is and X bPS b PS bPS bPS bP S ) = X zj (Y ) − Y zj (X ) zj (R b P S )2 (X where bPS zj ( Y )= LP X k=1 Mk IPk (j) ck M ( Yb S IS (j) yj − k ck M ) b P S ) is similarly defined. and zj (X The estimator for the standardized poststratified subpopulation ratio is bDP S = R LD X πg g=1 where YbgP S = Lp X Mk k=1 ck M S Ybg,k = Lp m X Mk X k=1 ck M YbgP S bgP S X IDg (j)IPk (j)IS (j) wj yj j=1 bgP S is similarly defined. Its score variable is and X bDP S ) = zj ( R LD X g=1 where zj (YbgP S ) = LP X k=1 and bgP S ) zj (X πg bgP S zj (YbgP S ) − YbgP S zj (X bgP S ) X bgP S )2 (X Mk IPk (j) ck M ( S Ybg,k IDg (j)IS (j) yj − ck M ) is similarly defined. References Cochran, W. G. 1977. Sampling Techniques. 3rd ed. New York: Wiley. Stuart, A., and J. K. Ord. 1994. Kendall’s Advanced Theory of Statistics: Distribution Theory, Vol I. 6th ed. London: Arnold. ratio — Estimate ratios Also see [R] ratio postestimation — Postestimation tools for ratio [R] mean — Estimate means [R] proportion — Estimate proportions [R] total — Estimate totals [MI] estimation — Estimation commands for use with mi estimate [SVY] direct standardization — Direct standardization of means, proportions, and ratios [SVY] poststratification — Poststratification for survey data [SVY] subpopulation estimation — Subpopulation estimation for survey data [SVY] svy estimation — Estimation commands for survey data [SVY] variance estimation — Variance estimation for survey data [U] 20 Estimation and postestimation commands 9