THE AUSTRALIAN JOURNAL OF STATETICS Published by THE STATISTICAL SOCIETY OF AUSTRALIA ~..____ VOL. 5 , No. 3 __ ~ ___.__ __ NOVE~EIC 1963 , RATIO ESTIMATION AND FINITE POPULATIONS : SOME RESULTS DEDUCIBLE FROM T H E ASSUMPTION OF AN UNDERLYING STOCHASTIC P R O C E S S K. Lt. W. BREWEE Ciimmonwealth Bureuu of Cefi.vt~and statistics, Cunbema 1. Introduction Practically all the formulae used in the application of sampling methods to finite populations are derived solely from the properties of the particular finite population concerned. The population is treated as an entity in itself, completely independent of any stochastic process which may have generated it. As a result, a number of important questions have hitherto remained unanswered and, in a sense, unanswerable. In this paper consideration is given to the implications of the existence of an underlying stochastic process, and, on the basis of this assumption, the following results are obtained: (1)A n expression for the conditional variance of a ratio estimator, subject to a particular sample of population units having been selected. (2) The optimum probabilities of selection of the individual units in the population. (3) The likely accuracy of a ratio estimate baged on the largest population units deliberately selected (a " partial collection "). (4) The extent of the diminution in the variance of the ratio estimator obtained by sampling without replacement (with total probability of selection proportional to size or some function of size) instead of with replacement (with the probability of selection a t each draw proportional t o size or the same function of size). (5) An extension of the expressions and estimates for conditional variance to two-stage sampling. (These formulae can easily be generalized to any desired number of stages.) Manuscript received September 13, 1962 ; revised June 18, 1963. 94 K. R. W. BREWER 2. A Generalized Form of the Ratio Estimator For the purpose of this discussion, '' ratio estimation " will be taken to mean the use of any formula in which the ratio of the item P to the benchmark item Z is estimated by the ratio of some linear function of the sample values y, ( i = l , 2 , . . ., rn referring to the order of selection in the sample) to the same linear funetion of the sample values xi, in which each coefficient in the linear function can depend only on the benchmark item value of the particular sample unit to which it applies. Using this definition, the expression " ratio estimation " includrbs three important particular cases : (1)Unbiased estimation with proba.bility of selection proportional to size, '' size " being in this instance the values of the benchmark item Z, (I=l, 2, . ., N referring to some ordering of units within the population). The coefficients in the linear function of the y i and x i are proportional to ~$7'and the estimator of the ratio Y / Z reduces to n-lC'(y,/z,). NOTE: C' is used in this paper t o denote 8 summation over i = l , 2 , . ., n and Z is used for a summation over I=1, 2 , . . , , N . (2) Ordinary ratio estimation with equal probabilities of selection for all units. The coefficients of the linear functions are all unity and the estimator of the ratio Y/Z is C.'Yi/X'Z,. (3) A modified form of regression estimation in which it is assumed that the constant term is zero. The coeBcients are proportional to zi and the estimator of P/Z is . . XrYpJZ'Z;. It will be noted that in the first two cases the probabilities of selection were specifled, and, because in both these cases the coefacients of the linear functions were inversely proportional to the probabilities of selection, the estimators were consistent. The modified regression estimator is a. coneistent estimator when selection is made with probability inversely proportional to size. The discussion in this paper will be limited (cxcept in Section 6) to such consistent ratio estimators. The consistent ratio estimator of Y is then y", where (1f y' =zrp;'yilz'p;'zi where P, ia the probability of selecting the Ith population unit for the sample a t any draw, and p i is the particular value of P, corresponding to the sample unit selected a t the i t h draw. It should be noted that, when sampling is without replacement, P I cannot strictly be regarded as the probability of seleotion of the I t h population unit a t any single draw. It is therefore necessary to define P , as one nth part of the total probability of selection of &heI t h population unit in a sample of n. Except where the contrary is explicitly stated, it will be taken that " selection without replacement " means strictly what it says, e.g. if systematic selection from a randomly ordered population (with probability proportional to size) is used to ensure selection without replacement, then the size of each population unit should be no larger than the skip interval. RATIO ESTIMATION A N D F I N I T E POPULATIONS 95 3. The Concept of an Underlying Stochastic Process The typical case, in which ratio estimation is employed to estimate the total value of the item Y from a sample, is that in which the values PI are in some sense dependent on the known values Z , of the benchmark item. To take an imaginary example, if the problem were to estimate the production of butter in Australia each month, the value of butter production PI by a given factory in 8 given month in fact depends to a large extent on the known value of butter production 2, by the same factory over the year covered by the last Factory Census. Alternatively, it may be regarded as dependent on the known wage bill for that factory in that month, in which case " wages " wonld be zl useful benchmark item. I n other words, given tho value of the benchmark item 2, we may make a stochastic estimate of Y,. The existence of a stochastic relation between Y , and 2, is, in faat, a hidden assumption behind the decision to use ratio estimation in nearly all practical cases. For although such a decision is justified formally on the grounds of a high correlation between Y , and Z,, the ground for assuming that a high correlation exists is almost invariably that there is some form of stochastic dependents between PI and 2,. From this point of view, the actual finite population with which the sampling statistician is confronted may be regarded as one particular state of affairs (namely the state of affairs which in fact exists) from all the possible states of affairs which might have existed, given the benchmark item information. It may, in fact, be regarded m a sample of one from an infinite number of finite populations, all with the same values of Z , but with values of P, varying from population to population and stochastically dependent on the 2,. The simplest form which this dependence can take, in which case the generalized form of ratio estimation described in the previous section is appropriate, is one which appears to have been first suggested by Cochran (1953)-see particularly pages 123-4 and 211-2. The form of this dependence is that the value of PI for the kth population is Pk,,where (2) ykI=PzIf Uk, and the ZJk, are independently dist'l-ibutedwith zero mean and variance 0;. The value of 0; may be regarded as a function of 2,. If the P, are subject to proportionate variations, c; will be proportional to Zf. I€on the other hand they are subject to equal variations, u? will be a constant. If they may be regarded as made up of large numbers of small equal elementary units, each subject to equal and independent variations, & will be proportional to 2,. In general it will be assumed that (31 u: =rPz;y where e2 and y are constants. It may be seen intuitively that y is nearly always between 0 and 1. According to Cochran (1953), p. 212, it is usually between and 1 and this has been borne out in empirical studies by the author. The model described by equations (2) and (3) wiU form the basis for the results obtained in this paper. An approach somewhat analogous to that used in this paper wag employed in an article by Godambe (1955). Godambe's theory + 96 K. R. W. BEEWEZi virtually uses that part of Cochran's model which may be expressed by equation (2) without equation (3), but the particular form in which he uses it is applicable only to sampling without replacement. The most important result in Godambe's paper duplicated in the present one is his equation (15)' which corresponds to the one numbered (22) in this paper. Some related problems were also investigated in a very similar fashion in a paper by Des Raj (1958). 4. Conditional Variances for Particular Samples Drawn Without Replacement The customasy definition of the mean square error of a ratio estimator is based on the average value of the squared deviation from the population total over all possible samples of a given size. This mean square error is a function of the individual population values and of the numerical size of the sample only. The accuracy of any sample estimate of a population total is usually measured by the sample estimate of this mean square error, and though this estimate may vary from sample to sample, the variations have no relevance to the accuracy of each particular sample, but ere only unavoidable deviations from the " true " mean square error. Thus no account is taken of whether the sample units selected are (by chance) large or small. It is, nevertheless, reasonable to suppose that if large units were selected by chance the resulting sample estimates would be more accurate than if small units were selected by chance. As long as the population is regarded as an entity in itself, there is no way of measuring these different accuracies and, indeed, no reason to assume that the accuracies do differ. When the assumption is made of an underlying stochastic process, more particularly of the model described by equations (2) and (3),it becomes possible to meamre these different acxxracies. In order to do this it is necessary to regard the y" of equation ( I ) as an estimator not only of Y, but also of PZ, or, since the population from which the sample is drawn is being regarded as one (the kth) of an infinite number of possible populatiom, it is better to say that y; must be regmded as an estimator both of Y, and PZ. I n some situations the statistician is interested in all possible states of affairs but in others he may be interested only in the one existing state. This, in practical situations, determines whether he wants to estimate Pk or PZ. The theory for both these situations is developed here. Denoting by E , the expectation over all possible populations subject to the particular values of 2, selected, (4) =pz. Thus, given the model described by equation (2), y l is an unbiased estimator of pZ no matter which populations units are selected in sample, and therefore it is also an unbiased estimator of pZ over all possible samples of size 71.. The variance of y; as an estimator of PZ depends on whether selection is with or without replacement. The treatment of the wse where selection is without replacement' i s the simpler, as it is RATIO ESTIMATION AND FINITE POPULATIONS 97 known that every sample unit is a dserent population unit. Consequently, this case will be considered first, and sampling with replacement later (in Section 7). For convenience of notation, the subscript k will be omitted from the remainder of this paper. Given, then, that selection is without replacement, the conditional variance of y " as an estimator of PZ, subject to the particular values of 2, selected, is (5) (bzo;"li)=Ei(y" -PZ)2 =Z2E i{Z'pr' 2 ~j / Z f p c1 ~ i } 2 =Z"'pi2o~/(X'pi'zi)2 =Z2Z;'p&;/&'2 where a' is the unbiased sample estimator of 2. Similarly, the conditional variance of y" regarded as an estimator of Y , subject to the particular values of 2, selected, is (";. 2 I i )= E i ( y " - - P ) 2 = E i [ ~ ) ~ ' ~ , / I ; ' p i ' a i - 2 ; U I ] 2 (6) =Ei[X'(prlZ- X ' p ~ l ~ & j / Z p ~-Il;"~ U,]2 i =z2C'(p;l --nx'Z-yS: ITZ22'2 +x "o; where I;" indicates a summation over all the population units not selected in sample. from equation (3), equations (5) and (6) become Substituting for CT; (7) and (8) I i) = z 2 o 2 ~ 9 ; ~ ~ i 7 / n 2 ~ ~ ~ (flzo9" 2 (CT$ I i) =."z2c'(p;l --nz'Z-~)~~/TZ4'2+~"Z~]. Both these expressions are functions of the particular values of 2, selected, and their different values indicate the variation of the accuracy of y" (as an estimator of PZ and P respectively) for different samples. It is to be noted that, if the model defined by equations (2) and (3) describes perfectly the true dependence of PI on Z I , the significance of equation (4) is that y" is an unbiased estimator of fI2 regardless of the values of the individual P I , that is of how the sample is chosen. 5. Optimum Selection Probabilities Denoting by az& and o$ the expectations over all possible samples of size TZ of the expressions on the left-hand sides of equations (5)and (6), these may be termed the expected variances of y" regarded as an estimator of fXZ and Y respectively. The optimum values of P, are those which minimize these two expressions. An approximate formula for the optimum selection probabilities which minimize azo> can be derived as follows: 98 K. R. W. BREWER The expression in square brackets on the right-hand side tends to unity as n increases. The difference, which is well known to be of the order of @,-l,will be neglected for the remainder of this paper. Thus Bzoi- =Z2EC 'pi20:/ (EC' p a1zi)2 (10) -n-qp-1I - 2 CI. Differentiating this expression partially with respect to P,, holding the sum of the P, equal to unity by means of a Lagrangian multiplier, and equating the resulting partial difYerentia1 coefficient to zero, the optimum values of PI which result are given by (11) P,=o,/Co,. The analogy is close between this formula and the usual optimum allocation for stratified sa,mpling, where with the usual notation, =n#h/xxh 8 , (121 h as may be seen by putting n and all the N h in equation (12) equal to unity. More particularly, it follows from equations (3) and (11) that (13) PI =2qEG. The optimum probabilities of selection are thus proportional to a. This result also holds when the object is to minimize IT,"-,for sampling, that is (14) jjzc$.=E(y" -pZ)z =E(~"-P)2+;E(P-fi2)2 B z ~ >can be split into two components, one from each stage of 2 2 =ou"+&SI. The second term, which represents the first stage variance, is not dependent on the PI. Hence whatever values of PI minimize 2 B z ~ $also minimize 0,". The formula P,ccz', implies that, when y =0, equal probabilities of selection are optimum, and when y=l, the optimum probabilities of selection are proportional to size, in which case the ratio estimator is unbiased. 6. Ratio Estimation from " Partial Collections " A possible objection to the above derivation of optimum probabilities of selection is the following. Assuming that equations (2) and (3) describe the manner in which the population has been generated and that 0 l y 51, it is tedious but not ditlicult to show that, for a given fixed sample size of n, the most accurate ratio estimate for the population total can be constructed from a ' L partial collection ", that is, the n units with the largest values of 2,. I n view of this, the optimum probabilities calculated above appear to be irrelevant. It must indeed be conceded that a ratio estimator based on such a, partial collection is, on these assumptions, an unbiased estimator of pZ and that its conditional variance is less than the expected variance of any possible estimator of the type described by equation (1). On the other hand it may well be unwise to abandon a sampling plan for RATIO ESTIMATION AND FINITE POPULATIONS 99 which a variance can be calculated, regardless of any assumptions, for one giving only a conditional variance valid on the assumption that equations ( 2 ) and ( 3 ) describe the generation of the population. Nevertheless, if the statistician is satisfied that equations ( 2 ) and (3) hold sufficiently well for his purpose, it is possible for him to minimize the conditional variance of his estimate by choosing to select the n units with the largest values of 2,. It is still possible, and in fact usual, with partial collections, to use equation (1) for estimation by writing n for p i ’ . The conditional variances appropriate t o such a method of estimation can be obtained from equations ( 7 ) and (8) writing n for p i 1 and C’z, for 2’. This is arithmetically equivalent t o treating the partial collection as though it were a sample drawn with equal probabilities of selection for all units in the population. However, the conditional variances may be reduced still further, in fact minimized, by treating the partial collection as though it were a sample drawn with unequal probabilities. The conditional variance of y” regarded as an estimator of PZ can be minimized by treating p i 1 as though it were proportional to zioi2. This yields the equations y ” =zc ly izi0; 2 1 2 rx; a &: 2 =ZC ’y I.a! Z . - 2 y / c 5z: -2y (15) and (azoy2 1 i) = Z 2 / C ’ ~ ; ~ =Z202/X’~:-2Y. ;2 (16) The conditional variance of y“ regarded as an estimator of P is minimized by treating p i ’ as though it were proportional to (2- ~ ‘ z ~ ) x+ Z ~ :c’ ~ Z ;~~ ~This ~ . yields the equations (17) y” =C‘y, + ( Z - C ’ z , ) C ’ y i x i o ~ 2 / c ; ’ z ~ o ~ 2 =C’y, +(Z --c’zi)C’y,zi - 2 y / ~ zi ’ 2-2y and (18) (o;” I i) =(Z-CC’zi)2/x’2;oi2+C”o~ = ( Z -x’Izi)Zo2/c’&-2y +02yz;-f. It is interesting to note that the approach of this section can be used to analyse the results obtained from any sample, no matter how it was selected, provided the model of equations ( 2 ) and (3) is known to hold to a sufficient degree of accuracy. 7. Comparison between Sampling Without and Sampling With Replacement In this section it will be assumed that sampling with replacement ia made with the probability of selection at each draw the same for a given population unit. The expression for the ratio estimator y’ is that given by equation (1). The expected variance of y” as an estimator of PZ will be written ,&izw, the tiZde indicating that sampling is with replacement. Then (19) &r> =E(y” -pzy = z 2 E E , { I ; ’ p c l u i / ~ ’ ~ , ~ l z * } 2 =22E([zlp;2a: + Z’ i# j Z’6j,,pi”0:]/(~lpil~i)3 I00 K. R. W. BREWER . ., where Z f Z f indicates a double summation over i=1, 2, . n and over j=1, 2, . .,n and where 6,,! is unity when the i t h and jth sample units are the same population unit and zero in all other cases. Then . &$ (20) "Z2{Epc20; +(n - l ) E 6 , , j p i 2 0 ~ } / n ( E p i l ~ i ) a =n-'(ZP;lo; +(m -l)&$}. The expected variance of y" as an estimator of P is therefore, from equation (14), (21) 6; *,-1{q--lo; -&T;j. The corresponding expression for sampling without replacement is obtained by comparing equations (10) and (14), (221 0; =,-'{XPF b* -.Z$}. The absolute amount by which the expected variance is reduced by sampling without (as opposed to with) replacement is therefore wl(a-1)h; and is independent of the probabilities of selection. Thus if B worthwhile saving is achieved with equal probabilitiea of selection, the same absolute saving with more nearly optimum probabilities of selection is even more worthwhile. The factor by which the expected variance is reduced is in fact oy+Z(&--' -n)o;/Z(PF1--I)oI. 2 i23) There are a number of interesting special cases of this formula. (i) If all P y l = N it reduces to (N--m)/(X-l) which is the wellknown reduction factor for equal probabilities of selection. (ii) If selection is with probability proportional to size and R,= Z I P , C;./Z;# L C ( R ; ~-n) C;/CRF~-1) aI. 2 (24) If o?=o22iy, then with y=) this again gives ( N - m ) / ( N - l ) but with y = l it gives (l--nXB;)/(l-XB~). (iii) If o; =02Z; r, then (25) rJ;"/&;" +X(PF' -n)ZyC(PF' -1)Z;y. If,in addition, optimum probabilities of selection are used with =a, p, (26) $&+$={Xz;-nCz;y}/{cz;y-Zz;y). 8. Selection with Minimum Replacement When certain population units are so large that no scheme of selection without replacement is compatible with the total probability of selection being proportional t o size, the variance is minimized by RATIO ESTIMATION AND FINITE POPULATIONS 101 taking these large units out of the population, enumerating them separately, and selecting the remainder of the sample from the rump population in the usual way. If this is inconvenient for any reason, the population may be kept in its original state and each large unit selected at least a certain minimum number of times and at most once more often, as in systematic selection. In dealing With selection without replacement P I was defined as one nth part of the total probability of selection. When selection is with minimum replacement PImust be defined as one nth part of the number of times the J t h unit is certain of selection plus one nth part of the probability of it being selected an extra time also. Thus if the unit is large enough to be selected v I times with certainty, P I will lie between v,/n and (v,+l)/n. The expected variance of the ratio estimator may be derived using a method analogous to that of the previous section. Consider the expression Bj(ZpT1uJ2which appears in the numerator in the formula for conditional variance. A unit which appears in the sample v I times with certainty will contribute either v, or v,+1 terms t o the summation, all of magnitude PT1UI. The probability that there will be vI such terms is v,+l-nP, and the Thus probability that there will be v,+l such terms is nP,-v,. the contribution of this term to with probability v,+l-nP, a , ( X ' p i ' ~ is ~ )$PI2& ~ and with probability nP,-v, its contributiou is ( ~ , + l ) ~ P 1 ~ aHence ;. its expected contribution is : (27) 2 ( v I + l -nP,)vipr -2 2 or +(nP, - v I ) ( v I + 1 ) 2 p ~ ~ a ? = ( ~ P , ( ~ v , +-Iv) , ( v I + ~ ) ) P ~ ~ o ; . The expected variance of the retio estimator (regarded as an estimator of pZ) is therefore (28) aza~-=n-2C{n~I(2vI +I) - v I ( v I + l ) ) ~ ~ ~ a ? and the expected variance of the ratio estimator (regarded as an estimator of P) is (29) & = N - ~ x { ~ P , ( ~ v , + I ) - v , ( v , + l ) -n2P:}PF2c;. 9. Estimators of Model Parameters The problem of estimating variance is effectively that of estimating the o; which are the only unknown terms in all the variance expressions considered in this paper. Since only one value of PI is available, the individual o; cannot be estimated without the use of the assumption expressed by equation (3), in which case the problem is to estimate d and y. This in turn involves the estimation of p. (i) Estimation of p and aa with y assumed known In practice the value of y for any population is likely to remain stable over a long period and it would s d c e to determine its value at Mequent intervals. In the meantime p and cr2 could be estimated, usually from samples, using an assumed value of y, in the following fashion. 102 K. R, W. BREWER Assuming that the equations (2) and (3) describe the manner in which the population has been generated, it was indicated in Section 6 that the minimum conditional variance estimator of p is b =E;'yiz;-2Y/);'&-2y (30) and that its conditional variance is I (31) (6; i)=a2/E,'X:-2Y. Now from the fact that the y i are distributed with mean @xi and variance C J ~ Z ; ~it, can be shown that the yi--bzi are distributed with zero mean and variance a2x;Y(1 (32) --x?-'Y z -1 &xi 2-2y ) 2nd that consequently an unbiased estimator of c2 is (33) 8 2 =m-"'{(y, --bXi)2/zy(l -zi'-?y /c'x;-2~)}. (5) Simultaneous estimation of p, o2 and y If y is unknown, it is necessary to assume that the distribution of the U l is normal and to use the method of maximum likelihood. The maximum likelihood estimator of p is then (34) where g is the maximum likelihood estimator of y. Except for the substitution of g for y, it will be seen that this estimator is identical with that of equation (30). The estimator of t 3,however, is smaller than the unbiased estimator of equation (33),as is usual with maximum likelihood estimators of variance, and takes the form (35) Putting (36) stfL=n-lC'(yi -tu,) 2/ x T". ui =yi --bai, equation (35) can be written (37) sML 2 =n-lXrzi:/z:g. It is not possible to write an explicit formula for the maximum likelihood estimator of y, but the method yields the following implicit expression : cov {&i2Q, log X i } =o. (38) This is in accordance with what might be expected from another line of reasoning, namely, that since over all possible populations generated by the hypothesized stochastic process the covariance of u;zT2ywith all possible functions of ziis zero, any reasonable estimator of y might be expected to equa.te the covariance of the zi:xi2y with some function of xi to zero. The method of maximum likelihood is still needed to indicate that the most appropriate function of the st for this purpose is t,he logarithmic. RATIO ESTIMATION AND FINITE POPULATIONS 103 It does not appear possible to find any iterative formula for g, and graphical methods must be used. However, there would not appear to be much point in calculating the covarianc,es for assumed values of y much less than & or more than unity. Empirical studies suggest that the correlation between u ~ z ~ and 2 y log x i varies approximately linearly with y over a considerable ra'nge where the value of the correlation is in the region of zero. The author is indebted to Rlr. N. I?. Nett,hheim for suggesting this method of estimnting y. Unfortunately the logarithmic function lacks robustness in tbat it varies very rapidly when xi is insignifica'ntly small, a'nd one or two amall units could easily dominate the whole covariance expression in equation (38). To avoid this, it is necessary either t o trunca,te the distribution a t some arbitrary point or to use some other statistic which approximates to log x i in the upper tail. The work of Simon (1955)and of Simon and Bonini (1958)indicat'es that most populations of the type to whic,li estimation sampling is usually applied fit closely the Yule distribution, which approximates to a Pareto distxibution in the upper tail. It would then follow that the logarithms of the rank orders (reckoning the rank order of the largest to be one, that of the second largest to be two, ctc.) or of the rank orders diminished by +,would, in the upper tail, be spaced roughly proportionately to the spacing of the logxi. 10. Conditional Variances for Two-Stage Sarnpling So far, it has been shown that y" is, given the assumption of the underlying stochastic process, an unbiased estimator of (32 and that, if all the sample units are different population units, the conditional variance, subject to the selection of a particular set of Z I , is given by equation (7). It has also been shown that, regarding y" as an estimator of P, the conditional variance of y" could be regarded as the second stage variance of the hypothetical sample de,sign. This conditional variance is given in equation (8). If, however, the y i are not accurately known, but are themselves estimat,ed by y i (on the basis of a second stage sample of the same kind) then (39) and y; =Z&p $yii'/C;ip+(j# (40) y" =Zyp;lyi/Cfp ;'xi where X21 is used to denote a summation over C = l , 2 , . . ., ni, for a summation over I'==l, 2 , . . ., ATr and X i I indicates a summation over a11 the second-stage population units in the Ith first-stage population unit which were not selected a t the second stage of sampling. To arrive a t a formula for the conditional variance of y" it is necessary to redefine the stochastic distribution by the following set of equations : (41) Yr,. = P r Z r r ~ U I f and (42) B*=P+,-; ui + 104 I ( . B. W. BREWER from which it follows that (431 P,=PZ,+ ~I+~c,,U,,J where UIand Ur,' are independent random variables with zero means and variances ts; and IS&, respectively and (44) (45) 2 2 2Y1 a1 =0181 bIr =c22p I I"'..* It then follows immediately that for selection without replacement 11. Estimators of Model Parameters in Two-Stage Sampling It is clear that the second stage parameters psi, crgi and y Z i can be estimated in the same way as the parameters p, cr2 and y in Section 9, but for the estimation of PI, CT; and y1 it is no longer possible to use similar formulae, as the y i are not exactly known and the y i themselves have a conditional variance arising from the second stage of sampling, which can, however, be estimated. It is theoretically possible to use maximum likelihood for estimating the first stage parameters, but in practice the formulae are unduly complicated, involving the simultaneous solution of two, or, if y1 is unknown, three intrinsic equations. For practical purposes it is probably best to use a sub-optimum procedure leading to a simpler set of equations. RATIO ESTI&iTIOW AYD FISITE POPULATIOKS 105 the conditional variance of 4, is Eii’Oi=Eii.(~l;--b,z,)’ 2 (52) (ii) Simultaneous estimation of pl, c : and yl. By analogy with equation (38) an estimator of y1 is given by (54) cov i [&;2QLz;2Q1 (sy$ z I i’)(l-Y”&-2g’/C’ zi2--2Q1 1 .. As with equation (38), it is necessary to use graphical methods of solution. Ref erelzces Cochran, W. G. (1953). Sampling Techniques. John Wiley & Sons, Inc., New York ; Chapman & Hall Ltd., London. Godambe, V. P. (1955). “ A Unified Theory of Sampling from Finite Populations ” J . Roy. Statist.“Soc., B17, 269-278. On the Relative Accuracy of Some Sampling Techniques.” Raj, Des (1958). J. dmer. Statist. Assoc., 53, 98-101, Simon, H. A. (1955). “ On a Class of Skew Distribution Functions.” Bimetrika, 42, 425-440. Simon, H. A., and Bonini, C. P. (1958). “ The Size Distribution of Business Firms. ” American Economic Review, 48, 602-617.