advertisement
Lohr 2.2
a)
Unit 1 is included in samples 1 and 3. 1 is therefore 1/8 + 1/8 = 1/4
Unit 2 is included in samples 2 and 4. 2 is therefore 1/4 + 3/8 = 5/8
Unit 3 is included in samples 1 and 2. 3 is therefore 1/8 + 1/4 = 3/8
Unit 4 is included in samples 3, 4 and 5. 4 is therefore 1/8 + 3/8 + 1/8 = 5/8
Unit 5 is included in samples 1 and 5. 5 is therefore 1/8 + 1/8 = 1/4
Unit 6 is included in samples 1, 3 and 4. 6 is therefore 1/8 + 1/8 + 3/8 = 5/8
Unit 7 is included in samples 2 and 5. 7 is therefore 1/4 + 1/8 = 3/8
Unit 8 is included in samples 2 , 3, 4 and 5. 8 is therefore 1/4 + 1/8 + 3/8 + 1/8 =
7/8
b)
Sample
1
2
3
4
5
y
1  4  7  7  4  4.75
2  4  7  8 4  5.25
1  4  7  8 4  5
2  4  7  8 4  5.25
4  7  7  8 4  6.5
Thus, the sampling distribution is
tˆ
Prob.
8  4.75  38
18
8  5.25  42
14
8  5  40
8  5.25  42
18
38
8  6.5  52
18
tˆ Prob.
38 1 8
40 1 8
42 5 8
52 1 8
Lohr 2.6
a)
30
25
Frequency
20
15
10
5
0
b)
0
2
6
4
Number of refereed publications
10
8
1
1
yˆU  yS   yi   28  0  4 1  3  2    110 
50 iS
50
89

 1.78
50
1
1
2
2
2
2
s2 
   yi  y s  
 28  0  1.78  4  1  1.78    1 10  1.78 
49 iS
49
 7.1955

 
 SE yˆ U
n  s2
50  7.1955


 1      1 
 0.37

 N n
 807  50

d)
# Sample units with no refereed publicatio ns 

n
28

 0.56
50
pˆ 
n  pˆ  1  pˆ 
50  0.56  0.44


SE  pˆ   1   
 1 
 0.0687

n 1
49
 N
 807 
95% C.I. :
p  0.56  1.96  0.0687  0.43 , 0.69
Lohr 2.26
Assume N  n  k
 k possible samples. One of them contains unit i
1
1
n
Pr Unit i is in sample   

k N n N
but
1
1
Pr S   
k N
 
n
Stratified sampling
• Population divided into strata (one stratum). [Males and Females; Different
regions; Age classes, etc.]. Stratum sizes N1, N2, … , NH
• Sampling made form each stratum to account for different variation within
different strata  Increases precision. Sample sizes n1, n2, … , nH
• Estimation of population totals and population means (proportions) by weighing
sample means from all strata. Weights computed from relative stratum sizes
(Nh / N )
H
ˆyU  y str   N h  yh
h 1 N
• Design planning: Select the total sample size based on precision requirements
(typically length of confidence intervals). Allocate the sample units over strata –
proportional allocation (only size-based), optimal allocation (size-, variation- and
cost-based).
Lohr 3.2
a)
Sample strat1 y1 Pr S1  Sample strat2
1,2
1.5 1/6
4,7
1,4
1,8
2,4
2,8
4,8
2.5
4.5
3
5
6
1/6
1/6
1/6
1/6
1/6
4,7
4,7
7,7
7,7
7,7
y2 Pr S 2 
5.5 1/6
5.5
5.5
7
7
7
1/6
1/6
1/6
1/6
1/6
b) There are 36 combinations of samples but only 12 combinations with unique
values of total estimates
y1
y2
tˆstr  8  0.5  y1  0.5  y2 
1.5 5.5
28
2.5 5.5
32
3
5.5
34
1.5
7
34
2.5 7
4.5 5.5
38
40
3
7
40
5
5.5
42
4.5
7
46
6
5.5
46
5
7
48
6
7
52
Prob
3  1 / 6   (1 / 6)  1 / 12
3  1 / 6   (1 / 6)  1 / 12
3  1 / 6   (1 / 6)  1 / 12
3  1 / 6   (1 / 6)  1 / 12
3  1 / 6   (1 / 6)  1 / 12
3  1 / 6   (1 / 6)  1 / 12
3  1 / 6   (1 / 6)  1 / 12
3  1 / 6   (1 / 6)  1 / 12
3  1 / 6   (1 / 6)  1 / 12
3  1 / 6   (1 / 6)  1 / 12
3  1 / 6   (1 / 6)  1 / 12
3  1 / 6   (1 / 6)  1 / 12
tˆstr
28
32
34
38
40
42
46
48
52
Prob
1/12
1/12
1/6
1/12
1/6
1/12
1/6
1/12
1/12
c)
tˆstr
28
32
34
38
40
42
46
48
52
Prob
1/12
1/12
1/6
1/12
1/6
1/12
1/6
1/12
1/12
1
1
1
1
1
ˆ
E t str   28   32   34   38   40  
12
12
6
12
6
1
1
1
1
 42   46   48   52   40
12
6
12
12
2 1
2 1
2 1
V tˆstr   28  40   32  40   34  40  
12
12
6
568
2 1
   52  40  
 47.33
12 12
y1U  1  2  4  8 4  3.75




1
28.75
2
2
2
2
S12   1  3.75  2  3.75  4  3.75  8  3.75 
3
3
y2U  4  7  7  7  4  6.25
1
6.75
2
2
2
2
S 22   4  6.25  7  6.25  7  6.25  7  6.25 
 2.25
3
3
1
1
1

1

ˆ
E t str   8    y1, U   y2, U   8    3.75   6.25   40
2
2
2

2

2
2


1
2
28
.
75
3
1
2
2
.
25








2
  47.33
V tˆstr   8      1   
    1   
 2   4 

2
2
4
2






Lohr 3.7
a)
Stratum, h
Nh
nh
yh
1 0  2 1    0  8 7
Biological , 1 102 7
 22 7  3.143
Physical, 2
310 19
40 19  2.105
Social, 3
217 13
16 13  1.231
Humanities , 4 178 11
5 11  0.455
Total
807 50
sh2
1 0 - 3.1432  2  1 - 3.1432  

 6  6.810
2
  0  8 - 3.143



8.211
4.359
0.873
4
4
N
22
40
16
5
h
tˆstr  N  
 yh   N h  yh  102   310   217   178   1321
7
19
13
11
h 1 N
h 1
2


n
s
2
SE tˆstr    1  h   N h  h 
Nh 
nh
h 1 
4
7 
19 


2 6.810
2 8.211
 1 

1 
 102 
  310 
102
7
310
19






13 
11 


2 4.359
2 0.873
 1 

217


1


178




13
11
 217 
 y 178 
 48.6
str
b) To compare with the results from 2.6 b divide both estimates with N:
tˆstr 1321
ystr 

 1.64 Compare with 1.78 from 2.6 b
N
807
SE tˆstr  48.6
SE  ystr  

 0.06 Compare with 0.37 from 2.6 b
N
807
c)
4
pˆ str  
h 1
Nh
102 1 310 10 217 9 178 8
pˆ h 
 
 
 
  0.57
N
807 7 807 19 807 13 807 11

n  N  pˆ  1  pˆ h 
SE  pˆ str    1  h  h  h

N
N
n

1

h 1 
h 
h
2
4
7   102  1 7   6 7  
19   310  10 19   9 19 

 1 

1 

 

 
102
807
6
310
807
18

 


 

2

2
13   217  9 13  4 13 
11   178  8 11  3 11

 1 
 1 

 

 
217
807
12
178
807
10

 


 

 0.0658
2
2
d) In 2.6 d) the standard error of the corresponding estimated proportiomn
was 0.0687. Thus, the precsion has been reduced a bit.
A corresponding confidence interval here gets the error margin
1.96  0.0658  0.129
compared to 0.135 of 2.6 d)

Lohr 3.22
a)
nh 
N h  Sh
N
l
l
 Sl
ch
cl
n 
N h N   S h
 N l N   Sl
ch
ch
l
For 0/1 - data  2  p  1  p 
N 1 2
and in general  2 
S
N
N h
N
 ph  1  ph  ch
N
N

1
 ph  1  ph   nh 
n 
N 1
N
l N l N  N  1  pl  1  pl  ch
 Sh 

N h N   ph  1  ph  ch
 N h N  pl  1  pl  ch
N 
n
l
Now, ch  c , h  1,2  nh 
N h N   ph  1  ph 
n




N
N

p

1

p
 h
l
l
l
0.4  0.10  0.90
 2000  1079
0.4  0.10  0.90  0.6  0.03  0.97
 n2  2000  1079  921
 n1 
b)
N
 p  1  ph 


nh   N h  S h2
nh   N h  N  1 h
  
  
V  pˆ str    1 
  1 

 
 
N h   N  nh
Nh   N 
nh
h 
h 
2
2

n   N  p  1  ph 
N

  1  h    h   h
N 1 h  Nh   N 
nh
2
 N  p  1  ph 
N can be assumed very large  V  pˆ str     h   h
nh
h  N 
N
Under proportion al allocation , nh  h  n
N
N p  1  ph 
0.10  0.90
0.03  0.97
 V  pˆ str    h  h
 0. 4 
 0 .6 
 2.673 10 5
n
2000
2000
h N
Under optimal allocation (see a) ), n1  1079, n2  921
2
 V  pˆ str   0.4  
2
0.10  0.90
2 0.03  0.97
 0.6  
 2.472 10 5
1079
921
Cluster sampling
• Population divided into heterogeneous groups - clusters, each serving as a
mirror of the whole population [communities, living areas, schools, classes
within a school]
• Clusters are not the same as strata. Care should be taken so that two clusters by
definition do not have different population properties.
• Cluster sampling is a tool for economising the sampling. Precisions of
estimates are usually worse than for simple random sampling (SRS) of
observation units.
• Cluster sampling can be made as one-stage, two-stage or multi-stage sampling.
primary units [highest level, e.g. communities], secondary units [e.g. living
areas], tertiary units [e.g. individuals]
• Clusters can be of equal or non-equal sizes [different formulas for estimators]
• Sampling at different stages can be made differently – with equal or unequal
(Ch. 6) probabilities
• Stratified sampling can be involved. E.g. If communities are clusters we still
may consider individuals in owned homes to have different living habits than
individuals in rented homes. Thus we may stratify within communities.
• National surveys are almost always made with cluster sampling in a complex
fashion.
• Formulas are always more complicated, due to the more complex structure of the
sampling – Cost is lower.
Lohr 5.11
Claims are primary units, fields are secondary units.
215 (interesting) fields in each claim  Clusters with equal sizes, M =215
One-stage sampling: SRS of primary units, checking all secondary units within a
sampled primary unit.
N  828 (claims)
n  85 (number of sampled claims)
Claim, i M i  M
1
215
ti
4
2
3
4
215
215
215
3
2
2
5
6
215
215
2
2
7

215

1

28
29
215
215
1
0

85

215

0
a)
N
tˆ 
n
yˆ 
 ti 
iS
30636
828
 360
 4  3  4  2  22 1  57  0  
85
85
30636 85
tˆ
 0.0020

N  M 828  215
2

tˆ 
1
2
 ti   
st 

N
n  1 iS 
2
2
2
30636 85  
30636 85   30636 85 
1 

   4 
   0.5583
    57   0 
  3 
828  
828 
828  
84 


2
s
85  0.5583
1
n
1



t
 0.00036
 1 
 1    
 SE yˆ 

M  N  n 215  828  85
b)
tˆ  360
2
s
n
85  0.5583



SE tˆ   N  1    t  828  1 
 63.6

 N n
 828  85
Lohr 5.6
N  580 (cases)
M i  M  24 (cans) [equal cluster sizes]
n  12 (sample of cases, primary sample)
mi  m  3 (sample size of secondary sample)
ˆt unb    wij  yij    N  M i  yij  N  M    yij 
mi
n m iS jSi
iS jS i
iS jS i n
580 24

  1  5  7   4  2  4     0  0  0  
12 3
ˆ
tˆunb 262 
580 24
580  262

 131 
 50653  yunb 


12 3
3
N
3


2
 ˆ tˆunb 
Mi
1
M
ˆ


st2 
t


t


y

  yij 


i
i
ij


n  1 iS 
N 
n jSi
jS i mi
2
2
1  24
24


 
   1  5  7   262 3       0  0  0   262 3   
11  3

 3
 
 2611.88
1
yij  yi 2 
si2 

mi  1 jSi
2
2
2


1
1

5

7
1

5

7
1

5

7






2
s1   1 
  5 
  7 
   9.33
2 
3  
3  
3  
etc.
2
2


s
m
s
n
N
12  2611.88



2
t
i
i
  M i 
Vˆ tˆunb   N  1       1 
 580 2  1 


mi
12
 N  n n iS  M i 
 580 
2
580 
3 
 1    24 2  9.33  1.33  1  3  7  12.33  5.33  2.33  4  2.33  6.33  0 
12  24 
 71704800  1323600  73028400

Compare with
2
s
2611.88
t
VˆWR tˆunb   N   580 2 
 73219700
n
12
2
Download