Quiz_Preps_1_thru_9_KEY

advertisement
Stat 109
Quiz 1 Prep
Given the data set find the following.
Use Summation Notation to express your answer.
There are 10 problems here, expect 4 or 5 on Quiz 1.
n
1)
x
i 1

i
n
2)
x
i 1
x
4)
 x
i
 x =
i
 x 
n
n
5)   x
i 1

1 n
x 
n i 1 i
3)
i 1
2
i
2
Name ____________
Let
𝑥1 = 6, 𝑥2 = −4, 𝑥3 = 7, 𝑥4 = 2,
Let
𝑓1 = 4, 𝑓2 = 5, 𝑓3 = −3, 𝑓4 = 1
545
Stat 109
Quiz 1 Prep
Given the data set find the following.
Use Summation Notation to express your answer.
There are 10 problems here, expect 4 or 5 on Quiz 1.
6) Variance :
 
n
x
i 1
2
n 1
2 
n
fx
i 1
i
i
n
9)
10)
fx
i 1
2

2

 n  

 x  
1  n 2  i 1 i  
x  n  
n  1  i 1 i




7) Variance :
8)
i
 x
i
2
i


1 n
2
f i  xi  x 

n i 1
=
Name ____________
Let
𝑥1 = 6, 𝑥2 = −4, 𝑥3 = 7, 𝑥4 = 2,
Let
𝑓1 = 4, 𝑓2 = 5, 𝑓3 = −3, 𝑓4 = 1
546
Stat 109
Quiz 1 Prep
Given the data set find the following.
Use Summation Notation to express your answer.
There are 10 problems here, expect 4 or 5 on Quiz 1.
Solution
Let
𝑥1 = 6, 𝑥2 = −4, 𝑥3 = 7, 𝑥4 = 2,
Let
𝑓1 = 4, 𝑓2 = 5, 𝑓3 = −3, 𝑓4 = 1
n
1)
x
i 1
4
 𝑥1 + 𝑥2 + 𝑥3 + 𝑥4
i
547
ANSWER:
 x  11
x 1
= 6 + (−4) + 7 + 2 = 11
n
x
2)
i 1
2
i
4

𝑥12
+
𝑥22
+
𝑥32
+
𝑥42
2
=6 +
(−4)2
2
2
+7 +2
ANSWER:
x
x 1
2
i
 105
= 36 + 16 + 49 + 4 = 105
x  1  xi
n
n
3)
1
 4 (𝑥1 + 𝑥2 + 𝑥3 + 𝑥4 )
i 1
1
= 4 (6 + (−4) + 7 + 2) =
 x
n
4)
i 1
i
11
4
ANSWER:
1 4
11
xi   2.75

4 x1
4
 x  = (𝑥1 − 𝑥̅ ) + (𝑥2 − 𝑥̅ ) + (𝑥3 − 𝑥̅ ) + (𝑥4 − 𝑥̅ )
= (6 −
11
11
11
11
) + (−4 − ) + (7 − ) + (2 − )
4
4
4
4
24 11
−16 11
28 11
8 11
= ( − ) +(
− )+( − )+( − )
4
4
4
4
4
4
4 4
ANSWER:
4
13
−27
17
−3
0
=
+(
)+
+( )= =0
4
4
4
4
4
n
5)
x
i 1
 x
x 1
i
 x  0
 x   (𝑥1 − 𝑥̅ )2 + (𝑥2 − 𝑥̅ )2 + (𝑥3 − 𝑥̅ )2 + (𝑥4 − 𝑥̅ )2
2
i
= (6 − 2.75)2 + (−4 − 2.75)2 + (7 − 2.75)2 + (2 − 2.75)2
ANSWER:
= (3.25)2 + (−6.75)2 + (4.25)2 + (−0.75)2
4
= 10.5625 + 45.5625 + 18.0625 + 0.5625 = 74.75
 x
x 1
 x   74.75
2
i
Stat 109
Quiz 1 Prep
Given the data set find the following.
Use Summation Notation to express your answer.
There are 10 problems here, expect 4 or 5 on Quiz 1.
6) Variance :
n
 
2
xi  x 2
n 1
i 1
Solution
Let
𝑥1 = 6, 𝑥2 = −4, 𝑥3 = 7, 𝑥4 = 2,
Let
𝑓1 = 4, 𝑓2 = 5, 𝑓3 = −3, 𝑓4 = 1
2
2
2
2

x1  x   x2  x   x3  x   x4  x 

n 1
(6 − 2.75)2 + (−4 − 2.75)2 + (7 − 2.75)2 + (2 − 2.75)2
=
4−1
ANSWER:
n
=
548
74.75
3
= 24.9166̅

xi  x 2
i 1
n 1
 24.916 6
7) Variance :
2

 n
 
  xi  

6   4  7  22 
1  n 2  i 1   1  2
2
2
2
2


 
x


6


4

7

2


 i
 3
n  1  i 1
n
4






2

1
11  1  420 121 1  299  299
2
  105 

 
 24.916 6


3
4  3  4
4  3  4 
12
ANSWER:
 = 24.91667
n
8)
fx
i 1
i i
 f1 x1  f 2 x2  f 3 x3  f 4 x4
 4  6  5 4  3  7  1  2
 24  20  21  2  15
ANSWER:
4
fx
i 1
i i
  15
Stat 109
Quiz 1 Prep
n
9)
fx
i 1
2
i i
Solution
549
 f1 x12  f 2 x22  f 3 x32  f 4 x42
 4  6 2  5 4  3  7 2  1 2 2
2
 4  36  5 16  3  49 1 4
ANSWER:
4
fx
 144  80 147  4  81
i 1

2
i i
 81
1 n
1
2
2
2
2
2
f i  xi  x   f1 x1  x   f 2  x2  x   f 3  x3  x   f 4 x4  x 
10)

n i 1
n


1
2
2
2
2
46  2.75  5 4  2.75   37  2.75  12  2.75
4

1
2
2
2
2
43.25  5 6.75   34.25  1 0.75
4

1
4  10.5625  545.5625   318.0625  10.5625
4

1
42.25  227.8125  54.1875  0.5625
4




ANSWER:

216.4375
 54.109375
4
1 n
2
f i  xi  x   54.109375

n i 1
Stat 109
Quiz 2 Prep
NAME_________
Problem 1.) Given random collected data reporting on the average weekly
hours that 22 students spend in front of a computer, find the 5 key numbers for
a boxplot and express them with correct notation. Then draw the box plot on the
given number line. Express any outliers with small circles.



This Data set is Circle one: 1.) Skewed left

2.) Symmetrical

1
2
6
15
27
1
4
10
15
58
1
4
10
19

3.) Skewed right
550
1
6
10
24
1
6
10
25

Stat 109
Quiz 2 Prep
NAME_________
Problem 2.) Given random collected data reporting on the number of trials it
took each of 28 sixth graders to shoot a basket from the free-throw line, find the
5 key numbers for a boxplot and express them with correct notation. Then draw
the box plot on the given number line. Express any outliers with small circles.



This Data set is Circle one: 1.) Skewed left

2.) Symmetrical

551
1 1 1 2 3 3 4
5 5 6 8 9 11 12
12 15 16 18 19 20 22
25 26 26 42 45 54 58

3.) Skewed right

Stat 109
Quiz 2 Prep
Solution
1
2
6
15
27
Problem 1.) Given random collected data reporting on the average weekly
hours that 22 students spend in front of a computer, find the 5 key numbers for
a boxplot and express them with correct notation. Then draw the box plot on the
given number line. Express any outliers with small circles.
2.) Find the median:
1
4
10
15
58
1
4
10
19
552
1
6
10
24
1
6
10
25
3.) Determine the Quartile Criterion
𝑛 + 1 𝑡ℎ
𝑥̃ = (
)
2
𝐼𝑓 𝑁 ÷ 4, 𝑊𝑒 𝐴𝑣𝑒𝑟𝑎𝑔𝑒:
𝑄1
22 + 1 𝑡ℎ
=(
)
2
𝐼𝑓 𝑁 𝑖𝑠 𝑛𝑜𝑡 ÷ 4
𝑡ℎ
𝑛 𝑡ℎ
𝑛
[4] +[4+1]
[
2
𝑡ℎ
23
=( )
2
𝑄3
= 11.5𝑡ℎ
𝑡ℎ
3𝑛 𝑡ℎ
3𝑛
[ 4 ] +[ 4 +1]
2
𝑡ℎ
𝑛
+1]
4
𝑡ℎ
3𝑛
[
+1]
4
 Where square brackets indicate that we round any
decimal down to find the nth value in the data set.
 N = 22 is not divisible by 4 so:
= 11𝑡ℎ + 0.5(12𝑡ℎ − 11𝑡ℎ )
= 6 + 0.5(10 − 6)
= 6 + 0.5(4)
𝑛
𝑡ℎ
 𝑄1 = [ 4 + 1 ] 𝑎𝑛𝑑 𝑄3 = [
=8
4.) Find the 1st and 3rd Quartiles.
𝑡ℎ
𝑛
𝑄1 = [ + 1 ]
4
𝑡ℎ
3𝑛
𝑄3 = [
+1]
4
𝑡ℎ
22
=[
+1]
4
𝑡ℎ
66
=[
+1]
4
= [ 5.5 + 1 ]𝑡ℎ
= [ 16.5 + 1 ]𝑡ℎ
= [ 6.5 ]𝑡ℎ
= [ 17.5 ]𝑡ℎ
= 6𝑡ℎ 𝑣𝑎𝑙𝑢𝑒
= 17𝑡ℎ 𝑣𝑎𝑙𝑢𝑒
𝑄1 = 2
𝑄3 = 15
5.) Find the IQR
and Step.
3𝑛
4
+1]
𝑡ℎ
6.) Find LOT &UOT
LOT = 𝑄1 − 𝑆𝑡𝑒𝑝
IQR = Interquartile range
IQR = Q3 – Q1
LOT = 2 − 19.5
LOT = −17.5
IQR = 15 – 2
IQR = 13
UOT = 𝑄3 + 𝑆𝑡𝑒𝑝
𝑆𝑡𝑒𝑝 = 1.5 × 𝐼𝑄𝑅
UOT = 15 + 19.5
𝑆𝑡𝑒𝑝 = 1.5 × 13
UOT = 34.5
𝑆𝑡𝑒𝑝 = 19.5
1
2
6
15
27
Note: The whiskers of the boxplot extend to the last
data point that exists within the outlier thresholds.
It is an error to extend the whiskers to the outlier
thresholds.
1
4
10
15
58
1
4
10
19
1
6
10
24
1
6
10
25
o



This Data set is Circle one: 1.) Skewed left

2.) Symmetrical


3.) Skewed right

Stat 109
Quiz 2 Prep
Solution
Problem 2.) Given random collected data reporting on the number of trials it
took each of 28 sixth graders to shoot a basket from the free-throw line, find the
5 key numbers for a boxplot and express them with correct notation. Then draw
the box plot on the given number line. Express any outliers with small circles.
2.) Find the median:
𝑛 + 1 𝑡ℎ
𝑥̃ = (
)
2
3.) Determine the Quartile Criterion
𝐼𝑓 𝑁 ÷ 4, 𝑊𝑒 𝐴𝑣𝑒𝑟𝑎𝑔𝑒:
𝑡ℎ
𝑛 𝑡ℎ
𝑛
[4] +[4+1]
𝑄1
𝑡ℎ
28 + 1
=(
)
2
2
𝑡ℎ
29
=( )
2
𝑄3
[
= 14.5𝑡ℎ
= 14𝑡ℎ + 0.5(15𝑡ℎ − 14𝑡ℎ )
= 12 + 0.5(12 − 12)
= 12 + 0.5(0)
1 1 1 2 3 3 4
5 5 6 8 9 11 12
12 15 16 18 19 20 22
25 26 26 42 45 54 58
𝑡ℎ
3𝑛 𝑡ℎ
3𝑛
] +[
+1]
4
4
2
𝐼𝑓 𝑁 𝑖𝑠 𝑛𝑜𝑡 ÷ 4
[
𝑡ℎ
𝑛
+1]
4
𝑡ℎ
3𝑛
[
+1]
4
 Where square brackets indicate that we round any
decimal down to find the nth value in the data set.
 N = 28 is divisible by 4 so:
= 12
𝑄1 =
𝑡ℎ
𝑛 𝑡ℎ
𝑛
[4] +[4+1]
2
𝑡ℎ
3𝑛 𝑡ℎ
3𝑛
[ 4 ] +[ 4 +1]
𝑎𝑛𝑑 𝑄3 =
2
4.) Find the 1st and 3rd Quartiles.
𝑄1 =
𝑡ℎ
𝑛 𝑡ℎ
𝑛
[4] +[4+1]
2
28
[ 4 ]
𝑄1 =
𝑡ℎ
𝑡ℎ
28
+1]
4
2
+ [ 8 ]𝑡ℎ
2
4+5 9
𝑄1 =
=
2
2
𝑄1 =
[7
]𝑡ℎ
+[
𝑄1 = 4.5
𝑡ℎ
3𝑛 𝑡ℎ
3𝑛
[ 4 ] +[ 4 +1]
𝑄3 =
2
𝑡ℎ
3 ∙ 28 𝑡ℎ
3 ∙ 28
+
+
1
]
[
]
4
4
𝑄3 =
2
[
𝑡ℎ
84 𝑡ℎ
84
[ 4 ] +[ 4 +1]
𝑄3 =
2
[ 21 ]𝑠𝑡 + [ 22 ]𝑛𝑑
2
22 + 25 47
𝑄3 =
=
2
2
𝑄3 =
𝑄3 = 23.5
553
Stat 109
Problem 2
Continued:
Quiz 2 Prep
Solution
554
𝑄1 = 4.5
𝑄3 = 23.5
5.) Find the IQR and Step.
IQR = Inter-quartile range
IQR = Q3 – Q1
IQR = 23.5 – 4.5
6.) Find LOT &UOT
LOT = 𝑄1 − 𝑆𝑡𝑒𝑝
LOT = 4.5 − 28.5
LOT = −24
IQR = 19
UOT = 𝑄3 + 𝑆𝑡𝑒𝑝
𝑆𝑡𝑒𝑝 = 1.5 × 𝐼𝑄𝑅
UOT = 23.5 + 28.5
𝑆𝑡𝑒𝑝 = 1.5 × 19
UOT = 52
𝑆𝑡𝑒𝑝 = 28.5
1 1 1 2 3 3 4
5 5 6 8 9 11 12
12 15 16 18 19 20 22
25 26 26 42 45 54 58
Note: The whiskers of the boxplot extend to the last
data point that exists within the outlier thresholds.
It is an error to extend the whiskers to the outlier
thresholds.
o



This Data set is Circle one: 1.) Skewed left

2.) Symmetrical


3.) Skewed right
o

Stat 109
Quiz 3 Prep
Name_____________
In surveys related to colony collapse disorder (CCD) a federation
of bee keepers survey the hives in the southern United States and
find that 38% of the bee hives have varroa mites, while 22% of the
bee hives have IAPV (Israel acute paralysis virus), and that 7% of
the hives have both ailments. Using the declared event variables at
the right, find the probability of drawing a bee hive described
below. Use complete probability notation to express your answer.
Note that Quiz 3 will likely ask for 4 scenarios. Ten are given here
for practice.
1.)
Declare Variables: 2pts
Let one letter signify one event:
M = Hive has Varroa mites.
I = Hive has IAPV
It is an error to couple event
variables. Examples of this error:
MI: Has both mites and IAPV
INM: Has IAPV but not Mites
The hive does not have varroa mites. 4.5pts
Use words to describe the assigned
variable. These are NOT declarations:
M = 0.38
2.)
The hive does not have IAPV.
4.5pts
3.)
The hive has both varroa mites and IAPV.
4.)
The hive has neither varroa mites nor IAPV.
5.)
The hive has varroa mites but not IAPV.
6.)
The hive either has varroa mites or does not have IAPV.
4.5pts
7.)
The hive either does not have varroa mites or has IAPV.
4.5pts
8.)
The hive either does not have varroa mites or does not have IAPV.
9.)
The hive either has varroa mites or has IAPV.
10.)
The hive does not have varroa mites but does have IAPV. 4.5pt
I = 0.22
4.5pts
4.5pts
4.5pts
4.5pts
4.5pts
B = 0.07
555
Stat 109


Quiz 3 Prep
Solution
556
Use a sketch to represent all possible events in the event space.
Find the notation associated with each event and piece together
with either addition or subtraction the specified events.
Declare Event Variables:
Assign one variable to one event.
DO NOT Assign variables
Without event descriptions.
M = Hive has Varroa mites.
I = Hive has IAPV.
M = 0.38 (No Credit)
I = 0.22
(No Credit)
B = Both
(No Credit)
Find the probability of drawing a bee hive from the Southern U.S such that:
̅ ) = 1 − 𝑃(𝑀) = 1 − 0.38
𝑃(𝑀
̅ ) = 0.62
𝑃(𝑀
1.) The hive does not have varroa mites.
2.) The hive does not have IAPV.
1.)
𝑃(𝐼 ̅ ) = 1 − 𝑃(𝐼 ) = 1 − 0.22
𝑃(𝐼̅ ) = 0.78
𝑃(𝑀 ∩ 𝐼) = 0.07 (𝐺𝑖𝑣𝑒𝑛)
3.) The hive has both varroa mites and IAPV.
4.) The hive has neither varroa mites nor IAPV.
̅ ∩ 𝐼̅ ) = 𝑃(𝑀
̅ ) − 𝑃(𝐼) + 𝑃(𝑀 ∩ 𝐼)
𝑃(𝑀
= 0.62 – 0.22 + 0.07
= 0.47
5.) The hive has varroa mites but not IAPV.
𝑃(𝑀 ∩ 𝐼̅ ) = 𝑃(𝑀) − 𝑃(𝑀 ∩ 𝐼)
= 0.38 – 0.07
= 0.31
6.) The hive either has varroa mites or does not have IAPV.
𝑃(𝑀 ∪ 𝐼̅ ) = 𝑃( ̅𝐼 ) + 𝑃(𝑀 ∩ 𝐼)
= 0.78 + 0.07
= 0.85
7.) The hive either does not have varroa mites or has IAPV.
̅ ∪ 𝐼 ) = 𝑃(𝑀
̅ ) + 𝑃(𝑀 ∩ 𝐼)
𝑃(𝑀
= 0.62 + 0.07
= 0.69
8.) The hive either does not have varroa mites or does not have IAPV.
9.) The hive either has varroa mites or has IAPV.
10.) The hive does not have varroa mites but does have IAPV.
̅ ∪ 𝐼̅ ) = 1 − 𝑃(𝑀 ∩ 𝐼)
𝑃(𝑀
= 1 – 0.07
= 0.93
𝑃(𝑀 ∪ 𝐼 ) = 𝑃(𝑀) + 𝑃(𝐼) − 𝑃(𝑀 ∩ 𝐼)
= 0.38 + 0.22 – 0.07
= 0.53
̅ ∩ 𝐼 ) = 𝑃(𝐼) − 𝑃(𝑀 ∩ 𝐼)
𝑃(𝑀
= 0.22 – 0.07
= 0.15
Stat 109
Quiz 3 Prep
Solution
Let’s use some sketches to support the use of probability notation for each
answer. Recall that the entire event space must have probabilities that sum to 1,
then let this segmented sketch have a probability area that equals 1.
Where:
M = Event that a hive has varroa mites
I = Event that a hive has IAPV.
First we shade the given probabilities (as portions):
𝑃(𝑀) = 0.38
𝑃(𝐼) = 0.22
𝑃(𝑀 ∩ 𝐼) = 0.07
Now we can use these shaded portions to support the notation given in the calculation of the
probability of each event.
1.) The hive does not have varroa mites.
̅ ) = 0.62
𝑃(𝑀
̅ ) = 1 − 𝑃(𝑀) = 1 − 0.38
𝑃(𝑀
–
=
2.) The hive does not have IAPV.
𝑃(𝐼̅ ) = 0.78
𝑃(𝐼̅ ) = 1 − 𝑃(𝐼) = 1 − 0.22
–
=
3.) The hive has both varroa mites and IAPV.
𝑃(𝑀 ∩ 𝐼) = 0.07
(𝐺𝑖𝑣𝑒𝑛)
557
Stat 109
Quiz 3 Prep
Solution
̅ ∩ 𝐼̅ ) = 0.47
𝑃(𝑀
4.) The hive has neither varroa mites nor IAPV.
̅ ∩ 𝐼̅ ) = 𝑃(𝑀
̅ ) − 𝑃(𝐼) + 𝑃(𝑀 ∩ 𝐼)
𝑃(𝑀
= 0.62 – 0.22 + 0.07
–
=
+
𝑃(𝑀 ∩ 𝐼̅ ) = 0.31
5.) The hive has varroa mites but not IAPV.
𝑃(𝑀 ∩ 𝐼̅ ) = 𝑃(𝑀) − 𝑃(𝑀 ∩ 𝐼)
= 0.38 – 0.07
–
=
6.) The hive either has varroa mites or does not have IAPV.
𝑃(𝑀 ∪ 𝐼̅ ) = 0.85
𝑃(𝑀 ∪ 𝐼̅ ) = 𝑃( ̅𝐼 ) + 𝑃(𝑀 ∩ 𝐼)
= 0.78 + 0.07
=
+
7.) The hive either does not have varroa mites or has IAPV.
̅ ∪ 𝐼) = 𝑃(𝑀
̅ ) + 𝑃(𝑀 ∩ 𝐼)
𝑃(𝑀
= 0.62 + 0.07
=
+
𝑃( ̅̅̅
𝑀 ∪ 𝐼) = 0.69
558
Stat 109
Quiz 3 Prep
Solution
̅ ∪ ̅𝐼 ) = 0.93
8.) The hive either does not have varroa mites or does not have IAPV. 𝑃( 𝑀
̅ ∪ 𝐼̅ ) = 1 − 𝑃(𝑀 ∩ 𝐼)
𝑃(𝑀
=
1 – 0.07
–
=
9.) The hive either has varroa mites or has IAPV.
𝑃( 𝑀 ∪ 𝐼) = 0.53
𝑃( 𝑀 ∪ 𝐼) = 𝑃(𝑀) + 𝑃(𝐼) − 𝑃(𝑀 ∩ 𝐼)
=
0.38 + 0.22 – 0.07
=
+
10.) The hive does not have varroa mites but does have IAPV.
̅ ∩ 𝐼) = 𝑃(𝐼) − 𝑃(𝑀 ∩ 𝐼)
𝑃( 𝑀
=
=
0.22 – 0.07
–
–
̅ ∩ 𝐼) = 0.15
𝑃( 𝑀
559
Stat 109
1.)
Quiz 4 Prep
Name__________
Given that 70% of pet owners in a college town have cats only while the rest have dogs only
and that 20% of cats have fleas while only 10% of dogs have fleas, find the probability that a
randomly drawn pet will have fleas. Declare event variables: 2pts
Notation: 2pts
Calculation: 4pts
2.)
Find the probability of drawing a dog given that the pet has fleas:
Notation: 2pts
Calculation: 4pts
3.) Are the events of the kind of pet drawn and whether or not it has fleas independent events?
Use complete probability notation and associated values to support your answer.
Notation: 2pts
Calculation: 4pts
560
Stat 109
Quiz 4 Prep
Solution
1.)
Given that 70% of pet owners in a college town have cats only while the rest have dogs
only and that 20% of cats have fleas while only 10% of dogs have fleas, find the probability that a
randomly drawn pet will have fleas.
It is an error to couple event variables.
First define the event space variables.
Examples of this error:
Let one letter signify one event.
C = event that a cat with fleas is drawn.
C = event that a cat is drawn.
D = event that a dog is drawn.
F = event that a pet with fleas is drawn.
The probability
of drawing a
pet with fleas.
=
D = event that a dog without fleas is drawn.
C = 0.70
D = 0.30
These are not declarations!
The probability
of drawing a pet The probability of
with fleas given drawing a dog.
the pet is a dog.
+
The probability
of drawing a pet The probability of
with fleas given
drawing a cat.
the pet is a cat.
PF   PF D  PD  PF C   PC 
Answer with Notation:
Answering in English:
PF   .10  .30  .20  .70
PF   0.17
“There is a 17% chance that randomly
drawn pet will have fleas.”
2.) Find the probability of drawing a dog given that the pet has fleas:
The probability of
drawing a dog
given the pet has
fleas.
=
PD F  
The probability
of drawing a dog
that has fleas
.
The probability
of drawing any
pet with fleas.
=
The probability of
drawing a pet with
fleas given
that it is a dog.
The probability of
drawing a dog.
The probability
of drawing any
pet with fleas.
PF D  PD 
PD  F 

P F 
P F 
Answer with Notation:
0.10  0.30
PD F  
0.17 
PD F   .1765
Answering in English:
“There is a 17.7% chance that randomly
drawn flea-born pet will be a dog.”
561
Stat 109
Quiz 4 Prep
Solution
562
3.) Are the events of the kind of pet drawn and whether or not it has fleas independent events?
Use complete probability notation and associated values to support your answer.
Explanation
 On problems #1 and 2, along with the given information
we found that:
 An intersection is equal to a conditional probability
times the probability of the given event. In this case the
conditional probability is the probability of drawing a
dog from all flea borne pets. We multiply this by the
probability of drawing a flea borne pet and the product
gives us the intersection of dogs with fleas as a
probability.
 We can determine whether events are independent of
each other by multiplying their probabilities and
checking whether the product is equal to the intersection.
If the product is equal we can claim that the two events
are independent of each other.
#3) Notation and Answer
𝑃(𝐷) = 0.30
𝑃(𝐹) = 0.17
𝑃(𝐷|𝐹) = 0.1765
𝑃(𝐷 ∩ 𝐹) = 𝑃(𝐷|𝐹) ∙ 𝑃(𝐹)
(Always true)
𝑃(𝐷 ∩ 𝐹) = 0.1765 ∙ (0.17)
𝑃(𝐷 ∩ 𝐹) = 0.030005
𝑃(𝐷 ∩ 𝐹) = 𝑃(𝐷) ∙ 𝑃(𝐹)
(True only if D and F are independent)
𝑃(𝐷) ∙ 𝑃(𝐹) = 0.30 ∙ (0.17)
𝑃(𝐷) ∙ 𝑃(𝐹) = 0.051
Since
𝑃(𝐷 ∩ 𝐹) ≠ 𝑃(𝐷) ∙ 𝑃(𝐹)
0.030005 ≠ 0.051
The event of a pet having fleas and what
kind of pet is drawn are not independent.
The rate of flea infection must depend upon
the type of pet drawn.
Stat 109
Quiz 5 Prep
Name ____________
For all Confidence Intervals present your answer with the percentage level of confidence, the
parameter of interest (mean or median), the high and low values of the interval, the units of
measure, all in a brief descriptive sentence.
For Example: “Find the 85% Confidence Interval for the mean cholesterol level for Eskimos”.
mg
.”
dL
Entomologists were interested in the adult life span of a particular species of may fly. A random
sample of the adult life span in hours for 24 may flies can be found in an MS Excel file at the site:
users.humboldt.edu/tpayer  Stat 109  Data sets  Mayfly.csv. Previous research has found
that this species had a standard deviation of  = 1.5 hours. Find a 90% Confidence interval on the
mean adult life span of this species of may fly. Use hand calculation.
Answer:
1.)
“The 85% CI for mean Eskimo cholesterol is (220.2, 245.6)
2.) Human birth weights in India are (approximately) normally distributed. Find a 95% confidence
interval for the mean population Indian birth weight given a random sample of 17 birth weights with
a sample mean and sd of 2900  598 grams x  sd  . Use hand calculation.
3.)
There is a third problem on the back page!
563
Stat 109
Quiz 5 Prep backpage
564
Table 6: t-Table for Confidence Intervals.
Level of Confidence (percent)
df
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
40
50
60
70
80
90
100
1000
∞
3.)
80
3.07768
1.88562
1.63778
1.53320
1.47589
1.43977
1.41493
1.39685
1.38303
1.37215
1.36342
1.35621
1.35019
1.34502
1.34060
1.33677
1.33338
1.33036
1.32775
1.32533
1.32320
1.32125
1.31944
1.31783
1.31636
1.31497
1.31369
1.31253
1.31142
1.31038
1.30308
1.29868
1.29581
1.29376
1.29222
1.29103
1.29007
1.28200
1.28155
90
6.31375
2.91999
2.35341
2.13184
2.01505
1.94317
1.89456
1.85953
1.83313
1.81244
1.79588
1.78228
1.77094
1.76133
1.75307
1.74587
1.73962
1.73407
1.72911
1.72474
1.72074
1.71715
1.71389
1.71087
1.70813
1.70563
1.70326
1.70112
1.69911
1.69724
1.68386
1.67589
1.67065
1.66692
1.66413
1.66196
1.66024
1.64600
1.64485
95
12.7062
4.3027
3.1825
2.7764
2.5706
2.4469
2.3646
2.3060
2.2622
2.2281
2.2010
2.1788
2.1604
2.1448
2.1315
2.1199
2.1098
2.1009
2.0930
2.0860
2.0796
2.0739
2.0687
2.0639
2.0595
2.0556
2.0519
2.0484
2.0452
2.0423
2.0211
2.0085
2.0003
1.9944
1.9901
1.9867
1.9840
1.9620
1.9600
98
31.8206
6.9646
4.5407
3.7470
3.3649
3.1427
2.9980
2.8965
2.8215
2.7638
2.7181
2.6810
2.6503
2.6245
2.6025
2.5835
2.5669
2.5524
2.5395
2.5280
2.5176
2.5083
2.4999
2.4921
2.4851
2.4786
2.4727
2.4671
2.4620
2.4573
2.4232
2.4033
2.3902
2.3808
2.3739
2.3685
2.3642
2.3300
2.3264
99
63.6570
9.9248
5.8410
4.6041
4.0321
3.7075
3.4995
3.3554
3.2498
3.1693
3.1058
3.0545
3.0123
2.9768
2.9467
2.9208
2.8982
2.8784
2.8610
2.8453
2.8314
2.8187
2.8074
2.7969
2.7874
2.7787
2.7707
2.7633
2.7564
2.7500
2.7045
2.6778
2.6604
2.6480
2.6387
2.6316
2.6259
2.5810
2.5758
99.9
636.607
31.598
12.924
8.610
6.869
5.959
5.408
5.041
4.781
4.587
4.437
4.318
4.221
4.140
4.073
4.015
3.965
3.922
3.883
3.849
3.819
3.792
3.768
3.745
3.725
3.707
3.690
3.674
3.659
3.646
3.551
3.496
3.460
3.435
3.416
3.402
3.391
3.300
3.291
If we have a 95% confidence interval, this means 95% of what must be true?
Answer in the context of the last problem.
Confidence
(Percent)
80
90
95
98
99
99.8
99.9
99.99
99.999
z
1.282
1.645
1.960
2.326
2.576
3.090
3.291
3.891
4.491
Stat 109
Quiz 5 Prep KEY
565
1a.) First we load the data file into R:
a1) Open the Mayfly file:
a2) Then select the web address of this file from your browser and press
Ctrl + C to copy from it.
users.humboldt.edu/tpayer
 Stat 109
 Data sets
 Mayfly.csv
a3) Open up R Studio and
load the data set.

 Open the Environment Tab
 Import Dataset
 From Web URL
 Click once inside the dialogue box.
 Press Ctrl + V to paste the web address.
 Click OK.
 Keep or rename
the file.
 Click on Yes for
Heading.
 Click on Import.
Stat 109
Quiz 5 Prep KEY
566
1.b) With the data loaded we load the nortest package and test the data for normality and find
the summarized values of the sample size, mean, and sd.
R Editor Code
 Open the Packages tab in
the lower right panel and
check that the nortest
package is checked.
R Console code
> install.packages(“nortest”)
> library(nortest)
R Code Explained



> attach(Mayfly)

> ad.test(Mayfly[,1])
> ad.test(Hours)
Anderson-Darling
normality test:
data: Hours
A = 0.2322,
p-value = 0.7756
> NROW(Hours)
[1] 24
> mean(Hours)
[1] 8.208333
> sd(Hours)
[1] 1.227552
Common R Studio Error
Install a package in R by referencing its name in
quotes with the install.packages command.
Add the package contents to R’s Library, this time
without quotes on the package name.
Attaches the Mayfly data set to R’s working
memory. (Necessary for data access.)
Either one of these lines of code will run the
Anderson-Darling test for normality on the Hours
data set within the file called Mayfly.
The first command for normality asks for the first
column in the file called Mayfly, while the second
command uses the column header of that data set to
reference the data.
 We conclude that the mayfly
(p > 
data set passes normality

because by Anderson Darling:

The sample size, mean and sd of the mayfly data set
are found by applying the respective commands to
column header name of the data set.
𝑁 = 24,
𝑥̅ = 8.208,
𝑠 = 𝑠𝑑 = 1.228
R Studio Error Explained
We cannot apply a command to a filename, rather
> ad.test(Mayfly)
we must apply the command to a data set within said
Error in
file using either the column header name (Hours)or
`[.data.frame`(x,complete.cases(x))
a column number index Mayfly[,1] for that file.
undefined columns selected
Stat 109
Quiz 5 Prep KEY
567
Problem #1, Continued
Use the summarized data taken from R’s output to construct a confidence interval by hand. Our
goal is to build a confidence interval about the mean mayfly life span. But we must decide which
of 2 confidence interval formulas to use, neither of which will be given on the quiz. Which one
do we apply?
When we know the true standard deviation value, , then we will use with
a Z-interval. This gives a better approximation, but  is often unknowable.
If we do not know the true standard deviation value, , then all we have is
the sample standard deviation: s. In this case we use s with a t-interval.

𝑥̅ ± 𝑍𝛼⁄2
𝑥̅ ± 𝑡𝛼⁄2
𝜎
√𝑛
𝑠
√𝑛
For a review on why we make this choice revisit Week 6 Day 1 Lecture Notes.
If is known, we use the z-interval for the mean mayfly life span:
Since  was given as  = 1.5

𝑥̅ ± 𝑍𝛼⁄2
𝜎
√
= 8.208 ± 1.645
𝑛
1.5
6pts
√24
“The 90% CI on the mean adult life span for this species of mayfly is (7.70, 8.71) hours”.
2.) Human birth weights in India are (approximately) normally distributed. Find a 95%
confidence interval for the mean population Indian birth weight given a random
sample of 17 birth weights with a sample mean and sd of 2900  598 grams x  sd  .
Use hand calculation.
Answer: Data is assumed to be normally distributed as we have summarized data. 
1pt
Note! Even though we cannot verify that this data set is normally distributed,
(because summarized data can’t be verified) we still have to acknowledge that we are
making this assumption. Without this we are not justified in using mean values.
Problem #2 Continued:
Sinceis not known, we use the t-interval:
𝑥̅ ± 𝑡𝛼⁄2
𝑠
598
√
√17
= 2900 ± 2.1199
𝑛

“The 95% CI on the mean birth weight in India is (2592.5, 3207.5) grams”.
6pts
Stat 109
Quiz 5 Prep KEY
568
Point Key For problems #1 and #2: (2 x 7pts each)
-3pts for using the wrong CI interval formula.
-0.5 pt for reporting the CI
with switched bounds.
-1pt for each normality check.
-2pts for using the wrong value within the correct table.
-2pts for calculator errors, -0.5pt for round off errors.
The lower bound should
come first like this:
(1.472, 2.376)
Switched bounds put the
upper bound first in error:
(2.376, 1.472)
-0.5 pt for reporting the CI without units of measure.
-1.5pts for using the sample standard deviation, s,
instead of the true standard deviation, s when it is
available.
Procedural Question:
A truncated t-table for this Example:
What happens if my df value lands in between the
gaps for larger df sample sizes of the t-table?
Suppose the confidence level is 95% and the
sample size is n = 79. This means your df at n-1 is
78. Except that there is no corresponding row for
df = 78.
   
Do we round up to df = 80?
No, because this overstates your sample size.
Do we round down to df = 70?
Yes! Because df = 70 is the closest value in the
table that does not overstate the sample size.
How about interpolating and use 80% of the distance
between 70 and 80 to estimate where the df = 78
value will be?
No, because the t-curve is not linear and this detail
work we save for R.
3.) Answer: 95% confidence means that 95% of all randomly drawn samples will form
confidence intervals that bracket the true mean Indian birth weight. The interval of (2592.5,
3207.5) grams may be one of the 95% CI that contains the true mean Indian birth weight or one
of the 5% that do not. The method we have used works 95% of the time. Thus we have 95%
confidence.
Also to the point: 95% confidence refers to the likelihood of bounding the true mean Indian
birth weight within a confidence interval calculated from a random sample of the
population. Once the confidence interval is calculated we do not know whether the interval
is one of the 95% that contains the true mean birth weight or one of the 5% that fails to catch
the true mean. We cannot verify whether the interval of (2592.5, 3207.5) grams contains the
true mean Indian birth weight, but the method we have used works 95% of the time. Thus we
have 95% confidence.
Stat 109
Quiz 5 Prep KEY
569
NOTE to Student:
Both statements above for problem 3 are solid interpretations of
the meaning of confidence in regards to a specific confidence interval. The first sentence
captures the key point. But unfortunately this remains a troubling concept for many students.
Consider a random selection of typical responses from previous quizzes and exams below.
Here are some common errors in the interpretation of the meaning of confidence with corrective responses.
a.)
“95% confidence means that 95% of all Indian birth weights will be between (2592.5, 3207.5) g.”
No, we are not answering whether Indian birth weights span a particular interval, but whether the
true mean of Indian birth weight will be caught within the confidence interval we have calculated
from a random sample. We have a 95% chance of bracketing the true mean birth weight before
the random draw of the data is made. The interval of (2592.5, 3207.5) may be one of the 95%
that contains the true mean birth, or one of the 5% that missed its containment.
b.)
“There is a 95% chance that the confidence interval of (2592.5, 3207.5) g contains the true
mean Indian birth weight.”
No, absolutely not! The interval either contains the true mean or it does not. There is no chance
about it. Referencing 95% chance was appropriate before the random sample was drawn. Once
we have the drawn the sample data our confidence interval is fixed and it either brackets the true
mean or it does not.
c.)
“95% of the data will be true to the interval (2592.5, 3207.5) g.”
What?? No Credit here. Stringing together the terms of data, interval, and “truth” in a vague
sentence does not explain the meaning of confidence. Look, our goal is to try to get the best
estimate we can on the true mean Indian birth weight. The problem is that while this value exists,
it is unknowable, so we resort to statistical sampling to form a confidence interval about a
randomly drawn sample mean. The method we use brackets the true mean value in 95% of all
confidence intervals formed from simulated trials. It is this 95% success rate that is the basis of
our confidence in bracketing the true mean Indian birth weight.
d.)
“95% of all true mean Indian birth weights will be contained within the interval of (2592.5,
3207.5) g.”
Oh no: First there is only one true mean birth weight. Second, you have reversed the process: it is
not whether 95% of birth weights (as true mean values or not) will be contained within the
confidence interval, but whether the confidence interval brackets the one true mean. We could
say that 95% of all confidence intervals formed from random samples bracket the true mean
Indian birth weight.
e.)
“95% of the time the sample mean will exist within the interval of (2592.5, 3207.5) g.”
No! The sample mean will always exist within the confidence interval, assuming that we make
no calculation errors. The sample mean is the dead center of the confidence interval, we add and
subtract a margin of error from the sample mean to build the confidence interval according to the
Stat 109
Quiz 5 Prep KEY
e.) Continued…
𝑠
t-interval formula: 𝑥̅ ± 𝑡𝑐𝑟𝑖𝑡𝑖𝑐𝑎𝑙 𝑛. The question is whether the
√
constructed confidence interval contains the true mean Indian
birth weight. Because this is real data and the true mean is
unknowable, we cannot verify whether this interval brackets the
true mean. Instead we cite that in simulated trials 95% of all
confidence intervals drawn from random samples did bracket the
true mean value, and thus we have 95% confidence.
Recall the lecture of Week 6 Day 1:
We calculated for a 98% confidence interval (CI) on the true mean
goliath frog weight in pounds and found (4.32, 5.42) lbs. Here we are
assuming that we do not know that the true mean weight of the koi in
the pond is = 4.62 lbs.
Once we have the confidence interval it is important that we do not
talk about probability or chance. Probability and chance refer to events that
have yet to occur. Once we have rolled the dice, drawn the card from the
deck, or for a biologist, taken a random sample of goliath frog weights, all
talk of chance is over. This confidence interval either caught the true mean
weight of the frog or it did not. The fact that we do not know the true mean
value in most situations does not change this. We have only one random
sample and its resulting confidence interval from which to approximate the
true mean frog weight. This particular interval that we were working with,
(4.32, 5.43) lbs is either one of the 98% of all confidence intervals
calculated from random samples that does contain the true mean frog
weight or it is one of the 2% of all confidence intervals calculated from
random samples that does not contain the true mean frog weight. Our 98%
confidence level comes from that fact that if we were to repeat our random
sampling millions of times over and construct millions of confidence
intervals from these samples we would be assured that 98% of these
confidence intervals would bracket the true mean frog weight. This 98%
probability of success with this method can be verified in computer
simulations and is the basis of our 98% confidence level.
570
Stat 109
Quiz 6 PREP

Table 8: Test of Hypothesis z-Table.
Draw a decision line for the reject H 0 and do not reject H 0


regions on either side of the z Critical value(s).
Find the bracketed p-value and compare it against  (LOS).
For the following scenarios state a full statistical conclusion of
the hypothesis comparing the z Sample with z Critical values and
bracketed p-values with  values
1.)
2.)
3.)
H 0 : p  0.15
H A : p  0.15
H 0 : p  0.48
H A : p  0.48
H 0 : p  0.007
H A : p  0.007
z Sample  1.89
  0.05
z Sample  2.79
  0.01
z Sample  1.35
  0.10
Name________
571
Level of
Significance
z
OneTwoCritical
tailed
tailed
.10
.20
1.282
.05
.10
1.645
.025
.05
1.960
.01
.02
2.326
.005
.01
2.576
.001
.002
3.090
.0005
.001
3.291
.00005
.0001
3.891
.000005
.00001 4.491
Stat 109
Quiz 6 PREP Solution

Table 8: Test of Hypothesis z-Table.
Draw a decision line for the reject H 0 and do not reject H 0


regions on either side of the z Critical value(s).
Find the bracketed p-value and compare it against  (LOS).
For the following scenarios state a full statistical conclusion of
the hypothesis comparing the z Sample with z Critical values and
bracketed p-values with  values
1.)
H 0 : p  0.15
H A : p  0.15
z Sample  1.89
  0.05
At the 5% LOS we do not Reject H0 because:
ZSample < ZCritical
p > 
(-1.89 < 1.645)
(p > 0.10) > 0.05
2.)
H 0 : p  0.48
H A : p  0.48
3.)
H A : p  0.007
Level of
Significance
z
OneTwoCritical
tailed
tailed
.10
.20
1.282
.05
.10
1.645
.025
.05
1.960
.01
.02
2.326
.005
.01
2.576
.001
.002
3.090
.0005
.001
3.291
.00005
.0001
3.891
.000005
.00001 4.491
Do Not Reject H0
Reject H0
ZCritical = 1.645
z Sample  2.79
  0.01
At the 1% LOS we do not Reject H0 because:
ZSample > ZCritical
p > 
(2.79 > -2.326)
(p > 0.10) > 0.01
H 0 : p  0.007
572
Reject H0
Do Not Reject H0
ZCritical = -2.326
z Sample  1.35
  0.10
Reject H0
Do Not Reject H0 Reject H0
At the 10% LOS we do not Reject H0 because:
ZCritical = -1.645
ZCritical = +1.645
ZCritical < ZSample < ZCritical
p > 
(-1.645 < -1.35 < 1.645)
(0.10 < p < 0.20) > 0.10
Stat 109
Quiz 7 Prep
Name____________
Given a scenario that requires a test of hypothesis and the resulting p-value, find the
following:



Declare the parameter with units of measure
State the hypothesis.
Describe what the p-value means in context of the problem. Use an
English sentence that uses the given parameter and its units of measure.
A plant physiologist conducted an
experiment to determine whether mechanical
stress can retard the growth of soybean
plants. Young plants were randomly
allocated to two groups of 13 plants each.
Plants in one group were mechanically
agitated by shaking for 20 minutes twice
daily, while plants in the other group were
not agitated. After 16 days of growth, the
mean stem length in cm of each plant was
measured, with the results given in the table
at right. Assume normality and use an
appropriate t-test to test the hypothesis that
stress tends to retard plant growth. Assume
that the tests p-value yields: p  0.02 .
Stresses
24.7
25.7
26.5
27.0
27.1
27.2
27.3
27.7
28.7
28.9
29.7
30.0
30.6
x  27.78
sd = 1.726
Control
25.2
29.5
30.1
30.1
30.2
30.2
30.3
30.6
31.1
31.2
31.4
33.5
34.3
x  30.59
sd = 2.134
Solution:
Declare parameter:
1 = Mean stem length in cm of plant after 16 days for stressed plants.
 2 = Mean stem length in cm of plant after 16 days for non-stressed plants.
State the hypothesis:
𝐻0 : 𝜇1 = 𝜇2
𝐻𝐴 : 𝜇1 < 𝜇2
Describe what the p-value means in the context of the problem:
If it’s true that there is no difference between the mean stem lengths (in cm) of
stressed versus non-stressed soybean plants, then only 2% of all random draws
will show that shaking the plants retards growth to an even greater extent than our
data set does.
573
Stat 109
Quiz 7
Prep Solution
If the null hypothesis is true then p% of all random draws will
contradict the null hypothesis to an even greater extent than our data set does.
If it’s true that there is no difference between the mean stem lengths (in
cm) of stressed versus non-stressed soybean plants, then only 2% of all random
draws will show that shaking the plants retards growth at an even greater extent
than our data does.
If its true that
The Null hypothesis
expressed with its
parameter and units of
measure
Then p% of all
random draws
will show that
If its true that
there is no difference
between the mean stem
lengths (in cm) of stressed
versus non-stressed
soybean plants,
Then 2% of all
random draws
will show that
The key point of
the alternate
hypothesis
Shaking plants
retards growth
at an even
greater extent
than our data
set does.
at an even
greater extent
than our data
set does.
574
Stat 109
Quiz 7 Prep
Consider these examples:
Try to complete each of these examples for practice without
looking to the answers on the next page.
Name_______________ 575



For each example:
Declare the parameter.
State the hypothesis.
Interpret the p-value.
1.)
A pediatrician wants to determine how effective aspirin is in decreasing body temperature of 5
year olds with the flu. She records the body temperature in Fahrenheit of 14 of her patients before
and one hour after administering aspirin. She runs a paired t-test. Her hypothesis test yields a pvalue of p = 0.03. Interpret the meaning of the p-value. Given the p-value is a probability and it is
read as a percentage, then 3% of what must be true?
2.)
A kinesiologist suspects that the proportion of national baseball league pitchers that are left
handed in the will be greater than the 10% reported in the general U.S. population. He runs a one
proportion hypothesis test on a random selection of 48 pitchers from the national baseball league
and finds that 13 of the pitchers are left handed. The hypothesis reports a p-value of p = 0.23.
Interpret the meaning of the p-value. Given the p-value is a probability and it is read as a
percentage, then 23% of what must be true?
3.)
A botanist suspects that the rate at which trees absorb carbon dioxide (measured in units of kg C
per square meter of ground area) will increase if the trees are fertilized. She runs a 2-sample t-test
comparing similar stands of neighboring trees where one stand receives fertilizer and the controls
did not. Her hypothesis test yields a p-value of p = 0.04. Interpret the meaning of the p-value.
Given the p-value is a probability and it is read as a percentage, then 4% of what must be true?
4.)
A doctor suspects that allergic reactions for adults will be smaller in populations that grew up as
children in homes where they regularly made contact with household pets. She samples from 2
populations: 2300 adults that had pets as youngsters, and 1600 adults that never had a pet as a
child. The adults that had pets as children had an incidence of allergic reactions in 2% of the
population. In the population of adults that never had a pet as a child the incidence of allergic
reaction was at 3.5%. Does the evidence support the doctors suspicion? A 2 sample proportion
yielded a p-value of 0.016
Stat 109
 Declare the parameter.
Quiz 7 Prep Key
1.)
 Declare the parameter.
d = Mean difference in temperatures in Fo of 5 year-old flu
patients between before and after aspirin ingestion.
We use “ > ” because:
Ho: d = 0
Before – After > 0
Ha: d > 0
(Large – small) > 0
 State the hypothesis.
 Interpret the p-value.
576
If it’s true that aspirin does not lower the temperature of 5 year olds with the flu,
then 3% of all random samples of 5 year olds with the flu that have taken aspirin
will have lowered temperatures to an even greater extent than was shown in this
sample.
2.)
p = Proportion of left-handers in the national baseball league.
 State the hypothesis.
Ho: p = 0.10
Ha: p > 0.10
 Interpret the p-value.
If it’s true that the proportion of left-handers in the national baseball league does
not exceed 10%, then 23% of all random samples of national league pitchers will
show an even greater proportion of left-handers than our sample of 13/48.
 Declare the parameter.
3.)
1 = Mean carbon absorbed in kg C per square meter of
ground area for fertilized trees.
2 = Mean carbon absorbed in kg C per square meter of
ground area for unfertilized trees.
 State the hypothesis.
Ho: 1 = 2
Ha: 1 > 2
 Interpret the p-value.
If it’s true that fertilized trees do not absorb any more carbon than non-fertilized
trees, then 4% of all random samples of fertilized trees will show rates of carbon
absorption that exceed what was found in this sample.
 Declare the parameter.
4.)
 State the hypothesis.
 Interpret the p-value.
p1 = proportion of adults that have allergies, and had pets as children. 
p2 = proportion of adults that have allergies, but had no pets as children.
Ho: p 1 = p2
Ha: p 1 < p2
If it’s true that having pets as children does not reduce one’s chances of having
allergies as an adult later in life, then 1.6% of all random draws comparing the
adult allergy rates between those that had pets as children and those that did not
will show results where adults that did have pets as children had an even lower
comparative rate of allergies than was seen in our sample data.
Stat 109
Quiz 8
For each of the 5 problems (only 4 are shown here
for the prep) determine whether the scenario
describes an independent comparison of means or
a dependent comparison of means.
Prep
NAME______________
 Declare the parameter with units of
measure.
 Make the hypothesis statement.
1.)
In a study of kidney function, 40 adult male frogs, Rana pipiens, had their Oxygen, O2,
consumption measured. 20 of the frogs had renal (kidney) damage while the other 20 frogs were
used as controls. The researchers suspect that the frogs with renal damage will consume more
oxygen in ml/g/hour than the controls. The mean O2 consumed by both groups of frogs was
recorded and compared.
2.)
Beta wave, or beta rhythm, is the term used to designate the frequency range of brain
activity above 12 Hz (12 transitions or cycles per second). Beta states are the states associated
with normal waking consciousness. Low amplitude beta waves with multiple and varying
frequencies are often associated with active, busy, or anxious thinking and active concentration.
Researchers suspect that sedative-hypnotic drugs such as benzodiazepines or barbiturates will
reduce the mean amplitude in a rat’s beta waves. 24 rats had their beta waves recorded as they
engaged in solving maze pathways. The beta waves of the same rats were then recorded when
the rats were subjected to a small dose of barbiturates and introduced to a new maze puzzle. The
mean amplitude of the beta waves for the rats in both experiments were compared.
3.)
The hygiene hypothesis proposes that our immune systems are fortified if one is exposed
to animals when we are young. A nurse tests the hypotheses on a group of adult volunteers that
had been separated into two groups: Group 1 had either a cat, a dog, or a rodent as a pet when
they were children, while Group 2 did have any pets when they were children. Each group was
given a skin scratch test for pet dander and the resulting skin rash that developed was recorded
for each individual in mm2. The mean area of skin reactivity in mm2 for each group was recorded
and compared.
4.)
A nurse tests the hypotheses that comfrey compresses can reduce skin reactivity to
irritants on the skin on a group of adult volunteers. Each volunteer had both of their forearms
scratched tested with pet dander. Immediately after the scratch test each volunteer had their left
forearm prepared with a comfrey compress while their right forearm were treated with a placebo
compress. The mean area of skin reactivity in mm2 for each arm was recorded and compared.
577
Stat 109
Quiz 8
For each of the 5 problems (only 2 are shown here for
the prep) determine whether the scenario describes an
independent comparison of means or a dependent
comparison of means.
Prep
Solution
 Declare the parameter with units of
measure.
 Make the hypothesis statement.
1) In a study of kidney function, 40 adult male frogs, Rana pipiens, had their Oxygen, O2,
consumption measured. 20 of the frogs had renal (kidney) damage while the other 20
frogs were used as controls. The researchers suspect that the frogs with renal damage will
consume more oxygen in ml/g/hour than the controls. The mean O2 consumed by both
groups of frogs was recorded and compared.
Declare parameters:  mean O2 consumption of renal damaged frogs in ml/g/hour.
 mean O2 consumption of control frogs in ml/g/hour.
State hypothesis:


2) Beta wave, or beta rhythm, is the term used to designate the frequency range of brain activity
above 12 Hz (12 transitions or cycles per second). Beta states are the states associated with normal
waking consciousness. Low amplitude beta waves with multiple and varying frequencies are often
associated with active, busy, or anxious thinking and active concentration. Researchers suspect that
sedative-hypnotic drugs such as benzodiazepines or barbiturates will reduce the mean amplitude in a
rat’s beta waves. 24 rats had their beta waves recorded as they engaged in solving maze pathways.
The beta waves of the same rats were then recorded when the rats were subjected to a small dose of
barbiturates and introduced to a new maze puzzle. The mean amplitude of the beta waves for the rats
in both experiments were compared.
Declare parameters:
d mean difference in Beta-wave amplitude between controls-barbiturate dosed rats.
State hypothesis:
d
d
578
Stat 109
Quiz 8
Prep
Solution
3.)
The hygiene hypothesis proposes that our immune systems are fortified if one is exposed
to animals when we are young. A nurse tests the hypotheses on a group of adult volunteers that
had been separated into two groups: Group 1 had either a cat, a dog, or a rodent as a pet when
they were children, while Group 2 did have any pets when they were children. Each group was
given a skin scratch test for pet dander and the resulting skin rash that developed was recorded
for each individual in mm2. The mean area of skin reactivity in mm2 for each group was recorded
and compared.
Declare parameters:
 mean skin reactivity in mm2 for adults who had pets as children.
 mean skin reactivity in mm2 for adults who did not have pets as children.
State hypothesis:


4.)
A nurse tests the hypotheses that comfrey compresses can reduce skin reactivity to
irritants on the skin on a group of adult volunteers. Each volunteer had both of their forearms
scratched tested with pet dander. Immediately after the scratch test each volunteer had their left
forearm prepared with a comfrey compress while their right forearm were treated with a placebo
compress. The mean area of skin reactivity in mm2 for each arm was recorded and compared.
Declare parameters:
d mean difference in skin reactivity in mm2 for adults between their comfrey treated left
forearm and their placebo treated right forearm.
State hypothesis:
d
d
579
Stat 109
Quiz 9 Prep
Given a brief description of the categorical data
displayed in 2x2 table determine which of the 3 Chisquare hypothesis tests to apply:
1)
Whitehall Laboratories, makers of Advil,
published a study comparing reports of upset
stomach as a side effect among ibuprofin users
versus placebo users. Is the rate of upset stomach
significantly different with Ibuprofin group versus
that experienced in the control group?
Name____________
1) Goodness of Fit Test
2) Independence of Attributes
3) McNemar’s Test
Dosage
Group:
Control
Ibuprofin
Upset
stomach
8
6
No upset
stomach
664
645
1) Goodness of Fit Test
Circle the correct Chi-Square test to apply:
2) Independence of Attributes
3) McNemar’s Test
2)
In an ecological study of the Carolina
Junco, 54 birds were captured from a certain
population; of these, 40 were male. Is this evidence
that males outnumber females in the population?
Junco
Counts:
Observed
Expected
Male
40
27
Female
14
27
1) Goodness of Fit Test
Circle the correct Chi-Square test to apply:
2) Independence of Attributes
3) McNemar’s Test
3) The shrub Xerospirea hartwegiana inhabits
Shrub
Cut and
dry climates of Mexico. Botanist were interested
Results:
Cut Only
Burned
in the shrubs ability to regenerate after trauma.
Died
1
4
35 Xerospirea shrubs were randomly selected
Regenerated
19
11
and all shrubs were topped off 2 inches above ground
level. 15 of the shrub stumps were subjected to a
propane torch to simulate a fire. Does the presence of fire
1) Goodness of Fit Test
significantly reduce the shrubs ability to regenerate?
2) Independence of Attributes
Circle the correct Chi-Square test to apply:
3) McNemar’s Test
580
Stat 109
Quiz 9 Prep
4)
ACME Laboratories, makers of a new
sunscreen ointment tested the ointment for the
side effect of itchy skin. Concerned that itchy
skin might be sex dependent, researchers applied
the ointment to 40 women and their brothers in an
attempt to match genetic skin type but still
control for differences of sex. Is the reaction to
the ointment dependent on one’s sex?
Sister has
itchy skin
Sister has
no reaction
Name__________
Brother has
Brother has
itchy skin no reaction
5
8
1
27
1.) Goodness of Fit Test
Circle the correct Chi-Square test to apply:
2.) Independence of Attributes
3.) McNemar’s Test
5)
In an attempt to determine if caffeine
quickens one’s reaction time, researchers tested
the reaction time of 64 college students before and
after taking caffeine. Using threshold of 0.35
seconds to push a button in response to a bell each
students success or failure was recorded before and
after taking 2 cups of coffee.
No
Success with
Caffeine  Caffeine
Male
Success
14
Failure
18
Failure with
Caffeine
3
27
1.) Goodness of Fit Test
Circle the correct Chi-Square test to apply:
2.) Independence of Attributes
3.) McNemar’s Test
6) The shrub Xerospirea hartwegiana inhabits
Before
After Fire
After Fire
dry climates of Mexico. Botanist were interested
Wildfire 
Present
Absent
in the role fire plays in the colonization of the shrub.
Present
1
19
35 random transects of a high mesa plane were taken Absent
4
11
the year before and then visited again a year after the
area was overrun by wildfire and the presence or absence
1.) Goodness of Fit Test
Xerospirea hartwegiana shrubs in each transect were
2.) Independence of Attributes
recorded before and after the wildfire. Does the presence
of the shrub before fire significantly reduce the shrubs
3.) McNemar’s Test
presence after the fire?
Circle the correct Chi-Square test to apply:
581
Stat 109
Quiz 9 Prep Solution
1.) Independence of Attributes.
The researchers of the experiment compare 2 populations of
treatments (ibuprofen vs control) for their effect on upset
stomach. In general, an independence of attributes test will count
one set of attributes (upset stomach or not) among 2 or more
populations: (ibuprofen vs control).
582
1.) Goodness of Fit Test
2.) Independence of Attributes
3.) McNemar’s Test
2.) Goodness of Fit Test.
1.) Goodness of Fit Test
Here the comparison is between an observed count and an
expected count. If the sex distribution of the Carolina Junco is 2.) Independence of Attributes
evenly distributed we would expect to see an even count of both 3.) McNemar’s Test
sexes. These expected counts are then compared against the
observed counts of each sex. The even “expected count” was implied in this example but a Goodness
of Fit test can also specify that the expected count should be based on percentages or ratios. For
example, “The ornithologist suspects that 70% of the Junco population will be male.” Or: “The
ornithologist suspects that the male Junco population will outnumber the females by a ratio of 2:1.”
3.) Independence of Attributes.
The researchers of the experiment compare 2 populations of
treatments (cut vs cut & burn) for their effect upon the
regeneration of the shrub. In general, an independence of
attributes test will count one set of attributes (regenerate or not)
among 2 or more populations: (cut vs cut & burn).
1.) Goodness of Fit Test
2.) Independence of Attributes
3.) McNemar’s Test
4.) McNemar’s Test
The researchers of the experiment compare the responses of itchy 1.) Goodness of Fit Test
skin among the paired populations of brother and sister. The 2.) Independence of Attributes
point of doing this is to have as similar skin types as possible to
enable a good comparison of male and female reactions to the 3.) McNemar’s Test
ointment without having a dissimilar genetic skin type reaction confound the results. Note how the
both the column headers and row descriptors of the 2x2 table describe an attribute (brother vs sister)
and result (itchy skin or not) simultaneously. We use McNemars Test whenever we want to discern a
difference between categorically paired data sets. Is itchy skin more likely for one sex vs another?
Stat 109
Quiz 9 Prep Solution
5.) McNemar’s Test
The researchers of the experiment compare the responses of
1.) Goodness of Fit Test
reaction times among the paired populations of students before
2.) Independence of Attributes
and after ingesting caffeine. The point of doing this is to have as
similar metabolisms as possible to enable a good comparison of
3.) McNemar’s Test
before and after reactions times to caffeine without having a
dissimilar metabolisms confound the results. Note how the both the column headers and row
descriptors of the 2x2 table describe an attribute (with vs without caffeine) and result (success vs
failure) simultaneously. We use a McNemars test whenever we want to discern a difference between
categorically paired data sets. Is reaction time quickened by caffeine? We pair before and after
reaction times with a McNemars test to find out.
6.) McNemar’s Test
The researchers of the experiment compare the presence of the 1.) Goodness of Fit Test
shrub, Xerospirea hartwegiana among the paired populations
of transects taken before and after a wildfire. The point of 2.) Independence of Attributes
doing this is to have as similar transects as possible to enable a 3.) McNemar’s Test
good comparison of before and after transect lines without
having a dissimilar soil types or micro-climates confound the results. Note how the both the column
headers and row descriptors of the 2x2 table describe an attribute (before vs after wildfire) and result
(present vs absent) simultaneously. We use a McNemars test whenever we want to discern a
difference between categorically paired data sets. Is the presence of the shrub significantly different
after a fire? We pair before and after plant presence transects with a McNemars test to find out.
7.)
Consider this Example:
The strength of anti-bacteria soap was
“Clean Counts”
“Dirty Counts”

challenged by a company that sells bar< 300 CFU
> 300 CFU
soap. The company claims that bar soaps
Anti-Bacterial Soap
15
5
work just as well as the anti-bacterial soap Bar Soap
12
8
in minimizing the bacteria left on one’s
hands after washing. A comparison with 20 volunteers was made in which each person washed their
hands and then dug in a compost heap with bare hands for 5 minutes and then washed their hands with
the anti-bacterial soap. Each person set their washed hand in an agar dish and the resulting bacteria
count was made in 24 hours. The same 20 people repeated the process with bar soap hand washing.
Each went back to the compost heap and dug bare handed for another 5 minutes in the compost bin
and this time washed their hand with the company’s bar soap, and again placed the washed hand in an
agar dish and a bacteria count was made 24 hours later. A lower threshold of 300 CFU (colony
forming units) was used as the standard of a “clean” hand. Do the resulting bacteria counts support the
bar-soap company’s claim? Is this paired data?
583
Stat 109
Quiz 9 Prep Solution
584
Revisited: Distinguishing Paired vs Unpaired Categorical Data
Paired Categorical Data will have the 2 treatments (Anti-Bacterial Soap vs Bar Soap) separated
into row and column responses of “Clean” and “Dirty” counts. Each cell represents the particular
response of a treatment from both categories of treatments. 20 people placed their hands in the
compost pile while using bacterial soap, and then repeated the experience while using bar soap.
If the two resulting bacteria counts are compared for each individual, then we have paired
categorical data. Repeating the point for emphasis: that is Susan’s bacteria count with antibacterial soap is compared with Susan’s bacteria count with bar soap hand washing, and John’s
bacteria count with anti-bacterial soap is compared with John’s bacteria count with bar soap hand
washing, and so on for each person 2 bacterial counts are also compared against each other. For
example in the first cell n1,1 we see that 10 people had “Clean” bacteria counts while using AntiBacterial Soap, and these same 10 people had low bacteria counts while using Bar Soap. Etc…
Anti-Bacterial
Soap 
“Clean Counts”
< 300 CFU
“Dirty Counts”
> 300 CFU
Bar Soap
“Clean Counts”
< 300 CFU
Bar Soap
“Dirty Counts”
> 300 CFU
10
3
6
1
Circle the correct
Chi-Square test to apply:
1) Goodness of Fit Test
2) Independence of Attributes
3) McNemar’s Test
Unpaired Categorical Data will have rows relegated to treatments, while columns will hold
responses. Each cell represents the particular response of a treatment from only one category of
treatment. For example in the first cell n1,1 we see that 15 people had “Clean” bacteria counts
while using Anti-Bacterial Soap. Those 15 people are not referenced again under a second
treatment nor a second, paired response.
Circle the correct
Chi-Square test to apply:
“Clean Counts”
“Dirty Counts”

< 300 CFU
> 300 CFU
1) Goodness of Fit Test
Anti-Bacterial Soap
15
5
Bar Soap
12
8
2) Independence of Attributes
3) McNemar’s Test
Download