Document 10684130

advertisement
Problems: T-TESTS
1. Two sets o f 60 randomly s e l e c t e d 5 t h grade c h i l d r e n were taught s p e l l i n g
according t o one o f two methods. One method was t h e o l d one t h a t had been
used f o r t h e l a s t twenty years. The o t h e r was a new, computer a s s i s t e d
technique. You wish t o know i f t h e new method i s an improvement over t h e o l d
one. A t t h e end o f t h e school year, 40 o f t h e 60 c h i l d r e n who used t h e new
method passed a vocabulary t e s t , whereas o n l y 24 o f t h e 60 c h i l d r e n who used
t h e o l d method passed t h e t e s t . (Note: Use a method o t h e r than c h i - s q u a r e t o
answer t h i s question.)
a. S t a t e t h e a p p r o p r i a t e n u l l and a l t e r n a t i v e hypotheses.
b. What i s your r e j e c t i o n r u l e ?
(Use t h e .O1 s i g n i f i c a n c e l e v e l .)
c. S t a t e your conclusions i n words
2. The Women's Youth C r i s i s Centers (WYCCs) i n Minneapolis are r u n e n t i r e l y by
women v o l u n t e e r s from l o c a l r e l i g i o u s groups. These women founded t h e WYCCs
i n 1985 w i t h t h e express goal o f "reducing t h e pregnancy r a t e among teenagers
i n our h i g h schools." (Based on h i g h school h e a l t h records i n 1985, t h e
pregnancy r a t e was such t h a t one out o f every twenty teenage women i n
Minneapolis h i g h schools had been pregnant p r e v i o u s l y o r became pregnant
d u r i n g t h a t year.) A t t h e founding o f t h e WYCCs t h e r e was considerable
disagreement among t h e "founding mothers" as t o whether r e l i g i o u s terminology
(e.g., references t o God, prayer, etc.) should be used by c o u n c i l o r s d u r i n g
t h e i r "pregnancy-preventative counseling." Unable t o r e s o l v e t h i s
disagreement, they s e t up two counseling centers: t h e R e l i g i o u s Center (RC),
where r e l i g i o u s terminology was used d u r i n g counseling, and t h e Secular Center
(SC), where no such terminology was used. To keep t r a c k o f t h e c e n t e r s '
performance, h i g h school h e a l t h records o f teenaged v i s i t o r s t o e i t h e r c e n t e r
were obtained a f t e r they graduated from h i g h school. I n 1992 1000 teenage
women had been counseled a t t h e WYCCs (300 a t t h e R e l i g i o u s Center and 700 a t
t h e Secular Center) and had subsequently graduated from h i g h school. O f t h e
300 teenage v i s i t o r s t o t h e R e l i g i o u s Center, 12 became pregnant a f t e r t h e i r
i n i t i a l v i s i t t o t h e RC; o f t h e 700 teenage v i s i t o r s t o t h e Secular Center, 21
became pregnant a f t e r t h e i r i n i t i a l v i s i t t o t h e SC.
a. Do t h e data suggest t h a t r e l i g i o u s terminology c o n t r i b u t e s t o t h e
e f f e c t i v e n e s s o f pregnancy-preventative counseling? (Use the .05 l e v e l o f
s i g n i f i c a n c e and show your work.)
b. Using t h e 1985 pregnancy r a t e among teenage women i n Minneapolis h i g h
schools as your p o i n t o f comparison, do your data p r o v i d e evidence t h a t
women v i s i t o r s t o the WYCCs are l e s s l i k e l y than t h i s t o become pregnant
a f t e r t h e i r i n i t i a l v i s i t t o one o f the centers? ( H i n t s : T h i s question
does n o t r e q u i r e t h a t you d i s t i n g u i s h between t h e two c e n t e r s i n any way.
As a r e s u l t , you w i l l want t o combine your data on t h e R e l i g i o u s and
Secular Centers. And once again, use a
.05 and show your work.)
-
c. C a l c u l a t e the P-value o f your f i n d i n g i n p a r t b. S t a t e i n words what
t h i s P-value means. ( H i n t : This statement should be phrased something
1 i k e , "The P-value o f
i s the p r o b a b i l i t y that .
.")
.
1G
d. A colleague challenges your conclusion i n p a r t b, by showing you t h a t
f i v e percent o f teenage women i n Minneapolis h i g h schools had been pregnant
p r e v i o u s l y o r became pregnant d u r i n q 1992. (I.e., t h e pregnancy r a t e i n
1992 i s t h e same as i t was i n 1985.) J1! a sentence o r two, how might you
e x p l a i n t h e seeming inconsistency between t h e P-value ( c a l c u l a t e d i n p a r t
c ) t h a t suggests a decrease i n t h e pregnancy r a t e and t h e t r u t h ( j u s t made
known t o you by your colleague) t h a t t h e r e has been no change i n t h e
pregnancy r a t e ?
3. I n t h e 1984 NORC survey, respondents were asked i f they were " a f r a i d t o
walk a t n i g h t i n t h e i r neighborhood." (Note: Do not use chi-square t o answer
t h i s question.)
a. Construct a 95% confidence i n t e r v a l f o r t h e d i f f e r e n c e i n p r o p o r t i o n s
between males and females who are a f r a i d t o walk i n t h e i r neighborhoods a t
night. Interpret the interval.
b. Use t h e same data t o t e s t t h e n u l l hypothesis t h a t t h e two p r o p o r t i o n s
are equal.
c. You have now used two techniques (confidence i n t e r v a l and hypothesis
t e s t ) t o compare t h e p r o p o r t i o n s o f males vs. females who are a f r a i d t o
walk i n t h e i r neighborhoods a t n i g h t . Compare these two techniques by
e x p l a i n i n g why t h e n u l l hypothesis i s o n l y r e j e c t e d a t t h e .05 l e v e l o f
s i g n i f i c a n c e when t h e 95% confidence i n t e r v a l does n o t c o n t a i n t h e value o f
zero.
You w i l l need t h e f o l l o w i n g o n e - l i n e SPSS program t o g e t t h e d a t a t o do t h i s
problem:
crosstabs t a b l e s = f e a r by sex.
4. During t h e l a s t 35 years Isaacs Counseling Services (ICS) has t r a i n e d
thousands o f s o c i a l workers f o r employment i n w e l f a r e bureaus throughout t h e
U n i t e d States. The board o f d i r e c t o r s a t I C S has r e c e n t l y demanded t h a t t h e
company evaluate t h e e f f e c t i v e n e s s w i t h which t h e i r t r a i n i n g i n s t i l l s
a l t r u i s t i c values i n those whom they t r a i n . You are i n charge o f making t h i s
e v a l u a t i o n . A f t e r t r y i n g various methods o f e v a l u a t i n g a l t r u i s m , you decide
t h a t t h e b e s t a l t r u i s m measure r e q u i r e s t h a t t h e respondent choose between
a l t r u i s t i c and n o n a l t r u i s t i c a l t e r n a t i v e s i n a r e a l - l i f e s i t u a t i o n . Here i s
t h e s i t u a t i o n t h a t you s e t up: Each person-being-evaluated i s l e f t i n a
w a i t i n g room w i t h another person. A f t e r a few minutes o f w a i t i n g , t h e " o t h e r
person" slumps i n h i s c h a i r , then s l o w l y s l i d e s down o f f t h e c h a i r and onto
t h e f l o o r (presumably) unconscious.
When faced w i t h t h i s s i t u a t i o n , 35 o u t o f a random sample o f 42 ICS graduates
t o o k t h e a l t r u i s t i c a l t e r n a t i v e , and t r i e d t o r e v i v e ( o r otherwise h e l p ) t h e
o t h e r person. I n c o n t r a s t , 26 o u t o f a random sample o f 39 s o c i a l workers who
were n o t graduates o f t h e I C S program took t h i s same a l t r u i s t i c a l t e r n a t i v e .
( H i n t : Do n o t make t h e mistake o f t r y i n g t o use a c h i - s q u a r e t e s t i n answering
t h i s question.)
a. Give the null and alternative hypotheses that you would use to test whether ICS graduates are more altruistic than other social workers? b. What rejection rule would you use in a test of the hypotheses in part a? (Use the .05 significance level.) c. Do you have evidence at the .05 significance level that ICS graduates are more altruistic than other social workers? Explain your answer. d. Assume that with your altruism measure, precisely 10% more ICS graduates than other social workers are altruistic. Given this fact, how much power did you have as you tested the hypotheses given in part a? 5. There are two competing theories of old-age-despair. Global Neglect Theory argues that old people's despair will be lessened if others (who could be relatives of the old people) do kind things for them. Relative Neglect Theory argues that old people's despair will increase if they believe that relatives are neglecting them. Professor I. Hatemom (an outspoken proponent of relative neglect theory) warns, "Nursing homes should be hesitant to make old people's lives too comfortable. Kindness by non-relatives (such as the nursing staff) will only increase the despair of those whose families already neglect them." You remain uncommitted to either theory and design an experiment to test whether either theory is true. You obtain visitation records from 1000 Iowa nursing homes and randomly sample one old resident from each--ensuring only that the resident is in good health and that the resident has had no visitors for at least 5 years. The 1000 residents are randomly divided into two equal groups. The 500 residents in one group receive a bouquet of flowers ("Complements of the State of Iowa") each month for the two years of your study; the other 500 residents receive no flowers. At the end of two years you find that 30 o f the residents, who received flowers, had committed suicide and 15 of the residents, who had not received flowers had committed suicide. (NOTE: You should not use chi-square in answering this problem.) a. Taking suicide as your measure of despair, give the appropriate null and a1 ternative hypotheses. b. Using the .05 level of significance, state your rejection rule. c. Which theory do the data support? Justify your answer in the light of your findings. (Hint: What two pieces of information are needed to justify your answer?) d. Imagine that among healthy Iowan nursing home residents (who have had no visitors for at least 5 years) the probability that a bouquet-per-month recipient commits suicide is .02 greater than the probability that a non-flower-recipient commits suicide. What is the power o f the hypothesis test that you have performed in the previous parts of this problem? (Be sure to use the significance level and variance estimates from previous parts of this problem.) 6. You a r e doing research on worker morale i n two companies. About two y e a r s
ago one o f t h e companies i n t r o d u c e d a p o l i c y according t o which workers who
show " c r e a t i v i t y " i n t h e i r jobs r e c e i v e h i g h e r pay increases than n o n c r e a t i v e
workers. The purpose o f t h i s p o l i c y was t o p r o v i d e workers w i t h a sense t h a t
rewards f o r t h e i r work were a l l o c a t e d f a i r l y . Pay increases f o r workers i n
both companies were announced l a s t month. A f t e r these announcements you asked
t h e f o l l o w i n g question t o two randomly sampled groups o f workers (one from
each company) :
On a s c a l e from 1 t o 10, where 1 i s t h e l e a s t f a i r o f companies and 10
i s t h e most f a i r o f companies, how f a i r do you t h i n k your company i s i n
a l l o c a t i n g rewards t o i t s workers?
(Note: For t h e purposes o f t h i s a n a l y s i s , assume t h a t responses t o t h e
" f a i r n e s s question" y i e l d a 1 0 - p o i n t i n t e r v a l - l e v e l measure.) Means and
standard d e v i a t i o n s on t h i s measure are as f o l l o w s :
Com~any
C r e a t i v i t y rewarded
C r e a t i v i t y n o t rewarded
Standard
Deviation
4.8
6.7
Mean
7.1
4.6
Sample
Size
21
25
a. Do you have evidence a t t h e .05 s i g n i f i c a n c e l e v e l t h a t t h e p o l i c y o f
rewarding c r e a t i v i t y achieved i t s purpose? (Use t h e above d a t a t o j u s t i f y
y o u r answer.)
b. I f t h e company t h a t rewarded c r e a t i v i t y r e a l l y d i d have workers w i t h a
2 - p o i n t h i g h e r average score on t h e f a i r n e s s scale than d i d workers i n t h e
o t h e r company, what i s t h e p r o b a b i l i t y o f making a Type I 1 e r r o r i n an
a n a l y s i s o f a d i f f e r e n t p a i r o f samples ( w i t h t h e same standard d e v i a t i o n s
and sample s i z e s ) than those analyzed i n p a r t a? (Use t h e .05 s i g n i f i c a n c e
l e v e l and show your work.)
7. You design an experiment t o t e s t whether p h y s i c a l c o n t a c t increases
m o t i v a t i o n . You randomly d i v i d e a group o f 6 t h grade c h i l d r e n i n t o two groups
o f t w e n t y - f i v e c h i l d r e n each. C h i l d r e n from each group are i n d i v i d u a l l y asked
t o perform a task. Each c h i l d ' s performance i s then assigned a m o t i v a t i o n
score from 1 t o 100 p o i n t s (by a panel o f judges), according t o how
e n e r g e t i c a l l y t h e c h i l d accomplished t h e t a s k . (High scores correspond t o
h i g h m o t i v a t i o n . ) The two groups a r e t r e a t e d i d e n t i c a l l y , except t h a t f o r
each o f t h e c h i l d r e n i n one group, t h e experimenter touches t h e c h i l d ' s
shoulder j u s t b e f o r e i t begins t h e task. C h i l d r e n from t h e o t h e r group are
n o t touched. Your data on t h e c h i l d r e n ' s m o t i v a t i o n scores a r e as f o l l o w s :
Group Mean
Standard D e v i a t i o n
Touched
35.89
87.48
Not touched 32.62 52.60 a. What a r e your n u l l and a l t e r n a t i v e hypotheses?
b. What do you conclude? I.e., do your data suggest t h a t p h y s i c a l c o n t a c t
increases m o t i v a t i o n a t t h e .05 l e v e l o f s i g n i f i c a n c e ? ( J u s t i f y y o u r
answer. )
c. For j u s t t h i s p a r t o f the question, imagine t h a t you do n o t know t h e
m o t i v a t i o n scores f o r t h e two group means. Imagine a l s o t h a t you know t h a t
when 6 t h graders are touched, they have m o t i v a t i o n scores f i v e p o i n t s
h i g h e r on t h e average t h a n 6 t h graders who are n o t touched. What i s t h e
power o f your t e s t o f t h e hypotheses l i s t e d i n p a r t "a."? (Use t h e .05
l e v e l o f s i g n i f i c a n c e and t h e above standard d e v i a t i o n measures.)
8. You wish t o determine whether telephone i n t e r v i e w e r s make fewer e r r o r s when
they use CATI (Computer-Aided Telephone I n t e r v i e w i n g ) techniques i n s t e a d o f
t r a d i t i o n a l noncomputer-aided telephone i n t e r v i e w i n g techniques. You h i r e 50
i n t e r v i e w e r s . You randomly s e l e c t h a l f o f them and t r a i n these 25 i n CATI
techniques. The o t h e r 25 i n t e r v i e w e r s r e c e i v e t r a d i t i o n a l t r a i n i n g as
telephone i n t e r v i e w e r s .
You do n o t t e l l your i n t e r v i e w e r s t h a t thev are t h e s u b j e c t s o f y o u r research.
Unknown t o t h e i n t e r v i e w e r s , they each i n t e r v i e w t h e same 10 people, who are
s h i l l s i n t h e experiment. (That i s , each o f these 10 s h i l l s i s a c t u a l l y
someone cooperating w i t h you i n your research.) Each s h i l l g i v e s t h e same 100
answers on h i s / h e r i n t e r v i e w ; each s h i l l g i v e s answers t h a t d i f f e r from those
given by t h e o t h e r 9 s h i l l s . Thus each i n t e r v i e w e r should r e c o r d t h e same 100
answers as d a t a from each o f t h e 10 i n t e r v i e w s . Note t h a t i f an i n t e r v i e w e r
c o r r e c t l y recorded each answer from each subject, t h e i n t e r v i e w e r would have a
t o t a l o f 1,000 c o r r e c t answers. I f no answers were c o r r e c t l y recorded by an
i n t e r v i e w e r , t h e i n t e r v i e w e r would have a "correctness score" o f zero.
I n t e r v i e w s are tape recorded t o assure t h a t s h i l l s g i v e t h e 100 answers they
a r e supposed t o g i v e . Your d a t a are as f o l l o w s :
Table 1: Mean Number o f Items C o r r e c t l y Recorded w i t h Variance and Sample
Size f o r I n t e r v i e w e r s Who D i d versus those Who D i d Not Use CATI
I n t e r v i e w i n g Techniques.
Mean
CAT1 I n t e r v i e w e r s
Other I n t e r v i e w e r s
990
978
Standard d e v i a t i o n
10.2
16.8
S a m ~ l es i z e
25
25
a. S t a t e t h e a p p r o p r i a t e n u l l and a l t e r n a t i v e hypotheses
b. What i s your r e j e c t i o n r u l e ?
(Use t h e .05 l e v e l o f s i g n i f i c a n c e . )
c. What do you conclude? I.e., do your d a t a suggest t h a t CATI systems
decrease i n t e r v i e w e r e r r o r s ? ( J u s t i f y your answer.)
9. You wish t o determine i f women who work o u t s i d e t h e home have worse
marriages than women who work a t home as homemakers. From a l i s t o f a l l Iowa
r e s i d e n t s you randomly sample 250 people. Each o f these people i s contacted
t o determine t h e i r gender, m a r i t a l s t a t u s , and working hours. A f t e r
d i s c a r d i n g a l l males, s i n g l e people, and p a r t - t i m e workers, you a r r i v e a t your
f i n a l sample o f 25 married women who have f u l l - t i m e j o b s o u t s i d e t h e i r homes
and 61 married women who work a t home as homemakers. Each o f these 86 women
completes a q u e s t i o n n a i r e which i n c l u d e s a s e r i e s o f items ( t h e renowned
Roberts M a r i t a l Qua1i t y Index) t h a t assess t h e q u a l i t y o f t h e respondent's
marriage. Scores on t h e Roberts M a r i t a l Q u a l i t y Index range from 0 = an
awful marriage t o 100 = a marriage made i n heaven ( t h a t i s , a "wonderful"
marriage).
follows:
Means and variances on t h e Roberts M a r i t a l Q u a l i t y Index are as
Work o u t s i d e home
Work a t home
Variance 12.8 3.9 79.8
a. What a r e y o u r n u l l and a l t e r n a t i v e hypotheses?
b. What would be y o u r r e j e c t i o n r u l e ?
throughout .)
(Use t h e .05 s i g n i f i c a n c e l e v e l
c . What do you conclude? I.e., what do your d a t a suggest about t h e
hypothesized r e l a t i o n between Iowa women's m a r i t a l q u a l i t y and t h e i r
working s t a t u s ?
d. Assume t h a t women who work o u t s i d e t h e home do i n f a c t have worse
marriages t h a n women who work a t home. I n p a r t i c u l a r , working women have
Roberts M a r i t a l Q u a l i t y Index scores t h a t a r e 3 p o i n t s l o w e r on average
t h a n t h e scores o f women who work a t home as homemakers. Assuming t h e
v a r i a n c e e s t i m a t e d e r i v e d i n p a r t b, what would be t h e power o f y o u r
hypothesis t e s t ?
Download