21-f11-bgunderson-iln-differenceinmeanspart1

advertisement
Author(s): Brenda Gunderson, Ph.D., 2011
License: Unless otherwise noted, this material is made available under the
terms of the Creative Commons Attribution–Non-commercial–Share
Alike 3.0 License: http://creativecommons.org/licenses/by-nc-sa/3.0/
We have reviewed this material in accordance with U.S. Copyright Law and have tried to maximize your
ability to use, share, and adapt it. The citation key on the following slide provides information about how you
may share and adapt this material.
Copyright holders of content included in this material should contact open.michigan@umich.edu with any
questions, corrections, or clarification regarding the use of content.
For more information about how to cite these materials visit http://open.umich.edu/education/about/terms-of-use.
Any medical information in this material is intended to inform and educate and is not a tool for self-diagnosis
or a replacement for medical evaluation, advice, diagnosis or treatment by a healthcare professional. Please
speak to your physician if you have questions about your medical condition.
Viewer discretion is advised: Some medical content is graphic and may not be suitable for all viewers.
Some material sourced from:
Mind on Statistics
Utts/Heckard, 3rd Edition, Duxbury, 2006
Text Only: ISBN 0495667161
Bundled version: ISBN 1111978301
Material from this publication used with permission.
Attribution Key
for more information see: http://open.umich.edu/wiki/AttributionPolicy
Use + Share + Adapt
{ Content the copyright holder, author, or law permits you to use, share and adapt. }
Public Domain – Government: Works that are produced by the U.S. Government. (17 USC §
105)
Public Domain – Expired: Works that are no longer protected due to an expired copyright term.
Public Domain – Self Dedicated: Works that a copyright holder has dedicated to the public domain.
Creative Commons – Zero Waiver
Creative Commons – Attribution License
Creative Commons – Attribution Share Alike License
Creative Commons – Attribution Noncommercial License
Creative Commons – Attribution Noncommercial Share Alike License
GNU – Free Documentation License
Make Your Own Assessment
{ Content Open.Michigan believes can be used, shared, and adapted because it is ineligible for copyright. }
Public Domain – Ineligible: Works that are ineligible for copyright protection in the U.S. (17 USC § 102(b)) *laws in
your jurisdiction may differ
{ Content Open.Michigan has used under a Fair Use determination. }
Fair Use: Use of works that is determined to be Fair consistent with the U.S. Copyright Act. (17 USC § 107) *laws in your
jurisdiction may differ
Our determination DOES NOT mean that all uses of this 3rd-party content are Fair Uses and we DO NOT guarantee that
your use of the content is Fair.
To use this content you should do your own independent analysis to determine whether or not your use will be Fair.
Stat 250 Gunderson Lecture Notes
Learning about the Difference in Population Means
Part 1: Distribution for a Difference in Sample Means
Chapter 9: Section 8, SD Module 5
The Independent Samples Scenario
Recall that two samples are said to be independent samples when the measurements in one
sample are not related to the measurements in the other sample. Independent samples are
generated in a variety of ways. Some common ways:
 Random samples are taken separately from two populations and the same response
variable is recorded for each individual.
 One random sample is taken and a variable is recorded for each individual, but then
units are categorized as belonging to one population or another, e.g. male/female.
 Participants are randomly assigned to one of two treatment conditions, such as diet or
exercise, and the same response variable, such as weight loss, is recorded for each
individual unit.
If the response variable is quantitative, a researcher might compare two independent groups by
looking at the difference between the two means. Here is an example of an independent
samples scenario from page 319 of your text.
From Utts, Jessica M. and Robert F. Heckard. Mind on Statistics, Fourth Edition. 2012.
Used with permission.
You can review page 338 for the short discussion and example regarding the difference in two
population means.
9.8 SD Module 5:
Sampling Distribution for the
Difference in Two Sample Means
Who are the speed demons? (Example 9.10)
147
A survey of college students included the question: “What is the fastest you have ever driven a
car? _____ mph”. Data for 87 males and 102 females responding to this question resulted in a
mean speed of 107 mph for males and a mean speed of 88 mph for females.
A Typical Summary of Responses for a Two Independent Samples Problem:
Population
Sample Size
Sample Mean
Sample Standard Deviation
1 (males)
n1 =87
x1 =107
s1 =17
2 (females)
n2 =102
x 2 =88
s 2 =14
Let 1 be the population mean fastest speed for the male college student population.
Let  2 be the population mean fastest speed for the female college student population.
We want to learn about 1 and  2 and how they compare to each other. We could estimate
the difference in population means 1   2 with the difference in the sample means x1  x2 .
Will it be a good estimate? Can anyone say how close this observed difference in sample means
x1  x2 of 19 mph is to the true difference in population means 1   2 ? _________
If we were to repeat this survey (with samples of the same sizes), would we get the same value
for the difference in sample means? ___________________
Is a difference in the sample means of 19 mph large enough to convince us that there is a real
difference in the means for the populations of students?
So what are the possible values for the difference in sample means x1  x2 if we took many sets
of independent random samples of the same sizes from these two populations? What would
the distribution of the possible x1  x2 values look like?
What can we say about the
distribution of the difference in two sample means?
Using results from Section 8.8 (differences of independent random variables) and Section 9.6
(sampling distribution for a sample mean), the sampling distribution of the difference in two
sample means x1  x2 can be determined.
From Section 8.8 we learned that the mean of the difference is just the difference in the two
means, and that the variance of the difference of two independent random variables is the sum
of the variances.
From Section 9.6 we saw that the standard deviation of a sample mean is

.
n
We will apply these ideas to our newest parameter of interest, the difference in two sample
means x1  x2 .
Sampling Distribution of the Difference in Two (Independent) Sample Means
If the two populations are normally distributed (or sample sizes are both large enough), then
x1  x2
is (approximately)
148
Since the population standard deviations of 1 and 2 are generally not known, we will use the
data to compute the standard error of the difference in sample means.
Standard Error of the Difference in Sample Means
s12 s 22
s.e. x1  x 2  

where s1 and s2 are the two sample standard deviations
n1 n 2
The standard error of x1  x 2 estimates, roughly, the average distance of the possible
x1  x 2 values from The possible x1  x 2 values result from considering all possible
independent random samples of the same sizes from the same two populations.
From Utts, Jessica M. and Robert F. Heckard. Mind on Statistics, Fourth Edition. 2012.
Used with permission.
Moreover, we can use this standard error to produce a range of values that we are very
confident will contain the difference in the population means 1   2 , namely,
x1  x 2  (a few)s.e.( x1  x 2 ). This is the basis for confidence interval for the difference in
population means discussed in Chapter 11.
Looking ahead:
Do you think the ‘few’ in the above expression will be a z* value or a t* value?
What do you think will be the degrees of freedom?
We will use the standard error of the difference in the sample means to compute a
standardized test statistic for testing hypotheses about the difference in the population means
1   2 , namely,
Sample statistic – Null value.
(Null) standard error
This is the basis for testing covered in Chapter 13.
Looking ahead:
Do you think the standardized test statistic will be a z statistic or a t statistic?
What do you think will be the most common null value used? H0: 1   2 = _______
149
Additional Notes
A place to … jot down questions you may have and ask during office hours, take a few extra notes, write
out an extra practice problem or summary completed in lecture, create your own short summary about
this chapter.
150
Download