Chapter 12 Notes filled in

advertisement
AP STATISTICS
Ch. 12: Sample Surveys
Vocabulary:
POPULATION- All experimental units that you want to make a conclusion about
 does not necessarily have to be a large group
SAMPLING FRAME- list of individuals from whom the sample is drawn. Not always the population of interest.
Ex: phone book, registered voter list, list of tax returns, school roster, etc.
SAMPLE – small group of the population that you use in your experiment/study/survey.
 Hopefully representative of the population
PARAMETER- describes a population. Often unknown. Fixed value.
Ex: population mean
STATISTIC – describes a sample of the population. Changes from sample to sample. We use the statistics from
repeated samples to estimate the value of the parameter.
Ex: sample mean
VALUE
PARAMETER
STATISTIC
Mean
μ
x
________________________________________________
Std. Dev.
σ
sx
________________________________________________
p̂
Proportion
p
________________________________________________
A sample is said to be representative (or unbiased) if the statistics accurately reflect the population parameter.
EXAMPLE 1:
A polling agency takes a sample of 1500 American citizens from a list of tax returns and asks them if they are
lactose intolerant. 12% say yes. This is interesting, since it has been shown that 15% of the population is
lactose intolerant.
12% = _statistic_______
15% = _parameter_________
Population? Sampling frame? Sample?
Population: All American citizens
Sampling Frame: List of tax returns
Sample: 1500 Amercian citizens
Parameter of Interest: The percent of American citizens that are lactose intolerant
EXAMPLE 2:
A random sample of 1000 people who signed a card saying they intended to quit smoking were contacted a year after
they signed the card. It turned out that 210 (21%) of the sampled individuals had not smoked over the past six
months.
21% = _statistic________
Population = All smokers
Sampling frame= People who signed a card saying they intended to quit smoking
Sample = 1000 people who signed the card
Parameter of interest = Percent of people who had not smoked over the past six months (quit smoking)
EXAMPLE 3:
On Tuesday, the bottles of tomato ketchup filled in a plant were supposed to contain an average of 14 ounces of
ketchup. Quality control inspectors sampled 50 bottles at random from the day’s production. These bottles contained
an average of 13.8 ounces of ketchup.
14 = _parameter______
13.8 = _statistic________
Population? Sample? Sampling frame?
Population: All ketchup bottles
Sampling Frame: Ketchup bottles from one day’s production
Sample: 50 bottles
Parameter: The average number of ounces of ketchup in the bottles
EXAMPLE 4:
A researcher wants to find out which of two pain relievers works better. He takes 100 volunteers and randomly gives
half of them medicine #1 and the other half medicine #2. 17% of people taking medicine 1 report improvement in
their pain and 20% of people taking medicine #2 report improvement in their pain.
17% = _statistic_____
20% = __statistic_____
Population? Sampling frame? Sample?
Population: All people
Sampling Frame: All Volunteers
Smaple: 100 volunteers, 50 taking medicine #1 and 50 taking medicine #2
Parameter: Percent of people who showed improvement in their pain
BIAS VS. VARIABILITY
BIAS – consistent, repeated measurements that are not close to the population parameter. Basically accuracy.
VARIABILITY - basically like reliability. Consistent measurements (doesn’t matter if they are accurate or not.)

To reduce bias… use random sampling

To reduce variability … use larger samples
SAMPLING VARIABILITY –
 Different samples give different results (even if they are from the same population)
 Different size samples give us different results
 Bigger samples are better!
SAMPLING DISTRIBUTION- If we take lots of samples of the same size and make a histogram
True parameter
Larger samples yield smaller variability:
Lots of samples sizes of 100:
True parameter
Lots of samples sizes of 1000:
True parameter
Bias vs. Variability:
True parameter
True Parameter
high bias, low variability
low bias, high variability
True parameter
high bias, high variability
True parameter
low bias, low variability
BIAS VS. VARIABILITY EXAMPLES:
*bias  accuracy
* variability  reliability
high bias, low variability
low bias, high variability
high bias, high variability
low bias, low variability
UNBIASED ESTIMATOR- When the center of a sampling distribution (histogram) is equal to the true parameter
SAMPLING DESIGNS
GOOD SAMPLING DESIGNS
1) Simple Random Sample (SRS)- Every experimental unit has the same chance of being picked for the sample
and every possible sample has the same chance of being picked.
-
give every subject in the population a number
use TRD and read across to select your sample
ignore repeats
Example: Take and SRS of 5 from the following list. Start at line 31 in the table.
01Smith
07Jones
13Holloway
02DeNizzo
08David
14Adams
03Schaefer
09Gray
15Capito
Sample:
04Meyers
10Gingrich
16Card
05Dietrich
11Moreland
17Hall
06Walsh
12Whitter
18Jordan
Example: Take and SRS of 4 from the following list of math teachers. Start at line 18.
01McGlone
02Szarko
03Stotler
04Timmins
05Gemgnani
06Lorenz
07McCuen
08Bellavance
09Kelly
10Arden
11O’Brien
12Lake
13Wilson
14Woodring
15Wheeles
16McNelis
17Robinson
18Bainbridge
Sample:
2) STRATIFIED RANDOM SAMPLE (not SRS)

Divide population into groups with something in common (called STRATA)
o Example: gender, age, etc.
Take separate SRS in each strata and combine these to make the full sample
o Can sometimes be a % of each strata
Example: We want to take an accurate sample of CB South students. There are 540 sophomores, 585 juniors,
and 530 seniors. Take a stratified sample.
3) SYSTEMATIC RANDOM SAMPLEThe first experimental unit is selected at random, and each additional experimental unit is selected at a
predetermined interval.
Examples: Surveying every 5th person that comes through the back door at CB South
Selecting a random person and then every 10th person on a list
4) CLUSTER SAMPLE –
Population is broken down into groups. All members of one (or more) group are taken as the sample.
The CB South population is broken down into sophomores, juniors, and seniors. We then sample ALL
of the members in one grade level.
5) MULTI-STAGE SAMPLE Used for large populations
 Example: sampling the population of the USA:
-Counties (pick 50)
- zip codes
- two streets
- 3 houses
CB South Example:
- Block
- 10 classes
- 5 students
BIASED SAMPLING METHODS:
1) VOLUNTARY RESPONSE SAMPLESChooses itself by responding to a general appeal. Ex: call-in, write-in, etc.
2) CONVENIENCE SAMPLESSelecting individuals that are easiest to reach/contact
TYPES OF BIAS IN A SAMPLE:

UNDERCOVERAGESampling in a way that leaves out a certain portion of the population that should be in your sample.
Ex: telephone polls, registered voter list, etc.

NONRESPONSEBias introduced when a large amount of those sampled do not respond. (You were selected, and
you chose not to respond)
Ex: Don’t answer the phone or hang up, don’t mail back a questionnaire, refuse to answer questions

RESPONSE BIAS Anything in the survey design that influences responses
Ex: respondents lying
responses to try to please the interviewer
unwillingness to reveal personal facts or information
leading or confusing questions

Voluntary Response BiasSome people are eager to volunteer when they have a strong opinion on the matter
Download