Uploaded by hisilicon.dfr

AP Stats Notes

advertisement
Census
- Collects data from every individual in the population
Population
- The entire group of individual about which we want info
- In a statistical study
Sample
- Part of the population from which we actually collect info
- Use info to draw conclusions/make inferences about the entire population
- Inference: drawing conclusions about a population on the basis of sample data
Creating a Sample Survey
1) Define population of interest
- Sampling frame: list of things that you draw a sample from
2) Determine what variable you want to measure
3) Decide how to choose a representative sample
Bias
- Using a method that will consistently overestimate/underestimate the value that you want to know
- Design of a study (can be attributed to flaws/data collection, not always a personal bias)
systematically favors certain outcomes
- Exam tip: indicate the direction of the bias and explain why
Biased Sampling Methods
Convenience Sample
- Choosing individuals who are easiest to reach
- Bias → often results in sample of like-minded people
Voluntary Response Sample
- Choosing individuals who voluntarily respond to a general appeal/invitation
- Bias → people w/strong opinions (often in same direction) are most likely to respond
- Internet, write-in, and call-in opinion polls
Personal Choice
- Creates bias
- Interviewers choose in CS, individuals choose in VRS
- Combat this problem by relying on chance (random = due to chance)
Random Sampling
- The use of chance to select a sample
- Central principle of statistical sampling
Simple Random Sample (SRS)
- An SRS of n individuals → chosen from the population so that every set of n individuals has an
equal chance of being selected
Choosing an SRS
- Label: Assign a numerical label to every individual in a population
- Use Table D to select random numbers
- Table D: table of random digits
- Or generate random integers on a calculator/use the hat method
Stratified Random Sample
-
Involves sampling important groups (strata) separately → combined to form one stratified
random sample
- Ex. dividing HS by grade level, dividing districts by income level, etc
1) Classify population into groups of similar individuals (strata)
2) Select an SRS from each stratum
3) Combine to form the full sample
Cluster Sample
- More often for convenience
- Clusters: mirror characteristics of population, contain variety
1) Classify population into clusters (groups near each other)
2) Select an SRS of clusters (choose a couple of entire clusters out of all of them)
3) Combine selected clusters into a sample
Multistage Sample
- Two or more methods combined
Margin of Error
- Sets bounds on the size of the likely error
- Tells us how much sampling variability to expect
- Results from random samples fall within
Errors
- Errors in sample surveys can introduce bias
- Two main sources: sampling errors and nonsampling errors
Sampling Errors
- Use of bad sampling methods
- Undercoverage: when some groups in the population are left out of the process of choosing the
sample
- Sampling frame should list all inds in pop (not often available)
- Ex. calling landline telephone numbers → excludes people w/only cell phone or no phone
at all, visiting households → excludes students in dorms, etc
Nonsampling Errors
- Occurs when individuals chosen for the sample can’t be contacted or refuses to participate
- Can only occur after a sample has been selected
- Different from voluntary response (inds already selected)
Response Bias
- Systematic pattern of incorrect responses
- People falsely tell interviewers that they voted bc it’s a social expectation
- Race/gender of interviewer can affect responses
- Forcing respondents to recall past events can lead to inaccurate info
Wording of Questions
- Confusing/leading questions can introduce strong bias and change outcomes
- Order of questions matters too
Observational Study
-
Observes individuals and measures variables of interest but doesn’t attempt to influence
responses
- Compare groups, examine relationships between variables, describe groups/situations
Experiment
- Deliberately imposes treatment on individuals to measure their responses
- Determine whether the treatment causes a change in the response
- Can help understand cause and effect
Confounding/Lurking Variable
- Variable that isn’t explanatory/response but can influence the RV
Confounding
- When two variables are associated in a way that their effects on a RV can’t be distinguished from
one another
- Observational studies of the effect of one V on another often fail bc of confounding
Treatment
- Specific condition applied to individuals
- Can be a combination of Vs if there are multiple EVs
Experimental Units
- Smallest collection of individuals to which treatments are applied
- Subjects: when EUs are humans
Factors
- EVs in an experiment
- When studying the joint effects of factors, each treatment is formed by combining a specific value
(level) of each factor
EV vs. Treatment
- Combinations of levels of EVs/factors form treatments
Random Assignment
- Experimental units are assigned to treatments at random
- Solution to problem of bias
Comparative Experimental Design
- Compares two treatments
- Random when EUs are assigned to treatments by chance
Completely Randomized Design
- Treatments are assigned to all the EUs by chance
- Happens after participants have been selected
- Can compare any number of treatments
- Difficult to make each group the same size
Control Group
- Provides a baseline for comparing the effects of other treatments
Principles of Experimental Design
1) Comparison
- 2 or more treatments that you can compare the results from
2) Control
- Control for confounding variables that might affect responses
-
Use comparative design and ensure that the only systematic difference between the
groups is the treatment administered
- Compare to a placebo
- Treatment with natural option for EV
3) Random Assignment
- Use chance to assign individuals to treatment groups
- Helps reduce effects of confounding variables that you can’t control
- Forms groups of EUs that should be similar before the treatments are applied
- As a result, differences should be due to treatment or chance
4) Replication
- Using enough EUs to distinguish a difference in the effects of the treatments from chance
variation
Placebo Effect
- Response to a dummy treatment
Double-Blind
- Neither subjects nor those who interact with them/measure the RV know which treatment a
subject received
Single-Blind
- Individuals who interact with subjects/measure the RV don’t know which treatment a subject
received
- Sometimes, it isn’t possible for subjects to not know
Statistically Significant
- An observed effect so large that it would rarely occur by chance
- A statistically significant association in data from a well-designed experiment implies causation
After participants have been selected:
Completely Randomized Design
- See above
Block
- A group of EUs that are known before the experiment to be similar in some way that is expected
to affect the response to the treatments
- Formed based on important and unavoidable sources of variability (confounding variables)
Randomized Block Design
- Random assignment of EUs to treatments is carried out separately within each block
- Control what you can, block what you can’t, and randomize to create comparable groups
Matched Pairs Design
- A type of randomized block design
- Create blocks by matching pairs of similar EUs
- Use chance to decide which member of a pair gets the first treatment
- The other member gets the second treatment
- Random assignment of subjects to treatments is done within each matched pair
- Sometimes, a “pair” consists of just one EU that gets both treatments, one after the other
- EU serves as own control → order of treatment scan influence response
Establishing Cause & Effect
-
Well-designed experiment w/randomized treatments and statistically significant results
EV occurs before RV
Correlation between EV and RV
Download