• Population: The overall group to which the research findings are intended to apply
• Sample: Any subset of the population, whether or not randomly drawn
• “Member” or “element” of the population: What we call a case once it’s been drawn in a sample
• Sampling frame: A listing of all “elements” or members of the population
• Why do we sample?
– To describe characteristics of a population when measuring each member is too expensive or impractical
– To test hypotheses using inferential (probability) statistics - the kind used in our articles
• Statistic
– A mathematical depiction of any characteristic that can be numerically measured; say, the mean (arithmetic average)
– A mathematical way to describe the relationship between two or more characteristics; say, correlation (r statistic)
• Summary statistic: An overall measure; say, the mean
• Population parameter: A statistic of the population; e.g., the mean
• Sample statistic: Same, of a sample
• Sampling error: Unintended differences between a population parameter and an equivalent statistic from an unbiased sample
– Inevitable result of sampling
– Try it out in class! Calculate the parameter, mean age.
Then take a random sample (more about that later) and compare it to the sample statistic.
– Any difference between the two is “sampling error.” It should decrease as the size of the sample increases
• Rule of thumb
– To minimize sampling error sample size should be at least
30 for populations of about 500; for larger populations sample size should be larger
Our class, as the “population”
• Samples should accurately reflect, or represent , the population from which they are drawn
– If a sample is representative, then we can generalize (apply, infer) our findings from the sample to the population
– Inference means to make a conclusion about something from its components. That’s why we’re studying “inferential statistics” - to make inferences about populations.
– Warning: We can never generalize to other populations – only the population from which the sample was drawn
• Probability sampling: Each element or “case” in the population has exactly the same chance to be selected
– “Gold standard” to which all sampling techniques aspire
– Every element’s probability of being selected is the same
– Random sampling is the most common way to obtain a probability sample
Every element has an equal chance of being selected
Data from
Sheba Wachtel, warden
Population: 200 inmates
Mean sentence: 2.94 years
Purpose of simple random sampling is to obtain a sample for further study that adequately represents the population.
Draw a sample of 30 and compare the population parameter and sample statistic.
How much error is there?
• “Strata” are layers – like the layers of a cake
• Purpose of stratified sampling is to compare the statistics of subgroups
– Do violent offenders draw longer sentences than property offenders?
• Can designate strata before or after sampling
Jay’s inmates
Property crimes: 150
Mean sentence : 2.88
Violent crimes: 50
Mean sentence: 3.12
Variables
•
Type of crime
•
Sentence length
Strata
•
Property offenders
•
Violent offenders
• With or without replacement?
– With replacement: Return each case to the population before drawing the next
• Keeps the probability of being drawn the same
• Makes it possible to redraw the same case
– Without replacement: Drawn cases are not returned to the population
• Probability of undrawn cases being selected increases as cases are drawn
– In social science research sampling without replacement is by far the most common
• Most sampling frames are sufficiently large so that as elements are drawn changes in the probability of being drawn are small
• Proportionate or disproportionate? (when concerned with strata)
– Proportionate: Draw a sample from the population without regard to strata
• If strata are of substantially different size, may have to draw numerous cases to get a satisfactory sample size in the smaller stratum
– Disproportionate: Stratify first, then draw samples of desired size from each
• Disproportionate is most frequently used. But there is a “catch”…
Stratified proportionate random sampling
Hypothesis: Gender affects cynicism (two-tailed)
Male cops are more cynical than female cops (one-tailed)
Compare average cynicism scores
Stratified disproportionate random sampling
Compare average cynicism scores
Note: don’t recombine these into a single sample!
Is there more likely to be a personal relationship between suspects and victims in violent crimes or in crimes against property?
You have full access to crime data for “Sin City” in 2009. These statistics show there were 200 crimes, of which 75 percent were property crimes and 25 percent were violent crimes. For each crime, you know whether the victim and the suspect were acquainted (yes/no).
1. Identify the population.
2. How would you sample?
3. Would you stratify? How?
4. Use proportionate and disproportionate techniques. Which is better? Why?
Stratified proportionate random sampling
Research question: Is there more likely to be a personal relationship between suspect and victim in violent crimes or in crimes against property?
( expect 7.5 violent – 25%) ( expect 22.5 property – 75%)
Compare proportions of these cases where suspects knew the victim
Stratified disproportionate random sampling
Compare proportions within each where suspect and victim were acquainted
(Note: cannot combine results)
Making cops “kinder” and “gentler”
The Anywhere Police Department has 200 patrol officers, of which 150 are males and 50 are females. Chief Jay wants to test a program that’s supposed to reduce officer cynicism.
Hypothesis: Officers who endure the training program will be less cynical
Dependent variable: Score on cynicism scale (1-5, low to high)
Independent variable: Cynicism reduction program (yes/no)
Stratified disproportionate random sampling
Does the training program reduce officer cynicism?
population:
200 patrol officers
150 males (75%)
CONTROL
GROUP
Randomly Assign
25 Officers
EXPERIMENTAL
GROUP
Randomly Assign
25 Officers
50 females
EXPERIMENTAL
GROUP
Randomly Assign
25 Officers
(25%)
CONTROL
GROUP
Randomly Assign
25 Officers
For each group, pre-measure dependent variable officer cynicism
Apply the intervention (adjust the value of independent variable – Jay’s program.)
NO YES YES NO
For each group, post-measure dependent variable officer cynicism
Also compare within-group changes – what do they tell us?
• Systematic sampling
– Randomly select first element, then choose every 5 th , 10 th , etc. depending on the size of the sampling frame (number of cases or elements in the population)
– Problem: Sampling list that is ordered in a particular way could result in a non-representative sample
• Cluster sampling
– Method
• Divide population into equal-sized groups (clusters) chosen on the basis of a neutral characteristic
• Draw a random sample of clusters. The study sample contains
every element of the chosen clusters.
– Often done to study public opinion (city divided into blocks)
– Rule of equally-sized clusters usually violated
– The “neutral” characteristic may not be so and affect outcomes!
– Since not everyone in the population has an equal chance of being selected, there may be considerable sampling error
• Accidental sample
– Subjects who happen to be encountered by researchers
– Example – observer ride-alongs in police cars
• Quota sample
– Elements are included in proportion to their known representation in the population
• Purposive/“convenience” sample
– Researcher uses best judgment to select elements that typify the population
– Example: Interview all burglars arrested during the past month
• Issues
– Can your findings be “generalized” or projected to a larger population?
– Are your findings valid only for those actually included in your samples?