RM_Sampling

advertisement

Sampling

• Population: The overall group to which the research findings are intended to apply

• Sample: Any subset of the population, whether or not randomly drawn

• “Member” or “element” of the population: What we call a case once it’s been drawn in a sample

• Sampling frame: A listing of all “elements” or members of the population

• Why do we sample?

– To describe characteristics of a population when measuring each member is too expensive or impractical

– To test hypotheses using inferential (probability) statistics - the kind used in our articles

Sampling error

• Statistic

– A mathematical depiction of any characteristic that can be numerically measured; say, the mean (arithmetic average)

– A mathematical way to describe the relationship between two or more characteristics; say, correlation (r statistic)

• Summary statistic: An overall measure; say, the mean

• Population parameter: A statistic of the population; e.g., the mean

• Sample statistic: Same, of a sample

• Sampling error: Unintended differences between a population parameter and an equivalent statistic from an unbiased sample

– Inevitable result of sampling

– Try it out in class! Calculate the parameter, mean age.

Then take a random sample (more about that later) and compare it to the sample statistic.

– Any difference between the two is “sampling error.” It should decrease as the size of the sample increases

• Rule of thumb

– To minimize sampling error sample size should be at least

30 for populations of about 500; for larger populations sample size should be larger

Our class, as the “population”

Sampling accuracy

• Samples should accurately reflect, or represent , the population from which they are drawn

– If a sample is representative, then we can generalize (apply, infer) our findings from the sample to the population

– Inference means to make a conclusion about something from its components. That’s why we’re studying “inferential statistics” - to make inferences about populations.

– Warning: We can never generalize to other populations – only the population from which the sample was drawn

• Probability sampling: Each element or “case” in the population has exactly the same chance to be selected

– “Gold standard” to which all sampling techniques aspire

– Every element’s probability of being selected is the same

Random sampling is the most common way to obtain a probability sample

Probability sampling exercises using random sampling

Every element has an equal chance of being selected

Data from

Jay’s correctional center

Sheba Wachtel, warden

Simple random sampling

Population: 200 inmates

Mean sentence: 2.94 years

Purpose of simple random sampling is to obtain a sample for further study that adequately represents the population.

Draw a sample of 30 and compare the population parameter and sample statistic.

How much error is there?

A population

“distribution”

Stratified random sampling

• “Strata” are layers – like the layers of a cake

• Purpose of stratified sampling is to compare the statistics of subgroups

– Do violent offenders draw longer sentences than property offenders?

• Can designate strata before or after sampling

Jay’s inmates

Property crimes: 150

Mean sentence : 2.88

Violent crimes: 50

Mean sentence: 3.12

Variables

Type of crime

Sentence length

Strata

Property offenders

Violent offenders

Random sampling methods

With or without replacement?

– With replacement: Return each case to the population before drawing the next

• Keeps the probability of being drawn the same

• Makes it possible to redraw the same case

– Without replacement: Drawn cases are not returned to the population

• Probability of undrawn cases being selected increases as cases are drawn

– In social science research sampling without replacement is by far the most common

• Most sampling frames are sufficiently large so that as elements are drawn changes in the probability of being drawn are small

Proportionate or disproportionate? (when concerned with strata)

– Proportionate: Draw a sample from the population without regard to strata

• If strata are of substantially different size, may have to draw numerous cases to get a satisfactory sample size in the smaller stratum

– Disproportionate: Stratify first, then draw samples of desired size from each

• Disproportionate is most frequently used. But there is a “catch”…

Stratified proportionate random sampling

Hypothesis: Gender affects cynicism (two-tailed)

Male cops are more cynical than female cops (one-tailed)

150 male (75 %)

Sin City

200 officers

50 female (25 %) randomly select 30 officers expect 22.5 males expect 7.5 females

Compare average cynicism scores

Is there a problem ? Hint: how many females in the sample?

Stratified disproportionate random sampling

150 male (75 %)

Sin City

200 patrol officers

50 female (25 %) randomly select 30 cases from each category

30 males 30 females

Compare average cynicism scores

Note: don’t recombine these into a single sample!

Sampling exercise - Sin City

Is there more likely to be a personal relationship between suspects and victims in violent crimes or in crimes against property?

You have full access to crime data for “Sin City” in 2009. These statistics show there were 200 crimes, of which 75 percent were property crimes and 25 percent were violent crimes. For each crime, you know whether the victim and the suspect were acquainted (yes/no).

1. Identify the population.

2. How would you sample?

3. Would you stratify? How?

4. Use proportionate and disproportionate techniques. Which is better? Why?

Stratified proportionate random sampling

Research question: Is there more likely to be a personal relationship between suspect and victim in violent crimes or in crimes against property?

50 violent (25 %)

Sin City

200 crimes in 2004

150 property (75 %) randomly select 30 cases

(15% of the population)

( expect 7.5 violent – 25%) ( expect 22.5 property – 75%)

Compare proportions of these cases where suspects knew the victim

Stratified disproportionate random sampling

Sin City

200 crimes in 2003

50 violent (25 %) 150 property (75 %) randomly select 30 cases from each category

30 violent

30 property

Compare proportions within each where suspect and victim were acquainted

(Note: cannot combine results)

Sampling in experiments

Making cops “kinder” and “gentler”

The Anywhere Police Department has 200 patrol officers, of which 150 are males and 50 are females. Chief Jay wants to test a program that’s supposed to reduce officer cynicism.

Hypothesis: Officers who endure the training program will be less cynical

Dependent variable: Score on cynicism scale (1-5, low to high)

Independent variable: Cynicism reduction program (yes/no)

Stratified disproportionate random sampling

Does the training program reduce officer cynicism?

population:

200 patrol officers

150 males (75%)

CONTROL

GROUP

Randomly Assign

25 Officers

EXPERIMENTAL

GROUP

Randomly Assign

25 Officers

50 females

EXPERIMENTAL

GROUP

Randomly Assign

25 Officers

(25%)

CONTROL

GROUP

Randomly Assign

25 Officers

For each group, pre-measure dependent variable officer cynicism

Apply the intervention (adjust the value of independent variable – Jay’s program.)

NO YES YES NO

For each group, post-measure dependent variable officer cynicism

Also compare within-group changes – what do they tell us?

Quasi-probability sampling

• Systematic sampling

– Randomly select first element, then choose every 5 th , 10 th , etc. depending on the size of the sampling frame (number of cases or elements in the population)

– Problem: Sampling list that is ordered in a particular way could result in a non-representative sample

• Cluster sampling

– Method

• Divide population into equal-sized groups (clusters) chosen on the basis of a neutral characteristic

• Draw a random sample of clusters. The study sample contains

every element of the chosen clusters.

– Often done to study public opinion (city divided into blocks)

– Rule of equally-sized clusters usually violated

– The “neutral” characteristic may not be so and affect outcomes!

– Since not everyone in the population has an equal chance of being selected, there may be considerable sampling error

Non-probability sampling

• Accidental sample

– Subjects who happen to be encountered by researchers

– Example – observer ride-alongs in police cars

• Quota sample

– Elements are included in proportion to their known representation in the population

• Purposive/“convenience” sample

– Researcher uses best judgment to select elements that typify the population

– Example: Interview all burglars arrested during the past month

• Issues

– Can your findings be “generalized” or projected to a larger population?

– Are your findings valid only for those actually included in your samples?

Download