Uploaded by tgohre

Theme1 L2 slides

advertisement
UNIVERSITY OF PRETORIA
Theme 1: Introduction to data
Presenter: TM Malatji
1
Contents
•
•
•
•
•
•
•
•
•
Populations and samples
Anecdotal evidence
Sources of bias
Confounding variables
Observational studies
Experimental studies
Principles of experimental design
Sampling strategies
Correlation vs. Causation
2
Theme 1: Populations and samples
• Population
– An entire group that you want to draw
conclusions about.
• Sample
– The specific group that you will collect data
from.
3
Theme 1: Populations and samples
• Research question:
– Over the last 10 years, what is the average
time to complete a degree for University of
Pretoria undergraduate students?
– What is the population? What is the sample?
4
Theme 1: Populations and samples
• Research question:
– Over the last 10 years, what is the average
time to complete a degree for University of
Pretoria undergraduate students?
– What is the population? What is the sample?
• Population: All the graduates from the
University of Pretoria from the last 10
years.
• Sample: The selected alumni students
who will be questioned concerning
completion time.
5
Theme 1: Why use samples?
• Samples are used when:
– The population is too large to collect data
from it.
– We do not have access to the entire
population.
– The population is unlimited in size and is
hypothetical. E.g. The effects of a new
medical procedure.
6
Theme 1: Populations and samples
• Generally samples should:
– Be based on a well defined selection criteria.
– Be unbiased on the make-up of the sample
cases.
– Be random to allow fair selection of cases.
– Consist of the different variations that are
present in the population.
7
Theme 1: Sources of bias
• Non-response
– Is the data representative of the population?
• Convenience sample
– Is the data representative of the population?
8
Theme 1: Why random sampling?
9
Theme 1: Anecdotal evidence
• Anecdotal evidence is based on individual
accounts, rather than on reliable research or
statistics, and so may not be valid.
• The data:
– Represents one or two cases.
– Is not representative of the population.
10
Theme 1: Relationship between
variables
• Independent/Explanatory
• Dependent/Response
might
Explanatory affect
variable
Response
variable
• Association ≠ Causation
11
Theme 1: Confounding variables
• Confounding variables
– Third party variable affecting both the
supposed explanatory and response
variables.
• Example:
You find that more workers are employed in
provinces in which the market provides higher
salaries. Does this mean that higher salaries lead
to higher employment rates?
12
Theme 1: Confounding variables
• Example:
You find that more workers are employed in
provinces in which the market provides higher
salaries. Does this mean that higher salaries lead
to higher employment rates?
Job sector
Average
salary
Number
employed
13
Theme 1: Observational studies
• Data is collected by monitoring what happens in
a sample space.
• This study comes in two forms:
– Prospective study: A study in which events
are recorded as they unfold.
• A number of workers are observed as they
grind a part to determine if there is a
difference in which the grinding process is
conducted in order to improve the
process.
14
Theme 1: Observational studies
• This study comes in two forms:
– Retrospective study: Data is collected after
events have occurred.
• A number of workers are interviewed and
asked to describe the method that they
normally use when they grind a part in
order to understand the grinding process
and improve it.
15
Theme 1: Experimental studies
• A study in which a treatment is given to cases.
• A randomized experiment is one in which there
is random assignment.
Treatment
Control
16
Theme 1: Principles of experimental
design
• Control possible confounders
• Randomize into treatment and control groups
• Sufficiently large sample or duplicate
experiment
• Block variables that can influence study
• Blinding
– Single or double
19
Theme 1: Sampling strategies
• Simple random sampling – Each population
member equally likely.
20
Theme 1: Sampling strategies
• Stratified sampling – similar characteristics in
each stratum (homogeneous).
21
Theme 1: Sampling strategies
Stratified sampling
• Example:
You are interested in how having a doctoral degree
affects the wage gap between men and women among
graduates of a certain university.
Because only a small proportion of this university’s
graduates have obtained a doctoral degree, using a
simple random sample would likely give you a sample
size too small to properly compare the differences
between men and women with a doctoral degree
versus those without one.
22
Theme 1: Sampling strategies
Stratified sampling
• Example:
You are interested in how having a doctoral degree affects
the wage gap between men and women among graduates of
a certain university.
Characteristic
Strata
Groups
Gender
•Female
•Male
Degree
•Bachelor’s
•Master’s
•Doctorate
1.Male bachelor’s graduates,
2.Female bachelor’s graduates,
3.Male master’s graduates,
4.Female master’s graduates,
5.Male doctoral graduates,
6.Female doctoral graduates.
23
Theme 1: Sampling strategies
Stratified sampling
• Example:
You are interested in how having a doctoral degree affects
the wage gap between men and women among graduates of
a certain university.
Female
bachelor
graduate
Female
master’s
graduate
Male
bachelor
graduate
Female
doctoral
graduate
Male
master’s
graduate
Male
doctoral
graduate
24
Theme 1: Sampling strategies
Stratified sampling
Salary
Gender
Job history
Qualification
Sample
Female
bachelor
graduate
Female
master’s
graduate
Male
bachelor
graduate
Female
doctoral
graduate
Male
master’s
graduate
Male
doctoral
graduate
25
Theme 1: Sampling strategies
• Cluster sampling – Diverse characteristics in
each cluster (non-homogeneous)
26
Theme 1: Sampling strategies
Cluster sampling
• Example:
You are interested in how having a doctoral degree
affects the wage gap between men and women among
graduates of a certain university.
Male,
Female,
bachelor,
masters,
doctoral
graduates
Male,
Female,
bachelor,
masters,
doctoral
graduates
Male,
Female,
bachelor,
masters,
doctoral
graduates
Male,
Female,
bachelor,
masters,
doctoral
graduates
Male,
Female,
bachelor,
masters,
doctoral
graduates
27
Theme 1: Sampling strategies
Cluster sampling
Sample
Male,
Female,
bachelor,
masters,
doctoral
graduates
Male,
Female,
bachelor,
masters,
doctoral
graduates
Male,
Female,
bachelor,
masters,
doctoral
graduates
Male,
Female,
bachelor,
masters,
doctoral
graduates
Male,
Female,
bachelor,
masters,
doctoral
graduates
28
Theme 1: Sampling strategies
• Multistage cluster sampling – Cluster sampling,
and then select cases for study from clusters
Male,
Female,
bachelor,
masters,
doctoral
graduates
Male,
Female,
bachelor,
masters,
doctoral
graduates
Male,
Female,
bachelor,
masters,
doctoral
graduates
Male,
Female,
bachelor,
masters,
doctoral
graduates
29
Theme 1: Study conclusions:
Correlation vs. Causation
• Observational study:
– A study in which cases are observed or
outcomes are measured without any
intervention to affect the outcomes (e.g. No
treatment given).
• Experimental study:
– A study in which and intervention is
introduced and the effects are studied.
30
Theme 1: Study conclusions:
Correlation vs. Causation
•
How does sleep deprivation affect your ability to drive? A recent
study measured the effects on 19 professional drivers. Each driver
participated in two experimental sessions: one after normal sleep
and one after 27 hours of total sleep deprivation. The treatments
were assigned in random order. In each session, performance was
measured on a variety of tasks including a driving simulation.
a. Correlation statement generalized to all drivers
• E.g. Sleep deprivation is associated with decreased
performance ability of professional drivers.
b. Causal statement generalized to all drivers
• E.g. Sleep deprivation decreases the performance ability of
professional drivers.
c. Causal statement about the sample
• E.g. Sleep deprivation decreases the performance ability of
19 sampled professional drivers.
31
Theme 1: L2 Summary
•
•
•
•
•
•
•
•
•
•
Populations and samples
Anecdotal evidence
Sources of bias
Confounding variables
Observational studies
Experimental studies
Principles of experimental design
Sampling strategies
Correlation vs. Causation
PS: Do complete the homework as exercise
32
Thank you!
Happy studying 
33
Download