Uploaded by Dannuel Clark

Ch1 notes

advertisement
5/19/22
Sampling &
Data
Introductory
Statistics
Ellen Smyth
2
1
2
Data
Data is information we gather through
experiments and surveys (sampling).
1. Experiment on coronavirus
vaccine
• Getting the virus or not
• Symptoms of vaccine vs placebo
• Severity of virus
2. Survey (observational study) on
effectiveness of a TV ad
Data: percentage who went to
Starbucks since ad aired
Statistics
Statistics involves:
Designing studies,
Analyzing resulting data, and
Translating data into
knowledge and
understanding.
3
3
4
4
1
5/19/22
Sample Statistics & Population Parameters
Population and Samples
Parameter: numerical summary
of population
Statistic: numerical summary
of sample
• Population: Subjects of interest
• Sample: Subset for whom we
have data
• Often want answers about large
group but can’t measure all, so a
subset is chosen
• Use statistical techniques to
make conclusions
5
6
Randomness
Observational
Study
Each possible outcome in the entire
population has an equal chance of
being chosen
Merely observe values of response and explanatory
variables without doing anything to control the subjects
(Survey, census, or just tracking their data anonymously)
7
7
8
2
5/19/22
Simple Random Sampling (SRS)
Cluster Random Sample
1. Divide population
into large number
of clusters, such as
city blocks
2. Select simple
random sample of
clusters
3. Use all subjects in
clusters as sample
Each possible sample of set size
n has equal chance of being
selected
• To get a truly simple random
sample, we must either:
1. Put names in hat
2. Assign each subject a
number and use a random
number generator to
choose subjects
9
10
Cluster Random Sample
Stratified Random Sample
1. Divide the
population
into groups,
strata
2. Select SRS
from each
strata
3. Combine
samples
from each
for total
sample
Advantages
• Sampling
frame
unavailable
• Cost
Disadvantage
• Need larger
sample size for
same
reliability
11
12
3
5/19/22
Systematic Random Sample
Convenience Samples: Poor Ways to Sample
Process:
• Use random
number generator
for 1st subject
• Choose every nth
person after that
where n =
population /
sample size
13
Convenience Sample:
easy to get
• Unlikely to
represent
population
• Often severe
biases
• Results apply only
to observed
subjects
14
Elements of an
Experiment
Experimental unit: Single subject
or individual to be measured
Explanatory variable: Explains by
coming first and possibly causing
change in the other variable
Experiment
Response variable: Affected or
responding variable
Treatment: Entire set of values of
explanatory variable in experiment
Imposes certain conditions to control subjects
and observes outcomes
16
15
16
4
5/19/22
Control Group
Subjects
Group in a randomized
experiment that receives an
inactive treatment (placebo)
but is otherwise managed
exactly as the other groups
Subjects - The entities that
we measure in a study
Subjects could be
1. individuals,
2. schools,
3. rats,
4. counties,
5. widgets
This Photo by Unknown Author is licensed under CC BY-NC-ND
17
17
18
18
Variable
A variable is any characteristic
that is recorded for the subjects
in a study
Examples: Marital status,
Height, Weight, IQ
A variable can be classified as
either
Categorical or
Quantitative
Discrete or
Continuous
Sampling Error
Sampling Error – the natural variation
that results from selecting a sample to
represent a larger population
19
19
20
20
5
5/19/22
Categorical Variable
Quantitative Variable
A variable is categorical if each
observation belongs to one of a set of
categories.
Examples:
1. Gender (Male or Female)
2. Religion (Catholic, Jewish, …)
3. Type of residence (Apt,
Condo, …)
4. Belief in life after death (Yes
or No)
A variable is called quantitative if observations
take numerical values for different magnitudes
of the variable.
Examples:
1.
Age
2.
3.
Number of siblings
Annual Income
21
21
22
22
Discrete Quantitative Variable
Continuous Quantitative Variable
• A quantitative variable is discrete if
its possible values form a set of
separate numbers: 0, 1, 2, 3, ….
• Examples:
1. Number of pets in a household
2. Number of children in a family
3. Number of foreign languages
spoken by an individual
A quantitative variable is
continuous if its possible values
form an interval
Measurements
Examples:
1.
2.
3.
4.
Height
Weight
Age
Blood pressure
23
23
24
6
5/19/22
Quantitative vs. Categorical
Proportion & Percentage (Rel. Freq.)
• For Quantitative variables, key
features are the center and
spread (variability).
• For Categorical variables, a key
feature is the percentage of
observations in each of the
categories .
Proportions and percentages are also called relative
frequencies.
25
25
26
Count all but leading zeros
Outlier
An outlier falls far from the rest of the data
28
27
28
7
5/19/22
Look to next digit:
As you work on:
•
•
•
•
Round up if 5 or more
Discussions
Homework
Projects
Quizzes
Tools
Formula card
Calculator
Lecture Notes
Textbook & Knewton Instruction
Round down if 4 or less
Message Me
29
29
30
pollev.com/smyth
31
31
32
32
8
5/19/22
33
33
34
34
The sample
35
35
36
36
9
5/19/22
The population
Very specific example of a parameter
37
37
38
38
Exactly how you’d acquire a simple random
sample including details for how students
would be selected assuming you had unlimited
resources
Very specific example of a statistic
39
39
40
40
10
5/19/22
Whether the previous simple random sample
would give us a truly representative sample for
APSU students
41
41
11
Download