the science of collecting, organizing, analyzing, and interpreting Data

advertisement
Chapter 1 – Introduction to Statistics
Statistics - the science of collecting, organizing,
analyzing, and interpreting Data.
Chapter 1: Collecting Data
Chapter 2: Organizing/Analyzing Data
Chapters 3-8: Interpreting Data
Types of Data Sets:
Population - data set consisting of _________
outcomes, measurements, or responses of interest
Sample - data set which is a __________ of the
population data set
Examples:
 If we are interested in measuring the salaries of
American high-school teachers, the population
data set would be a list of the salaries of every
high-school teacher in America. A sample data
set could be obtained by selecting 100 highschool teachers from a across the country and
listing their salaries.
1
 A polling organization wants to know whether
Americans favor increased defense spending.
The ___________ data set would consist of the
responses of every American. A common way of
choosing a ___________ data set would be to
randomly call 1000 Americans and gather their
responses to the question of whether they favor
increased defense spending.
Suppose we wanted to determine the average number
of course hours for HCC students. What would be
the population data set? A possible sample set?
2
Types of Measurements:
Parameter - a numerical measurement made using
the population data set
Statistic - a numerical measurement made using a
sample data set
Ex:
 Using the teacher salary data sets, we could
calculate the average salary for the high-school
teachers. The average calculated from the
population data set would be the __________.
The average calculated from the sample of 100
teachers would be a _________.
Notice that unless the population is very small it is
probably impossible to gather the population data
set, and so it is usually impossible to calculate the
parameter we are interested in. (For example, why
would it be impossible to find Americans’ actual average
calories consumed per day?)
The main idea of the science of statistics is that we
can get around this difficulty by selecting a sample,
calculating the sample statistic, and use the sample
statistic to make an estimate of the parameter.
Unfortunately, statistical estimates can never be
100% certain. (But they can be 90% or 95% or 99% certain.)
3
Descriptive and Inferential Statistics:
_____________ statistics just reports the “facts” as
gathered from the data (organization, summarization,
and display of data), ____________ statistics draws
conclusions about the population from the sample
data given.
Ex: Determine the descriptive and inferential
statistics based on this data from The Journal of
Family Issues:
Still Alive at 65:
Unmarried Men: 70%
Married Men: 90%
What can you conclude? Does this data prove your
statement?
4
Types of Data:
1.2
Qualitative Data – characteristics, labels, or nonnumerical data
Ex: Eye Color, Political Party, Name
Quantitative Data - numerical measurements or
quantities
Ex: Height, Area, Time, Income, Blood Alcohol Level
There are subcategories as well (see chart p. 14):
NOMINAL (qualitative): no meaningful numbers
ORDINAL (qualitative or quantitative): may be
numerical or rankable, but numerically meaningless
INTERVAL (quantitative): numerically
meaningful, and can find differences between values;
no absolute zero
RATIO (quantitative): can also make meaningful
statements about the ratios between values (ex: twice
as much rain); zero means “none”
Ex: Choose the best suited level of measurement :
1.
2.
3.
4.
_________ 5.
Hours spent watching movies
5 most popular baby names
Car make and model (ex: F-150)
Sea level of ridges
Numbers on soccer jerseys
5
1.3
Experimental Design:
How to design a statistical study
1. Identify what you are interested in, and what the
POPULATION is. (Ch 1.)
2. * Decide upon the METHOD OF DATA
COLLECTION, make a plan, COLLECT the
data. (Ch 1.)
3. DESCRIBE the data (descriptive statistics:
histogram, mean, standard deviation, etc. This
will be a topic of Ch 2.)
4. INTERPRET the data (Inferential statistics:
write and interpret a hypothesis. This will be a
topic of chapters 3 through 9.)
5. Consider possible errors or omissions.
* We are here.
6
Methods of Data Collection:
1.3
Method
Examples
Survey – an investigation
of characteristics of a
population
Census - collect
measurements from the
entire population


Determine average grade on a
Statistics exam
Measure salaries of all 50 state
governors


Opinion Polls
Determine average income in U.S


Temperature at the core of the Sun
Weight of an adult Tyrannosaurus Rex

Effects of institutionalization on
orphaned children

A sample of 200 cancer patients is
selected. An experimental drug is
given to 100 patients and the
remaining 100 patients receive a
placebo. The survival rates of the two
groups are then compared
Used when population is small.
Sample - choose a
sample from your
population and collect
measurements.
Used when population is large. (Most
Common)
Simulation - Program a
computer with a mathematical
or physical model to simulate
population data.
Used when impossible to collect sample data.
Observational Study – watch
the subjects and take note, but do not interfere
Experiment - Collect a
sample, split the sample
into two groups:
The Case
Group receives treatment.
The Control Group does not.
Used to measure the effect of treatment by
comparing the characteristics of the case and
control groups.
Placebo; Placebo Effect; Single Blind
Experiment ; Double Blind Experiment
7
Methods of Sampling:
Method
Random* Sampling The sample is chosen as a
result of chance (equal
likelihood of any being chosen.)
Systematic Sampling The population is placed
on a list, a random starting
point is chosen and then
every k-th member is
selected.
Stratified Sampling The population is divided
into groups (strata) usually
with meaningful
differences, and a sample
is chosen from each group.
Cluster Sampling - The
population is divided into
groups randomly, and then
a sample is chosen by
randomly selecting entire
groups.
Examples
 Telephone polling
random telephone
numbers
 Drawing names out of a
hat
(*Demo random # gen on calc.)
 Choosing a sample of
registered voters by
choosing every 25th
voter from the county
registration roll
 Testing every 300th
assembly line product
 Choosing 200 men and
200 women for a sample
 Stratify the population
by income level and then
choose a sample of low,
middle, and high income
individuals
 Randomly choose 10
polling stations in a city
and exit poll all voters at
those stations
Convenience Sampling Choose individuals for a
sample because they are
easy to include.
 Internet Polls
 Mail-In Customer
Survey
Group work Quiz, and article,
8
Download