Individuals

advertisement
Statistics Through Applications
Chapter 1:
How Do We Get “Good”
Data?
Copyright © 2004 by W. H. Freeman & Company
Individuals & Variables
• Individuals are the objects described by a set of
data.
– People, animals, or things
• A Variable is any characteristic of an individual.
Variables can take different values for different
individuals.
– Categorical Variables: places an individual into one of
several categories (Job type, gender, race)
– Quantitative Variables: takes numerical values for
which ordering and averaging make sense (age, weight,
salary)
Example: A few lines from a teacher’s gradebook
Name
Sex
Homeroom Grade
Calc No.
Test 1
Hsu, Danny
M
Blair
12
B319
81
Iris, Francine
F
Kingsley
12
B298
92
Ruiz, Ricardo
M
Alfonzo
11
B304
87
• What individuals does this data describe?
• What variables does this data describe?
• Which of these are categorical?
• Which are quantitative?
Good Data is Valid, Unbiased & Reliable
• Valid – relevant and
appropriate
• Unbiased – not
consistently lower or
higher than actuality
• Reliable – as little
variation as possible
Good Data is Compared Fairly
• Often a rate expressed as a percent or
fraction is a more valid measure than a
simple count of occurrences
– Two schools both had 1900 students pass
TAKS. One school has 2000 students and the
other has 2500. Did they perform equally as
well?
Percent Change
amount
of
change
• Percent change =
 100
starting value
• From July 2008 to July 2009, the Dow Jones Industrial
Average dropped from 11,496.57 to 8163.60. Find the
percent change.
• What is another way to describe a 100% increase?
• What can be said about a 100% decrease?
• What can be said about a decrease higher than 100%?
Even Good Data needs to be Read Carefully
• Summertime is Burglary Time – or is it?
– An advertisement for a home security system
says, “When you go on vacation, burglars go to
work. According to FBI statistics, over 26% of
home burglaries take place between Memorial
Day and Labor Day.”
• Only one in two cameras is actually in operation, but this could soon
increase to as many as one in three
Watford Observer, 2 August 2002
• Whereas five years ago the [professional conduct committee] panels
sat for only 90 days a year, in 2000 the number of days was 242 and in
2001 it was 479. This year the number of days will be higher still...
General Medical Council newsletter,
13 August 2002
• Westchester County is a suburban area covering 438 square miles
immediately north of New York City. The county is home to 800,000
deer.
Fine Gardening, September/October 1989
• Continental Airlines once advertised that it had “decreased lost baggae
by 100% in the past six months.”
Even Good Data Varies
• How Long is a Minute?
– How accurate are you and your classmates at knowing
how long a minute is?
– Get a partner and a stopwatch. You will take turns
timing and guessing. Using the stopwatch, the timer
tells the guesser when to start. When the guesser
believes that a minute has passed, he says “Stop.” At
that point, the timer stops the stopwatch and records the
time that passed to the nearest tenth of a second. Do
not tell your partner how much time actually passed!
– Reset the stopwatch and switch roles. Continue timing
and measuring until you each person has been timed
three times.
Analyzing How Long is a Minute?
•
•
•
•
Was your data valid?
Was either partner’s data biased?
Which partner was more reliable?
How about the class as a whole? Add your
data (all 6 measures) to the class list and
graph.
Use Averages to Improve Reliability
• No measuring process is perfectly reliable.
• The average of several repeated
measurements of the same individual is
more reliable (and less variable) than a
single measurement.
The Statistical Problem Solving
Process - APAC
•
•
•
•
A – Ask a question of interest
P – Produce data
A – Analyze and describe/graph the data
C – Conclusion, answering the question
Using APAC
• Which element of APAC
is shown here?
• What is a reasonable
question of interest?
• How do you think the data
were produced?
– What are the individuals?
– What is the variable?
– Is it quantitative or
categorical?
• What can be concluded?
First Homework Problem
• According to the National Institute on Media and the
Family, a preschooler’s risk of obesity jumps 6% for every
hour of television watched per day. The risk increases by
31% if the TV is in their bedroom.
– 1.What element of APAC is given here?
– 2. What is a reasonable question of interest in this case?
– 3. The actual study that produced these results involved 2761 lowincome adults in New York with children aged 1 to 4 years. Who
are the individuals in this study?
– 4. What variable(s) were measured?
Download