1.2 KEY TERMS

advertisement
1.2 KEY TERMS
•
Statistics:The collection, analysis, interpretation and presentation of
data
•
Population: A collection of persons, things or objects under study (ex.
All students at FBHS)
•
Sample: A portion or subset of the larger population to collect data
from and study (ex. This class as a subset of students at FBHS)
•
Statistic: A number that represents a property of the sample (ex. If this
class is a sample of the school, the average GPA of this class)
•
Parameter: A number that is a property of the population (ex. The
average GPA of all FBHS students)
•
Variable: (X or Y) a characteristic of interest for each person or thing in
a population
•
•
•
Numerical variables: Take on values with number measures
(ex. Weight in lbs., time in min.)
Categorical variables: Place the person or thing in a category
(ex. If X is favorite colors, examples could be purple, blue, etc.)
Data: values for the variable from each individual
IDENTIFY THE POPULATION, SAMPLE, PARAMETER,
STATISTIC, VARIABLE AND DATA:
To determine the average time FBHS students
take to get ready in the morning, we ask each
student in one English class how long they take.
 Population = all FBHS students
 Sample = students in the English class
 Parameter = average time for all students
 Statistic = average time for the English class
 Variable = X = time it takes one student
 Data = ex. 20 min., 1 hour, 35 min.
IDENTIFY THE POPULATION, SAMPLE, PARAMETER,
STATISTIC, VARIABLE AND DATA:
A phone survey is conducted to determine what
proportion of eligible voters have already decided
for whom they will vote for president
 Population = all eligible voters
 Sample = voters called
 Parameter = proportion of voters already
decided
 Statistic = proportion of called voters already
decided
 Variable = Y = the number of voters who have
already decided
 Data = decided, undecided
PRACTICE: IDENTIFY THE POPULATION, SAMPLE,
PARAMETER, STATISTIC, VARIABLE AND DATA:
To determine how long, on average, U.S. high school
students sleep a night, all students from 10 randomly
selected schools are asked
 Population = U.S. high school students
 Sample = students in the 10 chosen schools
 Parameter = average time sleeping of U.S.
students
 Statistic = average time sleeping of students in the
10 chosen schools
 Variable = X = time one student sleeps
 Data = hours, ex. 7 hours, 5 hours, 9 hours
1.3 KEY TERMS
• Qualitative data: categorize or describe a population
(ex. Hair color, favorite music, blood type)
• Quantitative data: Always numbers (ex. Height, number
of people with a certain trait)
• Quantitative discrete: data that can take on only
certain numerical values (ex. Number of days per
week someone exercises, number of magazines a
person reads)
• Quantitative continuous: data that results from
measuring (ex. Time spent in line, weight of fruit)
WORK COLLABORATIVELY TO DETERMINE THE CORRECT DATA TYPE (QUANTITATIVE OR
QUALITATIVE). INDICATE WHETHER QUANTITATIVE
DATA ARE CONTINUOUS OR DISCRETE. HINT: DATA THAT ARE DISCRETE OFTEN START WITH THE
WORDS "THE NUMBER OF."
a. the number of pairs of shoes you own
b. the type of car you drive
c. where you go on vacation
d. the distance it is from your home to the nearest grocery store
e. the number of classes you take per school year.
f. the tuition for your classes
g. the type of calculator you use
h. movie ratings
i. political party preferences
j. weights of sumo wrestlers
k. amount of money (in dollars) won playing poker
FIGURE 1.3
What type of data does this pie chart display?
FIGURE 1.4
What type of data does the graph display?
FIGURE 1.5
Pie chart to display qualitative data.
FIGURE 1.6
This bar graph displays the same data as the previous chart.
FIGURE 1.7
What do you notice about this bar graph?
FIGURE 1.8
What do you notice about this bar graph?
FIGURE 1.9
Bar graph with Other/Unknown Category
FIGURE 1.10
Pareto Chart With Bars Sorted by Size
FIGURE 1.11
Organization matters!
1.2 KEY TERMS (CONT.)
• Random sampling: each member of a population has an equal
chance of being selected for the sample. The goal is that the
sample has the same characteristics of the population
• Simple random sample: choose n objects at random (ex. Put
numbers in a hat, use a random number generator)
• Stratified sample: divide the population into groups (strata)
and choose an equal proportion from each stratum
• Cluster sample: divide the population into groups (clusters)
and randomly select some of the clusters
• Systematic sample: randomly select a starting point and
take every nth piece of data from a listing of the population
• Convenience sample: using results that are readily available
• Sampling bias: when a sample is collected and some
members of the population chosen are not as likely to be
chosen as the others
FIGURE 1.12
VARIATION HAPPENS!
•
Samples WILL be different, even on well chosen samples. We try to
choose representative samples to get as close as we can to the
parameter, but any statistic WILL have variability
•
There are some critical errors to be looking for:
•
•
•
•
•
•
•
•
•
Problems with samples: a non-representative sample
Self-selected samples: responses by people who choose to
respond
Sample size issues: large samples are better
Undue influence: collecting data or asking questions in a way
that influences the response
Non-response or refusal of subject to participate: The collected
responses may no longer be representative of the population.
Causality: A relationship between two variables does not mean
that one causes the other to occur
Self-funded or self-interest studies: A study performed by a
person or organization in order to support their claim.
Misleading use of data: improperly displayed graphs,
incomplete data, or lack of context
Confounding: When the effects of multiple factors on a
response cannot be separate
FIGURE 1.14
As the graphs show, Acme consistently outperforms the Other Guys!
Examining this statement, as a statistician, what do you notice?
FIGURE 1.15
What critical errors do you notice with this graph?
1.4 KEY TERMS
• Frequency: The number of times a value of data occurs
• Relative Frequency: the ratio (fraction or proportion) of the number
of times a value of the data occurs in the set of all outcomes to the
total number of outcomes
• Cumulative relative frequency : the accumulation of the previous
relative frequencies.
CREATE A FREQUENCY TABLE FOR HOW LONG
EVERYONE SLEPT LAST NIGHT
Hours
Tally
Frequency
Relative
Frequency
3
4
5
6
7
8
9
• What percent of people slept 8 hours?
• What percent of people slept up to 7 hours?
• What percent of people slept at least 6 hours?
Cumulative
Relative
Frequency
FIGURE 1.13
Which graph represents cumulative relative frequency?
Download