Statistics and Probability - Middle Tennessee State University

Statistics and Probability: 5
Anna Bargagliotti
The University of Memphis
Fundamentals of Statistics (purpose and
Probability Problems
Regression, correlation, causation
Random Variables II
Counting Problems
Fundamentals of Statistics
(purpose and vocab)
What is statistics?
Definition 1: Statistics is a collection of procedures and principles
for gaining and analyzing information in order to help people
make decisions when faced with uncertainty. (Seeing Through
Statistics, by Jennifer Utts)
Definition 2: Statistics is the science of collecting, organizing, and
interpreting data OR Statistics are the data that describe or
summarize something. (Our text)
Definition 3: Statistics is a mathematical science pertaining to the
collection, analysis, interpretation or explanation, and
presentation of data. (Wikipedia)
Definition 4: A set of concepts, rules, and procedures that help us
organize numerical information, understand, and make decisions.
Experimental Design
• Observational (Descriptive)
• Experimental
• What do we need to answer before we do a
– What is your population?
– Once the population for your experiment has
been determined, you select a representative
Types of Information
• What types of variables exist?
– Categorical
• Ordinal
• Nominal
– Quantitative
• Discrete
• Continuous
What can be used to represent
information graphically?
Box plot
Stem and Leaf
Any others?
What types of descriptive
questions do statisticians ask?
• Where is the center?
• What is the spread?
• What is the shape?
What types of research questions
do statisticians ask?
• What is the relationship between or among
• What types of relationships are there between or
among variables?
• What is the cause of an outcome?
• Are samples, procedures, groups, trials,
experiments, different or the same?
Examples and Problems
– Researchers chose two groups of 100 high
school students. One group receives $5 for
every week they have perfect attendance, the
other group gets nothing.
• What may be potential research questions the
researcher is trying to study?
• Is there a control group?
• What is the treatment?
• Is this observational or experimental?
Categorical? Quantitative?
Ordinal? Nominal? Discrete?
A person’s height
A person’s degree
A person’s race
A person’s SAT score
A person’s shoe size
Amount of time it takes to assemble a puzzle
The state in which a person lives
Rating of a newly elected politician
Population in the city of Memphis
Represent the information in one
graph in the most clear manner
• The polls for the Democratic party in the state of
Tennessee are showing: Clinton has 33% support,
Edwards 40%, and Obama 34%. In the state of
Florida: Clinton has 43%, Edwards 35%, and
Obama 34%.
Represent the information in one
graph in the most clear manner
• These data represent the percentage of
people in each of the 50 states that supports
universal heath care.
57, 60 , 50, 52, 42, 51, 39, 42, 49, 56, 38, 70,
43, 58, 48, 59, 57, 43, 40, 44, 33, 47, 46, 57,
51, 59, 63, 50, 48, 41, 47, 36, 56, 61, 50, 61,
49, 47, 32, 57, 61, 51, 60, 67, 41, 53, 45, 52,
48, 70
Determine if the following
statements make sense
• Your bar graph must be wrong because your bars
are wider than those shown in the solutions
• Your pie chart must be wrong because when I
added the percentages on your wedges, they
totaled 124%
• I was unable to make a bar chart, because the data
categories were qualitative rather than
• I rearranged the bars on my histogram so that the
tallest bar comes first.
Find the descriptive statistics
• For the previous data set, find the measures
of center.
• For the previous data set, find the shape.
– What does finding the “shape” mean?
Finding the spread: Why or why not is this a
good measure of spread? When would it be
good? When would it be bad?
• Range: the difference between the highest and lowest score
in a distribution.
• Interquartile Range: measure of the spread of the middle
50% of the scores. Defined as the 75th and 25th percentile.
• Variance: measure of squared deviation from the mean
divided by the N-1 (where N is the sample size).
• Standard Deviation: square root of the variance. Measure of
variability in same units as the data.
Find the measures of spread
• Consider the following set of 100-meter
dash running times: (time in seconds)
9.92, 9.97, 9.99, 10.01, 10.06, 10.07, 10.08,
10.10, 10.13, 10.13, 9.89, 9.98
What does this all have to do
with random variables?
• Recall: what is a random variable?
• What are the concepts we talked about with
random variables?
• How do those concepts relate to what we
talked about today?
• How do use random variables in statistics?
• How do probability concepts come into