Statistics and Probability - Middle Tennessee State University

advertisement
Statistics and Probability: 5
sessions
Anna Bargagliotti
The University of Memphis
•
Fundamentals of Statistics (purpose and
vocab)
•
Probability Problems
•
Regression, correlation, causation
•
Random Variables II
•
Counting Problems
Fundamentals of Statistics
(purpose and vocab)
What is statistics?
Definition 1: Statistics is a collection of procedures and principles
for gaining and analyzing information in order to help people
make decisions when faced with uncertainty. (Seeing Through
Statistics, by Jennifer Utts)
Definition 2: Statistics is the science of collecting, organizing, and
interpreting data OR Statistics are the data that describe or
summarize something. (Our text)
Definition 3: Statistics is a mathematical science pertaining to the
collection, analysis, interpretation or explanation, and
presentation of data. (Wikipedia)
Definition 4: A set of concepts, rules, and procedures that help us
organize numerical information, understand, and make decisions.
(http://bobhall.tamu.edu/FiniteMath/Module8/Introduction.html)
Experimental Design
• Observational (Descriptive)
• Experimental
• What do we need to answer before we do a
study?
– What is your population?
– Once the population for your experiment has
been determined, you select a representative
sample.
Types of Information
• What types of variables exist?
– Categorical
• Ordinal
• Nominal
– Quantitative
• Discrete
• Continuous
What can be used to represent
information graphically?
•
•
•
•
•
•
•
Bar
Histogram
Box plot
Scatterplot
Pie
Stem and Leaf
Any others?
What types of descriptive
questions do statisticians ask?
• Where is the center?
• What is the spread?
• What is the shape?
What types of research questions
do statisticians ask?
• What is the relationship between or among
variables?
• What types of relationships are there between or
among variables?
• What is the cause of an outcome?
• Are samples, procedures, groups, trials,
experiments, different or the same?
Examples and Problems
– Researchers chose two groups of 100 high
school students. One group receives $5 for
every week they have perfect attendance, the
other group gets nothing.
• What may be potential research questions the
researcher is trying to study?
• Is there a control group?
• What is the treatment?
• Is this observational or experimental?
Categorical? Quantitative?
Ordinal? Nominal? Discrete?
Continuous?
•
•
•
•
•
•
•
•
•
A person’s height
A person’s degree
A person’s race
A person’s SAT score
A person’s shoe size
Amount of time it takes to assemble a puzzle
The state in which a person lives
Rating of a newly elected politician
Population in the city of Memphis
Represent the information in one
graph in the most clear manner
• The polls for the Democratic party in the state of
Tennessee are showing: Clinton has 33% support,
Edwards 40%, and Obama 34%. In the state of
Florida: Clinton has 43%, Edwards 35%, and
Obama 34%.
Represent the information in one
graph in the most clear manner
• These data represent the percentage of
people in each of the 50 states that supports
universal heath care.
57, 60 , 50, 52, 42, 51, 39, 42, 49, 56, 38, 70,
43, 58, 48, 59, 57, 43, 40, 44, 33, 47, 46, 57,
51, 59, 63, 50, 48, 41, 47, 36, 56, 61, 50, 61,
49, 47, 32, 57, 61, 51, 60, 67, 41, 53, 45, 52,
48, 70
Determine if the following
statements make sense
• Your bar graph must be wrong because your bars
are wider than those shown in the solutions
manual.
• Your pie chart must be wrong because when I
added the percentages on your wedges, they
totaled 124%
• I was unable to make a bar chart, because the data
categories were qualitative rather than
quantitative.
• I rearranged the bars on my histogram so that the
tallest bar comes first.
Find the descriptive statistics
• For the previous data set, find the measures
of center.
• For the previous data set, find the shape.
– What does finding the “shape” mean?
Finding the spread: Why or why not is this a
good measure of spread? When would it be
good? When would it be bad?
• Range: the difference between the highest and lowest score
in a distribution.
• Interquartile Range: measure of the spread of the middle
50% of the scores. Defined as the 75th and 25th percentile.
• Variance: measure of squared deviation from the mean
divided by the N-1 (where N is the sample size).
• Standard Deviation: square root of the variance. Measure of
variability in same units as the data.
Find the measures of spread
• Consider the following set of 100-meter
dash running times: (time in seconds)
9.92, 9.97, 9.99, 10.01, 10.06, 10.07, 10.08,
10.10, 10.13, 10.13, 9.89, 9.98
What does this all have to do
with random variables?
• Recall: what is a random variable?
• What are the concepts we talked about with
random variables?
• How do those concepts relate to what we
talked about today?
• How do use random variables in statistics?
• How do probability concepts come into
play?
Download