Week_1_Review of Statistics STAT 410

advertisement
STAT 410: Applied Statistics Methods – Linear Models
Week 1: Review of Statistics
Introduction to Statistics
Learning Objectives:
A. Introduction to Statistics
1. Explain that Statistics is the science of learning from data; specifically, the study of the
variability in data.
2. Explain the process of Statistics: data production/collection, data analysis and
statistical inference/interpretation
3. Given a set of raw data, identify the individuals and the variables.
4. Given a variable, determine whether it is categorical or quantitative.
5. Be able to read an article and identify the process of Statistics
Vocabulary: Statistics, population, individual, sample, statistic, descriptive statistics, inferential
statistics, parameter, variable, quantitative variable, qualitative (categorical) variable, discrete
variable, continuous variable, nominal variable, ordinal variable.
Page 1 of 8
STAT 410: Applied Statistics Methods – Linear Models
Week 1: Review of Statistics
Introduction to the Practice of Statistics
The first misconception we must break is that Statistics is just another math class. The fact is
Statistics is science, as is evidenced by the definition
Statistics:
Here is a comparison
Mathematics
 Deductive Reasoning
 One correct answer, 100% correct
 Certainty
 Based on logic
 No context necessary
Statistics
 Inductive Reasoning
 A best answer supported by evidence
 Uncertainty
 Based on data
 Context matters
Since Statistics is a science it follows its own “scientific method” called the Process of Statistics.
The Process of Statistics
The process or science of Statistics can be broken down into six steps.
Step 1: Identifying the research objective (question)
Step 2: Collect the data needed to answer the question from Step 1
Step 3: Explore the data (Descriptive statistics)
Step 4: Draw Statistical Inferences
Step 5: Formulate general conclusions based on statistical inference
Step 6: Revisit, revise, or retest as necessary
Page 2 of 8
STAT 410: Applied Statistics Methods – Linear Models
Week 1: Review of Statistics
From Introduction to Statistical Investigations. ISBN: 978-1-118-95667-0
1. Statement of research question: There may be more than one way to phrase the question or
more than one question to ask in this situation. The research question must be well-defined
in order to direct the process through the next steps.
2. Producing Data: Once the Research Question is defined, we must determine the best way to
collect information to help us answer that question; we often call such information data.
Before we can determine how to collect our data we must determine the following based on
our Research Question:
Page 3 of 8
STAT 410: Applied Statistics Methods – Linear Models
Week 1: Review of Statistics
Population:
Individual:
Sample:
We must also determine what we wish to measure (information to collect) from our individuals
in order to answer our Research Question.
Variable:
We must also determine the best way to collect and measure the data from our sample.
Observational Study:
Experimental Study:
Page 4 of 8
STAT 410: Applied Statistics Methods – Linear Models
Week 1: Review of Statistics
There are two main types of variables:
Quantitative variable:
Examples:
Qualitative (categorical) variable:
Examples:
We can further break down variable types as follows:
Qualitative Variables
Quantitative Variables
Nominal Variable:
Discrete Variable:
Ordinal Variable:
Continuous Variable:
It is important to realize that there are two ways to determine a variable in a study. It may be
measured directly or derived from other variables.
Measured Variables:
Examples:
Derived Variables:
Examples:
It is important to consider if a derived statistics is appropriate or not for a given research
questions. Example: http://www.thedailybeast.com/articles/2015/08/24/how-the-media-failsbasic-math.html
Page 5 of 8
STAT 410: Applied Statistics Methods – Linear Models
Week 1: Review of Statistics
3. Exploratory Data Analysis/Description: Once we have collected our data we want to
summarize our data so that we can begin to answer our Research Question.
Statistic (general):
Descriptive statistics:
List some of the descriptive statistics learned in pervious statistics courses. When do you use
them (what types of variables and research questions)?
List some of the graphic types learned in previous statistics courses. When do you use them
(what types of variables and research questions)?
The goal of Exploratory Data Analysis (EDA) is to develop intuition about the data, measure the
observed effect in the data (differences, relationships, etc.), and produce information to inform
your audience about the data you collected to answer your research question. Descriptive
Statistics and Graphics provide the audience with information to assess the validity and
generalizability of your inferences and conclusions.
Page 6 of 8
STAT 410: Applied Statistics Methods – Linear Models
Week 1: Review of Statistics
4. Statistical Inference: Once we have summarized our data based on a sample we want to use
the data to say something about our population
Inferential statistics:
Parameter:
Inferential statistics will allow your draw conclusions about your specific statistical hypothesis
and assess the confidence you have in your conclusion (statistical significance). Inferential
statistics allow you to use the sample statistics to draw conclusions about the population
parameters (which are unmeasurable) of interest.
Built into Statistical Inference is the concept of probability – how likely a result would be
observed if a hypothesis was true or false. Probability allows us to assess the certainty of our
conclusions in the context of the study.
Page 7 of 8
STAT 410: Applied Statistics Methods – Linear Models
Week 1: Review of Statistics
What were some the key concepts learned about inferential statistics from your previous
courses? (Think about p-values, models, confidence intervals, hypothesis tests, etc.)
5. Draw general conclusions: The conclusions you draw about the specific parameter of
interest may provide you with more general conclusions about the population. For instance,
showing that there is a statistically significant difference between the mean height of men
and the mean height of women can be generalized to “Men are taller than women”.
6. Revisit, revise or retest: At the end of the process of statistics you rarely are finished
exploring the research question. You may have many new questions you want to explore as
a result of the conclusions of the process. Or you may need to revisit the question because
you were unable to answer the research question through the process. The Process of
Statistics is continuous and living, as conclusions will never be 100% certain.
SUMMARY: Each step of the Process of Statistics must be well executed in order
to conduct a meaningful and ethical statistical analysis.
To consider the Process of Statistics more generally, let’s look at the following article:
http://fivethirtyeight.com/features/science-isnt-broken/
Page 8 of 8
Download