STAT 410: Applied Statistics Methods – Linear Models Week 1: Review of Statistics Introduction to Statistics Learning Objectives: A. Introduction to Statistics 1. Explain that Statistics is the science of learning from data; specifically, the study of the variability in data. 2. Explain the process of Statistics: data production/collection, data analysis and statistical inference/interpretation 3. Given a set of raw data, identify the individuals and the variables. 4. Given a variable, determine whether it is categorical or quantitative. 5. Be able to read an article and identify the process of Statistics Vocabulary: Statistics, population, individual, sample, statistic, descriptive statistics, inferential statistics, parameter, variable, quantitative variable, qualitative (categorical) variable, discrete variable, continuous variable, nominal variable, ordinal variable. Page 1 of 8 STAT 410: Applied Statistics Methods – Linear Models Week 1: Review of Statistics Introduction to the Practice of Statistics The first misconception we must break is that Statistics is just another math class. The fact is Statistics is science, as is evidenced by the definition Statistics: Here is a comparison Mathematics Deductive Reasoning One correct answer, 100% correct Certainty Based on logic No context necessary Statistics Inductive Reasoning A best answer supported by evidence Uncertainty Based on data Context matters Since Statistics is a science it follows its own “scientific method” called the Process of Statistics. The Process of Statistics The process or science of Statistics can be broken down into six steps. Step 1: Identifying the research objective (question) Step 2: Collect the data needed to answer the question from Step 1 Step 3: Explore the data (Descriptive statistics) Step 4: Draw Statistical Inferences Step 5: Formulate general conclusions based on statistical inference Step 6: Revisit, revise, or retest as necessary Page 2 of 8 STAT 410: Applied Statistics Methods – Linear Models Week 1: Review of Statistics From Introduction to Statistical Investigations. ISBN: 978-1-118-95667-0 1. Statement of research question: There may be more than one way to phrase the question or more than one question to ask in this situation. The research question must be well-defined in order to direct the process through the next steps. 2. Producing Data: Once the Research Question is defined, we must determine the best way to collect information to help us answer that question; we often call such information data. Before we can determine how to collect our data we must determine the following based on our Research Question: Page 3 of 8 STAT 410: Applied Statistics Methods – Linear Models Week 1: Review of Statistics Population: Individual: Sample: We must also determine what we wish to measure (information to collect) from our individuals in order to answer our Research Question. Variable: We must also determine the best way to collect and measure the data from our sample. Observational Study: Experimental Study: Page 4 of 8 STAT 410: Applied Statistics Methods – Linear Models Week 1: Review of Statistics There are two main types of variables: Quantitative variable: Examples: Qualitative (categorical) variable: Examples: We can further break down variable types as follows: Qualitative Variables Quantitative Variables Nominal Variable: Discrete Variable: Ordinal Variable: Continuous Variable: It is important to realize that there are two ways to determine a variable in a study. It may be measured directly or derived from other variables. Measured Variables: Examples: Derived Variables: Examples: It is important to consider if a derived statistics is appropriate or not for a given research questions. Example: http://www.thedailybeast.com/articles/2015/08/24/how-the-media-failsbasic-math.html Page 5 of 8 STAT 410: Applied Statistics Methods – Linear Models Week 1: Review of Statistics 3. Exploratory Data Analysis/Description: Once we have collected our data we want to summarize our data so that we can begin to answer our Research Question. Statistic (general): Descriptive statistics: List some of the descriptive statistics learned in pervious statistics courses. When do you use them (what types of variables and research questions)? List some of the graphic types learned in previous statistics courses. When do you use them (what types of variables and research questions)? The goal of Exploratory Data Analysis (EDA) is to develop intuition about the data, measure the observed effect in the data (differences, relationships, etc.), and produce information to inform your audience about the data you collected to answer your research question. Descriptive Statistics and Graphics provide the audience with information to assess the validity and generalizability of your inferences and conclusions. Page 6 of 8 STAT 410: Applied Statistics Methods – Linear Models Week 1: Review of Statistics 4. Statistical Inference: Once we have summarized our data based on a sample we want to use the data to say something about our population Inferential statistics: Parameter: Inferential statistics will allow your draw conclusions about your specific statistical hypothesis and assess the confidence you have in your conclusion (statistical significance). Inferential statistics allow you to use the sample statistics to draw conclusions about the population parameters (which are unmeasurable) of interest. Built into Statistical Inference is the concept of probability – how likely a result would be observed if a hypothesis was true or false. Probability allows us to assess the certainty of our conclusions in the context of the study. Page 7 of 8 STAT 410: Applied Statistics Methods – Linear Models Week 1: Review of Statistics What were some the key concepts learned about inferential statistics from your previous courses? (Think about p-values, models, confidence intervals, hypothesis tests, etc.) 5. Draw general conclusions: The conclusions you draw about the specific parameter of interest may provide you with more general conclusions about the population. For instance, showing that there is a statistically significant difference between the mean height of men and the mean height of women can be generalized to “Men are taller than women”. 6. Revisit, revise or retest: At the end of the process of statistics you rarely are finished exploring the research question. You may have many new questions you want to explore as a result of the conclusions of the process. Or you may need to revisit the question because you were unable to answer the research question through the process. The Process of Statistics is continuous and living, as conclusions will never be 100% certain. SUMMARY: Each step of the Process of Statistics must be well executed in order to conduct a meaningful and ethical statistical analysis. To consider the Process of Statistics more generally, let’s look at the following article: http://fivethirtyeight.com/features/science-isnt-broken/ Page 8 of 8