Chapters 1 and 2 Week 1, Monday What is Statistics? “Statistics is a way of reasoning, along with a collection of tools and methods, designed to help us understand the world” -- Textbook, page 2 Involves: 1) Collecting, analyzing, presenting, interpreting data 2) Making decisions Chapter 1: Stats Starts Here What are Data? “Data are values along with their context” -- Textbook, page 2 “We can make the meaning clear if we organize the values into a data table” -- Textbook, page 8 “variables” “cases” “records” Name Student ID Gender Age Status GPA Joe 00001 Male Grad 4.0 Amy 00002 Female 19 Ugrad 3.5 Bob 00003 Male Ugrad 3.0 Chapter 2: Data 23 32 Sample VS Population “Often, the cases are a sample of cases selected from some larger population that we’d like to understand” – Textbook, page 9 Example: The data set below is a sample of three students from the population “All University of Akron Students” Goal: A sample that is representative of the population Name Student ID Gender Age Status GPA Joe 00001 Male Grad 4.0 Amy 00002 Female 19 Ugrad 3.5 Bob 00003 Male Ugrad 3.0 Chapter 2: Data 23 32 Types of Variables Categorical: “When a variable names categories and answers questions about how cases fall into those categories” (Gender, Status) Quantitative: “When a measured variable with units answers questions about the quantity of what is measured” (Age, GPA) Name Student ID Gender Age Status GPA Joe 00001 Male Grad 4.0 Amy 00002 Female 19 Ugrad 3.5 Bob 00003 Male Ugrad 3.0 Chapter 2: Data 23 32 Types of Variables Pitfalls: 1) Often numeric values are quantitative, but not always! (Student ID is not a “measured variable with units”) 2) We could turn Age into a categorical variable by assigning labels: “younger” for students under 22 and “older” for students over 22 Name Student ID Gender Age Status GPA Joe 00001 Male Grad 4.0 Amy 00002 Female 19 Ugrad 3.5 Bob 00003 Male Ugrad 3.0 Chapter 2: Data 23 32 Types of Variables Pitfalls: 1) Often numeric values are quantitative, but not always! (Student ID is not a “measured variable with units”) 2) We could turn Age into a categorical variable by assigning labels: “younger” for students under 22 and “older” for students over 22 Name Student ID Gender Age Status GPA Joe 00001 Male Grad 4.0 Amy 00002 Female Younger Ugrad 3.5 Bob 00003 Male Ugrad 3.0 Chapter 2: Data Older Older Types of Variables Identifier: A unique value for each case (“[When] there are as many categories as individuals and only one individual in each category”) whose value is not “useful” -- Textbook, page 12 (Student ID) Name Student ID Gender Age Status GPA Joe 00001 Male Grad 4.0 Amy 00002 Female 19 Ugrad 3.5 Bob 00003 Male Ugrad 3.0 Chapter 2: Data 23 32 Chapter 3 Week 1, Wednesday and Friday Data Set for Chapter 3 Slides Data is from a sample of 8 students from a graduate level Statistics class An identifier (Name) Three categorical variables: Gender (male, female) Handed (right, left) Grade (A, B, C, D, F) Chapter 3: Displaying and Describing Categorical Data Frequency Table Grade Count Grade % A 2 A 25 B 3 B 37.5 C 2 C 25 D 1 D 12.5 Frequency Table – displays counts for each category Relative Frequency Table – displays percentages/proportions (describes the distribution – names the possible categories and tells how frequently they occur) Chapter 3: Displaying and Describing Categorical Data Graphing Categorical Data Bar Chart– Displays the distribution of a categorical variable, showing the counts for each category next to each other for easy comparison. Chapter 3: Displaying and Describing Categorical Data Graphing Categorical Data Pie Chart– Shows the whole group of cases as a circle, slicing it into pieces whose size is proportional to the fraction of the whole in each category. Chapter 3: Displaying and Describing Categorical Data Graphing Categorical Data Area Principle– The area occupied by a part of the graph should correspond to the magnitude of the value it represents. Chapter 3: Displaying and Describing Categorical Data GENDER Contingency Table Male Female A 0 2 2 GRADE B C 3 1 0 1 3 2 D 1 0 1 Contingency Table – A two-way table for categorical variables showing how the individuals are distributed along each variable. Chapter 3: Displaying and Describing Categorical Data 5 3 8 Grade A 2/8 25% B 3/8 37.5% C 2/8 25% D 1/8 12.5% GENDER Contingency Table Male Female A 0 2 2 GRADE B C 3 1 0 1 3 2 D 1 0 1 Marginal Distribution– Can be obtained from the contingency table by observing row (or column) percents. Chapter 3: Displaying and Describing Categorical Data 5 3 8 Gender M 5/8 62.5% F 3/8 37.5% GENDER Contingency Table Male Female A 0 2 2 GRADE B C 3 1 0 1 3 2 D 1 0 1 Marginal Distribution– Can be obtained from the contingency table by observing row (or column) percents. Chapter 3: Displaying and Describing Categorical Data 5 3 8 GENDER Contingency Table Male Female A 0 2 2 GRADE B C 3 1 0 1 3 2 D 1 0 1 In future assignments you’ll have to answer the following types of questions from a contingency table: 1) What is the percent of students that earned an A? 2/8 = 25% Chapter 3: Displaying and Describing Categorical Data 5 3 8 GENDER Contingency Table Male Female A 0 2 2 GRADE B C 3 1 0 1 3 2 D 1 0 1 In future assignments you’ll have to answer the following types of questions from a contingency table: 2) What is the percent of students that are female? 3/8 = 37.5% Chapter 3: Displaying and Describing Categorical Data 5 3 8 GENDER Contingency Table Male Female A 0 2 2 GRADE B C 3 1 0 1 3 2 D 1 0 1 In future assignments you’ll have to answer the following types of questions from a contingency table: 3) What is the percent of females that earned an A? (Called a “conditional probability”) 2/3 = 66.7% Chapter 3: Displaying and Describing Categorical Data 5 3 8 GENDER Contingency Table Male Female A 0 2 2 GRADE B C 3 1 0 1 3 2 D 1 0 1 In future assignments you’ll have to answer the following types of questions from a contingency table: 4) What is the percent of students that earned an A or B? 5/8 = 62.5% Chapter 3: Displaying and Describing Categorical Data 5 3 8 GENDER Contingency Table Male Female A 0 2 2 GRADE B C 3 1 0 1 3 2 D 1 0 1 In future assignments you’ll have to answer the following types of questions from a contingency table: 5) What is the percent of students that earned an A and B? 0/8 = 0% Chapter 3: Displaying and Describing Categorical Data 5 3 8 GENDER Contingency Table Male Female A 0 2 2 GRADE B C 3 1 0 1 3 2 D 1 0 1 5 3 8 In future assignments you’ll have to answer the following types of questions from a contingency table: 6) What is the percent of students that are female and earned C? 1/8 = 12.5% Chapter 3: Displaying and Describing Categorical Data