What is Statistics? Definition of Statistics Statistics is the science of collecting, organizing, analyzing, and interpreting data in order to make a decision. Branches of Statistics The study of statistics has two major branches – descriptive(exploratory) statistics and inferential statistics. Descriptive statistics is the branch of statistics that involves the organization, summarization, and display of data. In this course, from chapter 1 through Chapter 5, they are talking about Descriptive statistics. Inferential statistics is the branch of statistics that involves using a sample to draw conclusions about population. A basic tool in the study of inferential statistics is probability. In this course, starting from Chapter 9, they are talking about inferential statistics. Chapter 1 Picturing Distributions with Graphs Chapter outline Individuals and variables Categorical variables: Pie Charts and bar graphs Quantitative variables: Histograms Interpreting histograms Quantitative variables: Stemplots Time plots Examining Distributions- Introduction Definitions: Individuals: the objects described by a set of data Variable: any characteristic of an individual Examples College student data: every currently enrolled student – date of birth, gender, major, GPA and so on Employee data: every employee – age, gender, salary, job type Variables Categorical variable: categories, groups Quantitative variable: numerical values Distribution of a variable: what values it takes and how often it takes these values Examples College student data: every currently enrolled student – DOB, gender, major, GPA, and so on Employee data: every employee – age, gender, salary, job type We can see distributions easily using graphs. It is possible to see distributions using numbers which describe the data. Example 1.1 (Page 5) Exploratory data analysis describes the main feature of data. 1. Examine each variable 2. Study the relationships among the variables 3. Start with graphs and add some numerical summeries. Categorical variables --- bar graphs and pie charts Distribution of categorical variables categories by relevant count or percent of individuals. Graphs: bar graph, pie chart Pie chart: figure 1.1 (P. 7)/ must include all categories Bar graph: figure 1.2 (P. 8)/heightindividual’s weight [gaps between bars and order is not important.] Note: It’s only for single variable now (for example: college major, tire model, final exam grade). Pie Chart in Figure 1.1 shows us each material as a part of the whole Quantitative variables: histograms How to make histograms Step 1. Choose the classes. Divide the range of the data into classes of equal width. Step 2. Count the individuals in each class. Step 3. Draw the histogram. Example 1.3 Example 1.3 (P. 11) Interpreting histograms Interpretation: What do we see? Overall pattern and striking deviations. Overall pattern Shape, center, spread: symmetric, skewed to the right/left, clustered. striking deviations Outlier Example 1.5 (P. 13) Example 1.6 (P. 14) Quantitative variables: stemplots Another way to display a distribution of quantitative variables. How to make stemplots 1. Sort data in increasing order first 2. Separate each observation into a stem consisting of all but the final digit, and a leaf, the final digit. 3. Write the stems in a vertical column with the smallest at the top, and draw a vertical line at the right of this column 4. Write each leaf in the row to the right of its stem, in increasing order out from the stem. Quantitative variables: stemplots Data: 80, 52, 86, 94, 76, 48, 92, 69, 79, 45 Step 1. Sort data in increasing order first Step 2. Decide stem Step 3. Fill in leaves Examples and Exercises Example 1.7 (P. 16) using Table 1.1 (P. 10) Example 1.8 (P. 16) Tips 1. Rounding 2. Splitting stems Quantitative variables: stemplots For small data sets, it is quicker to make and presents more detailed information You keep data values Time plots It is for variables which are measured at intervals over time. Example 1. The cost of raw materials for a manufacturing process each month. Example 2. The price of a stock at the end of each day. Time plots To display change over time, make a time plot. Plot each observation against the time at which it was measured 1. Put time on the horizontal scale 2. Put the variable on the vertical scale 3. Connect the data points by lines Special case: time series (for regularly measured variable) You can see: 1 )seasonal variation, 2) trend Example 1.9 (P.18) Free tutoring The Math Assistance Complex (MAC) 122 Kell Hall MAC website:(online tutoring available) www.gsu.edu/~wwwclc/mathlab.htm