Introduction What is Statistics? Stat 226 – Introduction to Business Statistics I Statistics is the science of collecting, describing and interpreting data allowing for data-based decision making. Spring 2009 Professor: Dr. Petrutza Caragea Section A Tuesdays and Thursdays 9:30-10:50 a.m. “I like to think of statistics as the science of learning from data...” (Jon Kettenring, ASA President 1997) In Business and Industry Statistics can be used to quantify unknowns in order to optimize resources, e.g. Introduction 1 2 3 4 Stat 226 (Spring 2009, Section A) Introduction to Business Statistics I Introduction 1 / 13 Introduction: Descriptive vs. Inferential Predict the demand for products and services. Check the quality of items manufactured in a facility. Manage investment portfolios. Forecast how much risk activities entail, and calculate fair and competitive insurance rates. Stat 226 (Spring 2009, Section A) Introduction to Business Statistics I Introduction 2 / 13 Introduction: Descriptive vs. Inferential We distinguish between descriptive and inferential Statistics: compared to inferential statistics: Descriptive Statistics is the collection, presentation and description of data in form of graphs, tables and numerical summaries such as averages, variances etc. Goals: Inferential Statistics deals with the interpretation of data as well as drawing conclusions and making generalizations based on data for a larger group of subjects. Goals: look for patterns making data-based decisions summarize and present data generalizing information obtained from descriptive analysis to a larger group of individuals quick information compare several groups, i.e. one can easily look for differences and similarities Stat 226 (Spring 2009, Section A) Introduction to Business Statistics I Introduction 3 / 13 Stat 226 (Spring 2009, Section A) Introduction to Business Statistics I Introduction 4 / 13 Introduction: Descriptive vs. Inferential Introduction: Population vs. Sample Example: Before movies are released they are previewed by a selected audience. Assume 200 people are asked to provide an overall rating for a movie yielding the following responses: Population Examples: 24% very satisfied all ISU students currently enrolled 26% satisfied all Audi A6 vehicles manufactured in a year 33% in between all customers banking with Wells Fargo 12% dissatisfied Sample 5% very dissatisfied ⇒ 24% of the 200 previewers were very satisfied with the movie – this is a descriptive statement based on a sample of 200 previewers. ⇒ 24% of all people who will see the movie will be very satisfied with the movie – this is an inferential statement for the entire population of individuals. Stat 226 (Spring 2009, Section A) The population in a study is the entire group of individuals or subjects about which we want to gain information. Introduction to Business Statistics I Introduction 5 / 13 Introduction: Population vs. Sample A sample is a subgroup (or part) of a population from which we obtain information in order to draw conclusions about the entire population. Examples: every 5th ISU students currently enrolled all Audi A6 vehicles manufactured on a single day 100 randomly chosen customers banking with Wells Fargo Stat 226 (Spring 2009, Section A) Introduction to Business Statistics I Introduction 6 / 13 Introduction: Populations vs. Sample Need to be careful, the terms population and statistics are relative. Consider all college students in the US, then all ISU students are no longer the population of interest but rather a sample. ⇒ Clearly formulate what the population of interest is! When using numerical summaries to describe samples or populations we need to distinguish between a so-called statistic and a parameter: any numerical summary describing a sample is called a statistic any numerical summary describing a population is called a parameter Example: movie preview 24% of the 200 previewers: 24% – statistic It is important to distinguish between a population parameter and a sample statistic. A parameter is a numerical summary of a population. Populations consist typically of too many individuals, so that these can never be observed. For example, it would be impossible to know the average summer earnings of all university students. This would require us to identify, find, and question thousands of students. Therefore we will hardly ever know the true parameter value of a population. It is however feasible to select a sample of 100 students (using proper randomization) and then the average earning of these 100 students could be computed. Any numerical measure computed from a subset of the population (typically a sample) is a statistic and can be observed. 24% of all people going to see the movie: 24% – parameter Stat 226 (Spring 2009, Section A) Introduction to Business Statistics I Introduction 7 / 13 Stat 226 (Spring 2009, Section A) Introduction to Business Statistics I Introduction 8 / 13 Introduction: Parameter vs. Statistic Introduction: Individuals and Variables some more definitions... Parameter is a numerical summary for the entire population. It typically remains unknown as we cannot observe the entire population. We will use the information based on the data such as a sample mean to get an idea what the value of the unknown population parameter is — this process is inferential. Statistics are numerical summaries (e.g. an average) that are obtained from real data, we can actually observe a statistic — statistics are descriptive. Stat 226 (Spring 2009, Section A) Introduction to Business Statistics I Introduction Individuals Individuals are subjects/objects of the population of interest; can be people but also business firms, common stocks or any other object that we want to study. Examples? A Variable A variable is any characteristic of an individual that we are interested in. A variable typically will take on different values for different individuals. Examples? 9 / 13 Introduction: Kinds of variables Stat 226 (Spring 2009, Section A) Introduction to Business Statistics I Introduction 10 / 13 Introduction: Kinds of variables Categorical variables Individuals can be placed into one of several categories. We distinguish nominal and ordinal variables. Quantitative variables Quantitative variables take numerical values for which arithmetic operations such as adding and averaging make sense, e.g. nominal: no order possible gender religion race colors height of a person weight of a person temperature time it takes to run a mile ordinal: order is possible currency exchange rates grades educational degrees Stat 226 (Spring 2009, Section A) Introduction to Business Statistics I Introduction 11 / 13 Stat 226 (Spring 2009, Section A) Introduction to Business Statistics I Introduction 12 / 13 Introduction Distribution The distribution of a variable describes WHAT values the variable takes and HOW often it takes these values. Depending on the type of the data (categorical or quantitative) we need to use different graphical and numerical tools to analyze and summarize the data at hand. We will start by describing data graphically: bar graphs, pie charts and pareto charts can be used to graphically summarize categorical data. a common graphical display for quantitative data is a histogram. Stat 226 (Spring 2009, Section A) Introduction to Business Statistics I Introduction 13 / 13