Chapter 1 Introduction and Data Collection 1.1 WHY LEARN STATISTICS? GOOD TUNES Good Tunes is a four-store home entertainment systems retailer. It wants to grow to eight stores in next three years. It wants to prepare an electronic slide show to show to banks. It decides to do a survey to show that their customers rate them as superior. We will follow this case throughout this book. What will they have to do? STATISTICS IS EVERYWHERE WASHINGTON — U.S. employers appear to be laying off workers again as the economic recovery weakens. The number of Americans applying for unemployment benefits reached the half-million mark last week for the first time since November. TALLAHASSEE, Fla. — A poll shows Florida Democratic gubernatorial hopeful Alex Sink on par with or ahead of her prospective Republican opponents. A Qunnipiac University poll released Thursday shows Sink with 31 percent to Bill McCollum's 29 percent in a hypothetical matchup. That's within the poll's margin of error of plus or minus 3 percentage points. WASHINGTON — The U.S. economy faces even more difficult times ahead with chronic high unemployment rates and slow manufacturing growth hurting the recovery, Congressional Budget Office Director Douglas Elmendorf said on Thursday. The U.S. unemployment rate will not fall to around 5.0 percent until 2014, Elmendorf wrote in his blog about CBO's new economic and budget outlook. WASHINGTON — Obama said that 60 percent of job losses are coming from small businesses, the sector that under a strong economy hires the most workers. WASHINGTON — A Pew poll found that 21% of Americans believe that President Obama is a Muslim. HONOLULU – The marketing department at a major national bank discovered that 30% of customers who purchased premium checking accounts would accept a proposal for overdraft protection. Page 1 1.1 WHY LEARN STATISTICS? Statistics is a branch of mathematics that transforms numbers into useful information for decision makers Understanding the numbers by reducing them to meangingful patterns. To reduce variance in decision making (but not to eliminate it). Data (numbers) versus information (summarized data). Purpose: To visualize patterns in the data Alternative definition: Statistics is a pseudoscience akin to numerology and astrology but lacking the precision of the former and the predictive power of the latter. Page 2 1.2 STATISTICS IN THE BUSINESS WORLD Applications To summarize business data To draw conclusions from that data To make reliable forecasts of business activities To improve business processes Descriptive Statistics Focuses on collecting, summarizing, presenting, and analyzing a set of data. Inferential Statistics Uses data that have been collected from a small group (sample) to draw conclusions about a larger group. 1.3 BASIC VOCABULARY OF STATISTICS A VARIABLE IS A CHARACTERISTIC OF AN ITEM OR INDIVIDUAL. Something that varies—sales, expenses, headcount, advertising budget, etc. Different values are the data associated with the variable, or just “the data” Need operational definitions of variables. “Junior” What is the operational definition of “Junior?” POPULATIONS AND SAMPLES A population consists of all the items or individuals about which you want to draw a conclusion. A sample is the portion of a population selected for analysis. PARAMETERS AND STATISTICS A parameter is a numerical measure that describes a characteristic of a population. A statistic is a numerical measure that describes a characteristic of a sample. WASHINGTON — A Pew poll found that 21% of Americans believe that President Obama is a Muslim. What is the parameter or statistic? Is it a parameter or statistic? Page 3 1.5 DATA COLLECTION CIRCUMSTANCES THAT REQUIRE DATA COLLECTION A marketing research analysts needs to assess the effectiveness of a new television advertisement. A pharmaceuticals manufacturer needs to determine whether a new drug is more effective than those in current use. An operations manager wants to improve a manufacturing or service process. An auditor wants to review the financial transactions of a company in order to determine whether the company is in compliance with generally acceptable accounting principles. SOURCES OF DATA DATA DISTRIBUTED BY AN ORGANIZATION OR INDIVIDUAL. Marketing research firms, investment services, etc. A DESIGNED EXPERIMENT. Which of two packaging options do consumers prefer? A SURVEY. Who will you vote for? What do you think about companies X, Y, and Z? AN OBSERVATIONAL STUDY. Notice what products shoppers in a store look at. CORPORATE DATA. (NOT IN THE BOOK.) Account, payroll, billing, human resources, shipping, etc. Part of observational study? SELECTING A SOURCE OF DATA Each has strengths and weaknesses. Must select the best type for an application. Page 4 1.6 TYPES OF VARIABLES Data Type Subtype Question Types Categorical Unordered On what island do you live? Check boxes listing the Islands. Numerical Example Check boxes with the choices Excellent, Very Good, Average, Poor, and Unsatisfactory. Ordered How do you rate the firm? Discrete How many phone numbers do you have? Number: ______ Continuous How much did you spend on breakfast? Amount: _______ FIGURE 1.1: QUESTIONS ABOUT THE GOOD TUNES CUSTOMER EXPERIENCE CLASSIFY THE TYPE OF DATA THAT EACH QUESTION WILL GENERATE. CRITIQUE EACH QUESTION. 1. How many days did it take from the time you ordered your merchandise to the time you received it? _____ 2. Did you buy any merchandise that was featured in the Good Tunes Sunday newspaper sales flyer for the week of your purchase? Yes _____ No _____ 3. Was this your first purchase at Good Tunes? 4. Are you likely to buy additional merchandise from Good Tunes in the next 12 months? Yes _____ No _____ 5. How much money (in U.S. dollars) do you expect to spend on stereo and consumer electronics equipment in the next 12 months? __________ 6. How do you rate the overall service provided by Good Tunes with respect to other retailers of home electronics? □ Excellent □ Very Good □ Fair □ Poor Page 5 Yes _____ No _____ 7. How do you rate the selection of products offered by Good Tunes with respect to other retailers of home electronics? □ Excellent □ Very Good □ Fair □ Poor STAGES IN A DATA-CENTRIC PROJECT 1. Define the problem. 2. Define data requirements. 3. Define the methodology for data collection and analysis. 4. Collect the data. 5. Analyze the results. 6. Present the results. Page 6