Business Statistics Course: BUSN 2429 Instructor: Bassem Hamid “Introduction to Business Statistics” (Chapter 1 & 7) 1 Business Statistics Map Introduction + Descriptive Statistics Probability + Probability Distributions Inferential Statistics 1. Introduction to Business Statistics 2. Displaying Descriptive Statistics 3. Calculating Descriptive Statistics 4. Introduction to Probabilities 5. Discrete Probability Distribution 6. Continuous Probability Distribution 7. Sampling & Sampling Distribution 8. Confidence Intervals 9. Hypothesis Test-One Sample Test 10. Hypothesis Test-Two Samples Test 14. Correlation and Single Regression Model 2 Do You Know The Meaning of The Following Terms; Problem, Statistics, Data, Sample, Population, Information? These Terms are the Elements of the Research Project Diagram! Elements of The Research Diagram Do You Know How Information (Knowledge) is Created? The Information is Created Through Conducting a Research Project! The Output of The Research Diagram Outlines This chapter covers the following points: • #1 Business Statistics • #2 Research Project Diagram/Steps • #3 Why Sample? • #4 Sampling Methods • #5 Data Classifications • #6 Data Collection Methods • #7 Ethics and Statistics 4 Objectives After completing this chapter, you will be able to: • #1 Define statistics terms; business statistics, data, variables, population and sample • #2 Distinguish between descriptive and inferential statistics • #3 Understand how statistics is used in the business world • #4 Classify data by the level of measurement • #5 Understand the difference between data collection methods and sampling methods • #6 Understand the ethical implications of misusing statistics 5 #1 Business Statistics • #1.1 Statistics Definition • #1.2 Statistic Terms • #1.3 Statistics Branches • #1.4 Statistics Applications 6 #1.1 Statistics Definition • Statistics is the mathematical science that deals with collection, analysis and presentation of data oStatistics includes converting data into meaningful information through statistical tools and techniques oInformation is used to make business decision by the managers Data Collect Data (Raw Materials) Statistical Tools Descriptive Inferential Information Convert (Workstation) Present Information (Processed Data) Process Concept: Input, Workstation & Output 7 #1.2 Statistics Terms • Population: consists of all possible subjects of interest • Sample: is group of subjects selected from population • Variables: Characteristic that can assume different values (e.g., Student grade or X) • Data: are the values assigned to the Measurement or Observation (Values that the variables can assume. e.g., Student grade, X= 90) Population (Data) Work Station Information (Statistics Tools) (Make Decision) Sample (Data) 8 Example: Population vs. Sample The warehouse has just received 1,000 pcs of cellphone cover. Can we measure the width of all cellphones to ensure they are confirming to specifications? Population The variable: The width of the cell phone cover The data: W = 40 MM Test Center It is not FEASIBLE 1000 Pcs Sample It is FEASIBLE Test Center 50 Pcs 9 #1.3 Statistics Branches Descriptive Statistics & Inferential Statistics 10 Descriptive Statistics • Descriptive Statistics includes • Collecting, summarizing, and displaying data using graphs, charts & tables (Sample or Population) The average spending in the summer activities is $1200 Average Spending ($) Sample Randomly selected (30 Students) Bar Chart Summer Activities $1200 Summer 11 Descriptive Statistics • Descriptive Statistics includes • Collecting, summarizing, and displaying data using graphs, charts & tables (Sample or Population) The average spending in the summer activities is $1200 Average Spending ($) Sample Randomly selected (30 Students) Bar Chart Summer Activities $ $1200 Gender Breakdown 30% 70% Male Female Summer 12 Descriptive Statistics • Descriptive Statistics includes • Collecting, summarizing, and displaying data using graphs, charts & tables (Sample or Population) The average spending in the summer activities is $1250 Average Spending ($) Population My Stat class (300 Students) Bar Chart Summer Activities $ $1250 Gender Breakdown 30% 70% Male Female Summer 13 Inferential Statistics • Inferential Statistics includes • Making claims or conclusions about the population based on a sample and probability theories Population My Stat class (300 Students) What is average spending in the summer activities? Sample Randomly selected (30 Students) The average spending in the summer activities is $1200 probability theories Conclusion Based on sample mean and probability theory, the spending mean for the population is between $1100 & $1300 14 Your Turn #1 Identify each of the following as either descriptive or inferential statistics • Julie, who cuts and style hair in her salon, had 23 customers last weeks • A recent poll showed that 75% of American had a favorable opinion of the president of the united States • The average exam score for my statistics exam was an 88 • Predicting election results by asking voters their intentions 15 Identify each of the following as either descriptive or inferential statistics • Julie, who cuts and style hair in her salon, had 23 customers last weeks. Descriptive Statistics • A recent poll showed that 75% of American had a favorable opinion of the president of the united States. Inferential Statistics (It is not feasible to ask every American in the country about his opinion. Sample and probability are used) • The average exam score for my statistics exam was an 88. Descriptive Statistics (Describe of sample data) • Predicting election results by asking voters their intentions. Inferential Statistics (Sample and probability are used) 16 #1.4 Statistics Application • Marketing oConduct a market research to identify the characteristics of target market oThe income of a certain group is between $4000-$4800 • Operations oDevelop a multi regression model to estimate the productivity based on its factors oProductivity = Y= a.X1 (wage) + b.X2 (Experience) +c.X3(Training) • Finance and Economics oConstruct a scatter plot to show the relationship between construction materials and house price o There is a correlation between the above variables. The higher the materials, the higher the house price. 17 Applying Concept “Attendance and Grades” • A study conducted on a sample of 500 students out of 10,000 students at Manatee community College revealed that oStudents who attended class 90% to 100 % of the time usually received an A. oStudents who attended class 80% to less than 90 % of the time usually received a B or C in the Class. oStudents who attended class less than 80 % of the time usually received a D or an F or eventually withdrew from the class. Please answer the following questions. 18 • 1. What are the variables under study? • 2. What are the data in the study? • 3. Are descriptive, inferential or both statistics are used? • 4. What are the population under study? • 5. Was the sample collected? If so, from where? • 6. From the information given, comment on the relationship between the variables 19 • 1. What are the variables under study? o The variables are grades and attendance • 2. What are the data in the study? o The data consists of specifics grades (A, B, C, D & F) and attendance numbers (80%) • 3. Are descriptive, inferential or both statistics are used? o These are descriptive statistics • 4. What are the population under study? o The population under study is the students at Manatee Community College • 5. Was the sample collected? If so, from where? o The data was collected from the sample of 500 students • 6. From the information given, comment on the relationship between the variables o Based on the data, it appears that in general, the better your attendance, the higher your grade 20 #2 Research Project Diagram and Steps #2 #3 #4 Population (Data) Workstation (Information) Sample (Data) Define Population/Sample “Sampling Methods” -Random -Systematic -Stratified -Cluster Define Data “Data Collection Methods” “Data Types/levels” -Direct Observation -Focus Group -Experiment -Survey -Interview -Qualitative -Quantitative -Nominal Level -Ordinal Level -Interval Level -Ratio Level Select Statistical Tools -Descriptive Statistics -Inferential Statistics #5 Solve Business Problem Define Business Problem #1 Research Project Steps “Create Knowledge/Make a Decision/Solve a Problem” The researcher should 1. Define the problem 2. Define the population/sample (Sampling Methods) 3. Define the nature of data (Data Collection Methods & Data Level/Types) 4. Use statistics tools to process the data and create Information (Descriptive/Inferential Tools) 5. Create Knowledge/Make a Decision/Solve a Problem Research Project Steps “Create Knowledge/Make a Decision/Solve a Problem” The researcher should 1. Define the problem -State Clearly the Business problem -The productivity (Output/Input = Unit produced/# of hours = 8 Units/8 hours = 1Unit per hour) in XYZ factory is low. Why? Does maturity matter? is there a relationship between the employees’ age and their productivity? 2. Define the population/sample (Sampling Methods) 3. Define the nature of data (Data Collection Methods & Data Level/Types) 4. Use statistics tools to process the data and create information (Descriptive/Inferential Tools) 5. Create Knowledge/Make a Decision/Solve a Problem Research Project Steps “Create Knowledge/Make a Decision/Solve a Problem” The researcher should 1. Define the problem 2. Define the population/sample (Sampling Methods) -Sampling Methods: Random, Systematic, Stratified, Cluster -Randomly select 60 employees (sample) from 1000 employees (population) in XYZ factory 3. Define the nature of data (Data Collection Methods & Data Level/Types) 4. Use statistics tools to process the data and create information (Descriptive/Inferential Tools) 5. Create Knowledge/Make a Decision/Solve a Problem Research Project Steps “Create Knowledge/Make a Decision/Solve a Problem” The researcher should 1. Define the problem 2. Define the population/sample (Sampling Methods) 3. Define the nature of data (Data Collection Methods & Data Level/Types) -Direct Observation, Focus Group, Experiment, Survey & Interview -Qualitative, Quantitative, Nominal Level, Ordinal Level, Interval Level & Ratio Level -Use survey. The data are Quantitative (Independent Variable: Employee age & Dependent Variable: Employee Productivity) 4. Use statistics tools to process the data and create information (Descriptive/Inferential Tools) 5. Create Knowledge/Make a Decision/Solve a Problem Research Project Steps “Create Knowledge/Make a Decision/Solve a Problem” The researcher should 1. Define the problem 2. Define the population/sample (Sampling Methods) 3. Define the nature of data (Data Collection Methods & Data Level/Types) 4. Use statistics tools to process the data and create information (Descriptive/Inferential Tools) -Descriptive Tools & Inferential Tools - Scatter Chart 5. Create Knowledge/Make a Decision/Solve a Problem Research Project Steps “Create Knowledge/Make a Decision/Solve a Problem” Productivity Dependent Variable The researcher should 1. Define the problem 2. Define the population/sample (Sampling Methods) 3. Define the nature of data (Data Collection Methods & Data Level/Types) 4. Use statistics tools to process the data and create knowledge (Descriptive/Inferential Tools) 5. Create Knowledge/Make a Decision/Solve a Problem -There is a positive correlation between employee age and productivity. -Hire old age employees Age Independent Variable Research Project Diagram and Steps #2 #3 #4 Population (Data) Workstation (Information) Sample (Data) Define Population/Sample Productivity is “Sampling Methods” decreasing ??? Why? -Random Is there any relationship between employee’s age & productivity??? (Randomly select 60 employees) -Systematic -Stratified -Cluster Define Data “Data Collection Methods” “Data Types/levels” -Direct Observation -Focus Group -Experiment -Survey -Interview -Qualitative -Quantitative -Nominal Level -Ordinal Level -Interval Level -Ratio Level #5 Solve Business Problem Define Business Problem #1 Select Statistical Tools -Descriptive Statistics Hire old-age (Scatterplot) employees -Inferential Statistics #3 Why Sample? • Population: consists of all possible subjects of interest • Sample: is group of subjects selected from population • Why Sample? o Examining the entire population would be expensive and time consuming o Can’t examine everything if the test is destructive • Sample has three principle o Examine part of the whole, Randomize and Sample size • If a sample is selected properly and the analysis performed correctly, sample information can be used to make an accurate assessment of the entire population 29 #4 Sampling Methods • Sampling From Population oNonprobability Sampling ▪ Convenience oProbability Sampling ▪ Simple Random ▪ Systematic ▪ Stratified ▪ Cluster ▪ Resampling 30 Research Project Diagram and Steps #2 #3 #4 Population (Data) Workstation (Information) Sample (Data) Define Population/Sample “Sampling Methods” -Random -Systematic -Stratified -Cluster Define Data “Data Collection Methods” “Data Types/levels” -Direct Observation -Focus Group -Experiment -Survey -Interview -Qualitative -Quantitative -Nominal Level -Ordinal Level -Interval Level -Ratio Level Select Statistical Tools -Descriptive Statistics -Inferential Statistics #5 Solve Business Problem Define Business Problem #1 Sampling Methods Sampling from a Population Probability Sampling Simple Random Nonprobability Sampling Convenience Systematic Stratified Cluster Resampling 32 Sampling Methods Probability Sampling Simple Random A probability sample is a sample in which each member of the population has a known, nonzero, chance of being selected for the sample Systematic Stratified Cluster Sample Resampling Population 33 Simple Random Sampling Probability Sampling Simple Random * A simple random sample is a sample in which every member of the population has an equal chance of being chosen Systematic Stratified Cluster Sample Resampling Population 34 Systematic Sampling In systematic sampling, every kth member of the population is chosen for the sample. The value of k is determined by dividing the size of the population (N ) by the size of the sample (n) Probability Sampling Simple Random Systematic Stratified * 1 2 3 4 5 6 7 8 9 10 1 2 3 1 2 3 1 2 3 Cluster Resampling K=3 Population, N Sample, n 35 Systematic Sampling • Formula for the Systematic Sampling Constant, k: N k= n N = Size of the population n = Size of the sample • Example: oSelect a systematic sample of size n = 30 from a population of N = 270 N 270 k= = =9 n 30 oFrom a list of all population values, choose every 9th value for the sample o9, 18, 27, 36 etc..270 36 Systematic Sampling • Advantages of systematic sampling: o Easy to do manually o Can avoid bias by not allowing judgment or convenience to affect the sample • Disadvantages: o One concern about systematic sampling is periodicity, which is a pattern in the population that is consistent with the value of k o Example: Sampling every 8 hours might obtain values only from the beginning or end of a shift, which might not be representative of all values during the day 37 Stratified Sampling Probability Sampling Stratified sampling divides the population into mutually exclusive groups, or strata. Each one is considered as part of the population Simple Random A random sample from each strata is selected Systematic Stratified Cluster * Strata are based on important variables that can have an impact on the data collected and the results that are achieved Resampling 38 Population Production Shifts Strata Select a random sample from each group First Shift (Part of the Population) Sample Second Shift (Part of the Population) Sample Third Shift (Part of the Population) Gather All Selected Samples to form the final sample Sample Sample This type includes dividing the population into groups (Strata) according to some characteristics that are important to the study and then select a random sample from each group Using stratified sampling helps ensure that all shifts are represented in the sample Stratified Sampling • Examples of Stratified Sample: oFor factory production shift, strata could be 1st shift, 2nd shift, and 3rd shift oFor an undergraduate population, strata could be class standing for Freshman, Sophomore, Junior, and Senior oFor a population of workers, strata might be different age categories of workers, young, old, etc. oFor a population of firms, strata might be large, mid and small • Using stratified sampling helps ensure that all classes, shifts, or ages are represented in the sample 40 Cluster Sampling Probability Sampling Cluster sampling involves dividing the population into mutually exclusive groups, or clusters, that are each representative of the population (Mini Population) Simple Random Systematic Then randomly select clusters to form the final sample Stratified Cluster * These clusters are often selected based on geography to help simplify the sampling process Resampling 41 Population Clusters Mini Population 1 (Hospital1: Surrey) Patients in Hospitals in Greater Vancouver Use all members of the randomly selected clusters to form the final sample Sample Mini Population 2 (Hospital2: Burnaby) Mini Population n (Hospital n: Other cities) This type includes dividing the population into groups (Clusters) by some means (Geographic Area) and then the researcher randomly select some of the cluster and use all members of the selected as samples. It would be very costly and time consuming to obtain random sample of patients since they are spread over a large area Cluster Sampling • Examples of cluster sample: o Individual cities where a new product is introduced o Customer account balances arranged in clusters by first letter of last name o Patients in greater Vancouver o Students in higher education institutions (Universities and colleges) 43 Cluster vs. Stratified (Economical Sampling Process) (Accurate Sampling Process) Susan would like to conduct a survey of homeowners in the Meadowbrook neighborhood to get their opinions on proposed road modifications in the area. What are the sampling method options for Susan? o Susan selects the first 20 homes that she passes as she walks into the entrance of the neighborhood. (Convenience sample) o Susan selects every third house on each street in the neighborhood. (Systematic sample) o Susan randomly chooses two streets in the neighborhood and selects every home on these streets. (Cluster sample) o Susan ensures that her sample contains a number of two-story, split-level, and ranch homes in her sample that corresponds to the number of homes in the neighborhood. (Stratified sample) Cluster vs. Stratified Population Clusters Population Starta 45 Cluster vs. Stratified Susan would like to conduct a survey of homeowners in the Meadowbrook neighborhood to get their opinions on proposed road modifications in the area. What are the sampling method options for Susan? o Susan selects the first 20 homes that she passes as she walks into the entrance of the neighborhood. (Convenience sample) o Susan selects every third house on each street in the neighborhood. (Systematic sample) o Susan randomly chooses two streets in the neighborhood and selects every home on these streets. (Cluster sample) o Susan ensures that her sample contains a number of two-story, split-level, and ranch homes in her sample that corresponds to the number of homes in the neighborhood. (Stratified sample) Street n Street 3 Street 2 Street 1 Cluster vs. Stratified Susan would like to conduct a survey of homeowners in the Meadowbrook neighborhood to get their opinions on proposed road modifications in the area. What are the sampling method options for Susan? o Susan selects the first 20 homes that she passes as she walks into the entrance of the neighborhood. (Convenience sample)Main Street o Susan selects every third house on each street in the neighborhood. (Systematic sample) o Susan randomly chooses two streets in the neighborhood and selects every home on these streets. (Cluster sample) o Susan ensures that her sample contains a number of two-story, split-level, and ranch homes in her sample that corresponds to the number of homes in the neighborhood. (Stratified sample) Cluster vs. Stratified Cluster 1 Street n Street 3 Street 2 Street 1 Cluster 2 Susan would like to conduct a survey of homeowners in the Meadowbrook neighborhood to get their opinions on proposed road modifications in the area. What are the sampling method options for Susan? o Susan selects the first 20 homes that she passes as she walks into the entrance of the neighborhood. (Convenience sample) o Susan selects every third house on each street in the neighborhood. (Systematic sample) o Susan randomly chooses two streets in the neighborhood and selects every home on these streets. (Cluster sample) o Susan ensures that her sample contains a number of two-story, split-level, and ranch homes in her sample that corresponds to the number of homes in the neighborhood. (Stratified sample) Strata 1 Cluster vs. Stratified Strata 2 Strata 3 Susan would like to conduct a survey of homeowners in the Meadowbrook neighborhood to get their opinions on proposed road modifications in the area. What are the sampling method options for Susan? o Susan selects the first 20 homes that she passes as she walks into the entrance of the neighborhood. (Convenience sample) o Susan selects every third house on each street in the neighborhood. (Systematic sample) o Susan randomly chooses two streets in the neighborhood and selects every home on these streets. (Cluster sample) o Susan ensures that her sample contains a number of two-story, split-level, and ranch homes in her sample that corresponds to the number of homes in the neighborhood. (Stratified sample) Resampling Resampling is a statistical technique where many samples are repeatedly drawn from a population Probability Sampling Simple Random Systematic One type of resampling methods is the bootstrap method Stratified Cluster Resampling * Involves using computer software to extract many samples with replacement in order to estimate a parameter of the population, such as a mean or proportion 50 Nonprobability Sampling Nonprobability Sampling Convenience A nonprobability sample is a sample in which the probability of a population member being selected for the sample is not known 51 Nonprobability Sampling Nonprobability Sampling Convenience * A convenience sample is used when sample values are selected simply because they are easily accessible • Advantages: o Quick and easy to get sample data o Provides general information about the population • Disadvantages: o May not be representative of the population 52 Your Turn #2 Identify the type of sampling technique for each of the following: • The first Monday of each month, I ask my customers who come to my store to fill out a satisfaction survey • I randomly select four stores in a mall and ask each customer in those stores about his or her opinion of the latest health care legislation • I position myself on a busy intersection of a city street and ask people what their opinions are of a local sports team • Susan would like to conduct a survey of homeowners in the Meadowbrook neighborhood to get their opinions on proposed road modifications in the area. Susan ensures that her sample contains a number of two-story, split-level, and ranch homes in her sample that corresponds to the number of homes in the neighborhood. • Using computer software, I randomly select 20 employees to participate in a job satisfaction survey. 53 Identify the type of sampling technique for each of the following: • The first Monday of each month, I ask my customers who come to my store to fill out a satisfaction survey. Systematic sampling • I randomly select four stores in a mall and ask each customer in those stores about his or her opinion of the latest health care legislation. Cluster sampling • I position myself on a busy intersection of a city street and ask people what their opinions are of a local sports team. Convenience sampling • Susan would like to conduct a survey of homeowners in the Meadowbrook neighborhood to get their opinions on proposed road modifications in the area. Susan ensures that her sample contains a number of two-story, split-level, and ranch homes in her sample that corresponds to the number of homes in the neighborhood. Stratified sampling • Using computer software, I randomly select 20 employees to participate in a job satisfaction survey. Simple random sampling 54 #5 Data Classification • #5.2.0 Data Definition • #5.2.1 Data can be divided into the following groups • #5.2.1.1 Qualitative Data • #5.2.1.2 Quantitative Data • #5.2.2 Data can be divided into the following groups • #5.2.2.1 The Nominal Level of Measurement (Qualitative) • #5.2.2.2 The Ordinal Level of Measurement (Qualitative) • #5.2.2.3 The Interval Level of Measurement (Quantitative) • #5.2.2.4 The Ratio Level of Measurement (Quantitative) 55 Research Project Diagram and Steps #2 #3 #4 Population (Data) Workstation (Information) Sample (Data) Define Population/Sample “Sampling Methods” -Random -Systematic -Stratified -Cluster Define Data “Data Collection Methods” “Data Types/levels” -Direct Observation -Focus Group -Experiment -Survey -Interview -Qualitative -Quantitative -Nominal Level -Ordinal Level -Interval Level -Ratio Level Select Statistical Tools -Descriptive Statistics -Inferential Statistics #5 Solve Business Problem Define Business Problem #1 #5.2.0 Data Definition • Variables: Characteristic that can assume different values (Student grade/X) • Data: are the values assigned to the Measurement or Observation (Values that the variables can assume (Student grade, X = 90) 57 #5.2.1.1 Qualitative Data • Qualitative Data: are non numerical variables that can not be ranked or ordered and can be placed into distinct categories according to some characteristics or attributes oGeographic locations (Burnaby, Richmond) oGender preference (Female or Male) 58 #5.2.1.2 Quantitative Data • Quantitative data: are numerical variables that can be ordered or ranked. oPeople can be ranked according to the value of their experience ▪ 10-15 years work experience-10 persons/Class A ▪ 16-20 years work experience-15 person/Class B oDiscrete: assume only certain value. It is obtained by counting and do not include fractions and decimals. ▪ The # of students in a classroom (0,1,2,3) oContinuous: assume infinite number of values between any two specific values (Range). They are obtained by measuring and often include fractions and decimals ▪ The value of the temperature is (15.768 C or -8.5 C - 40.563 C) 59 #5.2.2.1 The Nominal Level of Measurement (Qualitative) • The Nominal Level of Measurements: Classifies data into categories in which no order or ranking can be imposed on the data. It strictly deals with qualitative data. oClassifies university instructors according to the subject taught (Physics, Math) oClassifies people according to their gender (Male, Female) 60 #5.2.2.2 The Ordinary Level of Measurement (Qualitative) • The Ordinal level of Measurements: Classifies data into categories that can be ranked; however, precise difference between the ranks does not exist (It can not be measured). It strictly deals with qualitative data. oClassifies people according to their build (small, medium & large) oClassifies people according to their education level (PhD, Master & Bachelor) No Exact Limit Between Bodies 61 #5.2.2.3 The Interval Level of Measurement (Quantiative) • The Interval Level of Measurements: Classifies data into categories; however, ranks and precise difference between units of measure does exist (It can be measured but there is no meaningful zero). It strictly deals with quantitative data. oTemperature. There is a difference between 24 degree and 25 degree, but 0 degree does not mean that there is no heat at all. oStudents' marks. There is a differences between 59 and 60. 62 #5.2.2.4 The Ratio Level of Measurement (Quantitative) • The Ratio level of Measurements: Possesses all the characteristics of interval measurements and a true zero does exist. In addition, true ratios exist when the same variable is measured on two different members of the population. It strictly deals with quantitative data. o“0” income means there is no income oOne person can lift 200 pound while the other can lift 100 pound. The ratio is 2:1 63 Qualitative Data Qualitative Data Quantitative Data Nominal Ordinal Classifies data into categories in which no order or ranking Classifies data into categories that can be ranked Classifies data into categories that can be ranked Precise difference between the ranks does not exist Precise difference between units of measure does exist Interval A true zero does not exist “0” degree does not mean that there is no heat at all. Male & Female Big, Medium & Small Quantitative Data Ratio Possesses all the characteristics of interval measurements A true zero does exist “0” income means there is no income A:90-100 B:80-89.99 C:70-79.99 64 Your Turn #3 Identify the type of data (Quantitative/Qualitative) and level of measurement for each of the following data sources • Your IQ scores • The price for one gallon of gasoline • The letter grade earned in your statistics course • The number of boxes of Frosted Flakes on the shelf of a grocery store • The types of cars driven by students in your class 65 Identify the type of data (Quantitative/Qualitative) and level of measurement for each of the following data sources • Your IQ scores. Quantitative/Interval. The difference between IQ scores are meaningful, but there is no true zero point because an IQ of “0” does not indicate the absence of intelligence • The price for one gallon of gasoline. Quantitative/Ratio. The difference prices are meaningful, and there is a true zero point because gasoline that is “0” per gallon is free • The letter grade earned in your statistics course. Qualitative/Ordinal. You can rank letter grades, but the difference between the grades cannot be consistently measured • The number of boxes of Frosted Flakes on the shelf of a grocery store. Quantitative/Ratio. The differences between inventory levels are meaningful, and there is a true zero boxes on the shelf indicates an absence of the product. • The types of cars driven by students in your class. Qualitative/Nominal. The types of cars are merely labels with no ranking or meaningful difference. 66 #6 Data Collection Methods • #6.1.1 Secondary Data • #6.1.2 Primary Data • #6.1.2.1 Direct Observation • #6.1.2.2 Focus Group • #6.1.2.3 Experiment • #6.1.2.4 Survey • #6.1.2.5 Interview 67 Research Project Diagram and Steps #2 #3 #4 Population (Data) Workstation (Information) Sample (Data) Define Population/Sample “Sampling Methods” -Random -Systematic -Stratified -Cluster Define Data “Data Collection Methods” “Data Types/levels” -Direct Observation -Focus Group -Experiment -Survey -Interview -Qualitative -Quantitative -Nominal Level -Ordinal Level -Interval Level -Ratio Level Select Statistical Tools -Descriptive Statistics -Inferential Statistics #5 Solve Business Problem Define Business Problem #1 #6.1.1 Secondary Data • Secondary Data: are the data collected by someone else and made available for others to use. oU.S. Department of labor collect tons of data on topics such as consumer prices, inflation and unemployment oIndividuals or organization do not have source of control over the reliability of the data Secondary Data Primary Data Direct Observation Focus Group Data Sources Statistic Tools Information Experiment Survey Interview 69 #6.1.2 Primary Data • Primary Data: are data collected by the person or organization that eventually use the data oExpensive to acquire oThe individuals or organization have source of control over the reliability of the data Secondary Data Primary Data Direct Observation Focus Group Data Sources Statistic Tools Information Experiment Survey Interview 70 #6.1.2.1 Primary Data • Direct Observation: is a method of gathering data while the subject of interest are in their natural environment, often unaware they are being watched oWatching how baby interacts with a toy oThe subject will be unlikely influenced by the data collection process Secondary Data Primary Data Direct Observation Focus Group Data Sources Statistic Tools Information Experiment Survey Interview 71 #6.1.2.2 Primary Data • Focus Group: is a direct observational technique whereby individuals are often paid to discuss their attitudes towards products or services in a group setting controlled by moderator oStudents and instructors are used as focused group to obtain a new textbook feedback Secondary Data Primary Data Direct Observation Focus Group Data Sources Statistic Tools Information Experiment Survey Interview 72 #6.1.2.3 Primary Data • Experiment: The subjects are exposed to certain treatments and the data of interest are recorded • The Golden Brown color of the French fries has many influential factor that determine their color (e.g. The time of the frying, temperature, thickness of the potato and type of potato, etc.) Secondary Data Primary Data Direct Observation Focus Group Data Sources Statistic Tools Information Experiment Survey Interview 73 #6.1.2.4 Primary Data • Survey: includes directly asking people a series of questions and can be administered by e-mail, via the Web, through the mail. Face to face or over the telephone oThe survey questions should be carefully designed to avoid bias Secondary Data Primary Data Direct Observation Focus Group Data Sources Statistic Tools Information Experiment Survey Interview 74 #6.1.2.5 Primary Data • Interview: is used to gather data from people. oStructured Interview: Interview in which questions are scripted. oUnstructured Interview: Interview that begin with one or more broadly questions, with further questions being based on the responses Secondary Data Primary Data Direct Observation Focus Group Data Sources Statistic Tools Information Experiment Survey Interview 75 Your Turn #4 Identify the data required for each example are primary or secondary. For primary data, determine the best way in which the data should be collected. |In other words, should the data be collected via observation, experiment or survey? • Apple would like to measure the satisfaction levels of customers who purchased its new iPad product. • A manager of an electronics store would like to investigate the impact that price has on the demand for laptop computers. Each week, the price of a Dell laptop is adjusted and the demand for each week is recorded. • Cleveland State University needs to determine the current inflation rate to determine the annual salary increase for its staff for the upcoming year. • McDonald’s would like to determine the average wait time for customers who use its drive-through windows during the lunch hour. 76 Identify the data required for each example are primary or secondary. For primary data, determine the best way in which the data should be collected. In other words, should the data be collected via observation, experiment or survey? • Apple would like to measure the satisfaction levels of customers who purchased its new iPad product. Primary data through a survey • A manager of an electronics store would like to investigate the impact that price has on the demand for laptop computers. Each week, the price of a Dell laptop is adjusted and the demand for each week is recorded. Primary data through an experiment • Cleveland State University needs to determine the current inflation rate to determine the annual salary increase for its staff for the upcoming year. Secondary data • McDonald’s would like to determine the average wait time for customers who use its drive-through windows during the lunch hour. Primary data through direct observation 77 Applying Concept “Safe Travel” • Read the following information about the transportation industry and answer the following questions • The chart shows the number of job-related injuries for each of the transportation industries • Industry Number of Injuries oRailroad 4520 oIntercity Bus 5100 oSubway 6850 oTrucking 7144 oAirline 9950 78 • 1. What are the variables under study? • 2. Categorize each variables as quantitative or qualitative • 3. Categorize each quantitative variable as discrete or continuous • 4. Identify the level of measurement for each variable • 5. What is the message of this graph? 79 • 1. What are the variables under study? • The variables are industry and number of job-related injuries • 2. Categorize each variables as quantitative or qualitative • The type of industry is a qualitative variables, while the number of job-related injuries is quantitative • 3. Categorize each quantitative variable as discrete or continuous • The number of job-related injuries is discrete • 4. Identify the level of measurement for each variable • The type of industry is nominal, and the number of job-related injuries is ratio • 5. What is the message of this graph? • The rail-road is the safest transportation industry 80 #7 Ethics and Statistics • Biased sample – a sample that does not represent the intended population o Can lead to distorted findings o Can occur either intentionally and unintentionally • Changing the graph scale vs. • Choosing a sample that is not representative of the population 81 Summary • • • • Statistics Data, Variables Sample, Population Statistics Branches; o Descriptive Statistics o Inferential Statistics • • • • Sampling Methods Data Classification Data Collection Methods Research Project Diagram 82 References 1. Bluman, A. G. (2009). Elementary statistics: A step by step approach (6th ed). McGraw-Hill Higher Education. 2. Donnely, R. (2019). Business statistics (3rd ed). Pearson Education. 3. Groebner, D., Shannon, P., Fry, P., and Smith, K., (2018). Business statistics: A decision making approach (10th ed). Pearson Education. 4. Sharpe, N., DeVeaux, R.,Velleman, P., and Wright, D. (2017). Business Statistics. (3rd). Pearson Education. 83