CHAPTER 1: INTRODUCTION TO STATISTICS SQQS1013 ELEMENTARY STATISTICS 1.1 WHAT IS STATISTICS? › The word statistics is derived from classical Latin root, status which means state. › Statistics has become the universal language of the sciences › As potential users of statistics, we need to master both the “science” and the “art” of using statistical methodology correctly. › In Today’s Business World You Need To Think Differently About Statistics › Modern-day information technology enables businesses to apply statistics in new ways to solve business problems utilizing: – Vast amounts of summarized, unsummarized, numerical, and nonnumerical data (facts about the world). – Software to perform calculations. › Statistics Are The Methods That Allow People To Effectively Work With Data › Statistics provides a formal basis to: – Summarize and visualize data. – Reach conclusions from data. – Make reliable predictions about the activities. – Improve the processes. To Properly Apply Statistics You Should Follow A Framework To Minimize Possible Errors D C O V A • Define the data you want to study to solve a problem or meet an objective. • Collect the data from appropriate sources. • Organize the data collected by developing tables • Visualize the data by developing charts. • Analyze the data collected to reach conclusions and present those results. Nowadays statistics is used in almost all fields of human effort: EDUCATION AGRICULTURE BUSINESSES HEALTH/PUBLIC HEALTH SPORTS FINANCIAL ASPECTS IN STATISTICS Statistics Theoretical Statistics Development, derivation and proof of theorems, formulas, rules and laws. Descriptive Statistics Methods for collecting, organizing, analyzing and summarizing data Applied Statistics Applications of those theorems, formulas, rules and laws to solve real problems. Inferential Statistics Methods that use results obtained from sample to derive conclusions about a population Few common examples of descriptive and inferential statistics Examples of Descriptive Statistics: Examples of Inferential Statistics: i) Average marks obtained by all the students. i) Estimation of number of students (boys and girls separately) in a school. ii) Grades or percentile of the scores. ii) Population of particular county or city. iii) Average score in cricket. iii) Frequency of the variables. iv) Estimation of number of damaged or cavity teeth by a dentist. iv) Prediction by a dentist about the teeth that are susceptible to have cavity or damage in future. 1.2 BASIC TERM IN STATISTICS POPULATION • a collection of all individuals about which information is desired. Finite population • When the membership of a population can be (or could be) physically listed. • e.g. the employees of a given company, the number of airplanes owned by an airline, or the potential consumers in a target market. Infinite population • When the membership is unlimited. • e.g. the number of germs in the body of a patient of malaria. SAMPLE • A subset of the population. • e.g. 1.2 BASIC TERM IN STATISTICS PARAMETER STATISTICS • numbers that summarize data for an entire population. • Greek letter is used to symbolize the name of parameter. • Average/Mean - µ • Standard deviation - • e.g. The “average” age at time of admission for all students who have ever attended our college • numbers that summarize data from a sample • English alphabet is used to symbolize the name of statistic • Average/Mean – X-bar • Standard deviation -s • e.g. The “average” height, found by using the set of 25 heights. 1.2 BASIC TERM IN STATISTICS VARIABLE DATA VALUE • a characteristic of interest about each individual element of a population or sample. • e.g. • student’s age at entrance into college • the color of student’s hair • the value of variable associated with one element of a population or sample. • This value may be a number, a word, or a symbol. • e.g. • Farah entered college at age “23” • her hair is “brown” 1.2 BASIC TERM IN STATISTICS VARIABLE DATA VALUE DATA • a characteristic of interest about each individual element of a population or sample. • e.g. student’s age at entrance into college, the color of student’s hair • the value of variable associated with one element of a population or sample. • This value may be a number, a word, or a symbol. • e.g. Farah entered college at age “23”, her hair is “brown” • The set of values collected from the variable from each of the elements that belong to sample. • e.g. : The set of 25 heights collected from 25 students. 1.2 BASIC TERM IN STATISTICS CENSUS SAMPLE SURVEY • a survey includes every element in the population • might say a census is a 100% sample survey. • a survey includes every element in selected sample only Give an example of census and sample survey EXAMPLE 1.1 1. At Sintok Community College 150 students are randomly selected and asked the distance of their house to campus. From this group, a mean of 5.2 km is computed. a. b. c. d. e. What is the population? What is the sample? What is the parameter? What is the statistic? What is the variable of the study? 1.2 VARIABLES Categorical (qualitative) variables take categories as their values such as “yes”, “no”, or “blue”, “brown”, “green”. Numerical (quantitative) variables have values that represent a counted or measured quantity. Discrete variables arise from a counting process. Can assume any values corresponding to isolated points along a line interval. That is, there is a gap between any two values e.g. Number of courses for which you are currently registered Continuous variables arise from a measuring process. Can assume any value along a line interval, including every possible value between any two values. e.g. Weight of books and supplies you are carrying as you attend class today. EXAMPLE 1.2 TYPES OF VARIABLES Question Responses Variable Type Do you have a Facebook profile? Yes or No Categorical How many text messages have you sent in the past --------------three days? Numerical (discrete) How long did the mobile app update take to download? Numerical (continuous) --------------- Variables Categorical Nominal Numerical Ordinal Discrete Continuous Examples: Examples: Ratings Examples: Examples: Marital Status Political Party Eye Color (Defined Categories) Good, Better, Best Low, Med, High (Ordered Categories) Number of Children Defects per hour (Counted items) Weight Voltage (Measured characteristics) 1.2 SCALE OF MEASUREMENT › Data can also be classified by how they are categorized, counted or measured. › This type of classification uses measurement scales with 4 common types of scales: nominal, ordinal, interval and ratio. A nominal scale classifies data into distinct categories in which no ranking is implied. Categorical Variables Categories Do you have a Facebook profile? Yes, No Type of investment Growth, Value, Other Cellular Provider Celcom, Maxis, UMobile, Digi, None An ordinal scale classifies data into distinct categories in which ranking is implied. Categorical Variable Ordered Categories Student class designation Freshman, Sophomore, Junior, Senior Product satisfaction Very unsatisfied, Fairly unsatisfied, Neutral, Fairly satisfied, Very satisfied Faculty rank Professor, Associate Professor, Assistant Professor, Instructor Standard & Poor’s bond ratings AAA, AA, A, BBB, BB, B, CCC, CC, C, DDD, DD, D Student Grades A, B, C, D, F An interval scale is an ordered scale in which the difference between measurements is a meaningful quantity but the measurements do not have a true zero point. A ratio scale is an ordered scale in which the difference between the measurements is a meaningful quantity and the measurements have a true zero point. EXAMPLE 1.2 1. Classify each variable as discrete or continuous. a. b. c. d. e. Ages of people working in a large factory. Number of cups of coffee served at a restaurant. The amount of a drug injected into a rat. The time it takes a student to walk to school. The number of liters of milk sold each day at a grocery store. 2. Classify each as nominal-level, ordinal level, interval-level or ratio level. a. b. c. d. e. Rating of movies as U, SX and LP. Number of candy bars sold on a fund drive. Classification of automobile as subcompact, compact, standard and luxury. Temperatures of hair dryers. Weights of suitcases on a commercial airline. 1.3 SOURCE OF DATA Primary Sources: The data collector is the one using the data for analysis: Data from a political survey. Data collected from an experiment. Observed data. Secondary Sources: The person performing data analysis is not the data collector: Analyzing census data. Examining data from print journals or data published on the internet. EXAMPLES OF SURVEY DATA › A survey asking people which laundry detergent has the best stain-removing abilities. › Political polls of registered voters during political campaigns. › People being surveyed to determine their satisfaction with a recent product or service experience. EXAMPLES OF DATA FROM A DESIGNED EXPERIMENT › Consumer testing of different versions of a product to help determine which product should be pursued further. › Material testing to determine which supplier’s material should be used in a product. › Market testing on alternative product promotions to determine which promotion to use more broadly. EXAMPLES OF DATA COLLECTED FROM OBSERVATIONAL STUDIES › Market researchers utilizing focus groups to elicit unstructured responses to open-ended questions. › Measuring the time it takes for customers to be served in a fast food establishment. › Measuring the volume of traffic through an intersection to determine if some form of advertising at the intersection is justified. EXAMPLES OF DATA DISTRIBUTED BY AN ORGANIZATION OR INDIVIDUAL (SECONDARY DATA) › Financial data on a company provided by investment services. › Industry or market data from market research firms and trade associations. › Stock prices, weather conditions, and sports statistics in daily newspapers. TYPES OF SAMPLES Samples Non Probability Samples Judgment Convenience Probability Samples Simple Random Stratified Systematic Cluster Why do we have to study statistics? – To read and understand various statistical studies in related field. – To communicate and explain the results of study in related field using our own words. – To become better consumers and citizens. Can Statistics Lie? › Faulty or invalid statistics can be produced if any tasks in the DCOVA framework are applied incorrectly. › Many statistical methods are valid only if the data being analyzed have certain properties. › For inferential methods you should always look for logical causality. Chapter Summary In this chapter we have seen: › Statistics is a way of thinking that can lead to better decisions. › Statistics requires analytics skills and is an important part of your education. › Recent developments such as the use of business analytics and “big data” have made knowing statistics even more critical. › The DCOVA framework guides your application of statistics. › How to define variables. › Understanding the different measurement scales. › How to collect data. › Identifying different ways to collect a sample.