Ch 1.1 & 1.2 Basic Definitions for Statistics Objective A : Basic Definition A1. Definition What is Statistics? • Statistics is the science of collecting, organizing, summarizing, and analyzing data to draw conclusions. Descriptive Statistics Statistics Inferential Statistics • Descriptive statistics consist of collecting, organizing, summarizing, and presenting data. • Inferential statistics consists of generalizing from samples to populations, performing estimations and hypothesis tests, and making predictions. Population versus Sample • A population consists of all individuals (person or object) that are being studied. • A sample is a subset of the population. Parameter versus Statistic • A parameter is a numerical summary of a population. • A statistic is a numerical summary of a sample. Example 1: Identify the population and sample in the study. A farmer wanted to learn about the weight of his soybean crop. He randomly sampled 100 plants and weighed the soybeans on each plant. Example 2: Determine whether the underlined value is a parameter or a statistic. (a) Only 12 men have walked on the moon. The average age of these men at the time of their moonwalks was 39 years, 11 months, 15 days. (b) In a national survey on substance abuse, 66.4% of respondents who were full-time college students aged 18 to 22 reported using alcohol within the past month. A2. Variables and Type of Data Variable • A variable is a characteristic that can assume different values called data. Qualitative versus Quantitative variable • Qualitative (categorical) variables are variables that can be placed into distinct categories. • Quantitative (numerical) variables are numerical and arithmetic operations can be performed. Discrete variable Quantitative variable Continuous variable • A discrete variable can assume a countable number of values. • A continuous variable can assume an infinite number of values between any two specific values. They often include fractions and decimals. Example 1: Classify each variable as qualitative or quantitative. If the variable is quantitative, further classify the data as discrete or continuous. (a) Number of students attending a university for Fall 2012 (b) Colors of football caps in a store . (c) Social security number. (d) Water temperature of a swimming pool. Objective B : Level of measurement of a Variable In addition to being classified as qualitative or quantitative, variables can be classified by how they are categorized, counted, or measured. Four common types of measurement scales are used: nominal, ordinal, interval, and ratio. • The nominal level of measurement classifies data into categories in which no order or ranking can be imposed on the data. For example: • The ordinal level of measurement classifies data into categories that can be ranked. For example: • The interval level of measurement has the properties of the ordinal level of measurement and the differences in the values of the variable have meaning. There is no true zero. A value of zero does not mean the absence of the quantity. For example: • The ratio level of measurement has the properties of the interval level of measurement and the ratios of the values of the variable have meaning. There exists a true zero. For example: Objective C : Observational Study versus Designed Experiment Definition In an observational study, the researcher observes the behavior of the individuals without trying to influence the outcome of the study. In a designed experiment, the researcher controls one of the variables and tries to determine how the manipulation influences other variables. The independent variable which is also called the explanatory variable in a designed experiment is the one that is being controlled by the researcher. The dependent variable which is also called the response variable is the resultant variable. Confounding in a study occurs when the effects of two or more explanatory variables are not separated. A lurking variable is an explanatory variable that was not considered in a study, but that effects the value of the response variable in the study. Example 1: Determine whether the study depicts an observational study or an experiment. (a) Rats with cancer are divided into two groups. One group receives 5 mg of a medication that is thought to fight cancer, and the other receives 10 mg. After 2 years, the spread of the cancer is measured. (b) Conservation agents netted 320 large-trout in a lake and determined how many were carrying parasites. Example 2: Identify the explanatory variable and the response variable for the following studies. (a) Rats with cancer are divided into two groups. One group receives 5 mg of a medication that is thought to fight cancer, and the other receives 10 mg. After 2 years, the spread of the cancer is measured. (b) A researcher wants to determine whether young couples who marry are more likely to gain weight than those who stay single. Ch 1.3 to 1.5 Sampling Objective A : Ch 1.3 Simple Random Sampling Simple Random Sampling The goal of sampling is to obtain as much information as possible about the population at the least cost. • A random sample is obtained by using chance methods or random numbers. Example : Steps for Obtaining a Simple Random Sample 1. Number all the individuals in the population of interest. 2. Use a random number table, graphing calculator, or statistical software to randomly generate n numbers where n is the desired sample size. Objective B : Ch 1.4 More Sampling Methods • A systematic sample is obtained by selecting every kth individual from the population. • A stratified sample is obtained by dividing the population into non-overlapping groups (called strata) according to some similar characteristic, then sampling from each stratum. • A cluster is obtained by dividing the population into groups called the clusters such as geographic area or schools in a large district. Then all the individuals within a randomly selected clusters are selected. • A convenience sample is a sample in which the individuals are easily obtained and not based on randomness. Example 1: Identify the type of sampling used. (random, systematic, stratified, cluster, convenience) (a) Every tenth customer entering a grocery store is asked to select her or his favor color. (b) A farmer divides his orchard into 30 subsections, randomly selects 4, and sample all the trees within the 4 subsections to approximate the yield of his orchard. (c) A survey regarding download time on a certain website is administered on the Internet by a market research firm to anyone who would like to take it. (d) In an effort to identify is an advertising campaign has been effective, a marketing firm conducts a nationwide poll by randomly selecting individuals from a list of known users of the product. (e) A school official divides the student population into five classes: freshman, sophomore, junior, senior, and graduate student. The official takes a simple random sample from each class and asks the members’ opinions regarding student services. Objective C : Ch1.5 Bias in Sampling • Sampling bias means that the technique used to obtain the individuals to be in the sample tends to favor one part of the population over another. • Non-response bias exists when individuals selected to be in the sample do not wish to respond. • Response bias exists when the answers on the survey do not reflect the true feelings of the respondent. Example 1: The survey has bias. Determine the type of bias. (Sampling, non-response, response) (a) To determine the public’s opinion of the police department, the police chief obtains a cluster sample of 15 census tracts within his jurisdiction and samples all households in the randomly selected tracts. Uniformed police officers go door to door to conduct the survey. (b) The village of Oak Lawn wishes to conduct a study regarding the income level of households within the village. The village manager selects 10 homes in the southwest corner of the village and sends an interviewer to the homes to determine household income.