Chapter 1 – Introduction to Statistics Statistics - the science of collecting, organizing, analyzing, and interpreting Data. Chapter 1: Collecting Data Chapter 2: Organizing/Analyzing Data Chapters 3-8: Interpreting Data Types of Data Sets: Population - data set consisting of _________ outcomes, measurements, or responses of interest Sample - data set which is a __________ of the population data set Examples: If we are interested in measuring the salaries of American high-school teachers, the population data set would be a list of the salaries of every high-school teacher in America. A sample data set could be obtained by selecting 100 highschool teachers from a across the country and listing their salaries. 1 A polling organization wants to know whether Americans favor increased defense spending. The ___________ data set would consist of the responses of every American. A common way of choosing a ___________ data set would be to randomly call 1000 Americans and gather their responses to the question of whether they favor increased defense spending. Suppose we wanted to determine the average number of course hours for HCC students. What would be the population data set? A possible sample set? 2 Types of Measurements: Parameter - a numerical measurement made using the population data set Statistic - a numerical measurement made using a sample data set Ex: Using the teacher salary data sets, we could calculate the average salary for the high-school teachers. The average calculated from the population data set would be the __________. The average calculated from the sample of 100 teachers would be a _________. Notice that unless the population is very small it is probably impossible to gather the population data set, and so it is usually impossible to calculate the parameter we are interested in. (For example, why would it be impossible to find Americans’ actual average calories consumed per day?) The main idea of the science of statistics is that we can get around this difficulty by selecting a sample, calculating the sample statistic, and use the sample statistic to make an estimate of the parameter. Unfortunately, statistical estimates can never be 100% certain. (But they can be 90% or 95% or 99% certain.) 3 Descriptive and Inferential Statistics: _____________ statistics just reports the “facts” as gathered from the data (organization, summarization, and display of data), ____________ statistics draws conclusions about the population from the sample data given. Ex: Determine the descriptive and inferential statistics based on this data from The Journal of Family Issues: Still Alive at 65: Unmarried Men: 70% Married Men: 90% What can you conclude? Does this data prove your statement? 4 Types of Data: 1.2 Qualitative Data – characteristics, labels, or nonnumerical data Ex: Eye Color, Political Party, Name Quantitative Data - numerical measurements or quantities Ex: Height, Area, Time, Income, Blood Alcohol Level There are subcategories as well (see chart p. 14): NOMINAL (qualitative): no meaningful numbers ORDINAL (qualitative or quantitative): may be numerical or rankable, but numerically meaningless INTERVAL (quantitative): numerically meaningful, and can find differences between values; no absolute zero RATIO (quantitative): can also make meaningful statements about the ratios between values (ex: twice as much rain); zero means “none” Ex: Choose the best suited level of measurement : 1. 2. 3. 4. _________ 5. Hours spent watching movies 5 most popular baby names Car make and model (ex: F-150) Sea level of ridges Numbers on soccer jerseys 5 1.3 Experimental Design: How to design a statistical study 1. Identify what you are interested in, and what the POPULATION is. (Ch 1.) 2. * Decide upon the METHOD OF DATA COLLECTION, make a plan, COLLECT the data. (Ch 1.) 3. DESCRIBE the data (descriptive statistics: histogram, mean, standard deviation, etc. This will be a topic of Ch 2.) 4. INTERPRET the data (Inferential statistics: write and interpret a hypothesis. This will be a topic of chapters 3 through 9.) 5. Consider possible errors or omissions. * We are here. 6 Methods of Data Collection: 1.3 Method Examples Survey – an investigation of characteristics of a population Census - collect measurements from the entire population Determine average grade on a Statistics exam Measure salaries of all 50 state governors Opinion Polls Determine average income in U.S Temperature at the core of the Sun Weight of an adult Tyrannosaurus Rex Effects of institutionalization on orphaned children A sample of 200 cancer patients is selected. An experimental drug is given to 100 patients and the remaining 100 patients receive a placebo. The survival rates of the two groups are then compared Used when population is small. Sample - choose a sample from your population and collect measurements. Used when population is large. (Most Common) Simulation - Program a computer with a mathematical or physical model to simulate population data. Used when impossible to collect sample data. Observational Study – watch the subjects and take note, but do not interfere Experiment - Collect a sample, split the sample into two groups: The Case Group receives treatment. The Control Group does not. Used to measure the effect of treatment by comparing the characteristics of the case and control groups. Placebo; Placebo Effect; Single Blind Experiment ; Double Blind Experiment 7 Methods of Sampling: Method Random* Sampling The sample is chosen as a result of chance (equal likelihood of any being chosen.) Systematic Sampling The population is placed on a list, a random starting point is chosen and then every k-th member is selected. Stratified Sampling The population is divided into groups (strata) usually with meaningful differences, and a sample is chosen from each group. Cluster Sampling - The population is divided into groups randomly, and then a sample is chosen by randomly selecting entire groups. Examples Telephone polling random telephone numbers Drawing names out of a hat (*Demo random # gen on calc.) Choosing a sample of registered voters by choosing every 25th voter from the county registration roll Testing every 300th assembly line product Choosing 200 men and 200 women for a sample Stratify the population by income level and then choose a sample of low, middle, and high income individuals Randomly choose 10 polling stations in a city and exit poll all voters at those stations Convenience Sampling Choose individuals for a sample because they are easy to include. Internet Polls Mail-In Customer Survey Group work Quiz, and article, 8