Data Distributions Interactive Presentation Data Collection and Frequency Tables 1. Why does sample size matter? 2. How could the way data is collected affect answers to survey questions? 3. What are some ways to make random selection and why is randomness desirable? Vocabulary • Data – facts or numbers that are collected Types of Data • Categorical data – data that is a name or category • Numerical data – data that is a number Vocabulary Population – the entire group you want to find information about EXAMPLE: • Sample – a group of people within a population EXAMPLE: Sample or population? REMEMBER! • Sample Statistics will be more accurate as sample size INCREASES!! Vocabulary • Survey – given to investigate behaviors or opinions by questioning a sample from the population Click link for examples Vocabulary • Census – a survey of an entire population Vocabulary • Parameter – a measured characteristic of a population • Statistic – a measured characteristic of a sample A number that represents the average shoe size of ALL 7th graders The average shoe size of our class (the representative length from the sample) Vocabulary Review Definition Facts or numbers that are collected A measured characteristic of a sample Data that is a number Given to investigate opinions or behaviors by questioning a group of people A group of people within a population Survey of an entire population A measured characteristic of a population Data that is a name or category The group you want to find information about Discussion 1 • A school principal wants to know the average amount of time it takes her students to reach school each morning. To find this out, she asked 20 students in each grade “How long does it usually take you to reach school in the morning?” Explain how the words population, sample, data, and survey fit this situation. Discussion 2 • An automotive shop has 25 workers. The owner wants to reward his workers with a company outing. He is considering a day at a baseball game, a day at an amusement park, or a dinner for the workers at a restaurant. He decides to conduct a survey so he can make the best choice. Formulate a single question he could ask. Should he use a sample or a census? Frequency Table • After you choose a question, you need to collect and organize your data. A good way to do this is to use a frequency table (frequency distribution). • Frequency distribution (frequency table) – a table that organizes data to show how many times each item or group of items appears Your Turn! # of pets Tallies Frequency 0 1 2 • Maxine took a census of all the students in Ms. Alvarez’s class. The data below show the number of pets owned by each student: 0, 1, 3, 2, 1, 4, 2, 1, 0, 3, 5, 2, 2, 1, 3, 2, 1, 4, 5, 0, 0, 1, 2, 1, 2 Organize the data in an ungrouped frequency table. Use the data to determine how many more students have 1 pet than have no pets. 3 4 5 Questions: 1._______ students have 1 pet. 2._______ students have no pets. 3.How many more students have 1 pet than have no pets? _______ 4. The data are organized in the frequency table above. The data show that _____ more students have 1 pet than have no pets. • Try another frequency table problem: A survey of 200 people asked “On your dream vacation, how would you get where you are going?” The results are shown in the frequency table: Transportation Number of people Airplane 125 Automobile 6 Boat 42 Train 27 1. What percent of those surveyed chose boat? Challenge Question: 2. What percent did not chose airplane? 1. 21% of those surveyed chose boat 2. 37.5% of those surveyed did not choose airplane Getting the Idea • A frequency distribution presents data in a table. It is easy to read the data in a frequency distribution, but it is not easy to get the “whole picture” from the list of numbers. Graphs are used to show data. We will show you a variety of graphs you can use to display your data later on in this unit! Ticket-out-the-door • The 2,000 members of a club were mailed postcards, asking them to suggest locations for next year’s annual meeting. Only 150 returned the postcards. How do the new ideas from this lesson fit this situation? Frequency table sample 1. The 2,000 members of the club represent _____________. 2. The 150 members who sent back the postcards represent the ___________. 3. What the members write on the postcards is called ________. 4.The act of collecting the information on the postcards is called a _________. 5.A good way to organize this data is to use a ________ __________. • How can I describe and interpret a data set in a meaningful way? • VOCABULARY: central tendency, mean, median, mode Measures of central tendency: 1.Mean 2.Median 3.Mode Vocabulary These are measures of central tendency! • Mean – the average (add up the values and divide by the # of values) • Median – the middle number in a list of numbers (Hint: write the numbers in order) • Mode – the value that occurs the most EXAMPLE 1 • Find the mean, median, and mode of the data in the table: 9, 8, 9, 8, 7, 8, 9, 10, 10, 7, 8, 9, 8, 8, 10, 8, 8, 9, 10, 8, 8, 10, 9, 9, 9 *Hint to help with mean* Use the frequency column to find the TOTAL number of students Score 10 9 8 7 Frequency Example1 Answers: EXAMPLE 2 Zack wants to have a mean score of 80 on his health quizzes. He scored 70, 75, 82, and 90 on his first four quizzes. What score must he earn on his fifth quiz to have a mean score of exactly 80 for all five quizzes? SMART STRATEGY: Use what you know about MEAN! Step 1: Find the sum of the 4 scores you know. Step 2: Find the sum if Zack has a mean score of 80 on all 5 quizzes. Step 3: What number would you need to add 317 to get a sum of 400? Step 4: Check your answer Example 3 • This stem and leaf plot shows the number of miles Jamal biked per week for each of the past 10 weeks: Stem Leaf 3 6 6 4 0 3 3 5 7 8 9 5 3 Key: 5 3 = 53 miles This week, Jamal was ill so he only biked 11 miles. How does this change the median and mean of the data? Which would be the best measure for each situation? 1. Would you use mean, median, or mode to describe the typical selling price of a bicycle? 2. Would you use mean, median, or mode to determine the most popular toy sold at a store? MMMR Rap • M to the M to the M to the R, Remember this rhyme and you’ll go far • Mode, Median, Mean & Range, Now singing this song might feel strange. • Mode, Mode now I’ve been told, is the number you will see the most • Median now he’s the man, the one in the middle, line HIM up the best you can • From small to large, small to large remember this & your in charge • Now mean mean you may wonder, just add add add all your numbers • Then you just simply divide & you’ll have one number to your surprise • Last but not least is our friend the range • He’s not the best & he’s kind of strange • You start with the high & subtract the low, that’s the range now that’s fo sho! • EQ: What are measures of variation? VOCABULARY: Variation, Range, Quartiles, Interquartile Range, outlier, 5 Number Summary Vocabulary • Variability – How a data set is spread out • Range – The difference between the greatest and least values in a data set 27, 39, 40, 22, 19, 41, 58, 40, 53, 49 *HINT: Largest Number – Smallest Number 58 – 19 = 39 • Quartile: The three numbers that split an ordered data set in four equal groups Lower Quartile (median of the lower half of the data) The median of the data set Upper Quartile (median of the upper half of the data) • 5 Number Summary: the 5 numbers that divide a set of data into 4 equal groups. 1. Minimum or Lower Extreme 2. Lower Quartile (Q1) 3. Median (Q2) 4. Upper Quartile (Q3) 5. Maximum or Upper Extreme • Interquartile range: The difference between the first and third quartiles. (Note that the first and third quartiles are sometimes called upper and lower quartiles.) IQR = UQ - LQ • Outlier – a number that is much greater than or much less than the rest of the numbers in a data set EXAMPLE 1 Below are the weekly earnings for eight Kroger Employees. Find all measures of variation: $260, $175, $215, $350, $320, $235, $240, $280 You are looking for: 1.Lower extreme 2.Quartile 1 3.Median 4.Quartile 3 5.Upper Extreme 6.Range 7.IQR EQ: How can I use box-and-whisker plots to display and analyze data? • Box-and-Whisker Plot: a five number summary of data organized into quartiles The box-and-whisker plot below shows the weights, in pounds, of the dogs that were weighed this morning at a veterinarian’s office. Approximately what percent of the dogs weighed less than 25 pounds? 0 10 20 30 40 50 60 70 80 90 100 1. The box-and-whisker plot shows that the lower quartile of the data is _______ pounds. 2. The lower quartile is the median of the lower __________ of the data set. 3. The quartiles divide the data into ____________. 4. What fraction of the data is less than the lower quartile? _______ 5. What percent is equivalent to that fraction? The double box-and-whisker plot below shows the number of points scored in games by two basketball players on the same team. Find the range and interquartile range for each player. Who was the most consistent scorer? Step 1: Put the data in order from least to greatest Step 2: Find the median Step 3: Find the Lower Quartile Step 4: Find the Upper Quartile Step 5: Draw a number line Step 6: Place a point above the median, lower quartile, and upper quartile Step 7: Draw a box (with a vertical line thru the median) Step 8: Place a point above the lower extreme Step 9: Place a point above the upper extreme Step 10: Draw the whiskers • 15 shoppers rated a brand of paper towel on a scale from 0-10 2, 6, 6, 6, 7, 8, 8, 8, 9, 10, 10 • How do I collect data on a population that is too large to study? • VOCABULARY: sample, population Vocabulary A sample is a _________ selected group that is ___________ of the population. If the sample is __________ of the population, then the measures of central tendency and of variation for the ________ and the __________ should be similar. The larger the _______ size, the more accurate the _________. Example 1 Owen took a random sample of 10 students who take piano lessons at a music school and recorded their ages. The director of the school took a census of all 30 students who take piano lessons at the school and recorded their ages. Owen’s sample data and the director’s census data are show below: Owen’s Sample Data: 5, 12, 12, 12, 12, 13, 13, 17, 17, 25 Director’s Census Data: 5, 7, 8, 9, 10, 10, 11, 11, 11, 11, 12, 12, 12, 12, 12, 13, 13, 13, 13, 14, 15, 16, 17, 17, 17, 18, 19, 21, 21, 25, 30 Find and compare the mean, median, mode, and range of the sample and the census. Example 2 • Which of the two samples has measures that are closer to those of the actual population? Example 3 The manager of an online bookstore kept track of the number of books in each box that was shipped for 100 orders. His assistant randomly selected two samples from his data and calculated the mean and median for each: Sample A: 4, 7, 9, 9, 10, 11, 12, 15, 20, 26 Sample B: 1, 4, 4, 9, 12 Which sample is more likely to have a mean and a median that are good approximations of the actual mean (12.5) and the actual median (11.5) of the population? Calculate the mean and median of each sample to determine if your guess was correct or not. • How can best organize categorical and numerical data? • VOCABULARY: categorical data, numerical data, line plot, pictograph Vocabulary • Categorical data – data that is a name or category • Numerical data – data that is a number What are some examples? Vocabulary • Line plot– each data item is shown as a mark above a number line; good for showing numerical data How many brothers and sisters do you have? Class Example • Pictograph – a graph that shows data using symbols or pictures Example 2 • Jenny keeps statistics during basketball practice. She recorded the number of free throws each player on the team successfully made out of 15 attempts. Her data are listed below: 10, 14, 15, 12, 12, 9, 8, 14, 12, 5, 13, 10, 10, 12, 11 Create a line plot to display these data. Then identify the mode. Example 3 Leslie surveyed a sample of her classmates. She asked them to name the number of different states they have lived in. She displayed the results of her survey in the line plot below: Identify any outliers for the data. Then find the median and the range, with and without the outlier(s). Does removing the outlier(s) change those measures? • How can I collect, organize, and analyze data in a meaningful way? • VOCABULARY: histogram, bar graph • Bar graph – uses bars to display categorical data • The bars have spaces between them • All the bars are the same width Number of Victories Washington Warriors Victories Year Steps to making a BAR GRAPH Day 1. Study your data Visitor from the frequency table and determine a scale 2. Draw and label the graph. M T W R F 115 113 133 56 84 Your turn to try … • Using the frequency table below, draw a bar graph School Days Per Year Country School Days Belgium 175 Japan 243 Nigeria 190 S. Korea 220 USA 180 • Histogram – uses bars to show the frequency of data within equal intervals Since the intervals leave no gaps, the bars of a histogram do not have spaces between them!! Steps to Creating a Histogram 1. 2. 3. 4. 5. Draw and label the axes of your histogram List the intervals from the frequency table on the horizontal axes Use the totals from the table to set the scale on the vertical axes Draw the bar for each interval The bars should be touching, the same width and shaded • Example-Top 30 requested songs Weeks Frequency 1-5 4 6-10 11 11-15 9 16-20 4 21-25 0 26-30 2 Example 1: Double Bar Graph The double bar graph shows the number of tickets sold by four theatres yesterday. What was the mean number of tickets sold by these theatres? Example 2: Histogram • The number of words that Question 1: What students in a typing class percent of students can type in a minute are can type 30 or more listed below. First make a frequency table and then a words per minute histogram of the data. • 25,19,23,29,34,26,30,34,33, 20,35,35,25,29,36,22,34, 15 Question 2: How many students type 24 or less words per minute? EQ: How can I use line graphs and circle graphs to display and analyze data? • Line graph – a type of graph that shows change over time using a line connecting data points Shows trends over time!! People at the Sandwich Shop During what time interval did the greatest number of people come into the sandwich shop? By how much did it increase? • Circle Graph – displays categories of data as parts of a whole Shows Percents! ? Example 1 As they exited the voting booths, 2,000 people were asked to identify the mayoral candidate for whom they had voted. Of the people surveyed, how many voted for Milton? How many voted for Johnson? How many voted for Dunbar? Example 2 Mandy asked a sample of students at her school to name their favorite subject. Her results are shown above. If 12 students chose social studies as their favorite subject, what is the total number of students surveyed? SMART STRATEGY: Set up a PROPORTION! Making a Circle Graph Type of Movie Funny Scary Romantic Action Number of Students Percent of Total Degrees Size of angle (# if students/Total) in a (Percent x 360) circle Interactivate: Circle Graph Ticket-out-the-door • Create your own circle graph based on the survey data below. *HINT: A total of 50 students were surveyed!! Favorite Number type of ice of cream Students Percent of Total (# of students/Total) Degree s in a circle Vanilla 15 360 Chocolate 25 360 Strawberry 10 360 Size of angle (Percent x 360) EQ: How can I use scatter plots to display and analyze data? • Scatter Plot: a graph in which ordered pairs of data are plotted. You can use a scatter plot to determine whether a relationship, or correlation, exists between 2 sets of data HINT: LOOK FOR TRENDS (PATTERNS)! As x increases, y ______________. As x increases, y ______________. As x increases, y ______________. As x increases, y ______________. As x increases, y ______________. As x increases, y ______________. Interactivate: Scatter Plot Graphs that help us analyze data: • • • • • • • • Pictographs Histograms NLVM – Check it out! Bar graphs Line graphs Circle graphs Line plots Box-and-whisker plots Scatter plots