Statistics Survey Project Group Members: Group Topic: Deadlines: March 4 Part One: Qualitative Data March 18 Part Two: Quantitative Data April 29 Part Three: Summary TOPIC: Part One: Qualitative Data 1 Part Two: Quantitative Data Question Question Number Type 5 2 3 4 6 7 8 Question Person Responsible Part One: Qualitative Data Group Question Member Number Actual Question 1 2 3 4 1. Produce a Pareto chart for this data. Be sure to label each axis, give your chart a title, and make it easy for the reader to understand your data. See Example A. 2. Produce a pie chart for this data. Be sure that each sector is clearly labeled with the number of responses, the percent of responses, and the category that it represents. Give this graph a title. See Example B. 3. Complete a contingency table for this data compared to gender from the demographic questions. Divide your data into TWO categories only. Complete the table with the actual counts AND the percentages in each gender row. Write a short conclusion about whether you find an association with gender or an overall trend. Describe that association or trend. See Examples C, D, and E. 4. Further investigate this question by producing a contingency table for EACH GRADE LEVEL, comparing your question categories to gender again. FOR EACH GRADE LEVEL, Complete the table with the actual counts AND the percentages in each gender row. Write a short conclusion about whether you find an association with gender or an overall trend. Describe that association or trend. 5. Write a summary of what you have learned about DVHS students from the results of this question and your analysis. Be specific, and refer to your visual displays to support your claims. Qualitative Data Examples What is your favorite type of vehicle? Data: Truck car Car Jeep Truck SUV car SUV Car Car Jeep truck SUV SUV Car motorcycle Truck Motorcycle Bicycle skateboard EXAMPLE A Pareto Chart Frequency Distribution: Category Truck Car Jeep SUV Motorcycle Bicycle skateboard TOTAL Tally XXXX XXXXX X XX XXXX XX X X Count 4 6 2 4 2 1 1 20 Percent 20 30 10 20 10 5 5 100 Favorite Type of Vehicle 35 30 30 PERCENT 25 20 20 20 15 10 10 10 5 5 5 0 Car SUV Truck Jeep TYPE OF VEHICLE Motorcycle Bicycle Skateboard EXAMPLE B Pie Chart Favorite Type of Vehicle Skateboard, 1, 5% Bicycle, 1, 5% Motorcycle, 2, 10% Car, 6, 30% Jeep, 2, 10% Truck, 4, 20% SUV, 4, 20% EXAMPLE C Contingency Table Male Female Motorized, 4 wheel Vehicles 9=82% 7=78% Other TOTAL 2=18% 2=22% 11=100% 9=100% This contingency table indicates that there is NOT an association between favorite type of vehicle and gender. There is an overall trend that most students prefer motorized vehicles with four wheels, regardless of gender. This is a strong trend. EXAMPLE D Contingency Table Male Female Motorized, 4 wheel Vehicles 2=18% 7=78% Other TOTAL 9=82% 2=22% 11=100% 9=100% This contingency table indicates that there IS an association between favorite type of vehicle and gender. Females are more likely to prefer motorized, four-wheel vehicles, and males are more likely to prefer other types of vehicles. This is a strong association. EXAMPLE E Contingency Table Male Female Motorized, 4 wheel Vehicles 6=50% 4=50% Other TOTAL 6=50% 4=50% 12=100% 8=100% This contingency table is inconclusive. It shows neither overall trend nor association with gender. Part Two: Quantitative Data Group Member Question Number 5 6 7 8 Actual Question 1. Enter the data into a calculator and run the 1-Variable Stats. Record the results. Label each value not only with its symbol or abbreviation, but with its description. For example, S x=2.563, the standard deviation for this sample. See Example F. 2. Using the statistics you found in step 1, produce a boxplot to represent this data. Be sure that your graph is clearly labeled and easy for the reader to understand. See Example G. 3. Group your data into 4-6 reasonable intervals. It is required that those intervals be equal in size. Complete a group frequency table for your data. Use that table to create a histogram to represent this data. Be sure that your axes are labeled, that your graph has a title and is easy to read. See Example H. 4. Test your data for outliers. Be clear about which outlier test you are going to use, and show all the steps to apply that test clearly. Make a definite conclusion about which data are outliers. How does this affect the data? See Examples I, J, and K. 5. Perform an analysis of your data compared to the demographic information collected. Do that by completing the four contingency tables described below. Divide your data into TWO reasonable categories, and use those same categories in each contingency table. See Example L. a) b) c) d) Gender (M, F) and your two categories Grade level (non-seniors, seniors) and your two categories Age (under 16, 16 and up) and your two categories Employment (Job, No job) and your two categories Complete each table with the actual counts AND the percentages in each demographic row. Write a short conclusion about whether you find an association with the demographic information or an overall trend. Describe that association or trend. 5. Write a summary of what you have learned about DVHS students from the results of this question and your analysis. Be specific, and refer to your visual displays to support your claims. Qualitative Data Examples EXAMPLE F 1-Var Stats Question: How many times have you stayed up late because of a homework assignment or test, in the last two weeks? Data: 3 4 0 1 1 0 1 1 Calculator Results π₯Μ = 2.06 ππ₯ = 2.294 n = 16 minX = 0 π1 = .5 Med = 1.5 π3 = 3 MaxX = 9 0 4 3 2 9 2 2 0 Meaning? The mean for this sample is 2.06 late nights. This is the average. The standard deviation for this sample is 2.294 late nights. This measures how spread out the data is. The sample size is 16. The minimum number of late nights was 0. The first quartile for this data is 0.5. 25% of the data is below this value. The median for this data is 1.5. 50% of the data is below this value. The third quartile for this data is 3. 75% of the data is below this value. The maximum number of late nights was 9. EXAMPLE G BoxPlot: Made with http://www.imathas.com/stattools/boxplot.html Times up late for schoolwork EXAMPLE H Grouped Frequency Table Number of Late Nights (interval notation) Frequency (0, 2] (2, 4] (4, 6] (6, 8] (8, 10] 11 4 0 0 1 Times up Late for Schoolwork 12 Number os students 10 8 6 4 2 0 2 4 6 Number of Late Nights 8 10 EXAMPLE I Empirical Rule TEST for OUTLIERS From our calculator work: π₯Μ = 2.06 and ππ₯ = 2.294 Lowest Expected Value = π₯Μ - 3ππ₯ = 2.06 – 3(2.294) = -4.822 No data is lower than this, so there are no outliers on this end of the data. Highest Expected Value = π₯Μ + 3ππ₯ = 2.06 + 3(2.294) = 8.942 The data value of 9 is larger than this, so 9 is an outlier. EXAMPLE J IQR TEST for OUTLIERS From our calculator work: π1 = .5 Med = 1.5 π3 = 3 IQR ; 3 – 0.5 = 2.5 Lowest Expected Value = π1 − 1.5 (πΌππ ) = 0.5 − 1.5 (2.5) = −3.25 No data is lower than this, so there are no outliers on this end of the data. Highest Expected Value = π3 + 1.5 (πΌππ ) = 3 + 1.5(2.5) = 6.75 The data value of 9 is larger than this, so 9 is an outlier. EXAMPLE K Z-SCORE TEST for OUTLIERS π§= π₯−π₯Μ ππ₯ From calculator work: π₯Μ = 2.06 and ππ₯ = 2.294 Lowest Expected Value: z = - 3. –3 = π₯−2.06 2.294 , so x = -4.822 No data is lower than this, so there are no outliers on this end of the data. Highest Expected Value: z = 3. π₯−2.06 3 = 2.294 , so x = 8.942 The data value of 9 is larger than this, so 9 is an outlier. EXAMPLE L Contingency Table Male Female Less than twice 6 = 86% 7 = 78% Twice or more 1 = 14% 2 = 22% TOTAL 7= 100% 9 = 100% There is not an association with gender. There is a strong trend that most students have stayed up late for schoolwork less than twice in the past two weeks. Part Three: Summary 1. Complete the table shown. Each group member should complete the two rows that pertain to their questions. Have your group edit your work and give feedback. Question Number 1 2 3 4 5 6 7 8 Actual Question Summary of Our Analysis Proposed Revisions to the Question 2. Explain what recommendations you would make to DVHS administration based on your survey data and analysis. Back these recommendations from your data. Be clear about what changes you are asking for and why. Each group member should contribute at least one recommendation. Be sure to indicate which contribution is from each member. 3. Discuss any potential lurking variables that may have influenced your data. What were they? How could they have influenced your data? Do you think they were a major or a minor influence? How could we measure them for next time? 4. Discuss any possible bias errors that may have influenced your results. What were they? How could they have influenced your data? Do you think they were a major or a minor influence? How could we change our process to avoid or at least minimize them for next time? 5. What was your margin of error for this survey? How does that affect our analysis? How could the margin of error be improved next time? 6. Discuss what you (personally) have learned from this project. Be specific.