Statistics and Risk Management Software Proficiency Video URL: jukebox.esc13.net/untdeveloper/Videos/Software%20Proficiency.mov Vocabulary List: Measure of Central Tendencies: the measurement of a relationship between two variables. Causality: the relationship between cause and effect. Spreadsheet: a computer application used to calculate tabulated data. Commercial Software: proprietary software products available for a cost. Freeware: software available for no cost and with no limitations on use. Shareware: software available on a trial basis or with limited functionality, with the option to pay for a full version. Contingency Table: a cross-tabulation of the frequencies of two variables assigned to rows and columns. Histogram: a type of graph in which the widths of rectangular bars represent the limits of a class interval, or range of related values, and the heights of the rectangular bars represent the frequency of each class. Copyright © Texas Education Agency, 2012. All rights reserved. 1 Resources: The Importance of Business Statistics In this article by Six Sigma the importance of the use of business statistics is broken down for the non-statistician. The article touches on the need for sampling and other statistical gathering strategies in the business world. http://www.sixsigmaonline.org/six-sigma-training-certificationinformation/the-importance-of-business-statistics.html Statistical Thinking in a Technological Environment Through their research Dani Ben-Avi and Alex Friedlander has developed a method for teaching statistical thinking in a technological environment. Developed mainly for seventh through ninth grades, their method has activities, research projects, and thinking processes used by the duo. http://www.dartmouth.edu/~chance/teaching_aids/IASE/4.Ben-Zvi.pdf Spreadsheet: Statistical Functions This site defines each of the various Microsoft Excel functions available to use in statistical calculations. The site also provides some information and examples on how to select cells and ranges for calculations. http://www.medcalc.org/manual/statistical_functions.php Copyright © Texas Education Agency, 2012. All rights reserved. 2 Software Proficiency Practice Test Name:_____________________ Use the Internet to locate answers; compare with the key. List the five types of information required in order to analyze data: 1. 2. 3. 4. 5. List the nine statistical tests which can be completed with this software. 6. 7. 8. 9. 10. 11. 12. 13. 14. Copyright © Texas Education Agency, 2012. All rights reserved. 3 Software Proficiency Practice Test LOG INTO SPREADSHEET SOFTWARE You will first need to load the Analysis ToolPak: File, Options, Add-ins In the Manage box, select Excel Add-ins, Click Go In the Add-Ins available box, select the Analysis ToolPak (make sure to check the box), click OK The ToolPak add-in will be located under DATA, Data Analysis Check which options are available in your ToolPak: 1. ANOVA_____ 2. Histogram_____ 3. Chi-Square_____ 4. T- Test_____ 5. Z-Test_____ 6. Sampling_____ Now, look under functions available. Go to Functions, More Functions, Statistical functions. Identify with options are available under functions, answering yes or no: 7. Chi-Square_____ 8. Confidence_____ 9. ANOVA_____ 10. Z Test____ 11. T Test_____ Copyright © Texas Education Agency, 2012. All rights reserved. 4 Software Proficiency Practice Test KEY List the five types of information required in order to analyze data: 1. Data Library 2. Dataset 3. Dependent Variable 4. Predictor Variable 5. Grouping Variable List the nine statistical tests which can be completed with this software. 6. Descriptive 7. Histogram 8. Chi-Square Test 9. Box Plot 10. Stem and Leaf 11. Normal Quantile Plot 12. T-test/Confidence Interval 13. ANOVA 14. Correlation/Regression Check which options are available in your ToolPak: 15. ANOVA_____ yes 16. Histogram_____ yes 17. Chi-Square_____ no 18. T- Test_____ yes 19. Z-Test_____ yes 20. Sampling_____ yes Now, look under functions available. Go to Functions, More Functions, Statistical functions. Identify with options are available under functions, answering yes or no: 21. 22. 23. 24. 25. Chi-Square_____ yes Confidence_____ yes ANOVA_____ no Z Test____ yes T Test_____ yes Copyright © Texas Education Agency, 2012. All rights reserved. 5 Student Assignment 8.1a Software Proficiency Name:_____________________ Your class just took an exam worth 100 points. There are 30 students in your class. The scores were as follows. Student 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Score 73 78 89 78 81 85 91 97 92 84 86 84 79 98 92 Student 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 Using Excel create a new worksheet for this week’s assignment. First enter the 1-30 cases in Column Aand the test Scores in Column B. Next create a histogram type graph in the upper right of the worksheet. Save your work. You will continue to add to it all this week. Copyright © Texas Education Agency, 2012. All rights reserved. 6 Score 79 85 88 87 84 87 86 79 75 86 81 83 85 78 90 Student Assignment 8.1b Software Proficiency Name:_____________________ Next using the Data Analysis Add-In, create a Histogram type graph in the upper right of the worksheet. Often Raw data needs to be categorized into bins so a histogram can be created. Example: the grades you have can be divided into bins by grades…2 grades between 100 and 90, 8 grades between 89 and 80…..so forth. Excel using the histogram function will attempt to divide our raw data into bins to accommodate a bell or normalized curve. Play with this function and learn to use it well. Look in the Analysis section of the Data tab. Click "Data Analysis" and highlight the "Histogram" tool from the Analysis Tools box. Click "OK." You will need to specify an Input Range and an Output Range. The Bin Range can be blanks. Select "Chart Output" in the output options section to generate a histogram graph. Click "OK." If you end with a histogram that looks like a bell curve you have done well. The bins will populate with what Excel thing are good bins. It actually attempted to curse the resulting grades. Again using the Data Analysis Add-In, create a Pie Chart type graph in the Lower right of the worksheet. Hint: Use the Histogram tool again and then change it to a pie chart. Right click on the completed Histogram to change the chart type to pie. Save your spreadsheet work. You will continue to add to it all week. Copyright © Texas Education Agency, 2012. All rights reserved. 7 Student Assignment 8.1c Software Proficiency Name:_____________________ Below are the grades in a four point scale. Student 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Grade 2 2 3 2 3 3 4 4 4 3 3 3 2 4 4 Student 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 Grade 2 3 3 3 3 3 3 2 2 3 3 3 3 2 4 Add the grade point assignments into a Column C. Perform a Chi-Best Fit test to see if the expected grades are significantly different from what is expected of 33.33% for each group. Using the Bin data and expected of 10 for each of the three groups what does a =CHISQ.TEST(Actual Range , Expected Range) show. Was this an appropriate analysis test? What assumption(s) are made? Does this really mean anything? Copyright © Texas Education Agency, 2012. All rights reserved. 8 Student Assignment 8.1d Software Proficiency Name:_____________________ Below are the grades in a four point scale. Student 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Gender M M M F F F F M F F F F F M F Student 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 Gender F M F M F M M F M M F M M M M Using spreadsheet software, enter the Gender in column D and use a t-Test calculation to see if there is a significant difference of grades between the female group and the males group of students. Copy the C & D columns to a new worksheet and then sort all of the 2 x 30 cells by Gender. The use the formula =t.test(male grade range, female grade range,2,2) to get a result. What does the result tell you? For fun, “raise up” the score for all females students and see want happens to the t-Test results. Copyright © Texas Education Agency, 2012. All rights reserved. 9 Student Assignment 8.1e Software Proficiency Name:_____________________ Below are the study times for each student. Student 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Minutes 30 15 30 45 60 45 60 90 100 60 60 60 30 90 180 Student 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 Minutes 15 45 60 30 15 45 45 15 30 30 60 45 30 45 120 Using spreadsheet software, enter the grades in column E and calculate the correlation between the study times and exam scores. Copy columns B & E to another worksheet and use the =CORREL(Grade Range, Study Time Range) function. What does this tell you? Copyright © Texas Education Agency, 2012. All rights reserved. 10 Explore Activity: Statistics Software vs. Spreadsheet – The spreadsheet has become the go-to tool for numerical work in the 21st century. Spreadsheet programs were designed to be a programmable calculator, but not really a tool for statistical analysis. There are numerous add-ons to spreadsheet programs that attempt to work around the limitations, but it is hard to beat actual statistical software. In your lessons, you were introduced to one free source, PSPP, which is modeled after one of the standard professional software packages. In order to proceed with this activity you will need to be able to install software to your computer (or your teacher will need permission to install to school computers). Option 1 – Your assignments in these lessons involved using spreadsheet software to analyze data. Repeat the analyses using real statistical software. Option 2 – Find a data set that is of interest to you and then conduct an analysis using one of the statistical program offerings. Copy the data into PSPP (or spreadsheet program) to investigate whether doctors treat over-weight patients differently than normalweight patients. This is an opportunity to see statistics used in a reallife situation. Copyright © Texas Education Agency, 2012. All rights reserved. 11 Write a report about your set of data including: a. A description of your data source (i.e. background on where you found the data; the question of interest you are investigating, etc.). b. An appropriate graph. c. Descriptive statistics output from computer software (this would include a report of the mean, standard deviation, and other relevant statistics). d. An interpretation of the importance or relevance of the statistics with regards to your question of interest. Copyright © Texas Education Agency, 2012. All rights reserved. 12