Introduction to Stat Crunch Go into Course Compass and click on “Multimedia Library”, click “StatCrunch”, click “Find Now”, and then select any of the blue links below. Do this to load data from an Excel file into StatCrunch. In StatCrunch, click on “Data”. From the drop down menu select “Load Data” and then “From File”. Browse to find the file, then click “Okay”. (The file must be an Excel 97-2003 file with the .xls extension.) Summarizing Data Numerically Let’s find summary statistics for how many hours per day students in the sample reported watching TV. Click on “Stats” and then select “Summary Stats” followed by “Column”. Select the “TV” column and then click “Next”. If there are any summary statistics that you are not interested in, click on them in the column on the left and they will be removed from the column on the right. When you are done, click “Calculate”. Graphs: All of the graphs are found by clicking on the “Graphics” button at the top of the spreadsheet. Begin by making a histogram of the “Number of States Visited” data. Click “Graphics” and then “Histogram”. Select the “States_Visited” column. Click “Next” and then “Next” two more times. On the next screen, enter “Number of States Visited” for the X axis label. Click “Create Graph”. Next, make a bar graph of the gender data. Click on “Graphics” and select “Bar Plot with Data”. Label the x-axis appropriately. Finally, make a boxplot of the “state” data according to “class year”. Click “Graphics”, select “Boxplot” and select the column “States_Visited”. Next to the “Group by” field select “Year”. Click “next”. If you would like a modified boxplot, select “Use fences to identify outliers”. Click “next” and then type appropriate labels for the X axis label and the Y axis label. When you are done, click “create graph”. Copying StatCrunch output into a PowerPoint presentation: Open Microsoft’s “Paint” program. Click on the output (graph or table) that you want to copy and press the “Ctrl” and “Print Screen” buttons at the same time. Go to the Paint program and hit “Ctrl” and “V” at the same time. Select the graph/table that you would like to copy, then hit the “Ctrl” and “C” keys at the same time. Go to the PowerPoint presentation and select “Ctrl” and “V” at the same time. Your graph/table should appear in the PowerPoint presentation. Confidence Interval Estimate one population proportion (appropriate for a categorical variable) Let’s find a confidence interval to estimate the proportion of all SU students who smoke. Click “Stat”, then “Proportions”, then “One Sample” and finally “with Data”. In the pop-up window, enter the following: Outcomes in: Smoke Success: Y Click “Next” and then select the “Confidence Interval” button and choose a 0.95 level. Finally hit calculate. The lower number of the confidence interval is displayed as “L. Limit” and the upper number is displayed as “U. Limit”. Estimate one population mean (appropriate for a numerical variable) Let’s find a confidence interval to estimate the mean number of states that all SU students have visited. Click “Stat”, then “t-Statistics”, then “One sample” and finally “With Data”. In the pop-up window select the “States_Visited” column and then click “Next”. Select the “Confidence Interval” button and enter the appropriate confidence level. Finally click “Calculate”. Two Sample Inference Test a claim involving two population proportions and perform a CI to estimate the difference in two population proportions Let’s find a confidence interval to estimate the difference in the proportion of male and female SU students who smoke. First we need to separate the male and female smoking status data. To do this, click “Data”, click “Split Column”, and in the pop-up window that appears, do the following: Column: Smoke Group Colum: Gender Click “Create Groups!” The data will appear in the table in columns labeled “F Smoke” and “M Smoke”. To construct the confidence interval do the following: Click “Stat”, then “Proportions”, then “Two Samples” and finally “with Data”. In the pop-up window, enter the following: Sample 1 in: M Smoke Success: Y Sample 2 in: F Smoke Success: Y (I chose sample 1 to be males because I think they will smoke more often). Click “Next” and then select the “Confidence Interval” button and choose a 0.95 level. Finally hit calculate. The lower number of the confidence interval is displayed as “L. Limit” and the upper number is displayed as “U. Limit”. If instead you want to do a hypothesis test follow the above directions but instead of selecting the “Confidence Interval” button, select the “Hypothesis Test” button and then select the appropriate direction for the alternative hypothesis. Test a claim involving two population means and perform a CI to estimate the difference between two population means. Let’s say that we would like to see if there is a difference in how often male and female students at SU read the paper per week. We first need to split the “Paper” data according to gender. Do this the same way as you did above with the smoking data. Once the data has been separated into two columns, click on “Stat”, “T-Statistics”, “Two Sample”, “With Data”. Enter the female data into “Sample 1” and enter the male data into “Sample 2”. If you are interested in doing a hypothesis test, select the “Hypothesis Test” button and the appropriate sign for the alternative hypothesis. If you want to do a confidence interval, click that button and enter the appropriate level. Finally, select “Calculate”. Test a claim about association between two categorical variables (chi-squared) Suppose that we are interested in seeing if there is an association between class year and whether or not one smokes. Click on “Stat”, then “Table”, then “Contingency” and finally “with data”. In the pop-up window that appears, select “Year” as the row variable and “Smoke” as the column variable. Click “Next”. If you would like percents tallied or to see the expected cell counts, click the appropriate buttons. When you are done, click “Calculate”.