Minitab Guide for Math 355 Heights of Math 355 Students Spring, 2002 8 7 Frequency 6 5 4 3 2 1 0 60 62 64 66 68 70 72 74 76 78 80 Height Heights by Gender Height 80 70 60 Female Male Gender Descriptive Statistics: Height by Gender Variable Height Gender Female Male N 12 23 Mean 66.875 70.935 Median 68.000 71.000 TrMean 67.100 70.833 StDev 3.206 3.498 Variable Height Gender Female Male SE Mean 0.925 0.729 Minimum 60.500 65.000 Maximum 71.000 79.000 Q1 64.250 67.000 Q3 69.750 74.000 Minitab Guide for Math 355 To go with Introduction to the Practice of Statistics, Fourth Edition Table of Contents Chapter 1 Page Graphs for Categorical Variables Bar Charts Pie Charts Numerically Describing One Categorical Variable 1 1 1 Graphical Displays of Quantitative Variables Stem-and-Leaf Plots Histograms Time Plots Dot Plots Box Plots 1 2 2 2 2 Numerically Describing Quantitative Variables 3 Transforming Data (Changing Units) 3 Normal Quantile Plots 3 Generating Random Data 3 Chapter 2 Scatterplots 4 Correlation 4 Determining the Regression Line 4 Chapter 3 Selecting a Simple Random Sample 5 Chapter 5 Calculating Binomial Probabilities 5 Calculating Normal Curve Probabilities and Percentiles 5 Chapter 7 Page The One-Sample t Confidence Interval 6 The One-Sample t Test 6 The Two-Sample t Significance Test 7 The Two-Sample t Confidence Interval 7 Chapter 8 Calculating a Confidence Interval for a Population Proportion 7 Large-Sample Significance Test for a Population Proportion 8 Calculating Confidence Intervals for Comparing Two Proportions 8 Significance Tests for Comparing Two Proportions 8 Chapter 9 Numerically Describing Two Categorical Variables 8 Chi-Square Test for Two-Way Tables 9 Miscellaneous Adding Titles and Footnotes to Graphs 9 Chapter 1 Graphs for Categorical Variables: Bar Charts and Pie Charts To draw a bar graph of a categorical variable: 1) Use Graph > Chart. 2) In the dialog box, specify the column containing the raw data for the categorical variable in question as the X variable. Do not make any other specifications. 3) Click OK. To draw a pie chart of a categorical variable: 1) Use Graph > Pie Chart. 2) Click on “Chart data in” and specify the column containing the raw data. 3) Add a title. Click OK. Numerically Describing One Categorical Variable When you have column(s) of raw data, to determine how many and what percent fall into each of the categories: 1) Use Stat > Tables > Tally. 2) In the dialog box, under “Variables”, specify the desired column(s) containing the raw data for the categorical variable(s) in question, and click on any desired options for counts and percents. Graphical Displays of Quantitative Variables: Stem-and-Leaf Plots, Histograms, Time Plots, Dot Plots and Box Plots To draw a stemplot of a quantitative variable: 1) Use Graph > Stem-and-Leaf. 2) In the dialog box, specify the column(s) containing the data as your Variables. Click OK. 3) To produce separate stemplots, you can click on “By variable” and specify the column that contains your (quantitative) sorting variable. 4) To split stems, change the increment as follows: to split into two stems, let Increment = 5 times the leaf unit; to split into five stems: Increment = 2 times the leaf unit; to have only one stem, let Increment = 10 times the leaf unit. 1 To draw a histogram of a quantitative variable: 1) Use Graph > Histogram. 2) In the dialog box, specify under X a column containing the raw data for a quantitative variable. 3) Use the Options button if you want to display percents rather than counts or if you want to control how many intervals are created. To draw a time plot: 1) Use Graph > Time Series Plot. 2) In the dialog box, specify under Y the column containing the data, and then click OK. Alternatively, if you have the data in one column and the corresponding time (year, month, observation order etc.) in another column: 1) Use Graph > Plot. 2) In the dialog box, specify the column containing the data under Y and the column containing the time/order under X. 3) Click on “Display” and select “Connect”. 4) Click OK. To draw a dotplot: 1) Use Graph > Dotplot. 2) You can group according to a sorting variable by selecting “By variable” and specifying the column that contains the sorting variable (which can be either quantitative or categorical). [ To draw a box-plot: 1) Use Graph > Boxplot. 2) In the dialog box, specify a column containing the raw data for a quantitative variable under Y. It is not necessary to enter anything under X. To create side-by-side box-plots to compare a quantitative response variable across categories of an explanatory variable: 1) Use Graph > Boxplot. 2) In the dialog box, specify the response variable under Y and specify the categorical explanatory variable under X. 2 Describing Quantitative Variables Numerically To determine summary statistics for a quantitative variable: 1) Use Stat > Basic Statistics > Display Descriptive Statistics. 2) In the dialog box under “Variables” specify the column(s) containing the raw data for the quantitative variable(s). 3) The Graphs button of the dialog box provides options for several different graphs. 4) If you wish to compare numerical summaries of a quantitative variable across categories, check “By variable” and specify the variable that defines the categories. [The output from Stat > Basic Statistics > Display Descriptive Statistics includes the mean, the median, the elements of the five-number summary (the median, the quartiles, and the minimum and maximum), and the standard deviation.] Transforming Data (Changing Units) To change the units of your data: 1) Use Calc > Calculator. Normal Quantile Plots To draw a normal quantile plot: 1) Use Graphs > Probability Plot. 2) In the dialog box, specify the column containing the data, and click OK. To Generate Random Data from a Specified Probability Distribution To generate n observations from the standard normal distribution: 1) Use Calc > Random Data > Normal. 2) In the dialog box, specify the number of observations by inputting n as the number of rows of data, and specify the column(s) you want the data to be stored in. 3) Click OK. [You can generate data from any normal distribution by indicating the desired mean and standard deviation.] To generate n observations from the uniform distribution: 1) Use Calc > Random Data > Uniform. 2) In the dialog box, specify the number of observations by inputting n as the number of rows of data, and specify the column(s) you want the data to be stored in, as well as the lower and upper endpoints. Click OK. 3 Chapter 2 Graphing the Relationship Between Two Quantitative Variables To draw a scatterplot: 1) Use Graph > Plot. 2) In the dialog box, specify the columns containing the raw data for Y (the response variable) and X (the explanatory variable). 3) To mark different subgroups with different symbols, use the “Data display:” area of the Plot dialog box. Put the word Group under “For each,” and specify the column that defines the groups under “Group Variables.” Correlation To calculate a correlation coefficient: 1) Use Stat > Basic Statistics > Correlation. 2) Specify two columns as the Variables. 3) Click OK. Determining the Regression Line To find a least-squares regression equation: 1) Use Stat > Regression > Regression. 2) In the dialog box, specify the column containing the raw data for the response variable (Y) as the “Response,” and specify the column containing the data for the explanatory variable (X) as the “Predictors.” 3) You can get residual plots by using the Graphs button. To find a regression line and also have Minitab draw this line onto a scatterplot of the data, use Stat > Regression> Fitted Line Plot. Specify the response variable (Y) and the predictor (X) in the dialog box. 4 Chapter 3 Selecting a Simple Random Sample To sample values from a column: 1) Use Calc > Random Data > Sample from Columns. 2) In the dialog box, specify how many items (rows) will be selected from a particular column, and specify a column where the sample will be stored. To create a column of ID numbers: 1) Use Calc > Make Patterned Data > Simple Set of Numbers. 2) In the dialog box, specify a column for storing the ID numbers, and specify the first and last possible ID number for the population. Note: Items can be randomly selected from a column of names or data values, so it may not be necessary to assign ID numbers to the units in the population in order to select a sample. Chapter 5 Calculating Binomial Probabilities 1) Use Calc > Probability Distributions > Binomial. 2) In the dialog box, select either “Probability” or “Cumulative Probability” depending on whether you want P(X = k) or P(X ≤ k). 3) Specify the number of trials and the probability of success. 4) Click on “Input Constant” and fill in the corresponding box with the value of k. 5) Click OK. Calculating Normal Curve Probabilities and Percentiles 1) Use Calc > Probability Distributions > Normal. 2) Specify the mean and standard deviation. 3) To find P(X ≤ k), select “Cumulative Probability” and also specify the value of k in the box labeled “Input Constant.” 4) To find a percentile, select “Inverse Cumulative Probability” and also specify the cumulative probability for the percentile in the box labeled “Input Constant.” Note: it is not necessary to compute z-scores when using Minitab for determining normal curve probabilities. 5 Chapter 7 Calculating The One-Sample t Confidence Interval 1) Use Stat > Basic Statistics > 1-sample t. 2) Under “Variables” specify the column that contains the raw data. 3) To change the confidence level, use the Options button. The default confidence level is 95%. For paired data: 1) First calculate a column of differences using Calc > Calculator. 2) Use the 1-sample t procedure above, using your calculated differences as your data. Alternatively, for paired data: 1) Use Stat > Basic Statistics > Paired t. 2) Specify the two columns (first sample and second sample) that contain the raw data for the pair of measurements. Note the direction of the subtraction. 3) To change the confidence level, use the Options button. The default confidence level is 95%. Note: For both the 1-sample and Paired t, the Graphs button can be used to create visual displays of the data. The One-Sample t Test 1) Use Stat > Basic Statistics > 1-Sample t. 2) Under “Variables” specify the column that contains the raw data, and specify the null value of the mean in the “Test mean:” box. 3) To specify the Alternative hypothesis, use the Options button. For paired data: 1) Calculate a column of differences using Calc > Calculator. 2) Use the 1-sample t procedure, letting “Test mean:” equal 0. Alternatively, for paired data: 1) Use Stat > Basic Stats > Paired t. 2) Specify the two columns (first sample and second sample) that contain the raw data for the pair of measurements. Note the direction of the subtraction. 3) Use the Options button to specify the alternative hypothesis. Note: For both the 1-sample and Paired t, the Graphs button can be used to create visual displays of the data. 6 The Two-Sample t Significance Test 1) Use Stat > Basic Statistics > 2-sample t. 2) Specify the location of the data (see note below). 3) To specify the Alternative hypothesis, use the Options button. Note: The raw data for the response may be in one column (Samples), and the raw data for group categories (Subscripts) may be in a second column. Or the raw data for the two independent groups may be in two different columns (first and second). Use the Graphs button to create a comparative dotplot or a comparative boxplot. Note: In Minitab, the default option for the two-sample t procedures is the “unpooled” version in which the population variances are not assumed to be equal. To get the pooled version, check the dialog box item that says “Assume Equal Variances.” Check with your instructor to see which you should be using. The Two-Sample t Confidence Interval 1) Use Stat > Basic Statistics > 2-sample t. 2) Specify the location of the data (see note above). 3) To change the confidence level, use the Options button. Use the Graphs button to create a comparative dotplot or a comparative boxplot. Chapter 8 Calculating a Confidence Interval for a Population Proportion 1) Use Stat > Basic Statistics > 1 Proportion. 2) If the raw data are in a column of the worksheet, specify that column. 3) If the data have already been summarized, click on “Summarized Data,” and then specify the sample size (“Number of trials:”) (use n + 4 if using the Wilson estimate) and the count (“Number of successes:”) of how many observations have the characteristic of interest (use X + 2, if using the Wilson estimate). 4) Use the Options button, and click on “Use test and interval based on normal distribution.” 5) To change the confidence level, use the Options button. The default confidence level is 95%. 7 Large-Sample Significance Test for a Population Proportion 1) Use Stat > Basic Statistics > 1 Proportion. 2) If the raw data are in a column of the worksheet, specify that column. 3) If the data have already been summarized, click on “Summarized Data,” and then specify the sample size (“Number of trials:”). 4) Use the Options button, specify the value of p0, select the type of Alternative hypothesis, and click on “Use test and interval based on normal distribution.” Calculating Confidence Intervals for Comparing Two Proportions 1) Use Stat > Basic Statistics > 2 Proportions. 2) If the data have already been summarized, click on “Summarized Data,” and then specify the sample size (“Trials:”) and the number of “successes:” for each group (use n1 + 2 and n2 + 2, and X1 + 1 and X2 + 1 if using the Wilson estimate). 3) Use the Options button, to change the confidence level. The default level is 95%. Note: There are three possibilities for inputting data. 1) The raw data for the response (Samples) may be in one column of the worksheet, and the raw data for group categories (Subscripts) may be in a second column. 2) The raw data for the two independent groups may be in two different columns. 3) The data may already be summarized. Significance Tests for Comparing Two Proportions 1) Use Stat > Basic Statistics > 2 Proportions. See note above for information about inputting data. 2) To specify the alternative hypothesis, use the Options button. 3) To compute the z-statistic described in this section, use Options, and then click “Use pooled estimate of p for test.” Chapter 9 Numerically Describing Two Categorical Variables: To create a two-way table for two categorical variables starting with raw data: 1) Use Stat > Tables > Cross Tabulation. 2) In the dialog box, specify the two columns containing the raw data as the “Classification variables:”, and then choose any desired percents (row and/or column and/or total). 8 Chi-Square Test for Two-Way Tables If the raw data are stored in columns of the worksheet: 1) Use Stat > Tables > Cross Tabulation. 2) In the dialog box, specify the two columns containing the raw data as the “Classification variables:” and select “Chi-square analysis.” Click OK. If the data are already summarized into counts: 1) Enter the table of counts into columns of the worksheet. 2) Use Stat > Tables > Chi-Square Test. 3) In the dialog box, specify the Columns containing the table. 4) Click OK. Miscellaneous To Add Titles and/or Footnotes to Graphs Click on the “Annotation” option in the dialog box and select “Title” and/or “Footnote.” 9