Minitab Manual - Kennesaw State University

advertisement
Minitab Reference Manual:
A gentle overview
Table of Contents
1. Introduction to Minitab
 Importing Data
2. Data Analysis and Statistical Concepts
2
5
5
9
 Concept 1 – Measurements of Central Tendency
9
 Concept 2 – Measurements of Dispersion
24
 Concept 3 – Visualization of Univariate Data
29
 Concept 4 – Visualization of Multivariate Data
47
 Concept 5 – Random Number Generation And Simple Sampling
65
 Concept 6 – Confidence Intervals
71
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
These reference manuals have been developed to assist students in the basics of statistical computing – sort of a
“Statistical Computing for Dummies”. It is not our intention to use this manual to teach statistical concepts 1…but rather
to demonstrate how to utilize previously taught statistical and data analysis concepts the way that professionals and
practitioners apply them – through the able assistance of computing. Proficiency in software allows students to focus
more on the interpretation of the output and on the application of results rather than on the mathematical computations.
We should pause here and strongly make the point that computers should serve as a medium of expediency of calculation
– not as a substitution for the ability to execute a calculation.
In the Basic Concepts manual, we present statistical concepts, context for their use, and formulas where appropriate. We
provide exercises to execute these concepts by hand. Then, in each subsequent manual, the concepts are applied in a
consistent manner using each of the five major statistical computing packages – Excel, SPSS, Minitab, R and SAS.
Readers of this manual are assumed to have completed some introductory statistics course. For individuals wishing to review statistical
concepts, we recommend Introduction to Stats by DeVeaux, Velleman and Bock.
1
3
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
Minitab
What is Minitab?
Minitab was first developed in 1972 at Penn State University (Go Nittany Lions!). Initially, it was developed as a teaching
tool to help professors teach basic statistics. It is still used for that purpose at more than 4000 colleges and universities
around the world. One reason for its popularity in this venue is that it is a user-friendly, menu-driven interface – much
like SPSS. Because it offers accurate and customizable analysis tools for quality control and other important business and
industry functions, it is also now widely used by companies of all sizes. It is currently the package of choice at many
manufacturing Fortune 500 companies including Ford Motor Company, 3M, Honeywell International, and Samsung.
4
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
Getting data into Minitab
Prior to actually executing any of the statistical concepts from the Basic Concepts Manual, we first need to get the
WidgeOne.xls dataset into the Minitab system and convert it into a Minitab file.
When you open Minitab you should see the following screen:
This is the Session Window.
This is the Data Window.
5
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
As shown above, the display consists of two windows. The Session Window is at the top and is where you will see
commands and results displayed. The Data Window is at the bottom. It is the worksheet where the data is displayed in a
spreadsheet format. You now need to open up the WidgeOne dataset in Minitab.
In the File menu choose Open Worksheet as shown below:
6
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
You will next see a typical window for opening a file. Remember the WidgeOne.xls dataset is an Excel file. Your computer
will initially be looking for Minitab files only. You have the option of looking for files of any type. The window below
shows the system being instructed to look for Excel files. It also shows the WidgeOne dataset being selected:
Click Open.
7
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
You should then see the following display:
The three worksheets in the Excel file Widgeone.xls have all been converted to separate Minitab worksheets, named
Attendance, Employees and Plant_Survey. The statistical analysis for this guide is exclusively on the Plant_Survey
worksheet. You can close out of the others now. Make sure to go to the File menu and save the Plant_Survey worksheet
for future use.
8
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
Concept 1: Using Minitab for Measurements of Central Tendency
Minitab is a menu-based system. Thus it is only a matter of finding what you want to do on the menus and customizing
your request. For most computations, you should find Minitab (like SPSS) to be easier than Excel. In order to find the
two most predominant measures of central tendency (the mean and median) we start in the Stat menu. Within that menu,
choose Basic Statistics and Display Descriptive Statistics as shown:
9
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
Next you will see the following:
We need to choose the variables for which we are interested in finding the mean and median. We will choose only the
quantitative variables YRONJOB, PRDCTY and JOBSAT. We make these selections by clicking on the variables while
holding down the Control Key and then clicking on the Select button. This button will appear darker once a variable has
been highlighted.
10
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
This interface is common to almost every function in Minitab. The Select button will not activate until at least one variable
has been highlighted for selection:
11
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
After you have selected the variables for analysis, your screen should look like this:
Now click on the Statistics button. This will show you the statistics selected for display. There are many more statistics on
this list than you have been exposed to in this course.
12
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
The following dialogue box will appear:
We can select several different descriptive statistics. Statistics are selected by clicking in the box next to each until a check
mark appears. In this case, we have selected only the mean and median.
Once your selections are made, click the OK button and then click OK again in the Display Descriptive Statistics window.
13
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
We obtain the following display containing the means and medians of our five variables. The display appears in our
Minitab Session window:
Results for: Plant_Survey
Descriptive Statistics: JOBGRADE, PRDCTY, SOCREL, YRONJOB, JOBSAT
Variable
JOBGRADE
PRDCTY
SOCREL
YRONJOB
JOBSAT
N
40
40
40
40
40
Mean
6.600
84.58
5.500
8.290
6.850
Median
6.500
84.81
5.000
8.350
6.600
Notice that these figures are consistent with what we had generated using Excel and SPSS and what we had computed by
hand. Again, it is nice when numbers match!
14
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
Before we go on to looking at subsets of the data, let’s recode the values of the qualitative variables we will be using with
meaningful labels. The variables Plant and Gender are coded with single letters (N = Norcross, D = Dallas, etc.) We wish
to recode these variables, so our displays and graphs will have more descriptive names. These are the steps we use to
accomplish this. In the Data menu, select Code and then Text-to-Text as shown below:
We choose Text-to-Text because we wish to change text values (like D) to other text values (like Dallas).
15
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
We see a box like the one below. In this example, we have chosen the variable Plant as the column to code the data from
and also as the column to code the data into. This means we will recode the data within the same column rather than
choosing another one for the recoded values. Fill in the rest of the box as below:
16
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
Click OK. The Minitab Data Window should now look like this:
17
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
Perform the same type of recode for the Gender variable (M = Male, F = Female). Your data will then appear as follows:
Now we are ready to look at subsets of the data that will be determined by these qualitative variables. For example, what
if we wanted to know the measurements of central tendency of these variables by gender and by plant? We would
proceed exactly as before – Stat>Basic Statistics>Display Descriptive Statistics. We again choose the same three variables.
18
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
This time we will click inside the box called By variables. Once we click inside this box, the menu of variable choices
grows to include the qualitative variables from our set. Minitab knew we could not calculate means and medians for
qualitative variables and did not include those in the variable selection box. However, when we wish to subset the data,
these variables do come in as options. Please note that you should only place qualitative variables in the By variables box.
For the first analysis, choose Plant and then click on the Select button. You should see the following display:
Click on the Statistics button to choose Mean and Median again. Also check the N Total box this time, so we get the
frequency of each subset. Click OK and OK.
19
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
The following display will appear in your Session Window:
Descriptive Statistics: JOBGRADE, PRDCTY, SOCREL, YRONJOB, JOBSAT
Variable
JOBGRADE
Plant
Dallas
Norcross
N
23
17
Mean
6.870
6.235
Median
7.000
6.000
PRDCTY
Dallas
Norcross
23
17
88.34
79.49
90.42
79.86
SOCREL
Dallas
Norcross
23
17
5.522
5.471
5.000
5.000
YRONJOB
Dallas
Norcross
23
17
8.104
8.541
8.000
9.000
JOBSAT
Dallas
Norcross
23
17
7.148
6.447
7.000
6.300
Follow the same series of steps, only this type select Gender for the By variables box. Your output should look like this:
Descriptive Statistics: JOBGRADE, PRDCTY, SOCREL, YRONJOB, JOBSAT
Variable
JOBGRADE
Gender
Female
Male
N
20
20
Mean
6.300
6.900
Median
6.000
7.000
PRDCTY
Female
Male
20
20
83.97
85.19
84.90
84.81
SOCREL
Female
Male
20
20
6.000
5.000
6.000
5.000
YRONJOB
Female
Male
20
20
8.18
8.395
8.50
8.350
JOBSAT
Female
Male
20
20
6.980
6.720
6.700
6.400
20
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
As a rule, we do not use the mode as a Measurement of Central Tendency with quantitative data. If the data is qualitative
– Plant, Gender, Position – the mode is the ONLY Measurement of Central Tendency available.
We can determine the mode of variables such as these by choosing the Stat menu and within that menu selecting Tables
and Tally Individual Variables as shown here:
21
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
You will see the window below. Select the variables Plant, Gender and Position as shown:
22
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
Click OK. This output appears in the Session Window:
Tally for Discrete Variables: Plant, Gender, POSITION
Plant
D
N
N=
Count
23
17
40
Percent
57.50
42.50
Gender
F
M
N=
Count
20
20
40
Percent
50.00
50.00
POSITION
HRLY
MGT
N=
Count
20
20
40
Percent
50.00
50.00
It is easy to see that Dallas is the modal value for the Plant variable. It is also easy to see that there is no mode for the other
two variables in this example.
23
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
Concept 2: Using Minitab for Measurements of Dispersion
To represent the dispersion of a quantitative variable (Measurements of Dispersion are not relevant for qualitative
variables), we typically report the standard deviation. To do this in Minitab, we return to the Stat menu. Again we choose
Basic Statistics and Display Descriptive Statistics. Select the 5 quantitative variables as before. Do not select any variables
in the By variables box. Click on Statistics and select Standard deviation:
Click OK and then OK.
24
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
Here is the output you should see in the Session Window:
Descriptive Statistics: JOBGRADE, PRDCTY, SOCREL, YRONJOB, JOBSAT
Variable
JOBGRADE
PRDCTY
SOCREL
YRONJOB
JOBSAT
N
40
40
40
40
40
StDev
1.549
7.26
1.468
4.257
1.021
We could have obviously included lots of statistics in our analysis simply by choosing the ones we want from the
Statistics screen.
The second Measurement of Dispersion discussed in The Basic Concepts Manual was the frequency table. In that manual
and the Excel manual, when we created a frequency table for the job tenure variable, we created three categories: less than
5 years, 5-10 years and more than 10 years. To create these same categories in Minitab, we need to recode our YRONJOB
variable into a new variable called JOBTEN.
25
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
Go to the Data menu and choose Code>Numeric to Text:
We have selected Numeric to Text because we are changing the numerical variable YRONJOB to a qualitative variable
that we will call JOBTEN.
26
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
You will see a screen like the one below:
Select the YRONJOB variable as it is the one to be recoded. Type the name of the new variable JOBTEN in the Into
Columns box (It is a new name, so it is not a choice to be selected from the left-hand menu). Then fill in the old and new
values as we have them above. Note that values of YRONJOB between 0 and 4.9 will be coded as New. Values between 5
and 10 are coded as Experienced, and values over 10 are coded as Mature. Click OK.
27
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
Your Data Window should now have the new text variable JOBTEN in it:
Now we can easily generate a frequency table for the new variable JOBTEN. Once more go to the Stat menu, select
Tables>Tally Individual Variables. Select the variable JOBTEN. Click OK. You should see output like this in your Session
Window:
Tally for Discrete Variables: JOBTEN
JOBTEN
Experienced
Mature
New
N= 40
Count
16
15
9
Percent
40.00
37.50
22.50
Well Done!
28
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
Concept 3: Using Minitab for Visualization of Univariate Data
As stated previously, for professional presentation or for formal documents, we recommend the use of a graphics package
(e.g. Microsoft Power Point). However, Minitab (like SPSS) has some nice graphs available in the Graph menu, which can
be used less formally.
To replicate the pie chart developed in The Basic Concepts Manual, we go to the Graph menu and select Pie Chart. Our
first choice is whether we are charting counts of unique values (raw data) or values from a table. Our data are unique
values (meaning that it is coming straight from the dataset), so we check this box. We then must click inside the
Categorical Variables box. Once this is done, the box on the far left will fill with variable choices from our data set (yes,
we know…this is one of the more annoying aspects of Minitab). Select JOBTEN as your variable for this graph.
29
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
Here is the screen just before we click OK to draw our pie chart for JOBTEN:
Select the Labels button. Under the Titles tab, give your pie chart a meaningful title.
30
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
Then select the Slice Labels tab:
Identify that you want the slices to be labeled with the Category name and the Percent (remember that the reason that we
create Pie Charts is to visually represent the proportions). Select OK and OK.
31
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
You should see the following chart:
Job Tenure of Employees at the Dallas & Norcross Plants
Category
Experienced
Mature
New
New
22.5%
Experienced
40.0%
Mature
37.5%
Nice…except you will notice that the tenure categories are in alphabetical order, not inherent order. We would prefer to
see the order New, Experienced, Mature.
32
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
To reorder the values, go back to the Plant_Survey sheet. Click on any value in the JobTen column. Now, right click.
Select Column>Value Order:
33
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
You should now see this screen:
Reorder these values manually in your preferred order
Click OK.
34
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
Now your graph looks like this:
Job Tenure of Employees at the Dallas & Norcross Plants
New
22.5%
Category
New
Experienced
Mature
Mature
37.5%
Experienced
40.0%
Much better!
You will find that of all of the packages, Minitab has some of the strongest graphics. This graphic can be transported into
a word document by right clicking with your mouse and selecting “copy graph.”
(If you are saying “Hey…how do I get back to my datasheet!??...just go to Window>Plant_Survey)
35
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
If you need to create a pie chart to understand a quantitative variable (e.g. productivity) relative to a qualitative variable
(e.g. Plant), in Minitab you must begin by getting some summary statistics for the quantitative variable with the
qualitative one used to subset it. You can do this by selecting Stat>Basic Statistics>Store Descriptive Statistics. This process
will store (save) our descriptive statistics in a table in our Data Window, so the Pie Chart command can use the results to
make a chart. Replicate the window below. We are asking for statistics on PRDCTY by Plant.
Click on the Statistics button and choose the statistic Sum.
Click OK.
36
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
You will see the following in your Data Window:
You can think of this information as a little table
within your datasheet that tells you the total
amount of (summation of) productivity
attributed to each plant.
Now, you are ready to make the pie chart for this data.
Choose Graph>Pie Chart. You should indicate this time that your chart values are in a table. The categorical variable for
your table was named ByVar1 (change this in the worksheet if you need to). The Summary variable was named Sum1
(again – change it if you need to).
37
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
Fill out the screen as below:
Click on Labels and title the chart “Chart of Productivity by Plant”. Warning Warning Warning Will Robinson: If you do
not do this then labels made for other charts will likely display on this new one. It happens to the best of us (well…not
us…but other people)! Select the Slice Labels as necessary. Click OK.
38
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
You should see a pie chart similar to the following:
Chart of Productivity by Plant
Category
Dallas
Norcross
Norcross
39.9%
Dallas
60.1%
Next, we wish to replicate the bar chart from The Basic Concepts Manual, which displayed the frequency count for each
value of the variable JOBTEN.
39
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
Again, go to the Graphs menu. This time choose “Bar Chart”. This action will produce:
Select the Simple chart as above and click on OK.
Select JOBTEN as the categorical variable. Provide a title by clicking on the Labels button. Then click OK.
40
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
You should see something like this:
Bar Chart of Employee Tenure
18
16
14
Count
12
10
8
6
4
2
0
41
New
Experienced
JOBTEN
Mature
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
If you need to generate a different style of bar chart such as the one with horizontal bars, you can play with some of the
options in the Bar Chart panel. For example, to obtain horizontal bars, click on the Scale button before clicking OK. On the
Axes and Tick screen select Transpose values and category scales as shown below:
Click OK and OK.
42
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
You should see something like this:
Bar Chart of Employee Tenure
JOBTEN
New
Experienced
Mature
0
2
4
6
8
10
Count
12
14
16
18
To create a stem and leaf display for the variable YRONJOB, go to the Graph menu and select Stem and Leaf (Notice that
only the quantitative variables are available for this graphic). Select YRONJOB as your variable. Click OK.
43
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
You will get a Stem and Leaf Display like the one below in your Session Window:
Stem-and-Leaf Display: YRONJOB
Stem-and-leaf of YRONJOB
Leaf Unit = 1.0
2
7
12
16
(8)
16
9
5
1
0
0
0
0
0
1
1
1
1
N
= 40
01
22233
44555
6777
88888999
0000111
2333
4445
7
A quick note on interpretation of this messy output…the (8) is in brackets to signify that this is the stem with the greatest
number of observations. Here, the values include 8.x, 8.x, 8.x, 8.x, 8.x, 9.x, 9.x, 9.x years of service.
44
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
To get a Boxplot for YRONJOB, go to the Graph menu and select Boxplot. You will see a screen like the below where you
can choose the style of Boxplot you need. Choose Simple for this first one.
Click OK. Choose YRONJOB for your variable. Click OK again.
45
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
You will see the following:
Boxplot of YRONJOB
18
16
14
YRONJOB
12
10
8
Recall that in a box plot, the “box” represents
the middle 50% of the dataset (the top of Q1
and the top of Q3), and the line inside the
box represents Q2 or the median.
6
4
2
0
46
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
Concept 4: Using Minitab for Visualization/Organization of Multivariate Data
Contingency tables, stacked bar charts, 100% stacked bar charts and scatter plots can be easily generated in Minitab.
In order to use Minitab to reproduce the contingency table examining plant and gender from The Basic Concepts Manual,
simply go to the Stat menu. Choose Tables then choose Cross Tabulation and Chi Square (although we are not actually
calculating the Chi Square stats, some of the information that we need is under this option). Choose Plant for your rows
and Gender for your columns. The Cross Tabulations function in Minitab is quite flexible. If you wish to include more
than just the frequency counts in the cells of your table, place a check next to row percents, column percents and total
percents, as we have below:
Place the Plant variable in the Row position.
Place the Gender variable in the Column position.
47
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
Click OK. The contingency table below should appear in your Session Window:
Tabulated statistics: Plant, Gender
Rows: Plant
Columns: Gender
Female
Male
All
Dallas
13
56.52
65.00
32.50
10
43.48
50.00
25.00
23
100.00
57.50
57.50
Norcross
7
41.18
35.00
17.50
10
58.82
50.00
25.00
17
100.00
42.50
42.50
20
50.00
100.00
50.00
20
50.00
100.00
50.00
40
100.00
100.00
100.00
All
Cell Contents:
Count
% of Row
% of Column
% of Total
Notice the key at the bottom indicating that the cell contents have the count (frequency) on the top, followed by row
percents, column percents and total percents.
Wow…look how much output was created in a single table! That was so much easier than Excel! The output table
contains the conditional probabilities described in The Basic Concepts Manual. In the first “cell” – the intersection of
Female and Dallas – we have four pieces of information. We know that there are 13 women who work in Dallas. We
know that of all of the Dallas employees, 56.5% are female. We know that of all of the women, 65% are in Dallas. Finally,
we know that of all employees, 32.50% are females in Dallas.
48
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
If you need to subset this information further (e.g., by Job Tenure), there is an easy way to do that. Go back to the
Stat>Tables>Crosstabulation and Chi-Square screen. This table will be a little busy, so let’s just choose the Counts this
time. Make your selections of the three variables as follows:
Click OK.
49
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
The table below will appear in the Session Window:
Tabulated statistics: Plant, Gender, JOBTEN
Results for JOBTEN = Experienced
Rows: Plant
Columns: Gender
Female
Male
All
3
3
6
5
5
10
8
8
16
Dallas
Norcross
All
Cell Contents:
Count
Results for JOBTEN = Mature
Rows: Plant
Columns: Gender
Female
Male
All
6
2
8
3
4
7
9
6
15
Dallas
Norcross
All
Cell Contents:
Count
Results for JOBTEN = New
Rows: Plant
Columns: Gender
Female
Male
All
4
2
6
2
1
3
6
3
9
Dallas
Norcross
All
Cell Contents:
50
Count
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
Notice that the same information on Plant and Gender counts has now been provided by each level of Job Tenure –
Experienced, Mature and New (the levels are reported in alphabetical order rather than by order of magnitude).
The stacked bar charts developed in The Basic Concepts Manual can be easily developed in Minitab. Start in the Graphs
menu. Choose the option Bar Chart. You will see:
Make sure you choose the Stacked option as shown and click OK.
51
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
Then select the variables so that the category axis is Plant and the bars are stacked by Gender. This is done by selecting
Gender last and making sure the “stack categories of last categorical variable” box is checked. See below:
Add a title if you like. Then, as always, click OK.
52
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
You should see this:
Bar Chart of Gender by Plant
25
Gender
Male
Female
20
Count
15
10
5
0
Plant
53
Dallas
Norcross
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
The 100% Stacked Bar Chart is a little less straight forward to generate. Select Graph>Bar Chart>Stack>OK as before.
Assign the variables Plant and Gender as before. Then select Chart Options. You will see the following screen:
To generate the 100% calibration of the bars within each
plant value, set the Y-axis to be shown as a % value and
accumulate the values within each category.
Select OK.
54
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
You should see the following screen:
100% Bar Chart of Gender by Plant
Gender
Male
Female
100
Percent
80
60
40
20
0
Plant
Dallas
Norcross
Percent within levels of Plant.
55
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
Another multivariate visualization technique is the scatter plot. Again, Minitab provides us with flexibility to subset our
analysis if needed. Consider the relationship between Job Satisfaction and Productivity as we did with SPSS in Chapter 5.
This plot can be replicated in Minitab by going into the Graphs menu and choosing Scatterplot. A choice of types of
Scatterplots follows.
Choose the Simple scatter plot:
Then click on OK.
56
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
Next, choose PRDCTY for the Y-axis variable and JOBSAT for the X-axis variable as shown below:
Click OK. Click on the Labels button and add an appropriate title. Click OK and OK.
57
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
Here is the associated output:
Productivity versus Job Satistifaction
100
95
PRDCTY
90
85
80
75
70
5
6
7
JOBSAT
8
9
There is a slightly positive relationship between these two variables – it appears as if Job Satisfaction and Productivity are
related.
58
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
The side-by-side histograms from the Basic Concepts manual can be created much like the univariate histogram. Go to
Graph>Histogram and select the simple histogram and click OK. Highlight the quantitative variable, YRONJOB, and click
select. You should see this:
Click on the multiple graphs button.
59
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
The following dialogue box will open up.
Be sure to click “In separate panels of the same graph” because we want side-by-side histograms of YRONJOB.
After that, click the “by Variables” tab to select which qualitative variable we want to stratify the histograms by.
60
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
At this screen we can select which variable to group by. Highlight plant and click select, then ok.
61
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
You should get the following graph of histograms of job tenure stratified by plant:
Histogram of Widge One Employee Job Tenure by Plant
0
Dallas
4
8
12
16
Norcross
5
Frequency
4
3
2
1
0
0
4
8
12
16
Years on Job
Panel variable: Plant
Great Job!
62
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
The side-by-side box plots can be generated much the same way. Go to Graph>box plot> select simple and then click ok.
Once again, highlight YRONJOB, click select and then click the “multiple graph button”. You should see this:
Just we did with the side-by-side histograms, choose “in separate panels of the same graph” and click the “by Variables”
tab to select Plant. Click ok and ok.
63
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
You should see this:
Boxplots of Widge One Employee Job Tenure by Plant
Dallas
18
Norcross
16
Years on Job
14
12
10
8
6
4
2
0
Panel variable: Plant
Fantastic! Side-by-side graphics are a great way to visualize multivariate relationships. A word of warning: This is only a
multivariate analysis if the graphics are of one variable grouped by another. A histogram with one panel showing years
on job and another showing productivity scores is NOT a multivariate visualization.
64
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
Concept 5: Using Minitab for Random Number Generation and Simple Random Sampling
Like the other software applications, Minitab will generate random numbers using the internal clock in the computer. As
a result, every time a command is given to Minitab to generate some set of random numbers, a different set of random
numbers will be generated. The software normally chooses its own starting point for the generation process by using the
time of day to choose a random starting point in the string.
Sometimes, however, you may wish to control where Minitab starts its string. For example, you may wish to repeat a
sequence by generating the same set of random data. In this case, the BASE command tells the random number generator
where to start. The generator will use this base until you set a new BASE or exit Minitab.
If you need to set the “base” number so you can replicate your results, simply go to the Calc menu. Choose the Set Base
option. You should see the following screen:
Here, we have not chosen a base. We could have chosen a positive integer as our base. In doing so, we could replicate our
results anytime we wish to do so by going back in and resetting the base to that value.
To create a string of random numbers, which is uniformly distributed between 0 and 1, go to Calc > Random
Data>Uniform. You may note here that Minitab has a lengthy list of distributions, as did SPSS, which can be used to
generate random samples. Indeed, this sort of procedure is quite easy and versatile with this software.
65
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
We will generate 40 values from this normal distribution – with one value for every observation. We will name our new
data column “Group”.
Every distribution has parameters that must be specified. For the uniform distribution, the only parameters are the two
values between which we want our random numbers to fall. We choose these values to be 0 and 1. Fill in your window
like the one below:
Click OK. The new variable Group should appear in your Data Window.
66
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
Here is what a typical result would look like:
Remember that your results will vary since this variable was randomly generated.
One of the primary reasons for generating random numbers is to assign observations into statistically independent
groups. Using the random numbers, let’s assign the 40 observations into 3 groups. Here is one way to accomplish this:
Choose Data>Code>Numeric to Text.
67
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
The Code – Numeric to Text screen should look like this:
Note that since the distribution of the random numbers is uniform, each random value has an equal probability of
occurrence. This is very useful information for assignment of groups. If you are interested in assigning groups of
approximately equal size, then you should allocate the values of 0 through .33 to one group, .34 to .66 to another, etc. If
you want the first group to have approximately 25% of the population, then allocate the random values of 0 through .25 to
the first group, etc. Cool.
You should see something like the following in your Data Window:
68
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
Again, remember results will vary due to the randomness (pun intended) of this procedure.
This procedure has taken the 40 observations and assigned them into 3 groups based upon the random numbers created
in the previous procedure. Each of the 40 employees is now in one of these randomly assigned, independent groups.
69
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
Because this process of selecting a random sample from a set of data is so common, there is a very straight-forward way
to accomplish this in Minitab. Because we will be subsetting this dataset, go ahead NOW and save the Minitab file that
you are working in – File>Save Current Worksheet As (this will allow you to save it as an Excel spreadsheet again).
Now, suppose we wish to select a simple random sample of 30 individuals from this dataset. Go to Calc>Random
Data>Sample from Columns. You should see a screen like the following:
Specify that you wish to select a random sample of 30 cases. In the From columns box, identify all of the variables. Then,
identify all of the variables again for the Store samples in box. This will effectively take a random sample of size 30 from
our dataset and discard the observations that were not selected (did you save your file?). Select OK. You should now be
left with 30 observations.
70
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
Concept 6: Using Minitab for Confidence Intervals
Generating confidence intervals in Minitab is very easy. For example, if we wish to compute a 95% confidence interval for
the mean Job Satisfaction rating of all employees, we would go to Stat>Basic Statistics>One-Sample T2. We would see the
following screen on which we could choose the variable(s) we want to include in our analysis. This time we choose
JOBSAT as the Test Variable:
Click on the Options button. The default setup is the following. This selection will produce a complete 95% confidence
interval.
2
Ttests are very common tests used to determine if two sample means differ significantly or if one sample mean differs from some established
value. For more detailed information on Ttests, we suggest Statistical Methods and Data Analysis by Ott and Longnecker.
71
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
Click OK and then OK. You will see the following output in your Session Window:
One-Sample T: JOBSAT
Variable
JOBSAT
N
40
Mean
6.850
StDev
1.021
SE Mean 95% CI
0.161
(6.524, 7.176)
As stated in previous manuals, these results would be reported as:
“Based on a representative sample of 40 employees, we are 95% confident that job satisfaction among all employees is
estimated to be between 6.52 and 7.18”.
This means that the probability that the “true” mean job satisfaction of all employees, which is unknown, falls between
6.52 and 7.18 is 95%. It also means that there is a 5% probability that the true mean job satisfaction is outside of this range
(< 6.52 or > 7.18).
72
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
Another option here, which is only available for the 95% Interval (the most common), is the Interval Chart. Let’s look at
the 95% Interval graphic for Job Satisfaction by Plant. To do this, go to Graph>Interval Plot. Since we have one
quantitative variable (Job Satisfaction) that we want to evaluate by two groups within a qualitative variable (Plant), select
One Y With Groups:
73
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
Select OK. Assign JobSat to the Graph variable and Plant to the Categorical variables for grouping:
Add a Title to your graphic as appropriate through the Labels button. Select OK.
74
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
You should see the following Plot:
Interval Plot of JOBSAT
95% CI for the Mean
7.6
JOBSAT
7.2
6.8
6.4
6.0
Dallas
Norcross
Plant
Now that’s what I’m talking about! 
75
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
Minitab Lagniappe
What is a “Lagniappe”? This word derives from New World Spanish la ñapa, “the gift”. The word came into the Creole
dialect of New Orleans and there acquired a French spelling. It is still used in the Gulf States, especially southern
Louisiana, to denote a little bonus that a friendly shopkeeper might add to a purchase.
Our lagniappe for our readers includes the extra and interesting things that we have learned to do with these software
packages that might not be easily found or well known. A little extra information at no extra cost!
We’ve mentioned before that Minitab is very strong in graphics. The Lagniappe we’re giving you is about those graphics.
Recall the 95% Confidence Interval plot of job satisfaction by Plant – what if we wanted to create the same graph but
stratify by gender instead of plant? Well, we could go through the whole process of creating the graph from scratch again.
In fact, Minitab is very helpful in that the previous entries are retained for the next time you open the graph builder. But
we would still have to go in and select gender and change the title to say gender and, well, we (and now you) are more
consummate Minitab users than that. We can do better.
76
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
Minitab has a very useful “similar graph” option. Go to Editor>Make Similar Graph
77
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
The following dialogue box will appear:
Now make your changes by clicking on the
box where you want your new variable,
highlighting it from the list on the left and
then clicking select.
78
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
The new graph looks like this:
Interval Plot for Job Satisfaction
95% CI for the Mean
7.6
7.4
JOBSAT
7.2
7.0
6.8
6.6
6.4
6.2
Female
Male
Gender
79
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
Another great attribute about graphics in Minitab is how easy it is to customize the colors, patterns, fonts, etc. Simply
double click on the area you want to change and the graph and an interactive figure editor will open up. From there you
have numerous options for customizing your graphs (you can even change the titles and axis labels without having to
recreate the entire graph).
80
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
Congratulations – you are well on your way to becoming a STAT geek! Be proud.
81
Developed and maintained by the Center for Statistics and Analytical Services of Kennesaw State University
Download