Uploaded by hilaryhoang

Interpret all statistics and graphs for

5/22/2017
Interpret all statistics and graphs for Descriptive Statistics ­ Minitab Express
Minitab Express
Summary Statistics
™
Support
Descriptive Statistics
Search Minitab Express Support
Before you start
Overview
Interpret all statistics and graphs for
Descriptive Statistics
Data considerations
Example
Perform the analysis
Learn more about Minitab
Enter your data
Find deࣜnitions and interpretation guidance for every statistic and graph that is provided with
Statistics.
Select theDescriptive
statistics to display
Select theIn
graphs
display
Thisto Topic
Boxplot
Mode
CoefVar
N
Q1
N*
Key results
Histogram, with normal curve
Total Count
Individual value plot
Range
All statistics and graphs
IQR
SE mean
Kurtosis
Skewness
By using
site you agree to the use of cookies for analytics and personalized content. Read our policy
Methods
andthis
formulas
Maximum
StDev
Mean
Sum
Interpret the results
OK
Methods and formulas
http://support.minitab.com/en­us/minitab­express/1/help­and­how­to/basic­statistics/summary­statistics/descriptive­statistics/interpret­the­results/all­statistics­and­graphs/
1/21
5/22/2017
Interpret all statistics and graphs for Descriptive Statistics ­ Minitab Express
Median
Minimum
Q3
Variance
Boxplot
A boxplot provides a graphical summary of the distribution of a sample. The boxplot shows the shape,
central tendency, and variability of the data.
Interpretation
Use a boxplot to examine the spread of the data and to identify any potential outliers. Boxplots are
best when the sample size is greater than 20.
Skewed data
Examine the shape of your data to determine whether your data appear to be skewed. When data
are skewed, the majority of the data are located on the high or low side of the graph. Often,
skewness is easiest to detect with a histogram or boxplot.
Right-skewed
Left-skewed
The boxplot with right-skewed data shows wait times. Most of the wait times are relatively short, and only a
few wait times are long. The boxplot with left-skewed data shows failure time data. A few items fail
immediately, and many more items fail later.
http://support.minitab.com/en­us/minitab­express/1/help­and­how­to/basic­statistics/summary­statistics/descriptive­statistics/interpret­the­results/all­statistics­and­graphs/
Outliers
2/21
5/22/2017
Interpret all statistics and graphs for Descriptive Statistics ­ Minitab Express
Outliers
Outliers, which are data values that are far away from other data values, can strongly aࣜect the
results of your analysis. Often, outliers are easiest to identify on a boxplot.
On a boxplot, asterisks (*) denote outliers.
Try to identify the cause of any outliers. Correct any data–entry errors or measurement errors.
Consider removing data values for abnormal, one-time events (also called special causes). Then,
repeat the analysis. For more information, go to Identifying outliers.
CoefVar
The coeࣜcient of variation (CoefVar) is a measure of spread that describes the variation in the data
relative to the mean. The coeࣜcient of variation is adjusted so that the values are on a unitless scale.
Because of this adjustment, you can use the coeࣜcient of variation instead of the standard deviation
to compare the variation in data that have diࣜerent units or that have very diࣜerent means.
Interpretation
The larger the coeࣜcient of variation, the greater the spread in the data.
For example, you are the quality control inspector at a milk bottling plant that bottles small and large
containers of milk. You take a sample of each product and observe that the mean volume of the small
http://support.minitab.com/en­us/minitab­express/1/help­and­how­to/basic­statistics/summary­statistics/descriptive­statistics/interpret­the­results/all­statistics­and­graphs/
3/21
5/22/2017
Interpret all statistics and graphs for Descriptive Statistics ­ Minitab Express
containers is 1 cup with a standard deviation of 0.08 cup, and the mean volume of the large containers
is 1 gallon (16 cups) with a standard deviation of 0.4 cups. Although the standard deviation of the
gallon container is ࣜve times greater than the standard deviation of the small container, their
coeࣜcients of variation support a diࣜerent conclusion.
Large container
Small container
CoefVar = 100 * 0.4 cups / 16 cups = 2.5
CoefVar = 100 * 0.08 cups / 1 cup = 8
The coeࣜcient of variation of the small container is more than three times greater than that of the
large container. In other words, although the large container has a greater standard deviation, the
small container has much more variability relative to its mean.
Q1
Quartiles are the three values–the ࣜrst quartile at 25% (Q1), the second quartile at 50% (Q2 or median),
and the third quartile at 75% (Q3)–that divide a sample of ordered data into four equal parts.
The ࣜrst quartile is the 25th percentile and indicates that 25% of the data are less than or equal to this
value.
For this ordered data, the ࣜrst quartile (Q1) is 9.5. That is, 25% of the data are less than or equal to 9.5.
http://support.minitab.com/en­us/minitab­express/1/help­and­how­to/basic­statistics/summary­statistics/descriptive­statistics/interpret­the­results/all­statistics­and­graphs/
4/21
5/22/2017
Interpret all statistics and graphs for Descriptive Statistics ­ Minitab Express
Histogram, with normal curve
A histogram divides sample values into many intervals and represents the frequency of data values in
each interval with a bar.
Interpretation
Use a histogram to assess the shape and spread of the data. Histograms are best when the sample
size is greater than 20.
You can use a histogram of the data overlaid with a normal curve to examine the normality of your
data. A normal distribution is symmetric and bell-shaped, as indicated by the curve. It is often diࣜcult
to evaluate normality with small samples. A probability plot is best for determining the distribution ࣜt.
Good ࣜt
Poor ࣜt
http://support.minitab.com/en­us/minitab­express/1/help­and­how­to/basic­statistics/summary­statistics/descriptive­statistics/interpret­the­results/all­statistics­and­graphs/
5/21
5/22/2017
Interpret all statistics and graphs for Descriptive Statistics ­ Minitab Express
Individual value plot
An individual value plot displays the individual values in the sample. Each circle represents one
observation. An individual value plot is especially useful when you have relatively few observations and
when you also need to assess the eࣜect of each observation.
Interpretation
Use an individual value plot to examine the spread of the data and to identify any potential outliers.
Individual value plots are best when the sample size is less than 50.
Skewed data
Examine the shape of your data to determine whether your data appear to be skewed. When data
are skewed, the majority of the data are located on the high or low side of the graph. Often,
skewness is easiest to detect with a histogram or boxplot.
Right-skewed
Left-skewed
The individual value plot with right-skewed data shows wait times. Most of the wait times are relatively short,
and only a few wait times are long. The individual value plot with left-skewed data shows failure time data. A
few items fail immediately, and many more items fail later.
Outliers
Outliers, which are data values that are far away from other data values, can strongly aࣜect the
http://support.minitab.com/en­us/minitab­express/1/help­and­how­to/basic­statistics/summary­statistics/descriptive­statistics/interpret­the­results/all­statistics­and­graphs/
6/21
5/22/2017
Interpret all statistics and graphs for Descriptive Statistics ­ Minitab Express
results of your analysis. Often, outliers are easiest to identify on a boxplot.
On an individual value plot, unusually low or high data values indicate possible outliers.
Try to identify the cause of any outliers. Correct any data–entry errors or measurement errors.
Consider removing data values for abnormal, one-time events (also called special causes). Then,
repeat the analysis. For more information, go to Identifying outliers.
IQR
The interquartile range (IQR) is the distance between the ࣜrst quartile (Q1) and the third quartile (Q3).
50% of the data are within this range.
For this ordered data, the interquartile range is 8 (17.5–9.5 = 8). That is, the middle 50% of the data is between
9.5 and 17.5.
Interpretation
http://support.minitab.com/en­us/minitab­express/1/help­and­how­to/basic­statistics/summary­statistics/descriptive­statistics/interpret­the­results/all­statistics­and­graphs/
7/21
5/22/2017
Interpret all statistics and graphs for Descriptive Statistics ­ Minitab Express
Use the interquartile range to describe the spread of the data. As the spread of the data increases, the
IQR becomes larger.
Kurtosis
Kurtosis indicates how the peak and tails of a distribution diࣜer from the normal distribution.
Interpretation
Use kurtosis to initially understand general characteristics about the distribution of your data.
Baseline: Kurtosis value of 0
Normally distributed data establish the baseline for kurtosis. A kurtosis value of 0 indicates that the data follow
the normal distribution perfectly. A kurtosis value that signiࣜcantly deviates from 0 may indicate that the data
are not normally distributed.
http://support.minitab.com/en­us/minitab­express/1/help­and­how­to/basic­statistics/summary­statistics/descriptive­statistics/interpret­the­results/all­statistics­and­graphs/
8/21
5/22/2017
Interpret all statistics and graphs for Descriptive Statistics ­ Minitab Express
Positive kurtosis
A distribution that has a positive kurtosis value indicates that the distribution has heavier tails and a sharper
peak than the normal distribution. For example, data that follow a t-distribution have a positive kurtosis value.
The solid line shows the normal distribution, and the dotted line shows a distribution that has a positive kurtosis
value.
Negative kurtosis
A distribution with a negative kurtosis value indicates that the distribution has lighter tails and a ࣜatter peak
than the normal distribution. For example, data that follow a beta distribution with ࣜrst and second shape
parameters equal to 2 have a negative kurtosis value. The solid line shows the normal distribution and the
dotted line shows a distribution that has a negative kurtosis value.
http://support.minitab.com/en­us/minitab­express/1/help­and­how­to/basic­statistics/summary­statistics/descriptive­statistics/interpret­the­results/all­statistics­and­graphs/
9/21
5/22/2017
Interpret all statistics and graphs for Descriptive Statistics ­ Minitab Express
Maximum
The maximum is the largest data value.
In these data, the maximum is 19.
13 17 18 19 12 10 7 9 14
Interpretation
Use the maximum to identify a possible outlier or a data-entry error. One of the simplest ways to
assess the spread of your data is to compare the minimum and maximum. If the maximum value is
very high, even when you consider the center, the spread, and the shape of the data, investigate the
cause of the extreme value.
Mean
The mean is the average of the data, which is the sum of all the observations divided by the number of
observations.
For example, the wait times (in minutes) of ࣜve customers in a bank are: 3, 2, 4, 1, and 2. The mean
waiting time is calculated as follows:
On average, a customer waits 2.4 minutes for service at the bank.
Interpretation
http://support.minitab.com/en­us/minitab­express/1/help­and­how­to/basic­statistics/summary­statistics/descriptive­statistics/interpret­the­results/all­statistics­and­graphs/
10/21
5/22/2017
Interpret all statistics and graphs for Descriptive Statistics ­ Minitab Express
Interpretation
Use the mean to describe the sample with a single value that represents the center of the data. Many
statistical analyses use the mean as a standard measure of the center of the distribution of the data.
The median and the mean both measure central tendency. But unusual values, called outliers, aࣜect
the median less than they aࣜect the mean. When you have unusual values, you can compare the mean
and the median to decide which is the better measure to use. If your data are symmetric, the mean
and median are similar.
Median
The median is the midpoint of the data set. This midpoint value is the point at which half the
observations are above the value and half the observations are below the value. The median is
determined by ranking the observations and ࣜnding the observation that are at the number [N + 1] / 2
in the ranked order. If the number of observations are even, then the median is the average value of
the observations that are ranked at numbers N / 2 and [N / 2] + 1.
For this ordered data, the median is 13. That is, half the values are less than or equal to 13, and half the values
are greater than or equal to 13. If you add another observation equal to 20, the median is 13.5, which is the
average between 5th observation (13) and the 6th observation (14).
Interpretation
The median and the mean both measure central tendency. But unusual values, called outliers, aࣜect
the median less than they aࣜect the mean. When you have unusual values, you can compare the mean
and the median to decide which is the better measure to use. If your data are symmetric, the mean
http://support.minitab.com/en­us/minitab­express/1/help­and­how­to/basic­statistics/summary­statistics/descriptive­statistics/interpret­the­results/all­statistics­and­graphs/
11/21
5/22/2017
Interpret all statistics and graphs for Descriptive Statistics ­ Minitab Express
and median are similar.
Minimum
The minimum is the smallest data value.
In these data, the minimum is 7.
13 17 18 19 12 10 7 9 14
Interpretation
Use the minimum to identify a possible outlier or a data-entry error. One of the simplest ways to
assess the spread of your data is to compare the minimum and maximum. If the minimum value is
very low, even when you consider the center, the spread, and the shape of the data, investigate the
cause of the extreme value.
Mode
The mode is the value that occurs most frequently in a set of observations. Minitab also displays how
many data points equal the mode.
The mean and median require a calculation, but the mode is determined by counting the number of
times each value occurs in a data set.
Interpretation
The mode can be used with mean and median to provide an overall characterization of your data
http://support.minitab.com/en­us/minitab­express/1/help­and­how­to/basic­statistics/summary­statistics/descriptive­statistics/interpret­the­results/all­statistics­and­graphs/
12/21
5/22/2017
Interpret all statistics and graphs for Descriptive Statistics ­ Minitab Express
distribution. The mode can also be used to identify problems in your data.
For example, a distribution that has more than one mode may identify that your sample includes data
from two populations. If the data contain two modes, the distribution is bimodal. If the data contain
more than two modes, the distribution is multi-modal.
For example, a bank manager collects wait time data for customers who are cashing checks and for
customers who are applying for home equity loans. Because these are two very diࣜerent services, the
wait time data included two modes. The data for each service should be collected and analyzed
separately.
Unimodal
There is only one mode, 8, that occurs most frequently.
http://support.minitab.com/en­us/minitab­express/1/help­and­how­to/basic­statistics/summary­statistics/descriptive­statistics/interpret­the­results/all­statistics­and­graphs/
13/21
5/22/2017
Interpret all statistics and graphs for Descriptive Statistics ­ Minitab Express
Bimodal
There are two modes, 4 and 16. The data seem to represent 2 diࣜerent populations.
N
The number of non-missing values in the sample.
In this example, there are 141 recorded observations.
Total count
N
N*
149
141
8
N*
The number of missing values in the sample. The number of missing values refers to cells that contain
the missing value symbol *.
In this example, 8 errors occurred during data collection and are recorded as missing values.
Total count
N
N*
149
141
8
http://support.minitab.com/en­us/minitab­express/1/help­and­how­to/basic­statistics/summary­statistics/descriptive­statistics/interpret­the­results/all­statistics­and­graphs/
14/21
5/22/2017
Interpret all statistics and graphs for Descriptive Statistics ­ Minitab Express
Total Count
The total number of observations in the column. Use to represent the sum of N missing and N
nonmissing.
In this example, there are 141 valid observations and 8 missing values. The total count is 149.
Total count
N
N*
149
141
8
Range
The range is the diࣜerence between the largest and smallest data values in the sample. The range
represents the interval that contains all the data values.
Interpretation
Use the range to understand the amount of dispersion in the data. A large range value indicates
greater dispersion in the data. A small range value indicates that there is less dispersion in the data.
Because the range is calculated using only two data values, it is more useful with small data sets.
SE mean
The standard error of the mean (SE Mean) estimates the variability between sample means that you
http://support.minitab.com/en­us/minitab­express/1/help­and­how­to/basic­statistics/summary­statistics/descriptive­statistics/interpret­the­results/all­statistics­and­graphs/
15/21
5/22/2017
Interpret all statistics and graphs for Descriptive Statistics ­ Minitab Express
would obtain if you took repeated samples from the same population. Whereas the standard error of
the mean estimates the variability between samples, the standard deviation measures the variability
within a single sample.
For example, you have a mean delivery time of 3.80 days, with a standard deviation of 1.43 days, from
a random sample of 312 delivery times. These numbers yield a standard error of the mean of 0.08
days (1.43 divided by the square root of 312). If you took multiple random samples of the same size,
from the same population, the standard deviation of those diࣜerent sample means would be around
0.08 days.
Interpretation
Use the standard error of the mean to determine how precisely the sample mean estimates the
population mean.
A smaller value of the standard error of the mean indicates a more precise estimate of the population
mean. Usually, a larger standard deviation results in a larger standard error of the mean and a less
precise estimate of the population mean. A larger sample size results in a smaller standard error of the
mean and a more precise estimate of the population mean.
Minitab uses the standard error of the mean to calculate the conࣜdence interval.
Skewness
Skewness is the extent to which the data are not symmetrical.
Interpretation
Use skewness to help you establish an initial understanding of your data.
http://support.minitab.com/en­us/minitab­express/1/help­and­how­to/basic­statistics/summary­statistics/descriptive­statistics/interpret­the­results/all­statistics­and­graphs/
16/21
5/22/2017
Interpret all statistics and graphs for Descriptive Statistics ­ Minitab Express
Figure A
Figure B
Symmetrical or non-skewed distributions
As data becomes more symmetrical, its skewness value approaches zero. Figure A shows normally distributed
data, which by deࣜnition exhibits relatively little skewness. By drawing a line down the middle of this histogram
of normal data it's easy to see that the two sides mirror one another. But lack of skewness alone doesn't imply
normality. Figure B shows a distribution where the two sides still mirror one another, though the data is far from
normally distributed.
Positive or right skewed distributions
Positive skewed or right skewed data is so named because the "tail" of the distribution points to the right, and
because its skewness value will be greater than 0 (or positive). Salary data is often skewed in this manner: many
employees in a company make relatively little, while increasingly few people make very high salaries.
http://support.minitab.com/en­us/minitab­express/1/help­and­how­to/basic­statistics/summary­statistics/descriptive­statistics/interpret­the­results/all­statistics­and­graphs/
17/21
5/22/2017
Interpret all statistics and graphs for Descriptive Statistics ­ Minitab Express
Negative or left skewed distributions
Left skewed or negative skewed data is so named because the "tail" of the distribution points to the left, and
because it produces a negative skewness value. Failure rate data is often left skewed. Consider light bulbs: very
few will burn out right away, the vast majority lasting for quite a long time.
StDev
The standard deviation is the most common measure of dispersion, or how spread out the data are
about the mean. The symbol σ (sigma) is often used to represent the standard deviation of a
population, while s is used to represent the standard deviation of a sample. Variation that is random or
natural to a process is often referred to as noise.
Because the standard deviation is in the same units as the data, it is usually easier to interpret than the
variance.
Interpretation
Use the standard deviation to determine how spread out the data are from the mean. A higher
standard deviation value indicates greater spread in the data. A good rule of thumb for a normal
distribution is that approximately 68% of the values fall within one standard deviation of the mean,
95% of the values fall within two standard deviations, and 99.7% of the values fall within three
standard deviations.
The standard deviation can also be used to establish a benchmark for estimating the overall variation
of a process.
http://support.minitab.com/en­us/minitab­express/1/help­and­how­to/basic­statistics/summary­statistics/descriptive­statistics/interpret­the­results/all­statistics­and­graphs/
18/21
5/22/2017
Interpret all statistics and graphs for Descriptive Statistics ­ Minitab Express
Hospital 1
Hospital 2
Hospital discharge times
Administrators track the discharge time for patients who are treated in the emergency departments of two
hospitals. Although the average discharge times are about the same (35 minutes), the standard deviations are
signiࣜcantly diࣜerent. The standard deviation for hospital 1 is about 6. On average, a patient's discharge time
deviates from the mean (dashed line) by about 6 minutes. The standard deviation for hospital 2 is about 20. On
average, a patient's discharge time deviates from the mean (dashed line) by about 20 minutes.
Sum
The sum is the total of all the data values. The sum is also used in statistical calculations, such as the
mean and standard deviation.
Q3
Quartiles are the three values–the ࣜrst quartile at 25% (Q1), the second quartile at 50% (Q2 or median),
and the third quartile at 75% (Q3)–that divide a sample of ordered data into four equal parts.
http://support.minitab.com/en­us/minitab­express/1/help­and­how­to/basic­statistics/summary­statistics/descriptive­statistics/interpret­the­results/all­statistics­and­graphs/
19/21
5/22/2017
Interpret all statistics and graphs for Descriptive Statistics ­ Minitab Express
The third quartile is the 75th percentile and indicates that 75% of the data are less than or equal to this
value.
For this ordered data, the third quartile (Q3) is 17.5. That is, 75% of the data are less than or equal to 17.5.
Variance
The variance measures how spread out the data are about their mean. The variance is equal to the
standard deviation squared.
Interpretation
The greater the variance, the greater the spread in the data.
Because variance (σ2) is a squared quantity, its units are also squared, which may make the variance
diࣜcult to use in practice. The standard deviation can be easier to use because it is a more intuitive
measurement. For example, a sample of waiting times at a bus stop may have a mean of 15 minutes
and a variance of 9 minutes 2. Because the variance is not in the same units as the data, the variance is
often displayed with its square root, the standard deviation. A variance of 9 minutes2 is equivalent to a
standard deviation of 3 minutes.
http://support.minitab.com/en­us/minitab­express/1/help­and­how­to/basic­statistics/summary­statistics/descriptive­statistics/interpret­the­results/all­statistics­and­graphs/
20/21
5/22/2017
Interpret all statistics and graphs for Descriptive Statistics ­ Minitab Express
Minitab.com
●
License Portal
●
Store
●
Blog
●
Contact Us
Copyright © 2016 Minitab Inc. All rights Reserved.
http://support.minitab.com/en­us/minitab­express/1/help­and­how­to/basic­statistics/summary­statistics/descriptive­statistics/interpret­the­results/all­statistics­and­graphs/
21/21