STA 5126: Applied Statistics
Section 1: Doug Zahn
Fall 2003
to hear is to forget
to see is to remember
to do is to understand
(old proverb, sometimes attributed to Confucius)
Table of Contents
Number of
observations required
A. General instructions
B. An overview of the term project
C. What is Six Sigma about?
D. Project Assignment 1: Goals and Computer Information
Due: September 8, 2003 by 1:25 p.m.
E. Project Assignment 2: Define
Due: September 22, 2003 by 1:25 p.m.
F. Project Assignment 3: Measure
Due: October 13, 2003 by 1:25 p.m.
G. Project Assignment 4: Analyze
Due: October 29, 2003 by 1:25 p.m.
H. Project Assignment 5: Improve
Due: November 12, 2003 by 1:25 p.m.
Project Assignment 6: Control
Due: December 1, 2003 by 1:25 p.m.
J. Project Assignment 7: Oral Report
Due: December 1, 2003. In class.
Acknowledgments and a Request
I gratefully acknowledge the many contributions that previous students and Teaching Assistants have made to the
Term Project Instructions. Contributions by Karen Kinard and Nancy Davis have been especially valuable.
In the spirit of systematic improvement, please tell me of any places in these instructions where you see room for
improvement: typos to fix, parts to clarify, instructions that could be shortened without loss of clarity, new topics
to address, etc.
Following are general instructions for all work submitted in response to the Term Project Instructions:
1. All work is to be typed, double spaced on 8.5” x 11” paper using at least 1.0” margins and 12 point or larger
2. Put your name, the date and the title of the assignment on the cover page of each project assignment you
submit. Do not put any additional information on this cover page.
3. All figures and graphs are to be computer-drawn. Label each as “Figure 1”, “Figure 2”, etc. and refer to it in
the text as “See Figure 1”, “See Figure 2”, etc. Each figure is to have
 a complete title that stands alone and describes what is in the figure,
 its axes labeled, and
 the source of the data identified.
Treat all tables in this way also, referring to them in the text as “See Table 1”, “See Table 2”, etc.
Either put your figures and tables in the text, with the text wrapped around them, or put “Insert Table 1 about
here” in the text, on its own line, and present all the tables and figures at the end of the document.
4. Organize your answers in numbered and lettered paragraphs corresponding to the number and letter of the
questions being answered. It is critical that you do this so that your work can be accurately graded.
5. Staple or bind your pages together. Paper-clipped, folded, or loose papers run the risk of being lost or
misplaced. Do not submit your work in those ways.
6. Number all your pages after the cover page. Put the number in the upper right corner of the page. The cover
page is page 1.
7. Use complete sentences and questions. Check your spelling and grammar. Errors in these areas will cost you
points in this course and will compromise your credibility in the working world.
8. Thank you for submitting all your work in the above format. I will be able to give you higher quality feedback
on this work than if you do not follow these instructions.
9. Late work is not professional. A penalty of 5% per business hour or fraction thereof late will be assessed on all
project work and on all homework. Use your resources to get all work in on time. Computer problems do not
count as emergencies.
10. Consequences of not following instructions:
Not typed
Not double-spaced
English errors
-1 per 2 English errors (-10 maximum on each project part)
Answers not numbered
corresponding to questions
Pages not all numbered
Incorrect cover page
Tables, graphs not completely labeled
Not following other aspects of instructions: comparable consequences
Free Review Opportunity: I will give you coaching with a 24-hour turn-around time
during weekdays on project components (all or part of what is due) if you give me a
draft at least 48 weekday hours before the due date. I will do my best to catch all
errors and I don’t guarantee I will catch all of them.
The SCANS report (1991) and business leaders repeatedly emphasize how important it is that college graduates
be able to write well, present clearly, work on teams to solve problems, and show initiative.
Reich (1991) describes three broad occupational classifications:
production worker, service worker, and symbolic-analyst.
Symbolic analysts must possess four basic skills:
abstraction, systems thinking, experimentation, and collaboration.
Brain researchers and learning theory researchers emphasize the importance of individually meaningful
experiences in helping students to construct their own meaning for topics encountered in school. These
experiences also enhance long term memory of the topics.
The Term Project is designed to help you develop the SCANS and symbolic-analyst skills and have a personally
meaningful experience in this course.
Identify a question of interest to you. This question may explore any aspect of your professional or personal life
that you are interested in becoming more aware of and improving. Make sure this topic is one that you are able
and willing to discuss in front of class. Past topics have included:
 find out where my time (or money) goes by keeping a detailed record of everything I do in each twenty-four
hour period in order to better utilize time (money),
 determine how many times per day I eat without being truly hungry and relate this to weight change,
 monitor daily my fat grams consumed, exercise time, caffeine intake, water intake, hours of sleep, the percent
of appointments and meetings I am on time for, fitness, and relate one or more of these to performance goals
of interest.
This project has five phases:
In the first phase (Define) you will define the question of interest.
In the second (Measure) you will identify how you will measure quantities related to your question and collect
baseline data.
In the third phase (Analyze) you will analyze your baseline data to see if your process is stable or unstable and
begin to search for appropriate ways to improve your process.
In the fourth phase (Improve) you will do experimentation to develop process improvements that you think will
improve the process outcome.
In the fifth phase (Control) you will collect data on your improved process to see if your process outcome has
improved, and to see if you can keep the improvement in control.
The course will be organized around a semester-long Six Sigma project.
A Six Sigma project is “a highly disciplined and statistically-based approach for removing
defects from products, processes, and transactions, involving everybody in the
corporation…” (Hahn, Hill, Hoerl, and Zinkgraf, 1998). A “defect” is an undesired outcome
of the current process.
The initiative focuses on reducing the number of defects by first identifying opportunities
for defects. The next step is to reduce the number of defects occurring. The goal of Six
Sigma initiatives is fewer than 3.4 defects per million opportunities.
Six Sigma improvement projects follow a five-step process: Define, Measure, Analyze,
Improve, and Control (DMAIC). Following is a brief description of the steps involved in
the process.
Define – Select the area that you want to address. Identify the ultimate goal of the
improvement, as distinct from interim measures of it. Identify the problem that you
intend to address. Identify the process that the problem relates to. Identify what
results you want to achieve in the process. Assess the process to see if it has the
potential to produce these results. Assess the resources available to you to modify the
process, and determine if they are sufficient to produce the desired results.
Measure – Select the appropriate responses (the “Y’s”) to be improved, based on
customer inputs and other considerations (such as product yield), ensure that they are
quantifiable, and that we can accurately measure them. Determine what is
unacceptable performance (i.e., a “defect”). Gather data to gauge the capability of your
measurement process and the utility of the data it produces. (also measure x’s)
Analyze – Gather preliminary data to assess current performance. Analyze the
preliminary data to document current performance (baseline process capability), and
also to begin identifying root causes of defects (i.e., the “X’s”, or independent variables),
and their impact and act accordingly.
Improve – Determine how to intervene in the process to significantly reduce the defect
levels. Several rounds of improvements may be required.
Control – Once the desired improvements have been made, put a system in place to
ensure the improvements are sustained, even though significant resources may no
longer be focused on the problem
(Based on Hahn, G.J., Hill, W.J., Hoerl, R.W., and Zinkgraf, S.A., (1997), The Impact of
Six Sigma Improvement: A Glimpse into the Future of Statistics, The American
Statistician, 53 (3), 208-215.)
http://www.kwtunnell.com/IMAGES/ronsnee_3.doc is the location of another discussion of the big picture by Ron
Case studies of Six Sigma on the web sites of consulting firms:
http://www.gemedicalsystems.com/prod_sol/hcare/resources/library/, are 3 of approximately 12,600 sites found
by Google when I searched on six sigma case studies applications.
Due by 1:25 p.m. on September 8, 2003.
30 points
Students with clear goals get more from this course than students without clear goals.
Assignment 1 is an opportunity for you to clarify your goals for yourself and for me so that
• You can move toward them during this course.
• I can learn who you are, what you are up to, what you plan to use the course for,
and how I can support you in doing this.
To earn 20 points on Project Assignment 1 do the following:
 Submit your responses typed, double-spaced on 8.5" x 11" paper using at least 1"
margins on all sides and 12 point or larger type.
 Answer the questions in two pages or less. Staple if two pages.
 Submit a response for each question that demonstrates reflection on the question.
 Turn in the questionnaire on the next page along with your responses to this
 Answer my follow-up questions on your responses to Project Assignment 1.
 Follow the above instructions to get full points on this assignment.
What are two contributions you hope to make in your professional life? What scores
on what specific and measurable criteria will indicate that you have made your
A. What is the highest degree you are currently planning to get?
B. When do you plan to get it?
C. What is your best prediction right now of what you will be doing in your career
five years after the date you plan to get your highest degree?
How can this course be of help to you as you work to make your responses to
Questions 1 and 2 happen?
Delight: What measurable course outcomes must occur by December 12, 2003 in
order for you to be delighted with the return on the resources you have spent on
STA 5126? (Be specific. Be sure to name all your conditions for satisfaction.)
Relationship: My commitment is to do my part to have a working relationship with
you. (See page 2 of the syllabus for information about working relationships.) I
invite you to do your part to have a working relationship with me. Please respond
to my invitation by letting me know if you accept it, decline it, or have a counteroffer.
What other comments, questions, concerns, or suggestions do you have regarding
this course?
Please type the statement in quotes below as your response to Question 7 and sign it. This signed
statement must be received before any course points will be awarded.
"My signature below indicates that I have read the Syllabus and Getting the Most from
This Course handouts, have had all my questions about them answered to my
satisfaction, and agree to be in the course under these terms."
Name _________________ Major _______________ DUE: 1:25 p.m., September 8, 2003
What level of mastery do you have on the following computer applications? (Check
Heard of it
Can use it
Can explain
it Word processing software
(Word, WordPerfect, etc.)
Spreadsheet programs
(Excel, Lotus, etc.)
Statistical operations software
SAS, SPSS, etc.)
Use of the
Academic information:
A. What level mathematics have you completed?
B. Have you had any previous statistics course(s) in high school or college?
If so, please name the course(s). ____________________________________
What statistical topics do you recall from this work well enough to be able to
do or to explain?
What is the e-mail address you prefer to use? __________________________
Phone number: ___________________________
Days/hours when it is OK to call you at this number: ______________________
What level of access do you have to a PC with e-mail? ____________________
DUE: September 22, 2003 by 1:25 p.m. 70 points
The point value of each question is in parentheses in front of it.
(5) 1. What process have you chosen to investigate? What opportunities for improvement do you see in this
process? Why is it important to address these opportunities for improvement?
(20) 2. Voice of the customer: (4 points each)
A. Who is one customer of this process (other than yourself)?
B. Identify at least one measurable process output variable (Y-variable) of particular
interest to you and your customer. This is a variable that can be measured on each cycle of
your process.
C. Get a baseline measurement of this Y-variable by measuring it at least 5 times
D. What is your customer’s specification for this Y-variable?
E. Do a Stakeholder Analysis Commitment Chart for your project. Discuss what you have learned from
the construction of this chart and what it shows. How is this information of use in guiding your
actions on this project?
(20) 3. (2) A. Specify the beginning and ending of one cycle of the process you are investigating. Define your
cycle short enough (e.g., 1 day) so that you can observe at least 5 cycles for PA 2 (Define), 15 for
PA 4 (Measure), 10 for PA 6 (Improve), and 10 for PA 7 (Control).
(4) B. Describe in words what is done in one cycle of your process. Describe the activities in the
process, not the measurements you will be making.
(10) C. Draw a flowchart of one cycle of your process, starting with your beginning point and stopping
with your ending point. Make sure your flowchart contains all paths from the beginning to the
end of the process and that there are no dead ends in the middle of the flowchart.
(4) D. Do a SIPOC analysis of your process.
4. (5) A.
From the flowchart, identify at least two measurable input or process variables (X-variables)
under your personal control. These are two variables that can be measured on each cycle of your
process. Measure these variables also while you are getting your baseline measurement on your
(10) B. Explain the relationships that you hypothesize to exist between your X- and Y-variables.
(5) C. Submit a spreadsheet containing all data collected to date (the “Define data”).
5. (5)
Determine the resources that you have available and are willing to use for the project. What
limitations have been placed (by you or by other parties) on this project? (There are limits here,
given all the rest that is going on in your life. What are they?)
DUE: October 13, 2003 by 1:25 p.m. 40 points
The questions are worth 10 points each.
1. Operationally define how you will measure the Y-variable and X-variables you named in Project Assignment
3 in a way that assure that “all data collectors measure a characteristic in the same way.” Develop and
present data collection forms that you will use. Hoerl-Snee Section 5.3 (pp. 141-152) may be helpful here.
2. Validate your measurement system. Describe any steps you have taken here in a way that addresses concerns
that your customer(s) may have about the
precision and
of your measurement system.
3. Begin the data collection. Collect data for at least 15 cycles (the “Measure data”). Keep a data collection
diary (field diary) of special or unusual events that occurred while you were collecting your data.
A. Submit your diary as an appendix to this assignment.
B. Submit a spreadsheet containing all data collected to date.
4. “Continue improving measurement consistency.” Describe your activities in this area, indicating how your
measurement process was refined during the collection of the Measure data.
(Quotes indicate material quoted from Rath & Strong’s Six Sigma Pocket Guide.)
DUE: October 29, 2003 by 1:25 p.m.
120 points
Questions 1-4 are worth 15 points each; question 5 is worth 60 points.
Focus of Analyze step: “What vital few process and input variables (X’s) affect critical-toquality (CTQ) process performance or output variables?”
Y = f(X 1, X2, X3, X4, …) + ε
1. Assess the variation in your Y-variable using a control chart to see if it is common or special cause variation.
Present and interpret your control chart.
2. Depending on Answer 1, use either the Problem Solving or Process Improvement strategy to identify next
steps to do in the Analysis and Improve steps. Describe your plans.
3. In light of what you learned about your process in the Measure step, revise your process flowchart from the
one you presented in the Define step. Are there any non-value-added steps in your process? Scrap?
Rework? A hidden plant? Unnecessary complexity? Are there any opportunities for improvement here,
looking through what Rath & Strong call the “process door?”
4. Based on the Measure data, construct a cause-and-effect diagram for one aspect of your process that you wish
to improve. Construct two hypotheses about relationships that are suggested by your cause-and-effect
diagram and are suggested by the graphs in Answer 3. Here we are looking through Rath & Strong’s “data
5. Construct a multiple regression model relating Y to X1 and X2. Present all your data in a spreadsheet form.
Use the worksheet on the next page (that is based on p. 239 of H-S) to construct this model. Report on your
analyses and what you have learned from them. Include key graphs and comment on what you have learned
from each.
(Quotes indicate material quoted from Rath & Strong’s Six Sigma Pocket Guide.)
Regression Analysis Worksheet for Question 6 of Project Assignment 4: Analyze
Do a multiple regression analysis using all the data you have collected to date on your
response variable (y) and at least two predictor variables (x’s). Do the following steps:
a. Get to know your data.
i. Plot the histogram for all data to date for each variable. Do a box plot
for each variable. Are there any outliers in your histograms? If so,
what will you do with them?
ii. If your data are collected sequentially in time, do an individual control
chart plot for each variable. What do you see in these plots? (e.g.,
process in control or not? Special cause variation? Assignable?
Structural special cause variation?)
iii. Construct scatter plots of all pairs of variables using the Graph >
Matrix Plot command in Minitab. Are these plots linear? Curvilinear?
Do you see any outliers in these plots? If so, what will you do with
b. Formulate the model
i. Write out in symbols the theoretical multiple regression model that
you will ask Minitab to fit to your data. (e.g., a model like the ones on
p. 237, using your variable names, rather that y or x1 or x2)
c. Fit the model to the data
i. Compute the regression coefficients using Minitab. Write out this first
fitted multiple regression model.
d. Check the fit of the model
i. Plot the residuals to identify any abnormal patterns or atypical data
points. Do at least the four plots named on p. 239:
1. Residuals versus predicted values
2. Residuals versus predictor variables (x’s)
3. Residuals versus time or observation sequence
4. Normal probability plot of residuals
ii. Deal with any abnormalities detected and refit the model until the
residuals look OK.
e. Refit the model to the data
i. Compute the regression coefficients using Minitab and the residuals
and fitted values for the fitted model using the cleaned up data from
Part d. (This will be the same as your original data if you did not have
to do any clean up in Part d.
ii. If one or more of your predictor variables have t-statistics less than
2.0, eliminate the one with the smallest t-statistic and refit the model.
Continue doing this until all remaining predictors have t-statistics
greater than 2.0.
f. Interpret the results of your regression analysis.
i. What is your R2? What does this mean?
ii. Write out your final fitted multiple regression model. Interpret the
coefficients in your regression model.
iii. What do you know now that you did not know before doing this
regression analysis of your Measure data?
DUE: November 12, 2003 by 1:25 p.m. 90 points
The questions are worth 20, 50, 10, 10 points, respectively.
1. Design an experiment (at least a 2x2 with 2 replications, requiring at least 8 cycles of your process) to help
identify the vital few X’s that affect key output measures. Present the data gathered in your experiment in a
2. Do a 2x2 analysis of variance of the data you gathered in your experiment. Interpret the results of this
analysis. The results for you to interpret include graphs of the relationships among your variables, graphs of
the residual analyses, a graph of the two factor interaction, and graphs of the residuals. Include these graphs
in your report.
3. What process improvement action are you planning to implement on the basis of your answer to the preceding
question and to what you learned in the Analyze stage?
4. Pilot test this improvement for 2 cycles of your process. Present the data you collected in the pilot test. What
did you learn from the pilot test?
DUE: December 1, 2003 by 1:25 p.m. 70 points
The questions are worth 5, 10, 30, 10, 5, 5, and 5 points, respectively.
1. Present a new flowchart of your process that includes the improvement(s) you have made on the basis of PA
4 and PA 5.
2. Develop and monitor a plan for controlling for the modified process. Gather at least 10 observations on the
modified process (the “Control data”). Present your Measure and Control data in a spreadsheet.
3. A. Do a control chart for your Y-variable using your Measure and Control data. What does this chart tell
B. Do separate control charts for your Measure and Control data sets. What do these charts tell you?
C. Compare the population mean of your process during the Measure phase to the population mean of your
process during the Control phase using a hypothesis test (state your null and alternative hypotheses) and a
confidence interval.
D. What do these two inferential statistical procedures tell you about the changes you made in your process
during your project?
4. Document the benefits and costs of the improvements you made.
5. How will you continue to monitor the process and hold the gains you have made?
6. What recommendations do you have for future improvements in the process you studied?
7. What are the key points you have learned from this Six Sigma project?
DUE: December 1, 2003 in class
20 points
Present a ten-minute oral report to the class summarizing your entire project.
You will have ten minutes to present an oral summary of your project. Answer the following questions:
1. What question did you study? (Keep in mind that you are much more familiar with your study than we are.
Give us enough background so that we understand your question and its importance to you.)
2. How did you study your question?
3. What did you learn from your project?
A laptop computer (with PowerPoint) and a projector will be available in the classroom.
General purpose coaching on giving a short presentation
Dress appropriately.
Speak clearly and loud enough to be heard in the back row.
Be with your audience and not with your notes.
Practice your presentation and time it. Be clear in your own mind before you start what is the one main point
that you intend to communicate to us. I will give a “1 minute remaining” warning and will stop you at ten
5. To get maximum value from your experience, do the presentation as though it were an
integral part of your interview for your first job after completing your highest degree.