Scale-Up of a High Technology Manufacturing ... Failure Tracking, Analysis, and Resolution through a

advertisement
Scale-Up of a High Technology Manufacturing Startup:
Failure Tracking, Analysis, and Resolution through a
Multi-Method Approach
~I4I~1~
iuw~~
w~wu
MASSACHUSETTS INSTITUTE
OF TECHNOLOGY
By
Derek S Straub
OCT 0 12015
B.E. Aero-Mechanical Engineering
LIBRARIES
Stevens Institute of Technology, 2011
SUBMITTED TO THE DEPARTMENT OF MECHANICAL ENGINEERING IN
PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF
MASTER OF ENGINEERING IN MANUFACTURING
AT THE
MASSACHUSETTS INSTITUTE OF TECHNOLOGY
September 2015
0 2015 Derek S Straub, All rights reserved
The author hereby grants to MIT permission to reproduce
and to distribute publicly paper and electronic
copies of this thesis document in whole or in part
in any medium now known or hereafter crafted.
1^
Signature redacted
A uthor...............................
Derek S Straub
Department of Mechanical Engineering
August 7, 2015
711
Certified by:................ .........
Signature redacted
David E. Hardt
Ralph E. & Eloise F. Cross Professor of Mechanical Engineering
Thesis Supervisor
KV
Accepted by:................
Signature redacted ............................
David E. Hardt
Ralph E. & Eloise F. Cross Professor of Mechanical Engineering
Chairman, Department Committee on Graduate Students
PAGE INTENTIONALY LEFT BLANK
2
Scale-Up of a High Technology Manufacturing Startup:
Failure Tracking, Analysis, and Resolution through a
Multi-Method Approach
By
Derek S Straub
Submitted to the Department of Mechanical Engineering
on August 7, 2015, in partial fulfillment of the
requirements for the degree of
Master of Engineering in Manufacturing
Abstract
Product reliability, quality, and performance are essential for all companies, especially
high technology manufacturing startups looking to scale-up successfully. Company image and
reputation can be heavily impacted by product failures. The cost of failures in-house and at the
customer will only increase as a company scales up. Failure mitigation is critical to the success
of a product and its company throughout the entire product lifecycle. This thesis proposes an
ideal Failure Mitigation Strategy (FMS) that provides a methodology and framework with linear
process workflow and easy to follow steps that lead to the reduction of cost from failures.
Establishing a strong FMS will assist the company in learning from their failures while reducing
the total number and average cost of failure events. The ideal FMS was tailored to and
implemented at New Valence Robotics Corporation (NVBOTS) in Boston, Massachusetts, as a
case study.
The ideal FMS consists of failure tracking, failure analysis, and multi-method failure
resolution. Failure events are first observed and properly documented via the failure tracking
system. Failure tracking data is then processed during failure analysis using a total cost model to
automatically prioritize and down select the most impactful failure event types. Root cause
analysis is then performed on the top priority failure event types. Finally a robust multi-method
failure resolution methodology uses an economical combination of design and process changes
along with testing to eliminate or reduce the cost of those failures.
Over 200 failure events were tracked, including 50 unique failure event types, accounting
A unified and improved tracking system was
for over $75,000 in costs at NVBOTS.
implemented at NVBOTS along with a powerful analysis framework. Failure analysis was
performed, prioritizing the failures by total cost and a failure resolution framework was designed
to implement the solutions to the top priority failure event types. The ideal Failure Mitigation
Strategy offered in this thesis provides NVBOTS and other entities a framework that allows for
full understanding of the current failure landscape as well as a systematic method to reduce the
impact from failures through elimination and mitigation.
Thesis Supervisor: David E. Hardt
Title: Ralph E. and Eloise F. Cross Professor of Mechanical Engineering
3
PAGE INTENTIONALY LEFT BLANK
4
About the Author
Derek Straub is an aero-mechanical engineer currently
pursuing his Master of Engineering in Manufacturing at
Born and
Massachusetts Institute of Technology (MIT).
raised in Perkasie, Pennsylvania, USA he excelled
academics and athletics.
in
He later received his Bachelor of
Engineering in Aero-Mechanical Engineering with high
honors from Stevens Institute of Technology in 2011. He is
currently employed as a mechanical engineer at Federally
Funded Research and Development Center: MIT Lincoln Laboratory.
With eight years of
industry experience, he brings professional experiences and knowledge to this thesis along with
his solid academic foundation.
Past employers include the National Aeronautics and Space
Administration (NASA), Lutron Electronics, Hamilton Sundstrand, and KEY Handling Systems.
His strengths
include mechanical design,
rapid prototyping and fabrication,
additive
manufacturing, design for manufacturing and assembly, design for additive, and mechanical
testing. He has a passion for rapid prototyping and additive manufacturing and has consulted for
multiple companies on these topics. This thesis is drawn heavily from the culmination of years
of industry experience and his work performed for MIT and New Valence Robotics Corporation.
5
PAGE INTENTIONALY LEFT BLANK
6
Acknowledgments
Completion of this thesis was generously supported by many people to whom I am truly
grateful. I would like to thank and recognize the following people for all that they have done.
My family and friends, especially my parents for their unwavering support and love and for
raising me into the man I am today. My advisor, professor, and friend, David Hardt for his
wisdom and guidance throughout my studies and thesis work. My colleagues and great friends,
Rahul Chawla and Ali Shabbir, for all your help and assistance while making this past year
enjoyable.
NVBOTS for sponsoring our team's research and specifically AJ Perez and Mateo
Pefia Doll for their openness and timely support of our research needs. MIT Lincoln Laboratory
Lincoln Scholars Program for their financial and logistical support of my continued education
and for allowing me to fully focus on my graduate work. Lastly, Jim Ingraham, my Lincoln
supervisor and friend for his continual support and investment in the advancement of my
education and career. Thank you.
7
PAGE INTENTIONALY LEFT BLANK
8
Contents
Introduction.............................................................................................
15
Scale-up of High Technology M anufacturing Startups....................................
17
Chapter 1
1.1
1.1.1
Risks A ssociated with M anufacturing Scale-up.........................................
Research Motivation and Overall Problem Statement ..................
1.2
18
19
1.2.1
Overall Problem Statem ent ........................................................................
20
1.2.2
Overview of Subprojects ..........................................................................
21
Thesis Overview ...............................................................................................
22
1.3.1
Thesis Objective and Scope ......................................................................
23
1.3.2
Thesis Structure.......................................................................................
23
1.3
Chapter 2 A dditive M anufacturing Overview .......................................................
25
2.1
General Additive Manufacturing Process Work Flow....................................
26
2.2
Additive Manufacturing Technologies.............................................................
28
2.2.1
Binder Jetting ............................................................................................
29
2.2.2
Directed Energy Deposition......................................................................
29
2.2.3
M aterial Extrusion......................................................................................
29
2.2.4
M aterial Jetting...........................................................................................
29
2.2.5
Powder Bed Fusion ...................................................................................
29
2.2.6
Sheet Lam ination......................................................................................
29
2.2.7
Vat Photopolym erization...........................................................................
29
2.3
Additive M anufacturing Applications.............................................................
30
2.4
Industry and Market ........................................................................................
31
Chapter 3 Com pany Background...........................................................................
33
3.1
The Product ......................................................................................................
34
3.2
The M arket ......................................................................................................
36
3.3
Com pany Analysis ..........................................................................................
36
9
Chapter 4 Failure M itigation Introduction ..........................................................
41
4.1
Overview of Failure M itigation and Im portance.................................................
41
4.2
Overview of Ideal Failure M itigation Strategy ...................................................
43
4.2.1
Failure Tracking ........................................................................................
45
4.2.2
Failure Analysis.........................................................................................
46
4.2.3
Failure Resolution ......................................................................................
46
4.3
Initial Status of Company ................................................................................
47
Chapter 5 Failure Tracking ....................................................................................
49
5.1
Detailed M ethodology .......................................................................................
49
5.2
NVBOTS Case Study .......................................................................................
55
5.2.1
Original Tracking System s.........................................................................
57
5.2.2
New Tracking System ...............................................................................
60
Chapter 6 Failure Analysis.......................................................................................
63
6.1
Detailed M ethodology .......................................................................................
63
6.2
Failure Econom ics: Total Cost M odel .............................................................
66
6.3
NVBOTS Case Study .......................................................................................
68
Chapter 7 M ulti-M ethod Failure Resolution.........................................................
77
7.1
Detailed M ethodology .......................................................................................
77
7.2
NVBOTS Path Forward ..................................................................................
79
Chapter 8 Summary, Conclusions, and Recommendations ..................................
81
8.1
Sum m ary .............................................................................................................
81
8.2
Conclusions ......................................................................................................
82
8.3
Recom mendations for Future W ork................................................................
85
8.3.1
Autom ate Failure Tracking System ..........................................................
85
8.3.2
Autom ate Failure Analysis System ...........................................................
86
8.3.3
Establish Failure Review Process .................................................................
86
8.3.4
Establish Design Change Review Process ....................................................
86
Appendix A Total Cost Exam ple Calculations.............................................................
89
Appendix B Failure Analysis Spreadsheet ...............................................................
92
Bibliography ....................................................................................................................
95
10
List of Figures
Figure 2-1: A M process work flow steps..........................................................................
28
Figure 2-2: A M process categories ...................................................................................
29
Figure 3-1: N V BOTS logo .............................................................................................
33
Figure 3-2: N V Pro printer..............................................................................................
34
Figure 3-3: Print preview feature......................................................................................
35
Figure 3-4: Printer dashboard ........................................................................................
35
Figure 4-1: Failure M itigation Strategy Fram ework.....................................................
44
Figure 5-1: FM S failure tracking ..................................................................................
49
Figure 5-2: Old tracking system s..................................................................................
59
Figure 5-3: Django w elcom e screen .............................................................................
61
Figure 5-4: Django failure event list.............................................................................
61
Figure 5-5: Django event subm ission page....................................................................
62
Figure 6-1: FM S failure analysis ....................................................................................
63
Figure 6-2: Subassem bly Failure Histogram ..................................................................
69
Figure 6-3: Failure Event Cause Type H istogram .........................................................
71
Figure 6-4: When Failures are Found ...........................................................................
73
Figure 6-5: Total cost of failure events.........................................................................
75
Figure 7-1: FM S failure resolution ...............................................................................
77
11
PAGE INTENTIONALY LEFT BLANK
12
List of Tables
Table 3.1: NVBOTS evaluation...................................................................................
38
Table 5.1: Failure tracking items ..................................................................................
54
Table 5.2: Tracking systems comparison......................................................................
56
13
PAGE INTENTIONALY LEFT BLANK
14
Chapter 1
Introduction
There are two types of startup businesses in the world: those that scale-up and those that
do not. The businesses that do not scale-up either fail or settle into a truly small business with
little or no growth potential.' These so-called "lifestyle businesses" have their place within the
economy and among entrepreneurs who are perfectly at peace with running a small business.
However, the businesses that do scale-up are the ones looking to change the world, impact
customers' lives in a profound way, and obviously, make significant financial gains along the
way [1]:
Scale-up in entrepreneurial business refers to the process of rapid growth and expansion
of a company to adapt to a larger workload without compromising performance, revenues, and
operational controls [2]. Scaling up can only occur once a startup has validated its business
model through repeat revenue generation [2]. Once the foundation is in place, rapid growth in
market access, employees, operations, and revenues can occur.
Scaling up is an absolute necessity for those startup businesses funded by external
investors such as angel investors and venture capital (VC) firms. Venture capitalists (VCs) invest
in early-stage startups when the risk is high, the technology is unproven, and the market is
uncertain, but the potential upside is also very high. In return, they own equity in the company
and demand a significant return on their investment within a short period of time, achieved
through either sale to or merger with another company (merger and acquisition, or M&A) or less
commonly, registering as a publicly traded company via an initial public offering (IPO). The
significant investor pressure necessitates the need for scaling up the business as soon as feasible.
However, scale-up requirements significantly depend on the type of business. Software
by nature is very scalable. The initial investment is spent on developing the back-end software
'According to data collected from the U.S. Department of Commerce, Bureau of the
Census, and U.S. Department of Labor, 592,410 businesses closed and 28,322 declared
bankruptcy in 2007 [1].
15
and user experience. Once completed and released, subsequent iterations only cost a fraction of
the initial development cost and time. Software also does not require significant capital
investment, has almost instant global market reach via the Internet, and has a very rapid lifecycle
of only a few years [3]. By far, software startups are the easiest to scale-up and thus have
commanded the largest amount of VC attention and funding [4].
Startups involving a physical product, such as in consumer goods, manufacturing, and
high technology industries, or startups involving strict regulatory requirements such as in the
biopharmaceuticals industry do not have the same luxuries as software or service-based startups.
Significant up-front capital costs, a high burn-rate 2 , and a longer horizon before a sizeable return
on investment is realized are additional barriers to scale-up that causes VCs to shy away from
funding startups in these industries and instead focus their funds on less risky, though sometimes
less profitable in the long run, software startups [3].
Scale-up is absolutely critical even for companies without the added pressure from VCs.
A company is only solvent and in business as long as it has sufficient liquid assets to meet
current liabilities. Without a plan in place to rapidly increase company revenues, and the
fortitude to execute the plan, the business will quickly be unable to meet its liabilities and
become insolvent.
However, the financial risks discussed such as raising investor funding or generating
revenues are only one type of risk faced by startups during scale-up. A risk by definition is any
situation where there is a possibility of an outcome resulting in the loss of something of value
[1]. Unforeseen circumstances and their negative consequences in startup businesses manifest
themselves within the following types of risks, adapted from Hirai [1]:
Market: The possibility of insufficient demand for the offering at the chosen price. True
market demand is only realized once the company tries to sell; everything up until
then is speculation.
Competitive: The possibility of competitors having a better product, being first-tomarket, deliberately underselling your offering, filing intellectual property
disputes, poaching employees, et cetera.
2Burn
rate is the amount of cash a company spends per month.
16
Technology and Operational: Any variety of risks associated with product design,
functionality as intended, manufacturability, product quality and reliability,
production and distribution logistics, supplier management, et cetera.
Financial: Aside from raising investor funding and generating revenues, there are risks
associated with customer credit (defaulting on payments), commodity prices,
currency exchange rates, interest rates, price of assets used as collateral, et cetera.
People: Any number of risks associated with employees of the company, their fit with
corporate culture and vision, their productivity, the necessary combination of
experience, contacts, and skill, et cetera.
Legal and Regulatory: Any number of risks associated with corporate governance,
taxation, intellectual property, liability claims, and regulatory approval.
Systemic: Risks that threaten the viability of entire market and not just one firm, such as
fuel costs affecting the entire airline industry.
All these risks can be systematically identified, monitored, and mitigated through
appropriate risk management, which begins with driving a culture of risk management
throughout the organization. The technology and operational risks associated with a high
technology manufacturing startup are further explored in the subsequent section.
1.1
Scale-up of High Technology Manufacturing Startups
Manufacturing is well regarded as the engine that drives innovation. The U.S. Bureau of
Economic Analysis has determined that for every dollar spent on manufacturing, it generates
$1.48 in economic activity [5]. Manufacturing only represents 12% of the U.S. Gross Domestic
Product (GDP) and 9% of U.S. jobs, but two-thirds of all private research and development
funding and employs one-third of all engineers [5].
Increasingly, the innovation behind manufacturing, whether it is new products or
processes, is found in smaller startups rather than larger corporations [3]. Often these innovations
come out of research laboratories at universities across the nation, or through companies founded
by employees of larger corporations [3]. VC funding allows these startups to prove their
technology, however when it comes time to scale-up, VCs prefer to exit via an M&A with a
large corporation and let them scale-up in-house. An example of this is DuPont's acquisition of
17
Uniax in 2000, a start-up spun out of Professor Alan Heeger's laboratory at the University of
California Santa Barbara [3]. Uniax developed organic light-emitting diodes (OLEDs). It was
only in late 2011, after 11 years of in-house development and scale-up that DuPoint announced
the first commercialization project, under license to a major display manufacturer.
High-technology manufacturing startups face a significant barrier to scale-up due to
upfront capital investments required before production of their physical product can begin. This
poses a financial risk well known within the entrepreneurial ecosystem of VCs and startups.
However, there are significant risks associated with the technology and operational side of the
business as well, specifically associated with manufacturing of the product.
1.1.1
Risks Associated with Manufacturing Scale-up
High technology startups often mistake a successful prototype or the first iteration of the
product as the scaling product, and the first customers as scaling users [6]. This is hardly ever the
case. The first customers are typically lead users or early adopters that provide input for
improvement. In fact these early sales should be thought of as market research input [6]. Early
customers also are willing to put up product design and manufacturing quality shortfalls always
present in the first product iteration; something that the mass market would reject. Startups often
try to include as many features as possible in their initial offering in order to attract as many
customers within their target market as possible. In doing so they lose focus of their most basic
features, the competitive advantage that would win over their customers in the first place. The
scaling customers prefer a simple, robust product with the basic differentiating feature [7].
Quality and reliability are the most important features of any product. Marc Barros, serial
entrepreneur and former CEO of Contour LLC, an action sports camera company, reflected on
his experience with the following: "Shipping quality devices is by far the hardest part of building
a hardware company. Customers don't care about how small you are or the difficulties you face.
They expect you deliver on and surpass on your promise not just once, but multiple times, over
thousands of units" [7].
Achieving this level of quality and reliability is a tremendous effort that involves the
entire company to be focused on documenting and fixing problems during both initial product
development and production scale-up. Having the right talent driving the manufacturing scale-up
is critical. They must have a combination of skills, industry knowledge and experience, and
18
network of contacts to ensure the product is manufactured at the highest level of quality [3].
Barros recommends also having at least one person solely dedicated to product testing, and
quality and reliability improvement, and also working with an experienced production engineer
from the beginning of the design phase to ensure a quality, manufacturable product with high
yield rates. The company must always be willing to comprise of materials, methods, and location
of production to ensure the highest level of quality and reliability is achieved at the lowest cost.
It is very common for a startup to outsource production to contract manufacturers and
suppliers. Contract manufacturers, both domestic and foreign, are an invaluable source of indepth volume manufacturing knowledge. However, it is critical to select suppliers that have the
right specialized skills required for the startup's product, prioritize speed and quality over cutthroat cost reduction, and are willing to work with the company to improve the entire production
process [3]. There is significant tactical knowledge during the initial pilot production runs that is
very complex, and not easily reduced to simple instruction [8]. Therefore, face-to-face time with
suppliers on-site is required during these stages to qualify their process and continuously
improve, and take the leanings back to the company.
Finally, scaling up production requires a diligent effort in tracking the company's cash
cycle. Payment for production is usually due upfront for a startup that is not well established in
the industry yet, but revenues from sales are not expected for months [9]. Also there are large
cash implications associated with sustaining and customer service if the product quality suffers
and customers require repairs or replacements. This could leave the company in a cash-flow
insolvency situation. Careful planning in terms of supply contracts, payment terms, and product
sustaining must be executed from before the scale-up begins.
1.2
Research Motivation and Overall Problem Statement
New Valence Robotics Corporation (NVBOTS), founded in March 2013, is a Boston,
Massachusetts-based robotics startup company that has developed the world's first fully
automated cloud 3D printing management suite [10]. The 3D printing hardware, called the
NVPro, is based on the material extrusion additive manufacturing process. NVBOTS is in the
process of completing its in-house pilot production run and is faced with the problem of scalingup its production to meet customer demand. The scale-up project is the result of collaboration
between the Massachusetts Institute of Technology (MIT) and NVBOTS. The project was a team
19
effort conducted by the author, Derek Straub, Rahul Chawla [11], and Ali Shabbir [12], all
students in the Master of Engineering in Manufacturing (MEngM) program at MIT, between
February and August of 2015.
1.2.1
Overall Problem Statement
The MEngM team consulted on the overall scale-up project and specifically focused on
integrating NVBOTS' business model into its operations. NVBOTS does not directly sell its
printers to customers but rather leases them on five-year terms. They provide software and
hardware upgrades free of charge during the lease period, as well as any required repair services
or maintenance. This unique business model requires careful consideration by engineering and
production as the company scales up.
Product reliability and quality were identified to be the most important factors to focus on
during the scale-up process. As NVBOTS transitions from producing a few units per month
entirely in-house to producing hundreds of units per month in partnership with contract
manufacturers in the near future, a significant shift in current engineering and production
operations would need to occur. The costs associated with unreliable or sub-par quality product
are unsustainable with rapid growth.
Analyzing the complete product value chain, from design, to incoming supplier parts,
assembly, and the complete product, identified opportunities for process improvement. These
opportunities form the basis of each MEngM team member's individual subproject and thesis,
further discussed in Section 1.2.2.
In addition to specific improvement opportunities, the research and work completed by
the MEngM team also included:
* Establishing a framework and foundation of critical processes for future implementation;
-
Inculcating discipline, structure, and industry best practices in engineering and
production operations through learnings from industry experience; and
"
Providing case studies of process improvement implementation at NVBOTS as reference
for future use.
20
1.2.2
Overview of Subprojects
Analyzing the complete product value chain identified specific opportunities for process
improvement with regards to product reliability and quality. The first subproject focused on early
stages of the value chain by analyzing incoming part quality. As hardware startups initiate
operation, their main focus is on product development efforts. When they scale-up, they need to
give more importance to suppliers, quality control and inspection procedures. This project
focused on developing a framework for and analyzing these attributes. Analyzing the outcomes
of using this framework, key recommendations were made in this project for tolerancing
techniques, data acquisition and inspection procedure. Also, suggestions were made to streamline
strategy and operations and make full use of network effects. Chawla conducted this project and
the reader is referred to his thesis for all details [11].
The second subproject was conducted by the author and is the focus of this thesis. It
focused on establishing a proper Failure Mitigation Strategy at NVBOTS consisting of failure
tracking, analysis, and failure resolution. The aim of this project was to create a foundation,
framework, and methodology for NVBOTS to use in order to mitigate costly failures throughout
the product life cycle.
Failures, especially those that occur in the hands of the customer, can
have devastating consequences to any company and even more so to a startup.
This project
details a structured plan to capture all failure data, how to analyze it statistically and objectively
based on its cost to the company, and how to best resolve the failure for future units. As forprofit companies exist to produce profits, the Failure Mitigation Strategy is based on a least-cost
model. The goal is to minimize the cost impact of failures by preventing them from occurring or
by lessening their impact though multiple methods.
This project details how to learn from
failures and how to use that knowledge to create a product with increased reliability, quality, and
performance, while reducing manufacturing and service costs. This is critical for NVBOTS as
the cost of failures will only increase as they begin to scale-up production. Establishing a proper
Failure Mitigation Strategy will allow them to continually reduce the cost and impact of failures,
allowing them to successfully scale-up and providing them a commanding competitive
advantage for the future. Details of this project can be found in the remainder of this thesis,
starting in Chapter 4.
The third subproject focused on product reliability and life. A reliable product is
absolutely critical to NVBOTS because of their leasing business model. The costs associated
21
with repeatedly servicing an unreliable product are unsustainable as the business scales up, and
there is the potential to lose the customer entirely if unreliability is persistent. However, currently
there is no estimate of the life of the NVPro product. Furthermore, there are no processes inplace to predict, analyze, and test potential in-service failures and mitigate these risks during
product development or production. This poses a serious risk to NVBOTS and the potential for
catastrophic business failure if the future costs of service are unmanageable.
Therefore, this subproject focused on two key process improvements. The first was to
implement a structured approach to predict future in-service failure modes and understand their
impact. This was accomplished through Failure Modes and Effects Analysis (FMEA). The
second was to establish a methodology of actually testing the product to determine its reliability
and predict its life. This was accomplished through Design of Experiments, accelerated life
testing, and statistical analysis. The life of the printer was then estimated by determining the life
of the top risk component identified by the FMEA. Finally, the subproject served to establish a
culture of reliability through systematic testing and analysis. Shabbir conducted this project and
the reader is referred to his thesis for all details [12].
The opportunities identified and covered in detail between the three theses provide
recommendations for near term implementation, as this would be of immediate benefit to
NVBOTS. However, each subproject is also a process improvement that should be adopted by
operations to ensure long-term success during the entire scale-up process.
1.3
Thesis Overview
As previously mentioned, the overall problem statement for the project was to scale-up
production at NVBOTS as the company grows. This thesis specifically focuses on ensuring
product reliability, by continually reducing the number and impact of failures, as production
scales up to match increased demand. This was imperative to NVBOTS as their business model
is based on leasing the printer to customers, and any costs associated with servicing unreliable
printers is the responsibility of the company.
22
1.3.1
Thesis Objective and Scope
The objective of this thesis is to provide companies a structured framework for a Failure
Mitigation Strategy that is effective, efficient, and easy to use. It is intended to be universally
applicable to many industries; while the case study for NVBOTS details how it can easily be
tailored to fit a specific company. The details for failure tracking, failure analysis, and failure
resolution are provide to clearly convey the methodology behind the process so that all users can
better understand how the results are calculated and why each step is needed. This thesis intends
to provide a comprehensive foundation for companies to utilize in establishing a Failure
Mitigation Strategy to reduce the number and impact of failures; resulting in higher reliability
and a more successful company.
The intended scope of this thesis was limited to manufacturing industries; yet most of the
FMS framework is universal and may be beneficial to many entities with limited modification.
The scope of the NVBOTS case study was limited to their NVPro 3D printer.
Data was
collected between February and July 2015, with historical data dating back to August 2014.
Failure tracking and analysis was performed with all available data on hand, preparing NVBOTS
for root cause analysis and failure resolution. A proper failure tracking system was implemented
and NVBOTS is strongly encouraged to fully implement the failure analysis system and failure
resolution methodology presented in order to truly realize all of the benefits from the FMS.
1.3.2
Thesis Structure
This thesis is structured into multiple chapters that provide the necessary background
information on the overall project and details the FMS step by step. Case study details, when
applicable, are provided at the end of their respective chapters. Chapter 1 covers various risks
involved in scaling up a manufacturing startup and provides the motivation for the overall
project, majorly written by Shabbir [12] but tailored to this thesis. Chapter 2 presents a brief
overview of additive manufacturing including the general process, current technologies, and
industry market. Chapter 3 presents background on NVBOTS and an analysis of its competitive
strategy, written by Chawla [11] and included here verbatim. Chapter 4 presents an overview of
the Failure Mitigation Strategy, provides motivation, and introduces the case study. Chapters 5,
6, and 7 respectively detail the failure tracking, failure analysis, and failure resolution
23
components of the FMS.
Finally, Chapter 8 summarizes the work performed and provides
recommendations for future work to automate the FMS and supplement its effectiveness with
complimentary structured processes.
24
Chapter 2
Additive Manufacturing Overview
Additive Manufacturing (AM) is a field of manufacturing processes that creates objects
through successive addition of layers of material. Generally, the parts are built from digital
three-dimensional (3D) computer aided design (CAD) data, but this need not always be the case.
AM has been referred to by many different names, 3D Printing, Rapid Prototyping, and Freeform
Fabrication, just to name a few; but the term Additive Manufacturing best differentiates this field
of manufacturing processes from conventional manufacturing techniques, which usually involve
subtraction, deformation, or formation of material as well as changes to material properties. AM
has been around commercially since the late 1980s, but the industry really gained traction and
momentum in the 2000s and it has continued to increased rapidly ever since, with a compound
annual growth rate of 33.8% over the last three years [13]. In 1995, AM was only a $295 million
industry; as of 2014 the AM industry has grown to $4.1 billion and is expected to exceed $12.7
billion by 2018 and $21.1 billion by 2020 [13].
AM has opened up the design space to engineers, designers, and artists allowing them to
produce complex geometry that was once impossible or restricted by cost and/or time.
Geometrical freedom is just one of the many benefits offered by AM.
Time to first part,
customization, increased part performance, flexibility, material and energy efficiency, in-house
manufacturing, and reduction of the design cycle are some of the many benefits realized through
use of AM. AM is a potentially game changing tool for part production, but it is not the solution
to all manufacturing needs as there are some drawbacks. Cost, speed, and time are unfavorable
compared to conventional manufacturing when dealing with parts of simple geometry. Surface
finish, limited materials, material properties, and lack of standards are some of the other
drawbacks to AM. The key for users is to understand the capabilities and limitations and to
know when it is best to use AM or rather chose a conventional manufacturing process instead.
Depending on the desired object(s) and machine to be used there is a certain work flow
process to go from CAD data to having a physical part. This process can vary slightly for each
job and machine but in general all AM processes follow the same seven generic steps adapted
25
from Gibson [14]. The seven steps, in order are: CAD, Conversion to STL, STL Slicing and
Transfer to AM Machine, Machine Setup, Build, Removal, and Post Processing. There are many
factors such as geometry, material, intended use, cost, speed, et cetera that factor into which AM
machine to use for any given build. The American Society of Testing and Materials, now known
as ASTM International, has categorized all of the current machines by their AM process. There
are currently seven process methodology or technology categories defined by ASTM
International: Binder Jetting, Directed Energy Deposition, Material Extrusion, Material Jetting,
Powder Bed Fusion, Sheet Lamination, and Vat Photopolymerization [15]. The seven generic
steps and seven AM technology categories will be described in more detail in the immediately
following sections.
2.1
General Additive Manufacturing Process Work Flow
The detailed AM process will vary slightly from machine to machine and from build to
build but these seven generic steps cover the majority of all AM process work flows. Depending
on the machine, part(s), orientation of part, material, build quality, support material required, et
cetera, certain steps will be more extensive than others, while some may be skipped altogether.
Regardless, the following steps derived from Gibson [14] portray the typical work flow required
LU UaInsfUrm
C
UAdata inio a physic~al kbject via AM (see F
2-
f
a visual
representatiin
of the seven steps with a cup as an example part).
Step 1: CAD
All AM parts must start from a software model that fully describes the geometry.
This can involve the use of almost any CAD solid modeling software, but the
output must be a 3D solid or surface representation. Reverse engineering
equipment (e.g., laser and optical scanning) can also be used to create this
representation.
Step 2: Conversion to STL
Nearly every AM machine accepts the STL file format, which has become a de
facto standard, and nowadays nearly every CAD system can output such a file
26
format. This file describes the external closed surfaces of the original CAD model
and forms the basis for calculation of the slices.
Step 3: STL File Manipulation/Slicing/Transfer to AM Machine
The order of these three sub-steps may vary, but the STL file describing the part
must be transferred to the AM machine. There may be some general manipulation
of the file so that it is the correct size, position, and orientation for building. The
STL file is sliced into build layers and support material and corresponding support
layers are generated, if need be. These slices or layers represent the physical
build layers of material during the build.
STL manipulation and slicing may
occur on the AM machine or at a computer before transfer.
Step 4: Machine Setup
The AM machine must be properly set up prior to the build process. Such settings
would relate to the build parameters like the material constraints, energy source,
layer thickness, timings, et cetera. Setup usually involves cleaning, clearing, and
resetting of the build area altered from previous builds.
Step 5: Build
The part is built out of the given material(s) layer by layer according to the slice
data.
Building the part is mainly an automated process and the machine can
largely carry on without supervision. Only superficial monitoring of the machine
needs to take place at this time to ensure no errors have taken place like running
out of material, power or software glitches, et cetera. Newer and more industrial
machines are beginning to monitor for errors and anomalies in order to notify the
operator.
Step 6: Removal
Once the AM machine has completed the build, the parts must be removed. This
may require interaction with the part, raw material, and machine, which may have
safety interlocks to ensure for example that the operating temperatures are
27
sufficiently low or that there are no actively moving parts.
Removal must be
performed carefully and by experienced operators as many parts are damaged
during this step.
Step 7: Post-processing
Once removed from the machine, parts may require an amount of additional work
before they are ready for use. Parts may be weak at this stage or they may have
supporting features that must be removed. This therefore often requires time and
careful, experienced manual manipulation.
Post processing is usually the most
laborious step and yet the most commonly unknown step for those outside of the
industry.
1: CAD
2: STL Conversion
3: Slicing and Transfer
4: Machine Setup
5: Build
6: Removal
7: Post Processing
Figure 2-1: AM process work flow steps [14]
2.2
Additive Manufacturing Technologies
In 2014 there were 49 industrial grade AM machine manufacturers, many selling multiple
models [13]. In the same year there were hundreds of mostly smaller companies selling desktop
grade3 machines as well [13]. All machine models are similar in that they build sequentially,
layer by layer, defined by the slice data of the 3D CAD model.
Yet all these AM machine
models are different from one another in many ways, each with the technology and features the
manufacturer believes their customers want.
Still, they all fall into one of the seven AM
process/technology categories defined by ASTM International (see Figure 2-2). Below are the
definitions of the seven standard AM process categories according to ASTM International [15]:
3 Industrial
grade and desktop grade machines are defined in Section 2.4
28
2.2.1
Binder Jetting
An additive manufacturing process in which a liquid bonding agent
is selectively deposited to join powder materials.
*
2.2.2
-I2.2.3
Directed Energy Deposition
An additive manufacturing process in which focused thermal energy
is used to fuse materials by melting as they are being deposited.
Material Extrusion
An additive manufacturing process in which material is selectively
MIS:
dispensed through a nozzle or orifice.
2.2.4
Material Jetting
An additive manufacturing process in which droplets of build
hi
material are selectively deposited.
2.2.5
Powder Bed Fusion
An additive manufacturing process in which thermal energy
selectively fuses regions of a powder bed.
2.2.6
Sheet Lamination
An additive manufacturing process in which sheets of material are
ME_
AT"I
2.2.7
L
bonded to form an object.
Vat Photopolymerization
An additive manufacturing process in which liquid photopolymer in
a vat is selectively cured by light-activated polymerization.
Figure 2-2: AM process categories
29
Within the seven categories there are many machine models employing multiple variants
of the general process, yet they can all be summarized by the ASTM International categories.
More advanced AM machines are beginning to incorporate conventional manufacturing
processes in parallel to the additive processes. These machines still fit into one of the seven
categories, but are now being referred to as hybrid machines that are capable of both additive and
subtractive processes. Some future AM machines currently in the research and design phase
may not fit into one of these seven categories or actually blend two or more of the categories, but
for now these seven categories will suffice.
Additive Manufacturing Applications
2.3
Additive manufacturing has many applications and uses and more are being continually
thought of and put into use every year. As the machines and processes evolve and improve the
application space continues to grow. Originally AM parts were used solely as visual models to
better convey a conceptual design. Currently, AM part applications can fit into one or many of
the following categories: visual models, fit-check models, functional models, end-use parts,
tooling and molds, assembly guides and fixtures, education, and research. In recent years the
percentage of parts built for end-use has continued to climb and in 2014 end-use parts accounted
f-Po 9of/
1U1
2-71
['21
a
+p
-lanw1+ h+U m
bt
-part
U all _PaiLL UUIL, IIIJVV LM,~ 11nu3L pupuiai apVII LLIIIJ J L-J 11 ,ll~ ul LLUUL%.A LU L11%,
_P
01
Tisa
be attributed to the
steady increase in performance and quality of the AM machines as well as the increased adoption
and confidence from engineers, designers, and other users. Fit-check models was the second
most popular category in 2014 accounting for 17.8%, while the least popular use was that of
tooling [13].
The year 2015 will see a rise in both end-use parts and tooling, as the AM
machines are as capable as ever, there are increased material options, these two categories have
the most untapped potential, and leading manufacturers have been heavily spotlighting these
applications in their advertisements as well as at trade shows and conferences.
Many industries have realized the benefits of AM and its use has increasingly become
more and more widespread in industries such as automotive, aerospace, industrial/business
machines, consumer products and electronics, medical and dental, academic, government and
military, architectural and others. The automotive and aerospace sectors were early adopters to
AM and still represent a combined 30.9% of the total AM user-base , while consumer products
and electronics are catching up with 16.6% [13]. The vast range of uses can be attributed to the
30
widespread adoption throughout all the major sectors as they begin to truly realize the many
benefits of AM. One of the most popular benefits that most sectors look to capture is that of
reducing the development cycle time for new products. AM can speed up rounds of design,
prototyping, and testing through quick or even parallel production of multiple iterations of a
design. Typically after the development cycle, there is a manufacturing cycle required to tailor
the final development design to the manufacturing equipment for quality and efficiency in mass
production.
When AM is used for production of end-use parts the development cycle and
manufacturing cycle are reduced even further, as the manufacturing cycle is no longer needed.
The last iteration of parts that were built for the development cycle now become the
manufacturing design and require no further work, as they are already being manufactured on the
final manufacturing equipment. Because of this reduction in time and cost, among many other
benefits, many sectors are looking to increase their use of AM.
2.4
Industry and Market
The AM industry consists of two major classes of machines: desktop grade and industrial
grade. For the intent of this publication any AM machine that retails for more than $5,000 USD
is considered industrial grade.
Any AM machine retailing for less than $5,000 USD is
considered desktop grade. This provides a clear cut line between the two but their differences
are quite obvious and extend well past their price tags.
Industrial grade machines are just that; they are built for industrial use and are intended to
be operated in an industrial setting by trained operators. The machines range from $5,000 to
over $2,000,000 USD and are capable of very fine layer resolutions and large build volumes.
Industrial machines can process the widest range of materials including, but not limited to,
polymers, metals, ceramics, composites, and bio-matter. In general they have higher reliability,
quality, resolution, layer thickness options, advanced build control, speed, efficiency, and
robustness when compared to desktop models. Industrial grade machines are usually much more
complex, yet easier to work with than desktop machines, due to better software and a more
automated process. Typically they have larger build volumes and are able to build multiple parts
in parallel.
Industrial grade machines span all seven process categories and are starting to
include hybrid machines that are capable of both additive and subtractive processes.
Uses
include all of the previously noted uses but in contrast to the desktop models, industrial grade
31
machines are used more frequently for end-use parts, tooling, and fit-checks, owing to their
material selection and build quality/resolution. In 2014 Wohlers estimates that nearly 13,000
industrial grade machines were produced and sold. In total last year, industrial AM machine
sales accounted for 86.6% of revenue from sales worldwide [13].
Desktop grade machines are designed for low cost and are able to fit on a desktop at work
or at home. They range in price from about $400 to $5000 USD. These machines have not been
around as long as their industrial counterparts, first breaking into the commercial market in 2007
and only truly being sold in large quantities beginning in 2011 [13].
Desktop models are
notoriously known for being difficult to work with and lack in quality, resolution, and speed.
The software, user interfaces, and calibration setting are weak points and cause most of the
issues associated with this class of machines. Due mainly to their low cost, desktop grade
machines have a very good price to performance ratio and are much less expensive to operate.
They are limited to only a few simple material choices but usually have many build color
options. These machines are more tailored to home, educational, artistic, and recreational uses.
Currently desktop models are only available in one of two process/technology categories:
extrusion and vat photopolymerization.
Last year nearly 140,000 desktop machines were sold
worldwide accounting for 13.4% of revenues from all AM machines, up from 9% the previous
year [13].
The AM market is dominated by industrial and desktop machines but there is very little
in-between. Recently hundreds of companies have started to produce thousands of desktop AM
machines to satisfy the general public's craving for access to 3D printing. There is truly an
untapped market sitting directly between the two current machine grades.
A very accurate
analogy can be made to conventional printers: Industrial grade AM machines are similar to large
printing presses and desktop grade AM machines resemble conventional desktop inkjet and laser
printers, but there is currently nothing similar to that of the networked office printer. Xerox,
Canon, HP, and others have truly excelled in the networked office printer market, yet not a single
AM machine has been designed for a similar 3D market. NVBOTS aims to tackle this untapped
market with their NVPro 3D printer. The NVPro is networked and designed for speed and
autonomy. This should be a good fit for this open market but it's safe to say that many of the
existing industrial and desktop manufacturers are looking to fill this void as well. The classroom
and office space may well be the next battleground for AM machine manufacturers.
32
Chapter 3
Company Background
New Valence Robotics, or NVBOTS, is a 3D Printer manufacturing startup founded in
March 2013 by four MIT students. At present, NVBOTS has all of its operations in Boston, MA.
The vision of NVBOTS is to build a globally distributed network of on-demand intelligent
automated 3D printers in order to deliver high quality printed parts. The team here believes that
the current additive manufacturing process is full of hassles and this acts as an encumbrance
against increasing the user base of the technology. There's a steep learning curve involved in
designing for 3D printing, part removal is cumbersome and there is a lack of queuing which
makes 3D printers difficult to share. To tackle these problems, NVBOTS has developed the
world's only 3D printer with automated part removal, which through their cloud-based interface
can run continuously by itself and be controlled by any device [10].
See Figure 3-1 for
NVBOTS corporate logo.
NVBOTS
Figure 3-1: NVBOTS logo [10]
Their current business model is to lease out printers for five-year terms at different
pricing and packages to their educational and industrial customers with full service offered as a
part of every package [16]. The company recently closed a successful $2M seed round of
funding.
33
3.1
The Product
The NVPro is a dual extrusion based printer with a resolution of 100 microns and an
accuracy of 25.4 microns. The build volume is a cube of 8 inches and achievable printing speed
is as high as 180 mm/s. Figure 3-2 shows how visually open the NVPro machine is, allowing for
educational opportunities, while proudly displaying the complex internal mechanisms.
Figure 3-2: NVPro printer [17]
Other features of the NVPro include automated part removal which obviates manual
presence to clear the build area for subsequent prints, and a built-in camera that allows real time
viewing of the printing process from any device. All the printer management is through the cloud
so no extra software is required [17].
The NVPro caters to the education market as their target audience. Additional offerings
in the package include 3D printable curricula. These modules encourage project based and
applicative learning and lessons include life sciences, earth-space sciences, engineering and
many more [18]. The user interface (see Figure 3-3) is intuitive and easy to navigate. Its features
include print preview with size, shape, and quality adjustments, administrative control for queue
management, and a printer dashboard (see Figure 3-4) with a live video feed and other real time
monitoring add-ons [19].
34
0O&
.
.
Figure 3-3: Print preview feature [19]
n232- C
.0
mffifflrll.
40
C
(
31 C
Figure 3-4: Printer dashboard [19]
35
3.2
The Market
NVBOTS leases out printers on a five-year contract that ensures recurring consumables
revenue (plastic filament) and cloud services fees. Their beachhead market is the education space
in an attempt to capture the future designers and scientists early and also, through their data,
learn what is desired from 3D printing. They currently have 16 printers rented by educational
customers, 10 printers working internally and 12 printers currently on the assembly line. Once
they successfully penetrate the education marketplace, they will approach the industrial
marketplace with improved technology offerings in an attempt to make a stronghold there.
3.3
Company Analysis
Professor Michael Cusumano of the Sloan School of Management identifies eight key
points of successful startup ventures [20]. The following section looks closely at how NVBOTS
is currently positioned on the basis of these metrics, while Table 3.1 summarizes NVBOTS'
status.
Management Team
The founding team of NVBOTS includes CEO AJ Perez, CTO Forrest Pieper,
COO Chris Haid and VP of Engineering Mateo Pefla Doll, all of them MIT
mechanical engineers. NVBOTS also has an esteemed board of advisors in former
experts of manufacturing and 3D printing industry and also experienced professors at
MIT. With new hires, including experienced people in key areas of supply chain, sales,
and production, this metric seems well cleared for NVBOTS.
Attractive Market
The McKinsey Global Institute estimates the total economic impact of 3D
printing by 2025 to be up to $0.6 trillion [21]. Hence, an attractive market certainly
exists. With NVBOTS's beachhead market being largely untapped and their value
proposition being specifically advantageous to capture it, they are well on track. They
are also targeting industrial markets with improved technologies.
36
Compelling New Product
The NVPro is a compelling product in itself but it must be put in reference to
the competition that they face. In terms of feature offerings such as 24/7 printing
without human intervention, ease of sharing among consumers and use of data for
future improvements, it is the only product that achieves it.
Strong Evidence of Customer Interest
NVBOTS already has customers using the product and a high anticipated
demand for financial year 2015. Besides, NVBOTS is currently catering to its
beachhead market and at the same time working on product innovations that will serve
them well in the industrial marketplace. The true litmus test will come when they
attempt to pitch to industrial customers and compete with other well-established
players in that field.
Overcoming the "Credibility Gap"
Professor Cusumano describes this as "the fear among customers that the
venture will fail, leaving the buyer without technical support or a future stream of
product upgrades." In order to avoid this, startups must use present customers as
references for new customers [20]. This requires exceptional customer service and also
a reliable product that does not face critical issues in the field. This is one concern
area.
Demonstrating Early Growth and Profit Potential
NVBOTS has a well charted financial plan for growth and an existing
customer base. Successful seed funding rounds are indicative of the company's merit.
Flexibility in Strategy and Technology
This metric cannot be assessed through plans made in advance but only after
the company has been running for a certain period of time and proves to be responsive
to market needs and technological changes in such disruption prone markets.
37
Potential for Large Investor Pay-Off
A startup that has established sources of funding beyond angel investors and
family, friends, et cetera, shows promise for large pay-off and looks better to potential
investors [20]. This is an area of opportunity for NVBOTS.
NVBOTS
Elements for evaluation
Management Team
Attractive Market
Compelling New Product
Strong Evidence of Customer Interest
Opportunity
Overcoming the Credibility Gap
Demonstrating Early Growth and Profit Potential
Flexibility in Strategy and Technology
Opportunity
Potential for Large Investor Pay-Off
Opportunity
Table 3.1: NVBOTS evaluation
38
In conclusion, the company has a bright future ahead with their strong performance in
almost all of the above mentioned metrics. NVBOTS should have a strong focus on capitalizing
on opportunities by generating customer interest and also staying nimble and flexible in their
strategy. By refining their product design and manufacturing process, they can eliminate the
initial concerns that their product faces in the field and ward off their single potential problem.
39
PAGE INTENTIONALY LEFT BLANK
40
Chapter 4
Failure Mitigation Introduction
Failure mitigation is extremely important to any company, but for a high technology
startup it is of the utmost concern. As a high technology hardware startup, customer satisfaction
and company reputation are critical to market acceptance and the ability to increase demand.
Product failures, especially in the hands of the customer, can cause long lasting and devastating
impact. During a startup phase, companies are most fragile to this impact as they have little to
no reputation and are trying to establish themselves in the marketplace as a trusted and
established company.
The following sections will give a brief overview to the importance of
failure mitigation and the ideal failure tracking, analysis, and resolution methodology that should
be implemented to minimize any possible negative impact during startup, as well as throughout
the life of the company. The initial failure mitigation status of NVBOTS will also be discussed.
4.1
Overview of Failure Mitigation and Importance
Some of the benefits a manufacturing company can expect to gain from implementing a
Failure Mitigation Strategy (FMS) comprised of failure tracking, analysis, and resolution are:
-
increased product reliability
-
reduced severity of failures
-
increased product performance
" reduced service costs
-
increased manufacturing efficiency
" increased customer satisfaction
As the Failure Mitigation Strategy is a continual process it will provide reliability improvements
over the complete lifecycle of the product [22].
Failure tracking can allow a company to
categorically and statistically view all their failures over time. It gives an objective overview of
the quality and reliability of the product and manufacturing system.
Failure analysis is then
performed on the collected failure tracking data; ideally the analysis is continually updated in
41
real time as new failure tracking data is added. This can lead to important insights regarding
which failures are important and need to be prioritized for mitigation. Root cause is investigated
during analysis to help direct the failure resolution. In failure resolution, multiple methods are
used to combat failures.
Sometimes changes to product design, assembly, or handling are
required while in other cases testing may be more economical for failure mitigation. A failure
mitigation process that includes tracking, analysis, and resolution will enlighten management,
engineering, manufacturing, and sales about issues that they might have never even noticed as
well as ways to improve the reliability and maintainability [22]. It will clearly bring to the
surface which failures have the most impact and what is needed to mitigate them. If properly set
up and ingrained into the company culture, a strong FMS can propel a manufacturing company
to increased product quality, performance, and reliability.
Continually improving quality,
performance, and reliability provides a forceful competitive advantage that leads to a more
successful and reputable company.
Despite all these benefits, failure tracking systems are not common practice in many
industries [23]. Without a proper failure tracking system, companies are left to address product
failures in much less efficient methods. Some typical methods for prioritizing failures employed
by companies without failure tracking and analysis include first-in first-out, most frequent first,
most expensive first, and even management intuition. Obviously all of these methods are blind
to the true impact of the failures and address them without regard to which will have the most
benefit to the product and company. Without failure tracking many failures can go unnoticed
and valuable data and information is lost [23].
Without the proper tools to identify, track,
analyze, and resolve failures, companies run the risk of costly product failures and most of the
time they will occur in the worst possible location, at the customer.
High technology
manufacturing startups, like most startups, lack the structure and calculated processes that keep
larger and more established companies running smoothly. Because of this startups are less likely
to have a well-established and complete Failure Mitigation Strategy. Startups tend to fight fires
(failures) day to day and are usually just trying to keep up with the daily needs of starting a
company. Instead of collecting all failure data, analyzing and addressing the most costly and
impactful failures first, startups tend to take on the larger fire first. The only problem is that the
apparent size of the fire is not always aligned with the most impact to the company. A failure
may appear very large or important because one customer called to complain, while in actuality
42
it may be a very rare failure than does not warrant top priority. Proper failure mitigation is easier
than most think to implement and once set up it can run with very minimal impact to daily
operations, while providing impactful and immense benefits. A true FMS comprised of failure
tracking, analysis, and resolution will ensure the company is aware of all failures and addresses
them in the most effective and efficient manner in order to mitigate the negative impact of
failures to the company as soon as possible.
4.2
Overview of Ideal Failure Mitigation Strategy
An ideal Failure Mitigation Strategy is one that works best for a given company at a
given time. However, most are structured in a similar manner and follow the same process flow.
The Failure Mitigation Strategy detailed in this thesis was designed by the author for NVBOTS
but was kept in a generic form so that it may be applicable to most manufacturing companies and
possibly even appeal to other sectors. The ideal FMS should consist of three major components:
failure tracking, failure analysis, and failure resolution. There is a set workflow, that if followed
will result in a systematic reduction of costs incurred from failures through reduction in the
number and impact of failures. Figure 4-1 diagrams the ideal FMS framework, visually detailing
the structure and workflow. Each component of the FMS framework is introduced in this section
and later expanded upon in Chapters 5, 6, and 7, in much greater detail.
43
FAILURE MITIGATION STRATEGY
FRAMEWORK
2015 Derek S Straub
FAILURE EVENTS
I
OBSERVE
OBSERVE SYMPTOMS AND CONDITIONS
CONCLUSIONS
JUMPING TO
ON CAUSE
F
AVOID
U
DOCUMENT EVERY ISSUE/FAILURE EVENT
E
EASY, QUICK, ACCESSIBLE DATA ENTRY FORM
0
STANDARDIZED AND DETAILED EVENT RECORDS
-
RECORD AS SOON AS POSSIBLE
E
SINGLE UNIFIED TRACKING SYSTEM
a
FREQUENCY/TOTAL OCCURRENCES
-
COST
t
FAILURE TRACKING
DOCUMENT
FAILURE ANALYSIS
DECIDE
A'
-
HARDWARE
-
LABOR/TIME
*
-
e
TRAVEL
*
DIAGNOSIS
*
REPAIR
CUSTOMER IMPACT
*
DOWN SELECT BY USING TOTAL COST MODEL
*
INVESTIGATE ROOT CAUSE
U
FAILURE RESOLUTION
ACT
KNOWN CAUSE
*
-
*
CHANGE DESIGN,
*
SCREENING TEST
ASSEMBLY, ETC.
AS EARLY AS
TEST IF ECONOMICAL
POSSIBLE
Figure 4-1: Failure Mitigation Strategy Framework
Visually detailing the structure and workflow
44
UNKNOWN CAUSE
*
The failure mitigation workflow starts with a failure event. This event can be either a
failure or an issue. A failure is defined as an event that renders the 3D printer unusable and
cannot be easily fixed by the customer. It will no longer operate as grossly expected by the
customer. An issue is defined as an event that degrades the performance of the 3D printer but
the operation is similar to the customers' expectations or an event that renders the printer
unusable but can be easily fixed by the customer. Failures cause more impact to the customer
and company but both event types are important and must be recorded. When a failure event
takes place the first action should be to observe. The company should observe the symptoms and
conditions during the failure, if possible, and gather as much information as possible about the
state of the 3D printer. If the failure event is being relayed from a customer, the company should
extract as much detail as possible about the symptoms and condition of the printer. At this stage
in the workflow the company should avoid jumping to conclusions on the cause of the failure
This process should be performed in an objective manner.
event.
Failure tracking, failure
analysis, and failure resolution will be described in further detail below.
4.2.1
Failure Tracking
Once a failure event is observed or relayed the first step in the ideal FMS is failure
tracking. This step is critical as it collects all the information and data that will be used in the
remaining steps.
Failure tracking is all about organized and complete documentation of the
failure event. Every failure and issue must be documented. The data entry form must be quick,
easy to use, and easily accessible in order to encourage employees to completely fill out all fields
and provide a high level of detail. High quality entries allow for high quality results later on.
The event record should use standardized fields and language so that subsequent data processing
is more easily handled. Failure events should be logged as soon as possible in order to avoid any
loss of data. The quicker the data is logged the more complete and accurate the record is [24].
At this step possible causes can be added to the tracking system but only after symptoms and
conditions have been objectively recorded. There should be only one unified failure tracking
system to eliminate the possibility of duplicates or mismatched data sets. More information on
the failure tracking process and ideal failure tracking system is detailed in Chapter 5.
45
4.2.2
Failure Analysis
The second step in the ideal FMS is failure analysis. The basis of this step is to decide
which failures to address first by determining which ones have the greatest impact on the
company. This analysis is performed using a total cost model. The total cost model takes into
account both the frequency of a failure event type and the cost incurred to the company. The
cost to the company can be further broken down into sub-costs:
hardware, labor/time, and
customer impact. For NVBOTS, the last sub-cost, customer impact, was determined to be very
high with respect to the hardware and labor costs and thus the total cost values for NVBOTS are
dominated by the customer impact.
This will vary from company to company but customer
impact must be accounted for as it is usually quite important and sometimes the driving cost
factor. Once all of the data has been processed, the analysis will show which failure event types
create the greatest cost to the company. Event types can now be down selected by the total cost
model, only advancing the top cost contributors to root cause investigation. Ideally this step can
be automated and continually updated in real time as new data is logged into the failure tracking
system. Finally root cause analysis is performed on the top cost contributing failure event types
to determine how to resolve the failure in the future.
Ideally the company will be able to
determine the cause of each failure type but this is not always the case.
Luckily the failure
resolution step can handle both known cause failure types and unknown cause failure types
through the use of multiple resolution methods. More information on the failure analysis process
is detailed in Chapter 6.
4.2.3
Failure Resolution
Failure resolution follows the failure analysis step and is the last step in the ideal FMS.
Failure resolution is initiated by the completion of a root cause analysis of a top cost contributing
failure event type. If a root cause is found, the solution to that cause, be it a design change,
assembly process change, new handling instructions, et cetera, will be estimated for cost. The
cost of the solution will then be compared to the cost of the failures, extrapolated out in time and
the change will be made as long as the cost is less than the predicted cost of future failures. This
will be the case a majority of the time. In some rarer instances the cost to fix the failures may be
higher than the cost of the failures themselves. If this is found to be true, the engineering team
46
should look into whether a test can be used to screen the particular failure depending on the
nature of the cause. Most times a simple test can be very economical at reducing the chance of
the product experiencing a failure later on especially when related to varying incoming part
quality.
If the root cause cannot be determined the engineering team should determine a
screening or stress test that could weed out possible failures. This test should be implemented as
early in assembly as possible in order to remove the failure before more value is added to the 3D
printer. The earlier the failure is caught, the more time and money is spared by the manufacturer.
Failures with unknown causes should be periodically revisited with a fresh root cause
investigation as new information may lead to a definitive cause. More information on the failure
resolution process is detailed in Chapter 7.
4.3
Initial Status of Company
NVBOTS, as of March 2015, operated like most startups fighting fires day to day. They
were attempting to collect failure data but they had multiple failure tracking spreadsheets with
inconsistent and missing data. Tracking a failure was not a top priority and many times failures
went undocumented.
One major concern was that repeat failures were rarely recorded. The
engineering team at NVBOTS figured that they were aware of the problem so recording it a
second, third,
2 0 th
time was not needed.
This leads to lost data that could provide critical
insights to true cost and impact of the failures as well as information needed to properly
determine the root cause and resolution method. When looking at one failure event, confounding
factors could mask the true cause, but over time with multiple similar failures the true cause is
much easier to distinguish from the noise. Failure analysis was practically non-existent except
for a root cause analysis performed with little structure. The engineering team would usually
diagnose the individual failure on the spot and jump to logical causes without review of
historical data. Changes would be made to the design or process and they would run a quick test
to see if the change addressed the failure; sometimes it did, other times multiple iterations would
be needed before a solution was found. This process might be quicker for one failure but it is a
much slower process over the course of many failures, due to its inefficiencies. NVBOTS knew
that they needed to collect data on failures in order for their company to succeed, but multiple
factors, such as inexperience, workload, and perceived effort, prohibited them from establishing
a proper Failure Mitigation Strategy.
47
NVBOTS performs a burn-in test of every printer once complete, before shipping it off to
a customer. This burn-in period lasts one work week and mainly consists of multiple part builds
under standard operation conditions.
Some failures such as infant mortality rate associated
failures in electronics and assembly errors are caught at this time. This is both good and bad for
the company. It is positive in that they have caught many failures in-house before shipping to
the customer, allowing fewer failures in the customers' hands, yet negative in that NVBOTS is
finding failures in their product after 100% of the value has been added to the 3D printer.
Because of this, any failure found requires extensive teardown and rebuild, costing the company
valuable time and money that could be avoided. Again, some of these failures that occurred inhouse are not recorded or documented properly, leaving their failure data to show an incomplete
picture.
The research and work performed for this thesis is in support of establishing and
ingraining a proper Failure Mitigation Strategy at NVBOTS to increase their likelihood of
successful scale-up.
48
Chapter 5
Failure Tracking
The first step in the ideal FMS deals with
documenting the failure events (see Figure 5-1).
tracking
Failure
converts failure
events
into
valuable data. As failure tracking creates all of the
data that the FMS runs on, it is critical to have
rAILUKt I KALKIrYU
DOCUMENT
DOCUMENT EVERY ISSUE/FAILURE EVENT
'
EASY, QUICK, ACCESSIBLE DATA ENTRY FORM
SSTANDARDIZED AND DETAILED EVENT RECORDS
" RECORD AS SOON AS POSSIBLE
"
*
SINGLE UNIFIED TRACKING SYSTEM
complete and accurate records. Without a proper
failure tracking system valuable data is lost along
with
the possible
cost savings
and product
improvements that could be highlighted by the
failures.
A strong FMS begins with a strong
failure tracking system.
Below, the concepts
mentioned in Section 4.3.1, along with others, will
be expanded upon detailing the ideal failure
tracking methodology.
5.1
Figure 5-1: FMS failure tracking
The first step in the FMS framework
Detailed Methodology
Document Every Failure Event
Every failure event must be recorded no matter how big or small, significant or
insignificant, unique or common, issue or failure, every individual event must be logged. This is
very important as this data will later be statistically processed and missing data will only mislead
the company to fictional results. The process is robust enough that one or two missed entries
should not skew the data significantly but companies should take care to log every event in order
to obtain results that reflect the true status of their failure position.
Be sure to encourage the reporting of issues. Naturally, failures will be reported as the
product is no longer usable, but issues may go unreported as they may be easily fixed or only
49
degrade the performance a little and the customer is still satisfied with the product. This can lead
to many issues never being reported and thus not being tracked. Establish an incentive program
or other motivational system to increase the number of issues reported.
Depending on the
product, reporting features can be built into the design to assist the company in tracking failures
and issues. In the case of NVBOTS, the NVPro is always connected to the main company server
via the internet so NVBOTS is able to track issues and failures remotely.
For example, if a
standard filament jam occurs and the customer is able to fix it, the printer can send a message
that a jam has occurred even though the customer never reported it. Documenting as many
failure events as possible is vital to a successful FMS.
Detailed, Complete, and Accurate
As the saying goes, garbage in equals garbage out. Care must be taken to ensure that
failure events are logged in a detailed, complete, and accurate manner. Many times the user who
entered the failure event details will not be on the team conducting the root cause analysis. Thus,
the failure event record should be detailed enough that anyone reading it could reconstruct the
situation without input from the original user who filed the failure event. The results from the
FMS are only as good as the input data generated during failure tracking.
Record Events As Soon As Possible
Failure events should be recorded as soon as possible to alleviate the risk of loss of data.
The longer the time between observing a failure event and recording, the less accurate details
become. Accuracy is crucial so it must become second nature to record right away. When a
technician is out on a service call it makes sense to repair the unit first, allowing for the customer
to begin using the product again, but directly afterwards the technician should record the event to
ensure the event gets documented and the details are as accurate as possible.
Be Objective
The symptoms, conditions, and details of the failure event should be recorded first and be
strictly objective.
Failure tracking is not the ideal time to make subjective decisions on the
definitive cause of the failure event. The data from one single failure event is less conclusive
than that from many similar failure types and thus decisions will be made later in the failure
50
analysis step.
The technician must remember failure tracking is all about objective
documentation of the failure event in order to create a detailed and accurate historical record of
what happened. Failure tracking attempts to create a virtual snapshot of the failure event so that
it can be viewed again in the future. At the very end of the entry form possible causes may be
listed if deemed appropriate.
Standardized Form and Language
The failure tracking system must use a standardized form and standardized language or
codes.
The standardization is needed for statistical processing in the failure analysis step.
Processing is much easier when the same failure type appears as the same code or name. This
will streamline the remainder of the FMS as well as eliminate confusion while entering data into
the failure tracking system. Standard language ensures that data recorded by one technician on
failure type X appears the same as data entered by another technician documenting a failure
event of the same failure type. This also allows for quicker and clearer communication between
different members of the company.
Easy, Quick, Integrated, and Accessible Data Entry Form
The failure tracking data entry form should be easy, quick, integrated, and accessible.
Ease of use is important as this will encourage more frequent usage and more complete entries as
the technicians will be happy to use it. Similarly the form should be quick and to the point. The
form must be as efficient as possible and capture all the required data without asking for
extraneous items that are not needed. To increase the ease of use and speed of filing, the form
should be integrated to the system so that where applicable data can be automatically populated.
Items such as the failure event number, username, date and time, location, firmware revision,
software revision, et cetera can be automatically filled saving the technician valuable time while
also alleviating the possibility of typos and incorrectly entered data. The form must be highly
accessible to allow timely entry of failure events. When technicians are out on a service call they
must be able to remotely enter in data via a mobile device or laptop. Timely entry of failure
event details reduces the chance of detail accuracy lost over time. Failure tracking systems with
forms that are easy to use, quick, integrated and accessible will capture better data and be better
accepted by the technicians.
51
Single Unified System
Only one unified tracking system should be used to track all failures. Multiple tracking
systems can allow for a number of issues such as duplicate entries, non-standard language, and
confusion. Combining data from multiple tracking systems can become difficult as the data may
not be compatible or may be formatted incorrectly. A single unified system promotes standard
language and codes, accurate data entry and smooth transition to data analysis. Duplicate entries
are deterred but easily recognized and discarded if found. A single tracking system also promotes
simplicity allowing for better technician comprehension and adoption. Multiple tracking systems
generate more work, are less efficient and should be avoided.
Proper Training and Company Culture
Technicians, engineers, and anyone else who will be using the tracking system should be
properly trained. They should receive training on the whole FMS system as well as proper data
entry. Training on the complete FMS system will provide all users a thorough understanding of
the entire process which will allow them to understand the importance they play to the system.
Each member of the team needs to fully comprehend his role and how proper execution will
benefit himself, the system, and the company as a whole. Properly trained users will generate
better data and be more efficient and effective at documenting failure events. The system is only
as good as the users who use it. The concept of failure tracking should be ingrained into the
company culture and heavily supported by management. Technicians should not be motivated
by other factors to fill out the form as quickly as possible or even worse skip an entry altogether.
Contrary, they should be encouraged and motivated to submit an accurate, detailed, complete,
and timely entry for every failure event. Failure tracking should become second nature to the
users of the system.
Tracking Items
Depending on the company and product, the items tracked can be tailored to best fit the
given company.
In general, the items should provide enough information to enable the full
virtual reconstruction of the failure event. When developing a failure tracking system one should
think about all of the possible ways to describe the state of the product and its environment, as
these are the items that should be tracked. Items to be tracked should be clearly defined and a
52
standardized language or coding system should be used to standardize responses. Note fields for
extra details can help a technician report an event that requires special reference beyond that of
the standard language or codes. When applicable, items should be automatically populated by
the failure tracking system to reduce technician manual entry. All fields are required except for
those listed as optional. Fields with standardized language or codes should have drop down lists
or searchable lists to choose from. Table 5.1 details the tracking items for NVBOTS. This list
has been lightly tailored to fit the NVBOTS NVPro 3D printer, but it could easily be altered to fit
another product or company.
53
Automatically generated, serialized number unique to each failure event
Autofill
User Name
Unique identifier for the user who submitted the tracking form
Autofill
Date/Time
Date and timestamp
Autofill
Firmware Rev #
Current firmware revision number on the hardware involved with the failure
event
Autofill
Atfl
Software Rev#
Current software revision number on the hardware involved with the failure
event
Autofill
Macro and micro location of the 3D printer
Autofill
File currently being built when the failure occurred
Autofill
Material of filament spool currently loaded while the failure occurred
Autofill
Download of the log files currently on the machine while failure occurred
Autofill
Temperature, humidity, atmospheric pressure, et cetera in local vicinity to the
printer
Round trip travel time automatically calculated from online route planning
Autofill
Round trip travel distance automatically calculated from online route planning
service
Autofill
Serialized hardware number unique to each NVPro frame
Manual
Failure event classification of failure or issue
Manual
Standardized symptoms observed that describe the failure
Manual
Open details field to note extra specific details about the failure beyond that of
Manual
Part #s
Part numbers of the machine parts that are damaged and need to be replaced or
repaired, if applicable
Optional
Fixed?
Was the failure or issue fixed?
Manual
Details describing the fix, if applicable
Manual
Time in minutes spent diagnosing the failure event and deciding how to fix the
failure event
Manual
Time spent in minutes on teardown and repair
Manual
Possible primary cause of failure event, if applicable
Optional
Possible secondary causes of failure event, if applicable
Optional
Photos/Videos
Photos or videos that could assist in the investigation of the root cause or that
document the status or condition of the printer or environment
Optional
Notes
Open notes field for any extra information that is important but does not fit
Optional
#
Failure Event
Location of Printer
Current Build
Current Material
Software Data Dump
Environmental
Conditions
Travel Time
Travel Distance
Hardware #
Failure or Issue
Symptoms
Details on Failure Event
Fix Details
Time
Spent on Diagnosis
Time Spent on Repair
Primary Cause
Secondary Causes
.Autofill
service
the standard language
elsewhere
Table 5.1: Failure tracking items
List of recommended tracking items for NVBOTS with brief descriptions
54
5.2
NVBOTS Case Study
This thesis and subproject was conducted in support of NVBOTS and the work was
intended to establish a framework for an ideal Failure Mitigation Strategy for the NVPro.
Initially NVBOTS was tracking some of their failures but their tracking systems were inefficient
and lacking in many areas. Table 5.2 details the tracking status of the old systems versus the
new system with regard to the ideal tracking items. Improvement can be clearly seen as the new
system has been designed to comply with the ideal FMS tracking methodology. Details on the
old tracking systems and the new and improved tracking system can be found in the immediately
following sections.
55
Old y
ms
Tracking Item
Entry Type
Failure Event #
User Name
Date/Time
Firmware Rev #
Software Rev #
Location of Printer
Current Build
Current Material
Software Data Dump
Environmental Conditions
Auto
Auto
Auto
Auto
Auto
Auto
Auto
Auto
Auto
Auto
Travel Time
At
Future
Travel Distance
Auto
Futurp
Hardware #
Manual
Failure or Issue
Symptoms
Manual
Details on Failure Event
Manual
Manual
Part #s
Fixed?
Fix Details
Time Spent on Diagnosis
Time Spent on Repair
Primary Cause
Secondary Causes
Photos/Videos
Notes
Optional
Manual
Manual
Manual
Manual
Optional
Optional
Optional
Optional
Some
Some
Some.
Some
Some
Some
Manual
Soon
Some
some
C
Some
Some
Soon
Soon
Some
ome
Table 5.2: Tracking systems comparison
Note the tracking improvement over the multiple old tracking systems
56
5.2.1
Original Tracking Systems
Initially there were six different tracking systems being used at NVBOTS. The number
of tracking systems grew organically as the company expanded. NVBOTS knew that tracking
was important but new systems were created over time and they had separate systems for
different failure locations. Each system had its own layout and there was no communication or
ability to merge data between the multiple systems. Each tracked a different set of items and
there was very little standardized language or codes. The data collected from event to event was
very inconsistent which made it difficult to fully understand each failure event and even tougher
to merge all of the data into one system. The six tracking systems being used were:
-
Google Docs In-house Operation Failures spreadsheet
-
Google Docs Field Operation Failures spreadsheet
-
Google Docs Assembly Notes spreadsheet
-
GitHub tracking system
-
Django Staging server
" Django My server
The three Google Docs spreadsheets all had a different format and tracked different
items. Some of these spreadsheets were not recording who entered the data and they were not
being detailed and objective in documenting the events. The GitHub tracking system was being
used more as a work list than a proper tracking system. The major drawback to GitHub was that
a majority of the time the users would only record a unique failure type once. Subsequent failure
events of the same type would likely not be recorded as the item was already on the GitHub list
for future work. This method throws away valuable failure data that could be used to track how
often a specific event type occurs as well as the total cost to the company from those untracked
failures. Moreover, root cause analysis is more successful with multiple data sets for the same
failure type. Determining the root cause from one event is much more difficult and uncertain
than concluding on a cause of a failure event type with multiple event data sets. Lastly the two
Django servers were better at automatically capturing some tracking items but the systems were
heavily oriented towards failure cause instead of objectively documenting the symptoms and
conditions.
Overall the systems did track over 200 failure events with 50 unique failure types but the
data transfer and processing for analysis was laborious and largely manual. Standardization of
57
data was applied retroactively, while missing items were filled in with best estimates by the
engineering team and the author. Many missing items were not able to be estimated and thus the
data set is still incomplete.
The initial situation of NVBOTS tracking systems highlights the
need for a single proper failure tracking system. Figure 5-2 shows a collage of the multiple
tracking systems originally employed by NVBOTS.
58
I
I
I
II
I,
I
I
2
I
~i
I I
11111
-
II
-
"UI
-
1
13
Cil
I I I II
a
-
I~~lld
I1 N
4
I
II
I IJ
I"
I
CL
i
I
I
I
I
low.
I
;
I
I
C
+
C
I
I
I
-U
a
S
I
I
I
-
2
A
e
2
I II
96
I
U
2
'I
I
U
U
I
0
I
I
U
~3
I.
I
I.
~t
I
I
U
I
V
4
I
V
V II
0
2
I
0
L
0
I
i
I
i
I
I I
II
I
I IjI II I I 1.I I III. *1
Figure 5-2: Old tracking systems
Promoted non-standard language and near impossible data integration
59
5.2.2
New Tracking System
The new FMS tracking system developed in this project and now in use by NVBOTS is
based on the ideal failure tracking methodology.
The tracking system still uses the Django
server architecture, but has been completely revamped. Most of the tracking items are already
added to the system, and many have been automated, with plans to include all of the
recommended tracking items as soon as possible.
A few of the more complex automation
features will be added once developed. The new system is easy to use, has a clean layout, and
documents the symptoms and conditions objectively using standardized language. The Django
system is accessible anywhere with connection to the internet to allow for easy access and timely
recording of failure events.
The system is only as good as its users. Because of this the team at NVBOTS has been
instructed on the complete FMS and how to properly use the tracking system to its full potential.
One big change from before on the human side is that the team now understands why it is
important to track every single failure event. Hopefully this leads to a culture of extremely high
failure event tracking percentages.
NVBOTS is also now fully aware of the concept of
objectively documenting first, before thinking about causes.
This will ensure that data is
collected in a straightforward manner with little need for interpretation.
The failure tracking
knowledgebase at NVBOTS has increased in the past few months which will only further
improve the already bolstered tracking system via increased and improved user input.
Although advised against this, NVBOTS is currently using two tracking systems due to
technical issues with how the 3D printers communicate with the main servers. Essentially they
have two identically revamped Django tracking systems; one that tracks failure events of inhouse research and development 3D printers, while the other tracks all other 3D printers. One
specific issue to watch out for is non-standardized language between the two systems. When
NVBOTS is able to integrate the two systems into one unified system in the future, caution must
be taken to avoid language mismatch.
Also, failures found in the research and development
printers may not be indicative of those found in the production models.
This will force
NVBOTS to include another tracking item when merging the two systems, in order to distinguish
between research and development units and production units during analysis.
Large
improvements have been made to the failure tracking system at NVBOTS, but there is still more
work to be done with respect to achieving the complete FMS tracking system detailed earlier.
60
Figure 5-3 and 5-4 show the current welcome screen and failure event list that are designed to be
simple and easy to use. Figure 5-5 details the event submission page that has been designed to
minimize the amount of manual entry while ensuring a common standard language is used.
Site administration
Recent Actions
+Add
Failure events
change
My Actions
None available
Figure 5-4: Django welcome screen
Note the simple layout with intuitive and easy to find
"Add" and "Change" failure event buttons
v
ur ev- n
ome
>
Printer
-
Select failure event to change
created
Determined cause
July 30, 2015, 3:11 p.m
0
H51 Shift in XY Plane During Print failure
July 29, 2015, 5:40 p.m.
0
H162 Extruder Jam failure
July 22, 2015, 12:34 p.m.
0
i
Failure event
L9 H17 Bed Won't Heat failure
'
July 17, 2015, 3:01 p.m.
0
a.m.
0
July 15, 2015, 12:16 p.m
0
H17 Extruder Jam failure
July 7, 2015, 7:18 p.m.
0
H17 Extruder
Jam failure
July 3, 2015, 7:08 p.m.
0
117 Torn Buildtak failure
"
H-17 Extruder Jam failure
July 2, 2015, 2:12 p.m.
July 1, 2015, 4:19 p.m.
0
117 Mega Jam, Hot End Ooze failure
H
H17 Removal Motor Stall failure
June 30, 2015, 2:45 p.m.
0
June 30, 2015, 1:33 p.m.
0
0
H17 Extruder Jam, Extruder Bearing Failure, Mega Jam failure
June 29, 2015, 6:44 p.m.
0
0
H 17 Extruder Jam failure
June 29, 2015, 4:15 p.m.
0
117 Object under z-gantry failure
H
June 28, 2015, 3:33 p.m.
0
H117 Super Jam [generic] failure
H31 Removal Motor Stall failure
June 26, 2015, 10:06 am.
0
0
0
0
0
June 19, 2015, 10:11 a.m.
0
144 Printer Disconnected failure
H
118 Removal Motor Stall failure
June 17, 2015, 11:23 pm.
0
June 17, 2015, 10:26 a.m,
0
129 Shift in XY Plane During Print failure
129 Super Jam [generic] failure
June 16, 2015, 3:05 p.m.
0
June 16, 2015, 12:35 p.m.
0
June 12, 2015, 3:43 p.m.
S
H44 Removal Motor Stall failure
June 12. 2015, 10:59 a.m.
0
H31
June 11, 2015, 2:40 a.m,
0
H40 Removal Motor Stall failure
May 28, 2015, 10:26 a.m.
0
117 Part comes off raft failure
May 23, 2015, 11:57 a.m.
0
H38 Layer Splitting failure
1] H17 Torn Buildtak failure
H17 Removal Motor Stall failure
_
E]
I
L
H17 Z Carriage
L
I
Service required
July 17, 2015, 9:43
Backlash failure
Super Jam [End of Spool] failure
0
26 failure events
Figure 5-4: Django failure event list
61
Fixed
0
0
0
0
a
0
0
0
0
0
0
0
0
F
, 11
,! -;I.; 1!, i
I:). ii
-.
-
1-
1,
- I
-
''
-
:
-
1
1.
-,
1.
Add failure event
ommas
iom- Ca
Part
R-mva
S
Motn,
lam
r
Z C.-..
St.11
rMft
[End of Sp-[)
B'kI'.h
Super
I..
Shift kn XY
Obj.ct
-nd.r -gantry
Exruder Jam
[19-1c]
Pt-
During Print
PrinD.
Kot End 0-m
I-m S.Itdtak
11
RP.-v
.
Exhrudwr Bearing F.11-r
-0
l-,-.
Kieai
opling D-sngagtcd
Faff-rCue(seby
P-1-od
Nut Croked
F.A-uCau
Part Fa.Ourr-.-s
(D-sign)
St.pper &IdppingFalr~ue(D-09n)
.at End F.. Drop BrknF.11ure1au
(C--P-onkn
Printbed Connfto m-Rtd F.11ur-r-us (D0-1gn)
Printbed
L-r"
hardwa m:
-
.,-.;,.----.-
...-. ,-,' ,,,,- ,--, ". .,-.
--
.
lrwruffenent Pritbed Pr-I.ad Fad-r.Cau. (A-sdbly)
-
(C-mn-nt
Qu
QUa0ty
------ Q
Detad.s:
cub--,-or
Nvpo
t-e:
1,
A.
A.
Z015,
,2015,
5.81.2''.
5:35 p-m
!$ 25 P.m.
35'
3.-
.sd a dd itooths.
9-'.
-.
c
-Un-
ditng
Figure 5-5: Django event submission page
Reduces manual entry required via autofill and promotes use of standard language
62
Chapter 6
Failure Analysis
FMS processes all of the data collected from the
FauiREEyNS
i OnERVE SYNIPTOM S AND COW17ftNS
OFss'E
'c
failure tracking system to determine which failure
9
DOCUMENT I'ERY ISUE/ AJWRF 1VENT
E ASY, Quick, AM Sb.E 5ATA'0tm-Y FOWt
types have the largest impact on the product and
S1ANDAWEDANDETARWEaEVETRECCIr3
*
**F"f"**A*"""YST" company (see Figure 6-1). This allows a company
to focus on the top contributing failure types first,
tooA ; S0N A$ POStStU
S
FAILURE ANALYSIS
OF.T
FREQUENCY/TOTAL OCCURRENCES
efficiently reducing failure costs while improving
HARDWARE
LABOR/TIME
TRAVEL
4
in the automatic prioritizing of failure types, via the
FMiLURE WSOLUTION
,
DIAGNOSIS
s
"
REPAIR
CUSTOMER IMPACT
.
DOWN SELECT BY USING TOTAL COST MODEL
"
INVESTIGATE ROOT CAUSE
.
-
their product. The failure analysis process results
total cost model, for root cause investigation. Root
CAUSE
UmPG TO'
*
The failure analysis component of the ideal
cause analysis is then performed on the top priority
NoAUSECa
-CHAN91MfSXN,
AEMBY-E
failure types to determine the appropriate failure I
TEST 4 ECONOMICAL
UNOWNC
SE
t(SREEMN6 TEST
AS AtARL A-%
POSSIBLE
resolution method. Failure analysis also allows for
Figure 6-1: FMS failure analysis
visualization, statistical evaluation, and a clear and
The second step in the FMS framework
broad understanding of the failure landscape of a
given product or company. Below, the concepts mentioned in Section 4.3.2, along with others,
will be expanded upon detailing the ideal failure analysis methodology.
This methodology is
intended to provide a foundational framework that can be applied to any company interested in
establishing a strong FMS.
6.1
Detailed Methodology
Prioritizing Failure Types
The main objective of the failure analysis step is to calculate which failure types create
the greatest cost or impact to the company. This is a large concern for companies as they want to
focus on solutions that will have the largest impact first. The failure analysis prioritizes failure
63
types by using a total cost model. The total cost model objectively calculates the total cost of all
failure types to date by summing up the costs of individual failure events based on hardware
costs, labor costs, and customer impact costs. This allows the company to see which failure
types need to be addressed first in order to eliminate the largest impacts. All data for total cost
should be included in the failure event form allowing for the calculations and prioritization to be
automatically performed in real time after each event submission to the failure tracking system.
This eliminates the need for manual processing or making decisions based on intuition. Which
failure type to address first is now based on facts, not subjective feel. More details explaining
the total cost model can be found in Section 6.2.
Root Cause Investigation
After prioritization, the top cost contributing failure types are investigated for root causes.
Root cause analysis can be performed using one of the many popular methods, including but not
limited to:
" Cause and Effect (Fishbone or Ishikawa Diagram)
" Apollo Root Cause Analysis
" Tree Diagram
-
Kepner-Tregoe Approach
-
Five Whys
The Cause and Effect Diagram method involves drawing a fishbone diagram with branching
lines, signifying each of the possible cause categories such as materials, environment, machines,
measurement, et cetera.
Further lines are dawn sprouting off the category lines, marking
possible causes within each category.
The Apollo Root Cause Analysis involves asking the
question "why?" to a problem or effect. The causal answers are two fold, always consisting of
an action and a condition. This question is asked again and again until there are no more causal
answer pairings. The Tree diagram method is primarily used to narrow down to a specific cause
when the general issue is already known. It involves drawing a tree diagram with two or more
branches spurring off the trunk and subsequent branches forming off the larger branches, with
each level becoming more and more specific. The Kepner-Tregoe Approach is a method based
on selecting the most probable or best known cause. It involves rating a list of potential causes
and then testing the most probably cause. The Five Whys method is self-explanatory in that it is
64
performed by asking the question "why?" five successive times, resulting in the root cause being
found in five levels or less. These methods can be performed by a single individual but they are
usually more effective when performed by a team [25]. For more information on how to perform
a root cause analysis and details on each method please see Otegui [26], Gano [27], Williams
[28], and Sarkar [29]. The root cause analysis will either find a cause that when removed will
remove the failure or the cause will remain undetermined. Thankfully the failure resolution step
is able to handle both failure types, those with known root causes and those with unknown root
causes. Failure resolution will be further detailed in Chapter 7.
Statistical Insight
Failure analysis provides a statistical insight to the failure landscape of a given product or
company. The data collected in failure tracking is organized and processed so that trends and
important connections become easily visible. It allows for quick sorting and filtering of data by
attributes such as:
" total cost
" average cost per failure event
" number of events
m
failure type
-
issue versus failure
-
cause type
" when found
-
part cost
-
repair time
Graphs and visuals can easily be generated to better display the statistical findings from the data.
Some popular charts that allow for a deeper comprehension of the failure tracking data include:
-
number of failures over time
-
total cost of failure event types
-
failure subassembly histogram
" failure cause type histogram
-
value added when found
65
The statistical insight provided by failure analysis allows engineering and management to fully
understand where they currently stand with product failures and enlightens them to potential
areas to focus on for improvement.
6.2
Failure Economics: Total Cost Model
The metric used to prioritize and down select failure types for root cause analysis, and
later failure resolution, was total cost. As most businesses exist to generate profit, cost is the
vehicle that transmits the impact of failures to a company. Cost is also very advantageous to use,
as the value or impact of most anything can be converted into a cost or a credit for universal
comparison and mathematical manipulation. The total cost is the summation of all the costs of
individual failure events. Thus the total cost of a specific failure type is the summation of the
cost of all the failure events of that failure type. Cost can be broken down into three main subcosts: hardware cost, labor and time, and customer impact.
Labor and time can be further
divided into travel time and cost, diagnosis time, and repair time. These cost factors have been
developed for NVBOTS but are generic enough to apply to most companies with minimal
alteration. The total cost model should be designed to automatically update in real time as new
failure events are logged in the failure tracking system. All costing data will be provided in the
new fIMIre
Latai11g
syste, a
vllwing
I Lfoh
company
Lo
always be minrmeu tO which faiure
types are of top priority. A total cost calculation example is provided in Appendix A for better
understanding.
Hardware
The hardware cost is the cost of the parts and consumables needed to fix the given
failure. NVBOTS leases and services all of their 3D printers, thus the hardware cost is the cost
required to source or manufacture each part.
Costs for parts and consumables were taken
directly from the NVBOTS parts cost list.
Labor and Time
As noted earlier labor and time can be broken down into travel time and cost, diagnosis
time, and repair time. Travel time and cost was estimated at a flat rate for all historical failure
events as this data was not recorded in the previous failure tracking systems. The flat rate for
66
travel was calculated assuming a round trip time of two hours and a round trip distance of 80
miles. The hourly skilled technician labor rate used in calculations was $20 per hour and the
mileage cost rate was $0.575 per mile. These conservative estimates were based on the fact that
currently all NVBOTS customers are local to the Greater Boston area and the service is based
out of NBVOTS headquarters in the Seaport district of Boston.
After surveying the service
technicians, it was determined that the average trip length was approximately one hour each way.
The average mileage was calculated using the distances between NVBOTS headquarters and
their customer locations via Google Maps [30]. The assumed skilled technician labor rate was
averaged from labor rates found on PayScale [31]. The mileage cost rate was derived from the
United States Internal Revenue Service standard deductible cost rate for using an automobile for
business in 2015 [32]. Using the ideal FMS failure tracking system, exact round trip mileage and
travel time will be recorded, allowing for more accurate travel costing in the future.
Diagnosis time is the time incurred assessing the failure and determining the appropriate
fix. Repair time is the time spent by the technician during repair, including teardown and
rebuild. Diagnosis and repair times for all historical failures were estimated by the engineering
team and field technicians at NVBOTS during a meeting with the author as they were not
recorded in the past. The same hourly labor rate of $20 was used for calculating diagnosis and
repair costs. Moving forward, estimates will no longer be needed as the diagnosis and repair
times will be tracked by the new tracking system.
Customer Impact
Customer impact is the most vague and intangible cost associated with the total cost
model. Careful thought and time should be spent considering the cost of customer impact as it
usually has a significant impact on the total cost. Customer impact is the value of lost trust and
satisfaction of a customer in the product converted to a cost. For a startup company with a
recurring lease business model, customer impact is of the utmost importance as customer
satisfaction and trust are needed to ensure renewal of the lease. For NVBOTS the customer
impact was calculated for the two failure event categories: issues and failures. The director of
supply chain and finance along with the engineering team and the author determined that the
average customer was likely to not renew their lease if they had more than one issue every other
week or more than two failures within a year.
67
It was agreed upon that these were the best
estimates for now but the values should be updated as more historical data is generated.
A
customer survey could also shed light into more accurate customer impact costs.
The value of an NVBOTS customer over the intended five year long-term contract was
estimated to total $25,000 taking into account production and service costs [33]. To calculate the
cost per issue, the value of a lost customer ($25,000) was divided by the number of issues over
five years estimated to cause a lost customer (130 issues). Thus, an issue was calculated to cost
$192 per event in customer impact. Similarly, to calculate the cost per failure, the value of a lost
customer ($25,000) was divided by the number of failures over five years estimated to cause a
lost customer (10 failures). Thus, a failure was calculated to cost $2,500 per event in customer
impact. It can be clearly seen that customer impact carries an immense cost for NVBOTS due to
their business strategy and customer base. This highlights the importance of customer impact
and the great need to reduce the number and severity of failure events occurring in the hands of
the customer. There is no customer impact if the failure event is caught in-house.
6.3
NVBOTS Case Study
Failure analysis was performed on historical data collected from the six failure tracking
systems at NVBOTS.
merge
the
Uata
Substantial manual work was done to standardize the data language,
seLs, aiiu 1111 111 ms
Ing
where pUsMUil.
11 Witems
A spreasUeL
Was UVeIUpeU
LU
organize and analyze the data. See Appendix B for a screenshot of a representative portion of
the failure analysis spreadsheet. As of July 1 0th, 2015, 204 failure events were tracked, including
50 unique failure types, accounting for over $75,000 in tracked costs.
The true cost of all
failures is most likely higher as many historical failures were not documented.
The data was first manipulated to categorize the failure events by subassembly.
Figure 6-2 shows the resulting histogram of failure events by subassemblies.
The gantry
subassembly shows the highest number of failure events, most likely because of the heavy
scrutiny applied to this subassembly at NVBOTS. This is the only subassembly that is measured
on the coordinate measuring machine (CMM) before and after assembly.
subassembly is early in the final assembly order.
Luckily this
This allows for failures to be caught and
addressed early in the assembly process, before most of the value has been added to the 3D
printer. The earlier a failure is found in the manufacturing process, the less backtracking and
rebuild is required, thus avoiding greater costs. Of more concern are the values for the extruder
68
and electronics subassemblies. These two subassemblies are added to the printer much later in
the final assembly process and incur greater costs if found after final assembly. A key takeaway
from the chart is that NVBOTS should implement subassembly testing of the gantry, extruder,
and electronics before the final assembly procedure. This will limit the number of failures that
make their way into the final assembled product allowing them to be caught much earlier in the
value added chain, thus reducing impact and cost. These three subassemblies account for a
majority of the failures and thus they require focused attention.
1
70
60
4.'
C
a,
50
40
Ui
* Issues
E
30
.0
* Failures
E 20
z
10
0
-
-2
C'
0
(>0"
Subassembly (left to right, in order of final assembly)
Figure 6-2: Subassembly Failure Histogram
Failure analysis can also show which departments of a company are responsible for
failure occurrences. The NVBOTS data was processed manually and coded for probable cause
type. Each failure event was listed as most likely being caused by design, assembly, part quality,
69
extreme use, unknown, or other. A histogram was generated to display these results visually and
is shown in Figure 6-3. One benefit of this graph is that it is able to show which departments
within a company are responsible for past failures and allows management to direct support to
those departments in need. Design failure events most likely come from engineering and design,
while production and assembly is most likely responsible for assembly cause type events. Part
quality failure events should be handled by incoming inspection and supply chain, while extreme
use events are most likely caused by miscommunication between sales and the customer or a
result of poor customer training. These department-to-cause type relationships will not apply to
every failure event but in general they hold true.
The histogram shows that engineering,
production, and supply chain will need focus and support moving forward in order to decrease
the number of failure events in their respective areas.
70
70
60
50
C
40
M Issues
LL.
4-30
30
0 Failures
E 20
z
10
0
Design
Assembly
Part Quality Extreme Use
Unknown
Other
Cause Type
Figure 6-3: Failure Event Cause Type Histogram
Another valuable graph resulting from failure analysis, diagrams when failure events
occur along the value chain. As mentioned before, the earlier a failure is caught the less it will
cost the company. Figure 6-4 shows when failure events are found with respect to the percent of
value added to the product at the time of discovery. The percent valued added is based on the
approximation that the total cost of the NVPro is $2000 and 25 hours of labor are needed to
complete one build. At the time of analysis, these were the best estimates provided by the
director of supply chain and finance, operations manager, and CEO.
This graph is an eye opener to the company. An overwhelming majority of the failure
events occur after 100 percent of the value has been added to the 3D printer. These failures are
found during the one week bum-in test prior to shipping the product off to the customer. On a
71
positive note it is good that they are catching these failures before the product arrives at the
customer but since 100 percent of the value has been added costly teardown and rebuild are
needed to fix the failure. Another issue with failures occurring once the printer is complete and
in testing is that it impacts the delivery date to the customer. The failure will delay the testing as
it needs to be assessed and repaired, inevitably pushing back the ship date to the customer. This
is not the first impression any company wants to make.
The spike at 40 percent is easily
explained by NVBOTS meticulous CMM inspection of the gantry and Z axis subassemblies.
This is a positive step in that these failures are caught early in the value chain.
The most
troubling detail of this histogram is the fact that over one quarter of the failure events have
occurred in the hands of the customer. Customer impact is a dominant factor in the total cost
model for the NVPro causing these failure events to have the greatest cost and impact. As a
startup looking to grow in popularity and sales, product reliability and company reputation are
essential. A takeaway from this histogram is that more subassembly testing is needed to catch
failures earlier in the value chain, preferably before final assembly. Another takeaway is that too
many failures are occurring at the customer.
NVBOTS needs to use the FMS to first and
foremost lower the values on this histogram and secondly shift the values towards the left.
Through the use of the ideal FMS they will be able to lower the number of failures as well as
reduce the average cost per failure.
72
100
90
80
C
70
GJ
260
La
50
Issues
4-50
0
S40
. Failures
E
S30
z
20
10
0
10
20
30
40
50
60
70
80
90
100
120
Percent Value Added
Figure 6-4: When Failures are Found with Respect to Value Added
From the failure analysis spreadsheet individual costs and total costs were calculated to
prioritize the failure types. This allowed NVBOTS to view which failure types had the greatest
impact on their company and needed to be addressed first. Figure 6-5 shows the complete chart
as well as a magnified section highlighting the most costly failure types. The long thin tail to the
right is made up of failures and issues that were caught in-house or issues that rarely occurred
out in the field and this shows that more than 50 percent of the failure types can safely be
ignored for now. These failure events are both rare and inexpensive and do not currently need
attention. The failure event types on the left are those that are most important to the company as
they historically have the greatest total costs.
These events are either very expensive to fix,
occur often, or both. Most of these event types have occurred at least once at the customer due
to the high customer impact costs of NVBOTS.
73
The super jam 4 failure type is by far the most costly event accounting for over $18,000.
It occurs quite often and it is always a full failure. When at the customer this automatically
requires a service run by a trained technician, but more importantly it will incur a $2500
customer impact cost for each event. The super jam is the top priority for NVBOTS. The next
greatest total cost event is the printer board failure. This failure is quite different from the super
jam in that it rarely occurs at the customer but the hardware is expensive and there have been
many failure events. This can be seen as the average cost per event is relatively low. Frame
alignment also falls into the same high number of occurrences category as printer board failure.
After the first 15 to 20 failure types the total costs really drop off. Root cause analysis
should be performed on the top 15 to 20 failure types to determine the root cause, if possible.
Another item to note is that seven of the top 20 event types are in the electronics subassembly.
Until the root cause of each can be determined, this is an easy subassembly to test before final
assembly takes place. This can help catch some of the failures within this costly subassembly
before it is integrated into the whole 3D printer and hopefully catches some failures that may
have occurred later at the customer.
This graph is very important to the FMS and should be
automatically updated in real time as new failure events are logged.
This graph should be
brought up in a companywide meeting at least once a month to inform every one of the top
priorities and update management of progress on root cause analysis and resolution actions
underway.
4
A super jam is filament jam in the extruder head that cannot be fully unjammed by
simply pulling the filament back out, usually due to the filament breaking, ending, or melting
prematurely.
74
Total Cost of Failure Events
See Below for Detail i
vTotaos
I
t
:
I
I l
j
I
A
s
les
Tt
Average Cost per Event
KI
IN
AFf", Evenhts[hl t
cIz
rn:I
Ir+n
~
rma
a
Puan
0
0
1000 0
CD
C0
0-t
CD
800
m Total
0
Cost
600 0
i Average Cost per Even t
4000
200 0
0E
e,~
LV~I
ventL
\0
PAGE INTENTIONALY LEFT BLANK
76
Chapter 7
Multi-Method Failure Resolution
The last step in the linear workflow of the
ideal FMS is failure resolution (see Figure 7-1).
Failure
FLURE EVENTS
TAKG*
F
W
""*oi""**
event types are fed into the failure
resolution
step from the
root cause
analyses
Pom
A
-RERS
performed during the failure analysis step. Failure
resolution is the action step. This is where failure
types are finally mitigated by complete elimination
or reduction of impact.
'",W"O
Naturally, elimination
o
sounds like the ideal resolution method but that is
not always true.
The FMS is mainly concerned
with reducing the impact of the cost of failures to a
product or company.
o
C I
1"INf~nGA1ERQOT CAUSE
FAILURE
ACT
Failure resolution must be
VNi
RESOLUTION
KNOWN CAUSE
CHANGE DESIGN,
ASSEMBLY, ETC.
TEST
IF
ECONOMICAL
UNKNOWNCAUSE
SCREENING TEST
AS EARLY AS
POSSIBLE
based on economics and thus the ideal resolution
Figure 7-1: FMS failure resolution
method is the most economical one. Failure types
The third step in the FMS framework
can come out of root cause analysis with either a
known or unknown cause. Multi-method failure resolution is robust enough to handle both cases
and provides a framework for selecting and acting upon the most economical failure mitigation
solution. The concepts mentioned in Section 4.3.3, along with others, will be expanded upon
below detailing the ideal failure resolution methodology.
7.1
Detailed Methodology
Solutions for the Future
Solutions devised in failure resolution should be aimed at reducing failure events and
severity for the foreseeable future. Although quick fixes and temporary stopgaps can help
reduce near-term failures, this framework is designed to develop robust and long-lasting
77
solutions. Failure resolutions should be well vetted with a proper failure review board meeting.
If the resolution requires a design change, a proper design change review should be conducted to
ensure the change does not negatively impact other aspects of the product.
Like root cause
analysis, these solutions can be devised by a single individual but generally higher quality results
are achieved when devised by a team [25]. It is recommended that an interdisciplinary team be
tasked at addressing all failures that have been down selected by the failure analysis
prioritization.
Economical Solutions
The solutions implemented during failure resolution must be economical.
The main
purpose of the ideal FMS is to limit the cost of failures. Most solutions are not free as they
require engineering support, management oversight, production implementation, et cetera, along
with any physical costs such as new hardware, test equipment, or tooling. Historical cost and
expected future cost of the failure should be compared to the estimated solution cost. In all cases
the solution cost must be less than the expected future cost of the failure for a resolution action to
occur. Otherwise it is not an economically sound solution and further thought should go into
other possible solutions.
Known Cause
When root cause analysis is able to determine the cause of a failure event type the
resolution should eliminate the event type if economical.
This is accomplished via design
change, assembly procedure change, proper handling requirements, supplier change, et cetera.
The estimated cost of the solution should be compared to the expected future cost of the failure
type. If economical, the solution should be implemented to eliminate future failures.
If the
solution out weights the expected failure cost, a test may be a more economical resolution.
Screening, burn-in, or stress tests can be implemented early in the value chain to reduce the
number of failures post-test. This allows for potential failures to be brought to the surface as
soon as possible to limit the impact they cause. Catching a failure before it reaches the customer
can be very cost effective.
A majority of the time, elimination of a known cause failure is
economically viable, otherwise a simple test as early as possible is recommended.
78
Proper
documentation of the resolution method should be kept for future employees to study as lessons
learned from past failures.
Unknown Cause
Sometimes root cause analysis is unable to determine a definitive cause with the
information currently available. When this occurs, an economical test should be devised to arrest
failures early in the value chain. The test will increase the chance of finding the failure earlier,
thus reducing the cost from failure timing. When possible, these tests should be implemented at
or before the subassembly process in order to alleviate the amount of backtracking. As in the
known cases these tests can have a large impact on cost by reducing the number of failures that
reach the customer.
Even though an effective test may already be implemented, a root cause
investigation should be revisited periodically as new data may help determine the root cause and
possibly lead to the elimination of that failure event type.
7.2
NVBOTS Path Forward
After root cause analysis is complete on the top 15 or 20 failure event types, down
selected in failure analysis, the failure resolution methodology detailed above should be used to
take action and eliminate or mitigate each failure type. Two of the top 20 failures, idler bearing
seizures and super jams, are already being addressed. For the idler bearing failure, the root cause
was determined to be metal shavings from the old assembly procedure that were fouling the
bearings and leading to premature failure.
This failure has been virtually eliminated with a
simple assembly procedure change to prevent the shavings from entering the needle bearings.
On the other hand, super jams have been investigated for a while, but no single definitive root
cause has been determined. NVBOTS has discovered that most super jams begin as a regular
jam. This has allowed them to design a test, built into the 3D printer, which continuously checks
for a jam. Once a jam is detected, the printer is paused to prevent the jam from developing
further into a super jam. Although this does not prevent every super jam, it has proven quite
effective at reducing the number of super jams and notifies the customer that a regularjam has
occurred. Regular jams are only an issue and can be fixed by the customer, resulting in reduced
cost and impact to NVBOTS.
The company should continue to use the structured failure
resolution approach to address the remaining top ranked failures. Once complete they should
79
continue to move down the prioritized list, improving the NVPro one failure type at a time, until
the failures have so little impact that the solutions are no longer economical.
80
Chapter 8
Summary, Conclusions, and
Recommendations
8.1
Summary
Product reliability, quality, and performance are essential for all companies, especially
high technology manufacturing startups looking to scale-up successfully. Company image and
reputation can be heavily impacted by product failures. The cost of failures in-house and at the
customer will only increase as a company scales up. Failure mitigation is critical to the success
of a product and its company throughout the entire product lifecycle. This thesis proposes an
ideal Failure Mitigation Strategy that provides a methodology and framework with linear process
workflow and easy to follow steps that lead to the reduction of cost from failures. Establishing a
strong FMS will assist the company in learning from their failures while reducing the total
number and average cost of failure events. The ideal FMS was tailored to and implemented at
NVBOTS in Boston, Massachusetts, as a case study.
The ideal FMS consists of failure tracking, failure analysis, and multi-method failure
resolution. Failure events are first observed and properly documented via the failure tracking
system. Failure tracking data is then processed during failure analysis using a total cost model to
automatically prioritize and down select the most impactful failure event types.
Root cause
analysis is then performed on the top priority failure event types. Finally a robust multi-method
failure resolution methodology uses an economical combination of design and process changes
along with testing to eliminate or reduce the cost of those failures.
Over 200 failure events were tracked, including 50 unique failure event types, accounting
for over $75,000 in costs at NVBOTS.
A unified and improved tracking system was
implemented at NVBOTS along with a powerful analysis framework.
Failure analysis was
performed, prioritizing the failures by total cost and a failure resolution framework was designed
to implement the solutions to the top priority failure event types. The ideal Failure Mitigation
81
Strategy offered in this thesis provides NVBOTS and other entities a framework that allows for
full understanding of the current failure landscape as well as a systematic method to reduce the
impact from failures through elimination and mitigation.
8.2
Conclusions
The purpose of the overall MEngM-NVBOTS collaborative project was to introduce
process improvements to match production scale-up at NVBOTS. This was accomplished
through three subprojects completed by the MEngM team. Chawla [11] focused on developing a
framework for incoming part quality control and inspection procedures, the details of which can
be found in his thesis. This thesis focused on implementing a Failure Mitigation Strategy to
reduce the cost of failures while improving the product reliability, quality, and performance.
Finally, Shabbir [12] focused on improving product reliability through systematic failure analysis
and a quantitative understanding of product life through accelerated life testing, the details of
which can be found in his thesis. The following conclusions are drawn from the work performed
for this thesis focused on the FMS.
Importance of Product Reliability
PrOdIUcL re.1iLLUitLLy
is extremeLy
1imporLta
LU
L11suXCcs
01 a coUmipany.
company reputation are heavily impacted by product reliability and performance.
I IUUL
aU
This is highly
magnified for high technology manufacturing startups looking to scale-up, as they have little
track record and their reputation is more fragile. Even more so, as NVBOTS leases their printers
they are solely responsible for all service and costs to keep the 3D printers running. Product
failures must be properly addressed to ensure the reliability of the product meets or exceeds the
customers' expectations and limits the cost to NVBOTS.
The FMS presented provides a
structured method to reducing the number and impact of failures and thus increases product
reliability.
Failure Mitigation Strategy
The ideal FMS consists of three steps: failure tracking, failure analysis, multi-method
failure resolution. This framework provides a structured foundation that can be adapted to fit
many companies.
The presented case study details how the FMS was easily tailored to
82
NVBOTS and it is now a valuable tool in their daily operations. This is not an entirely novel
idea, but it is not as widely used as expected and even less so documented. This work intends to
serve as guide to establishing a FMS in hopes that many companies will benefit from its ability
to methodically reduce the impact of failures. FMS is an effective and efficient cost reduction
tool that pays for itself and more.
Cost of Failure Timing
The impact of when failures occur was highlighted throughout this work.
Identical
failures can have drastically different costs, solely due to when they occur. A failure that is
caught early in the value added chain is easy to fix, and has limited impact past its subassembly.
A failure occurring at the customer is more expensive and labor intensive to fix and the impact is
much further reaching, as the product failed to meet the customers' expectations.
It is
economically advantageous to catch failures as soon as possible in the value chain if they cannot
be eliminated altogether.
Cost of Customer Impact
Customer impact can have significant and impactful costs, as seen in the NVBOTS case
study. These costs are not always so dominant, but in most cases customer impact costs are
highly significant and should not be overlooked. Like most startups, NVBOTS was surprised to
find out how much customer impact was truly costing them. This cost is hard to quantify and
sometimes difficult to see, warranting effort and time spent to estimate it as best the company
can.
As the cost of customer impact can be significant, management should take it into
consideration when making decisions for the future.
Document Everything, Objectively
The importance of documenting every failure was highlighted by the NVBOTS case
study.
Failures that are untracked are a loss of potential savings and product improvement.
Failure events generate a lot of valuable data that is relatively easy to capture if a proper failure
tracking system is in place. Every failure event must be recorded no matter how big or small,
significant or insignificant, unique or common, issue or failure, every individual event must be
logged.
The events must be documented objectively, taking note to record symptoms and
83
conditions before contemplating causes. This will ensure the company is able to capitalize on
the valuable data hidden within failures.
Single Tracking System with Standardized Language
A single unified tracking system with standardized language is critical to successful
failure tracking and analysis.
The standardization allows for efficient and automated data
processing. Without it, tedious manual work is required to process the data and data integrity
can be compromised in translation. This was exemplified by NVBOTS's six separate failure
tracking systems, all tracking different items and lacking a common language. One system and
one language will allow for efficient documentation, communication, and data analysis.
Total Cost Model
The total cost model was developed as an objective method to prioritize failure event
types for mitigation and enlighten the company to the true cost of their failure events. This
model is widely applicable as most anything can be converted into a cost for comparison and
mathematical manipulation. Care must be exercised in developing the cost rates and other inputs
to the model to ensure accurate results. Along with a proper failure tracking system, the total
cost model can be automated to run after every new failure event submission. This automation
will allow for the top priority list to be updated in real time, allowing for a deeper and more
complete understanding of the current failure position of the company.
Super Jams and Electronics
Specifically for NVBOTS, super jams were calculated as the most costly failure type to
date, totaling over $18,000. The total cost of super jam events was more than double that of the
next most expensive failure event type. The immense total cost is being driven by the number of
occurrences and customer impact as many super jams have occurred at the customer. NVBOTS
is aware of this and is working towards reducing the impact via continuously monitoring for
known symptoms and powering down the extruder before a super jam can occur. Among the
rest of the top 20 prioritized failure types, seven involve the electronics subassembly.
Root
cause analysis should be performed to determine the best resolution methods. Until root cause
84
and resolution are complete, a simple subassembly test can be implemented to screen for failures
early in the value added chain.
Multi-Method Failure Resolution
Failure resolution provides a robust and capable framework that can handle known cause
and unknown cause event types.
Failures with an indeterminate cause are treated just as
importantly as their counterparts with known causes, as large cost savings are still possible. It
details how to economically resolve failures through multiple methods, including design
changes, new assembly procedures, and testing. Finally, solutions are implemented, reducing or
eliminating the cost of failures and realizing the goal of the complete FMS.
8.3
Recommendations for Future Work
The following recommendations are tailored for NVBOTS but are generally applicable to
all companies. Automating the failure tracking and failure analysis systems should be completed
as soon as possible in order to fully realize all of the benefits of implementing the Failure
Mitigation System. Establishing this system early on and ingraining it into the company culture
will allow for fewer growing pains while scaling up and prevent the cost of failures from scaling
up as well. The last two recommendations complement the FMS and will bring about proper
engineering rigor and structure, commonly found in successful and well established companies,
to NVBOTS.
8.3.1
Automate Failure Tracking System
The failure tracking system should be automated where possible to minimize manual
technician input and reduce the risk of entry error. All of the recommended tracking items in
Table 5.2 should be included in the new system in order to capture a complete snapshot of the
failure event and provide the required data for failure analysis.
The travel time and travel
distance should be automatically populated from real time data requested from Google Maps at
the time of the service. The macro location of the printer, along with the location of NVBOTS
headquarters should allow for this to be sourced, once the hardware number is manually entered.
A complete and searchable parts list with part numbers and names should be integrated into the
85
tracking system for technicians to quickly document which parts need repair or replacement. All
of these recommendations will allow for more efficient and accurate data entry, and accurate
data leads to accurate results.
8.3.2
Automate Failure Analysis System
Automation of the total cost model within the failure analysis system is made possible by
the new tracking system, as all data required for total cost calculations is now being captured.
Automatic transfer of data from the tracking system to the analysis system, or a merger of the
two systems, should be implemented in order to realize the capability of real time updates to the
failure type top priority list. Once a new data entry is submitted, the data should automatically
transfer to the analysis system where total cost is calculated, automatically adjusting the top
priority graph in real time. Data over 12 months old should be automatically removed from the
displayed failure priority graph to keep the data set fresh and show a more representative
depiction of the current failure landscape. All data should be archived but only the most recent
12 months should be displayed.
Along with the new failure tracking system and items, this
automation will drastically reduce the amount of manual labor required for failure analysis,
allowing the engineering team to focus on other tasks at hand.
8.3.3
Establish Failure Review Process
Establish a failure review board comprised of an interdisciplinary team with
representatives from each department of the company.
This team will oversee all root cause
investigations and failure resolution actions. The team will review the resolution solutions to
ensure that they are economical and do not negatively impact any other aspect of the product.
This interdepartmental team will own the FMS and create a culture of strict adherence to the
rigor and structure required to operate an effective FMS.
8.3.4
Establish Design Change Review Process
As the company grows, structured processes are vital to maintaining smooth operations
and workflow.
For large design changes, design change review meetings should be held to
ensure the change is beneficial and has been properly vetted against negative impact to other
aspects of the product. This is a great time to get a second set of eyes on a new design to ensure
86
it will work as intended and provide value to the product.
The same interdisciplinary team,
aforementioned above, should oversee this process, helping to ensure new failure methods are
not introduced into a product via a design change.
87
PAGE INTENTIONALY LEFT BLANK
88
Appendix A
Total Cost Example Calculations
The following examples show the total cost calculations of two specific failure events.
Every event's total cost was calculated in a similar manner, via the analysis spreadsheet in
Appendix B.
Failure Event: Broken Removal Blade Mount
Failure/Issue: Failure
Where: Field (at the customer)
Item
Quantity
Cost/Rate
1
$5.00
Calculation
Cost
1 x $5.00
$5.00
Total
Hardware
Blade Mount
$5.00
Time/Labor
Travel Mileage (mi)
Travel Time (hr)
Diagnosis Time (hr)
Repair Time (hr)
80
2
0.083
0.333
80 x $0.575
2 x $20.00
0.083 x $20.00
0.333 x $20.00
$0.575
$20.00
$20.00
$20.00
$46.00
$40.00
$1.67
$6.67
$94.33
Customer Impact
Failure
1
1 x $2,500.QO
$2,500.00
$2,500.00
$2,500.00
$2,599.33
Total Cost
89
Failure Event: Printerboard Electronics Failure
Failure/Issue: Failure
Where: In-House Testing at NVBOTS
Item
Item
Quantity
Ouant1tv
Cost/Rate
cost/Rate
1
$85.00
Calculation
Cost
1 x $85.00
$85.00
Total
Hardware
Printerboard
$85.00
Time/Labor
Travel Mileage (mi)
Travel Time (hr)
Diagnosis Time (hr)
Repair Time (hr)
0
0
0.5
$0.575
$20.00
0.75
$20.00
0 x $0.575
0 x $20.00
0.5 x $20.00
0.75 x $20.00
$20.00
$0.00
$0.00
$10.00
$15.00
$25.00
Customer Impact
Failure
0
0 x $2,500.00
$2,500.00
$0.00
$0.00
$110.00
Total Cost
90
PAGE INTENTIONALY LEFT BLANK
91
Appendix B
Failure Analysis Spreadsheet
The image on the following page is a screenshot of a representative portion of the master
failure analysis spreadsheet that was used to analyze all 204 failure events.
The data was
manually standardized one event at a time and categorized for statistical processing. Total cost
was calculated on the right half of the spreadsheet with color coding signifying the impact of
each total cost. All graphs and statistics were derived from this spreadsheet.
92
. 2 - 2
2 = fi a- .-
2
8~
EEE
~22
E
cD
E2
-
2
-
.2~~~~~~2
-
12(
I ~I
-
lu
~88
Ln
E
____
-
.22~.
i,7Z
,
_2m~
-
-
co
.
--
'L'
E
=
2 -
c
wi12
-a
=
r4rir
CC~
-
cf o E w
C4
8 88, ,
-
__
-2
tE
Ln
~
-
93
E
2
n-
-
t2
S wl '
~~~
-
i R
E
-2
22
caA
ci m
-
E
E
~
-
2
E E
Rm.
-
-
C:5-
.2
W
M2 12
2E
-
E
2
-
r
_
22
2
E E
wwauW
2
~~-C
~
-
Z-
2e!T2L
.2 2
-
-2
2
.2
2 2 2!
T.2mw,
2
.
-
.2
PAGE INTENTIONALY LEFT BLANK
94
Bibliography
[1]
A. Hirai, "What Kills Startups?," Cayenne Consulting, 2010.
[2]
M. Less, "Startupedia: What Does Scaleup Mean?," Startup Institute. [Online]. Available:
http://blog.startupinstitute.com/2015-3-24-what-is-a-scaleup/.
[3]
[Accessed: 01-Jul-2015].
S. Berger, "Scaling up Startups to Market," in Making in America: From Innovation to
Market, Cambridge, MA: MIT Press, 2013, pp. 65-89.
[4]
National Venture Capital Association, "Annual Venture Capital Investment Tops $48
Billion in 2014," 2015. [Online]. Available: http://nvca.org/pressreleases/annual-venturecapital-investment-tops-48-billion-2014-reaching-highest-level-decade-accordingmoneytree-report/. [Accessed: 02-Jul-2015].
[5]
T. Kurfess, "Why Manufacturing Matters," American Society ofMechanical Engineers,
no. November, 2013.
[6]
C. Banden-Fuller and I. MacMillan, "3 Mistakes Made in Scaling up New Ventures,"
HarvardBusiness Review, Aug-2010.
[7]
M. Barros, "Poor Quality Will Kill You," One Entrepreneur's Perspective. [Online].
Available: http://marcbarros.com/poor-quality-will-kill-you/. [Accessed: 03-Jul-2015].
[8]
E. R. Reynolds and H. Samel, "Invented in America, Scaled Up Overseas," ASME
Mechanical EngineeringMagazine, no. November, Nov-2013.
95
[9]
K. Weisul, "Everything Your Startup Needs to Know About Manufacturing," Inc.com,
2015. [Online]. Available: http://www.inc.com/kimberly-weisul/five-lessons-from-themanufacturing-trenches.html. [Accessed: 02-Jul-2015].
[10]
NVBOTS, "About Us." [Online]. Available: http://nvbots.com/about/. [Accessed: 03-Jul2015].
[11]
R. Chawla, "Scale-up of a High Technology Manufacturing Startup: Framework for
Analysis of Incoming Parts, Inspection Procedure and Supplier Capability,"
Massachusetts Institute of Technology, 2015.
[12]
A. Shabbir, "Scale-up of a High-Technology Manufacturing Startup: Improving Product
Reliability Through Systematic Failure Analysis and Accelerated Life Testing,"
Massachusetts Institute of Technology, 2015.
[13]
T. Wohlers and T. Caffery, Wohlers Report 2015: Additive Manufacturingand 3D
PrintingState of the Industry: Annual Worldwide Progress Report. Fort Collins: Wholers
Associates, Inc. 2015.
[14]
I. Gibson, D. W. Rosen, and B. Stucker, Additive Manufacturing Technologies, Second
Edi. New York: Springer, 2015.
[15]
ASTM International, "F2792-12a - Standard Terminology for Additive Manufacturing
Technologies," pp. 10-12, 2013.
[16]
NVBOTS, "Pricing." [Online]. Available: http://nvbots.com/pricing/. [Accessed: 12-Jul2015].
[17]
NVBOTS, "NVPRO." [Online]. Available: http://nvbots.com/nvpro/. [Accessed: 12-Jul2015].
[18]
NVBOTS, "NVLIBRARY." [Online]. Available: http://nvbots.com/nvlibrary/. [Accessed:
12-Jul-2015].
96
[19]
NVBOTS, "MYNVBOTS." [Online]. Available: http://nvbots.com/mynvbots/. [Accessed:
12-Jul-2015].
[20]
M. A. Cusumano, "Evaluating a startup venture," Commun. ACM, vol. 56, no. 10, p. 26,
2013.
[21]
R. Dobbs, J. Manyika Yougang Chen, M. Chui, and S. Lund, "McKinsey Global Institute
The McKinsey Global Institute," no. May, 2013.
[22]
M. Villacourt and P. Govil, "Failure Reporting, Analysis, and Corrective Action System,"
Austin, 1994.
[23]
W. Goble, "Value of failure data," Hydrocarb. Process., vol. 87, no. 10, p. 138, Oct. 2008.
[24]
M. R. Tuckey and N. Brewer, "The influence of schemas, stimulus ambiguity, and
interview schedule on eyewitness memory over time.," J. Exp. Psychol. Appl., vol. 9, no.
2,pp. 101-118, 2003.
[25]
"Root Cause Analysis Processes & Methods," ASQ, 2015. [Online]. Available:
http://asq.org/leam-about-quality/root-cause-analysis/overview/conducting-rootcause.html. [Accessed: 17-Jun-2015].
[26]
J. L. Otegui, Failureanalysis, vol. 3, no. 2. Ney York: Springer, 2014.
[27]
D. L. Gano, Apollo Root Cause Analysis -A New Way of Thinking. 2007.
[28]
P. M. Williams, "Techniques for root cause analysis," Baylor Univ. Med Cent. Proc., vol.
14,no.2,pp.154-7,2001.
[29]
A. Sarkar, A. R. Mukhopadhyay, and S. K. Ghosh, "Root Cause Analysis, Lean Six Sigma
and Test of Hypothesis," TQMJ., vol. 25, no. 2, p. 26, 2013.
97
[30]
"Boston," Googel Maps, 2015. [Online]. Available:
https://www.google.com/maps/place/Boston,+MA/@42.3133735,71.0571571,12z/data=!3ml!4bl!4m2!3ml!1s0x89e3652d0d3d311b:0x787cbf240162e8a0
. [Accessed: 30-May-2015].
[31]
"Hourly Rate for Skill: Computer Hardware Technician," PayScale, 2015. [Online].
Available:
http://www.payscale.com/research/US/Skill=ComputerHardwareTechnician/HourlyRa
te. [Accessed: 30-May-2015].
[32]
"New Standard Milage Rates Now Available; Business Rate to Rise in 2015," IRS, 2015.
[Online]. Available: http://www.irs.gov/uac/Newsroom/New-Standard-Mileage-RatesNow-Available;-Business-Rate-to-Rise-in-2015. [Accessed: 30-May-2015].
[33]
"Interview with Edward Brady, NVBOTS Director of Supply Chain and Finance."
Boston, MA, 2015.
98
Download