Scale-Up of a High Technology Manufacturing Startup: Failure Tracking, Analysis, and Resolution through a Multi-Method Approach ~I4I~1~ iuw~~ w~wu MASSACHUSETTS INSTITUTE OF TECHNOLOGY By Derek S Straub OCT 0 12015 B.E. Aero-Mechanical Engineering LIBRARIES Stevens Institute of Technology, 2011 SUBMITTED TO THE DEPARTMENT OF MECHANICAL ENGINEERING IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF ENGINEERING IN MANUFACTURING AT THE MASSACHUSETTS INSTITUTE OF TECHNOLOGY September 2015 0 2015 Derek S Straub, All rights reserved The author hereby grants to MIT permission to reproduce and to distribute publicly paper and electronic copies of this thesis document in whole or in part in any medium now known or hereafter crafted. 1^ Signature redacted A uthor............................... Derek S Straub Department of Mechanical Engineering August 7, 2015 711 Certified by:................ ......... Signature redacted David E. Hardt Ralph E. & Eloise F. Cross Professor of Mechanical Engineering Thesis Supervisor KV Accepted by:................ Signature redacted ............................ David E. Hardt Ralph E. & Eloise F. Cross Professor of Mechanical Engineering Chairman, Department Committee on Graduate Students PAGE INTENTIONALY LEFT BLANK 2 Scale-Up of a High Technology Manufacturing Startup: Failure Tracking, Analysis, and Resolution through a Multi-Method Approach By Derek S Straub Submitted to the Department of Mechanical Engineering on August 7, 2015, in partial fulfillment of the requirements for the degree of Master of Engineering in Manufacturing Abstract Product reliability, quality, and performance are essential for all companies, especially high technology manufacturing startups looking to scale-up successfully. Company image and reputation can be heavily impacted by product failures. The cost of failures in-house and at the customer will only increase as a company scales up. Failure mitigation is critical to the success of a product and its company throughout the entire product lifecycle. This thesis proposes an ideal Failure Mitigation Strategy (FMS) that provides a methodology and framework with linear process workflow and easy to follow steps that lead to the reduction of cost from failures. Establishing a strong FMS will assist the company in learning from their failures while reducing the total number and average cost of failure events. The ideal FMS was tailored to and implemented at New Valence Robotics Corporation (NVBOTS) in Boston, Massachusetts, as a case study. The ideal FMS consists of failure tracking, failure analysis, and multi-method failure resolution. Failure events are first observed and properly documented via the failure tracking system. Failure tracking data is then processed during failure analysis using a total cost model to automatically prioritize and down select the most impactful failure event types. Root cause analysis is then performed on the top priority failure event types. Finally a robust multi-method failure resolution methodology uses an economical combination of design and process changes along with testing to eliminate or reduce the cost of those failures. Over 200 failure events were tracked, including 50 unique failure event types, accounting A unified and improved tracking system was for over $75,000 in costs at NVBOTS. implemented at NVBOTS along with a powerful analysis framework. Failure analysis was performed, prioritizing the failures by total cost and a failure resolution framework was designed to implement the solutions to the top priority failure event types. The ideal Failure Mitigation Strategy offered in this thesis provides NVBOTS and other entities a framework that allows for full understanding of the current failure landscape as well as a systematic method to reduce the impact from failures through elimination and mitigation. Thesis Supervisor: David E. Hardt Title: Ralph E. and Eloise F. Cross Professor of Mechanical Engineering 3 PAGE INTENTIONALY LEFT BLANK 4 About the Author Derek Straub is an aero-mechanical engineer currently pursuing his Master of Engineering in Manufacturing at Born and Massachusetts Institute of Technology (MIT). raised in Perkasie, Pennsylvania, USA he excelled academics and athletics. in He later received his Bachelor of Engineering in Aero-Mechanical Engineering with high honors from Stevens Institute of Technology in 2011. He is currently employed as a mechanical engineer at Federally Funded Research and Development Center: MIT Lincoln Laboratory. With eight years of industry experience, he brings professional experiences and knowledge to this thesis along with his solid academic foundation. Past employers include the National Aeronautics and Space Administration (NASA), Lutron Electronics, Hamilton Sundstrand, and KEY Handling Systems. His strengths include mechanical design, rapid prototyping and fabrication, additive manufacturing, design for manufacturing and assembly, design for additive, and mechanical testing. He has a passion for rapid prototyping and additive manufacturing and has consulted for multiple companies on these topics. This thesis is drawn heavily from the culmination of years of industry experience and his work performed for MIT and New Valence Robotics Corporation. 5 PAGE INTENTIONALY LEFT BLANK 6 Acknowledgments Completion of this thesis was generously supported by many people to whom I am truly grateful. I would like to thank and recognize the following people for all that they have done. My family and friends, especially my parents for their unwavering support and love and for raising me into the man I am today. My advisor, professor, and friend, David Hardt for his wisdom and guidance throughout my studies and thesis work. My colleagues and great friends, Rahul Chawla and Ali Shabbir, for all your help and assistance while making this past year enjoyable. NVBOTS for sponsoring our team's research and specifically AJ Perez and Mateo Pefia Doll for their openness and timely support of our research needs. MIT Lincoln Laboratory Lincoln Scholars Program for their financial and logistical support of my continued education and for allowing me to fully focus on my graduate work. Lastly, Jim Ingraham, my Lincoln supervisor and friend for his continual support and investment in the advancement of my education and career. Thank you. 7 PAGE INTENTIONALY LEFT BLANK 8 Contents Introduction............................................................................................. 15 Scale-up of High Technology M anufacturing Startups.................................... 17 Chapter 1 1.1 1.1.1 Risks A ssociated with M anufacturing Scale-up......................................... Research Motivation and Overall Problem Statement .................. 1.2 18 19 1.2.1 Overall Problem Statem ent ........................................................................ 20 1.2.2 Overview of Subprojects .......................................................................... 21 Thesis Overview ............................................................................................... 22 1.3.1 Thesis Objective and Scope ...................................................................... 23 1.3.2 Thesis Structure....................................................................................... 23 1.3 Chapter 2 A dditive M anufacturing Overview ....................................................... 25 2.1 General Additive Manufacturing Process Work Flow.................................... 26 2.2 Additive Manufacturing Technologies............................................................. 28 2.2.1 Binder Jetting ............................................................................................ 29 2.2.2 Directed Energy Deposition...................................................................... 29 2.2.3 M aterial Extrusion...................................................................................... 29 2.2.4 M aterial Jetting........................................................................................... 29 2.2.5 Powder Bed Fusion ................................................................................... 29 2.2.6 Sheet Lam ination...................................................................................... 29 2.2.7 Vat Photopolym erization........................................................................... 29 2.3 Additive M anufacturing Applications............................................................. 30 2.4 Industry and Market ........................................................................................ 31 Chapter 3 Com pany Background........................................................................... 33 3.1 The Product ...................................................................................................... 34 3.2 The M arket ...................................................................................................... 36 3.3 Com pany Analysis .......................................................................................... 36 9 Chapter 4 Failure M itigation Introduction .......................................................... 41 4.1 Overview of Failure M itigation and Im portance................................................. 41 4.2 Overview of Ideal Failure M itigation Strategy ................................................... 43 4.2.1 Failure Tracking ........................................................................................ 45 4.2.2 Failure Analysis......................................................................................... 46 4.2.3 Failure Resolution ...................................................................................... 46 4.3 Initial Status of Company ................................................................................ 47 Chapter 5 Failure Tracking .................................................................................... 49 5.1 Detailed M ethodology ....................................................................................... 49 5.2 NVBOTS Case Study ....................................................................................... 55 5.2.1 Original Tracking System s......................................................................... 57 5.2.2 New Tracking System ............................................................................... 60 Chapter 6 Failure Analysis....................................................................................... 63 6.1 Detailed M ethodology ....................................................................................... 63 6.2 Failure Econom ics: Total Cost M odel ............................................................. 66 6.3 NVBOTS Case Study ....................................................................................... 68 Chapter 7 M ulti-M ethod Failure Resolution......................................................... 77 7.1 Detailed M ethodology ....................................................................................... 77 7.2 NVBOTS Path Forward .................................................................................. 79 Chapter 8 Summary, Conclusions, and Recommendations .................................. 81 8.1 Sum m ary ............................................................................................................. 81 8.2 Conclusions ...................................................................................................... 82 8.3 Recom mendations for Future W ork................................................................ 85 8.3.1 Autom ate Failure Tracking System .......................................................... 85 8.3.2 Autom ate Failure Analysis System ........................................................... 86 8.3.3 Establish Failure Review Process ................................................................. 86 8.3.4 Establish Design Change Review Process .................................................... 86 Appendix A Total Cost Exam ple Calculations............................................................. 89 Appendix B Failure Analysis Spreadsheet ............................................................... 92 Bibliography .................................................................................................................... 95 10 List of Figures Figure 2-1: A M process work flow steps.......................................................................... 28 Figure 2-2: A M process categories ................................................................................... 29 Figure 3-1: N V BOTS logo ............................................................................................. 33 Figure 3-2: N V Pro printer.............................................................................................. 34 Figure 3-3: Print preview feature...................................................................................... 35 Figure 3-4: Printer dashboard ........................................................................................ 35 Figure 4-1: Failure M itigation Strategy Fram ework..................................................... 44 Figure 5-1: FM S failure tracking .................................................................................. 49 Figure 5-2: Old tracking system s.................................................................................. 59 Figure 5-3: Django w elcom e screen ............................................................................. 61 Figure 5-4: Django failure event list............................................................................. 61 Figure 5-5: Django event subm ission page.................................................................... 62 Figure 6-1: FM S failure analysis .................................................................................... 63 Figure 6-2: Subassem bly Failure Histogram .................................................................. 69 Figure 6-3: Failure Event Cause Type H istogram ......................................................... 71 Figure 6-4: When Failures are Found ........................................................................... 73 Figure 6-5: Total cost of failure events......................................................................... 75 Figure 7-1: FM S failure resolution ............................................................................... 77 11 PAGE INTENTIONALY LEFT BLANK 12 List of Tables Table 3.1: NVBOTS evaluation................................................................................... 38 Table 5.1: Failure tracking items .................................................................................. 54 Table 5.2: Tracking systems comparison...................................................................... 56 13 PAGE INTENTIONALY LEFT BLANK 14 Chapter 1 Introduction There are two types of startup businesses in the world: those that scale-up and those that do not. The businesses that do not scale-up either fail or settle into a truly small business with little or no growth potential.' These so-called "lifestyle businesses" have their place within the economy and among entrepreneurs who are perfectly at peace with running a small business. However, the businesses that do scale-up are the ones looking to change the world, impact customers' lives in a profound way, and obviously, make significant financial gains along the way [1]: Scale-up in entrepreneurial business refers to the process of rapid growth and expansion of a company to adapt to a larger workload without compromising performance, revenues, and operational controls [2]. Scaling up can only occur once a startup has validated its business model through repeat revenue generation [2]. Once the foundation is in place, rapid growth in market access, employees, operations, and revenues can occur. Scaling up is an absolute necessity for those startup businesses funded by external investors such as angel investors and venture capital (VC) firms. Venture capitalists (VCs) invest in early-stage startups when the risk is high, the technology is unproven, and the market is uncertain, but the potential upside is also very high. In return, they own equity in the company and demand a significant return on their investment within a short period of time, achieved through either sale to or merger with another company (merger and acquisition, or M&A) or less commonly, registering as a publicly traded company via an initial public offering (IPO). The significant investor pressure necessitates the need for scaling up the business as soon as feasible. However, scale-up requirements significantly depend on the type of business. Software by nature is very scalable. The initial investment is spent on developing the back-end software 'According to data collected from the U.S. Department of Commerce, Bureau of the Census, and U.S. Department of Labor, 592,410 businesses closed and 28,322 declared bankruptcy in 2007 [1]. 15 and user experience. Once completed and released, subsequent iterations only cost a fraction of the initial development cost and time. Software also does not require significant capital investment, has almost instant global market reach via the Internet, and has a very rapid lifecycle of only a few years [3]. By far, software startups are the easiest to scale-up and thus have commanded the largest amount of VC attention and funding [4]. Startups involving a physical product, such as in consumer goods, manufacturing, and high technology industries, or startups involving strict regulatory requirements such as in the biopharmaceuticals industry do not have the same luxuries as software or service-based startups. Significant up-front capital costs, a high burn-rate 2 , and a longer horizon before a sizeable return on investment is realized are additional barriers to scale-up that causes VCs to shy away from funding startups in these industries and instead focus their funds on less risky, though sometimes less profitable in the long run, software startups [3]. Scale-up is absolutely critical even for companies without the added pressure from VCs. A company is only solvent and in business as long as it has sufficient liquid assets to meet current liabilities. Without a plan in place to rapidly increase company revenues, and the fortitude to execute the plan, the business will quickly be unable to meet its liabilities and become insolvent. However, the financial risks discussed such as raising investor funding or generating revenues are only one type of risk faced by startups during scale-up. A risk by definition is any situation where there is a possibility of an outcome resulting in the loss of something of value [1]. Unforeseen circumstances and their negative consequences in startup businesses manifest themselves within the following types of risks, adapted from Hirai [1]: Market: The possibility of insufficient demand for the offering at the chosen price. True market demand is only realized once the company tries to sell; everything up until then is speculation. Competitive: The possibility of competitors having a better product, being first-tomarket, deliberately underselling your offering, filing intellectual property disputes, poaching employees, et cetera. 2Burn rate is the amount of cash a company spends per month. 16 Technology and Operational: Any variety of risks associated with product design, functionality as intended, manufacturability, product quality and reliability, production and distribution logistics, supplier management, et cetera. Financial: Aside from raising investor funding and generating revenues, there are risks associated with customer credit (defaulting on payments), commodity prices, currency exchange rates, interest rates, price of assets used as collateral, et cetera. People: Any number of risks associated with employees of the company, their fit with corporate culture and vision, their productivity, the necessary combination of experience, contacts, and skill, et cetera. Legal and Regulatory: Any number of risks associated with corporate governance, taxation, intellectual property, liability claims, and regulatory approval. Systemic: Risks that threaten the viability of entire market and not just one firm, such as fuel costs affecting the entire airline industry. All these risks can be systematically identified, monitored, and mitigated through appropriate risk management, which begins with driving a culture of risk management throughout the organization. The technology and operational risks associated with a high technology manufacturing startup are further explored in the subsequent section. 1.1 Scale-up of High Technology Manufacturing Startups Manufacturing is well regarded as the engine that drives innovation. The U.S. Bureau of Economic Analysis has determined that for every dollar spent on manufacturing, it generates $1.48 in economic activity [5]. Manufacturing only represents 12% of the U.S. Gross Domestic Product (GDP) and 9% of U.S. jobs, but two-thirds of all private research and development funding and employs one-third of all engineers [5]. Increasingly, the innovation behind manufacturing, whether it is new products or processes, is found in smaller startups rather than larger corporations [3]. Often these innovations come out of research laboratories at universities across the nation, or through companies founded by employees of larger corporations [3]. VC funding allows these startups to prove their technology, however when it comes time to scale-up, VCs prefer to exit via an M&A with a large corporation and let them scale-up in-house. An example of this is DuPont's acquisition of 17 Uniax in 2000, a start-up spun out of Professor Alan Heeger's laboratory at the University of California Santa Barbara [3]. Uniax developed organic light-emitting diodes (OLEDs). It was only in late 2011, after 11 years of in-house development and scale-up that DuPoint announced the first commercialization project, under license to a major display manufacturer. High-technology manufacturing startups face a significant barrier to scale-up due to upfront capital investments required before production of their physical product can begin. This poses a financial risk well known within the entrepreneurial ecosystem of VCs and startups. However, there are significant risks associated with the technology and operational side of the business as well, specifically associated with manufacturing of the product. 1.1.1 Risks Associated with Manufacturing Scale-up High technology startups often mistake a successful prototype or the first iteration of the product as the scaling product, and the first customers as scaling users [6]. This is hardly ever the case. The first customers are typically lead users or early adopters that provide input for improvement. In fact these early sales should be thought of as market research input [6]. Early customers also are willing to put up product design and manufacturing quality shortfalls always present in the first product iteration; something that the mass market would reject. Startups often try to include as many features as possible in their initial offering in order to attract as many customers within their target market as possible. In doing so they lose focus of their most basic features, the competitive advantage that would win over their customers in the first place. The scaling customers prefer a simple, robust product with the basic differentiating feature [7]. Quality and reliability are the most important features of any product. Marc Barros, serial entrepreneur and former CEO of Contour LLC, an action sports camera company, reflected on his experience with the following: "Shipping quality devices is by far the hardest part of building a hardware company. Customers don't care about how small you are or the difficulties you face. They expect you deliver on and surpass on your promise not just once, but multiple times, over thousands of units" [7]. Achieving this level of quality and reliability is a tremendous effort that involves the entire company to be focused on documenting and fixing problems during both initial product development and production scale-up. Having the right talent driving the manufacturing scale-up is critical. They must have a combination of skills, industry knowledge and experience, and 18 network of contacts to ensure the product is manufactured at the highest level of quality [3]. Barros recommends also having at least one person solely dedicated to product testing, and quality and reliability improvement, and also working with an experienced production engineer from the beginning of the design phase to ensure a quality, manufacturable product with high yield rates. The company must always be willing to comprise of materials, methods, and location of production to ensure the highest level of quality and reliability is achieved at the lowest cost. It is very common for a startup to outsource production to contract manufacturers and suppliers. Contract manufacturers, both domestic and foreign, are an invaluable source of indepth volume manufacturing knowledge. However, it is critical to select suppliers that have the right specialized skills required for the startup's product, prioritize speed and quality over cutthroat cost reduction, and are willing to work with the company to improve the entire production process [3]. There is significant tactical knowledge during the initial pilot production runs that is very complex, and not easily reduced to simple instruction [8]. Therefore, face-to-face time with suppliers on-site is required during these stages to qualify their process and continuously improve, and take the leanings back to the company. Finally, scaling up production requires a diligent effort in tracking the company's cash cycle. Payment for production is usually due upfront for a startup that is not well established in the industry yet, but revenues from sales are not expected for months [9]. Also there are large cash implications associated with sustaining and customer service if the product quality suffers and customers require repairs or replacements. This could leave the company in a cash-flow insolvency situation. Careful planning in terms of supply contracts, payment terms, and product sustaining must be executed from before the scale-up begins. 1.2 Research Motivation and Overall Problem Statement New Valence Robotics Corporation (NVBOTS), founded in March 2013, is a Boston, Massachusetts-based robotics startup company that has developed the world's first fully automated cloud 3D printing management suite [10]. The 3D printing hardware, called the NVPro, is based on the material extrusion additive manufacturing process. NVBOTS is in the process of completing its in-house pilot production run and is faced with the problem of scalingup its production to meet customer demand. The scale-up project is the result of collaboration between the Massachusetts Institute of Technology (MIT) and NVBOTS. The project was a team 19 effort conducted by the author, Derek Straub, Rahul Chawla [11], and Ali Shabbir [12], all students in the Master of Engineering in Manufacturing (MEngM) program at MIT, between February and August of 2015. 1.2.1 Overall Problem Statement The MEngM team consulted on the overall scale-up project and specifically focused on integrating NVBOTS' business model into its operations. NVBOTS does not directly sell its printers to customers but rather leases them on five-year terms. They provide software and hardware upgrades free of charge during the lease period, as well as any required repair services or maintenance. This unique business model requires careful consideration by engineering and production as the company scales up. Product reliability and quality were identified to be the most important factors to focus on during the scale-up process. As NVBOTS transitions from producing a few units per month entirely in-house to producing hundreds of units per month in partnership with contract manufacturers in the near future, a significant shift in current engineering and production operations would need to occur. The costs associated with unreliable or sub-par quality product are unsustainable with rapid growth. Analyzing the complete product value chain, from design, to incoming supplier parts, assembly, and the complete product, identified opportunities for process improvement. These opportunities form the basis of each MEngM team member's individual subproject and thesis, further discussed in Section 1.2.2. In addition to specific improvement opportunities, the research and work completed by the MEngM team also included: * Establishing a framework and foundation of critical processes for future implementation; - Inculcating discipline, structure, and industry best practices in engineering and production operations through learnings from industry experience; and " Providing case studies of process improvement implementation at NVBOTS as reference for future use. 20 1.2.2 Overview of Subprojects Analyzing the complete product value chain identified specific opportunities for process improvement with regards to product reliability and quality. The first subproject focused on early stages of the value chain by analyzing incoming part quality. As hardware startups initiate operation, their main focus is on product development efforts. When they scale-up, they need to give more importance to suppliers, quality control and inspection procedures. This project focused on developing a framework for and analyzing these attributes. Analyzing the outcomes of using this framework, key recommendations were made in this project for tolerancing techniques, data acquisition and inspection procedure. Also, suggestions were made to streamline strategy and operations and make full use of network effects. Chawla conducted this project and the reader is referred to his thesis for all details [11]. The second subproject was conducted by the author and is the focus of this thesis. It focused on establishing a proper Failure Mitigation Strategy at NVBOTS consisting of failure tracking, analysis, and failure resolution. The aim of this project was to create a foundation, framework, and methodology for NVBOTS to use in order to mitigate costly failures throughout the product life cycle. Failures, especially those that occur in the hands of the customer, can have devastating consequences to any company and even more so to a startup. This project details a structured plan to capture all failure data, how to analyze it statistically and objectively based on its cost to the company, and how to best resolve the failure for future units. As forprofit companies exist to produce profits, the Failure Mitigation Strategy is based on a least-cost model. The goal is to minimize the cost impact of failures by preventing them from occurring or by lessening their impact though multiple methods. This project details how to learn from failures and how to use that knowledge to create a product with increased reliability, quality, and performance, while reducing manufacturing and service costs. This is critical for NVBOTS as the cost of failures will only increase as they begin to scale-up production. Establishing a proper Failure Mitigation Strategy will allow them to continually reduce the cost and impact of failures, allowing them to successfully scale-up and providing them a commanding competitive advantage for the future. Details of this project can be found in the remainder of this thesis, starting in Chapter 4. The third subproject focused on product reliability and life. A reliable product is absolutely critical to NVBOTS because of their leasing business model. The costs associated 21 with repeatedly servicing an unreliable product are unsustainable as the business scales up, and there is the potential to lose the customer entirely if unreliability is persistent. However, currently there is no estimate of the life of the NVPro product. Furthermore, there are no processes inplace to predict, analyze, and test potential in-service failures and mitigate these risks during product development or production. This poses a serious risk to NVBOTS and the potential for catastrophic business failure if the future costs of service are unmanageable. Therefore, this subproject focused on two key process improvements. The first was to implement a structured approach to predict future in-service failure modes and understand their impact. This was accomplished through Failure Modes and Effects Analysis (FMEA). The second was to establish a methodology of actually testing the product to determine its reliability and predict its life. This was accomplished through Design of Experiments, accelerated life testing, and statistical analysis. The life of the printer was then estimated by determining the life of the top risk component identified by the FMEA. Finally, the subproject served to establish a culture of reliability through systematic testing and analysis. Shabbir conducted this project and the reader is referred to his thesis for all details [12]. The opportunities identified and covered in detail between the three theses provide recommendations for near term implementation, as this would be of immediate benefit to NVBOTS. However, each subproject is also a process improvement that should be adopted by operations to ensure long-term success during the entire scale-up process. 1.3 Thesis Overview As previously mentioned, the overall problem statement for the project was to scale-up production at NVBOTS as the company grows. This thesis specifically focuses on ensuring product reliability, by continually reducing the number and impact of failures, as production scales up to match increased demand. This was imperative to NVBOTS as their business model is based on leasing the printer to customers, and any costs associated with servicing unreliable printers is the responsibility of the company. 22 1.3.1 Thesis Objective and Scope The objective of this thesis is to provide companies a structured framework for a Failure Mitigation Strategy that is effective, efficient, and easy to use. It is intended to be universally applicable to many industries; while the case study for NVBOTS details how it can easily be tailored to fit a specific company. The details for failure tracking, failure analysis, and failure resolution are provide to clearly convey the methodology behind the process so that all users can better understand how the results are calculated and why each step is needed. This thesis intends to provide a comprehensive foundation for companies to utilize in establishing a Failure Mitigation Strategy to reduce the number and impact of failures; resulting in higher reliability and a more successful company. The intended scope of this thesis was limited to manufacturing industries; yet most of the FMS framework is universal and may be beneficial to many entities with limited modification. The scope of the NVBOTS case study was limited to their NVPro 3D printer. Data was collected between February and July 2015, with historical data dating back to August 2014. Failure tracking and analysis was performed with all available data on hand, preparing NVBOTS for root cause analysis and failure resolution. A proper failure tracking system was implemented and NVBOTS is strongly encouraged to fully implement the failure analysis system and failure resolution methodology presented in order to truly realize all of the benefits from the FMS. 1.3.2 Thesis Structure This thesis is structured into multiple chapters that provide the necessary background information on the overall project and details the FMS step by step. Case study details, when applicable, are provided at the end of their respective chapters. Chapter 1 covers various risks involved in scaling up a manufacturing startup and provides the motivation for the overall project, majorly written by Shabbir [12] but tailored to this thesis. Chapter 2 presents a brief overview of additive manufacturing including the general process, current technologies, and industry market. Chapter 3 presents background on NVBOTS and an analysis of its competitive strategy, written by Chawla [11] and included here verbatim. Chapter 4 presents an overview of the Failure Mitigation Strategy, provides motivation, and introduces the case study. Chapters 5, 6, and 7 respectively detail the failure tracking, failure analysis, and failure resolution 23 components of the FMS. Finally, Chapter 8 summarizes the work performed and provides recommendations for future work to automate the FMS and supplement its effectiveness with complimentary structured processes. 24 Chapter 2 Additive Manufacturing Overview Additive Manufacturing (AM) is a field of manufacturing processes that creates objects through successive addition of layers of material. Generally, the parts are built from digital three-dimensional (3D) computer aided design (CAD) data, but this need not always be the case. AM has been referred to by many different names, 3D Printing, Rapid Prototyping, and Freeform Fabrication, just to name a few; but the term Additive Manufacturing best differentiates this field of manufacturing processes from conventional manufacturing techniques, which usually involve subtraction, deformation, or formation of material as well as changes to material properties. AM has been around commercially since the late 1980s, but the industry really gained traction and momentum in the 2000s and it has continued to increased rapidly ever since, with a compound annual growth rate of 33.8% over the last three years [13]. In 1995, AM was only a $295 million industry; as of 2014 the AM industry has grown to $4.1 billion and is expected to exceed $12.7 billion by 2018 and $21.1 billion by 2020 [13]. AM has opened up the design space to engineers, designers, and artists allowing them to produce complex geometry that was once impossible or restricted by cost and/or time. Geometrical freedom is just one of the many benefits offered by AM. Time to first part, customization, increased part performance, flexibility, material and energy efficiency, in-house manufacturing, and reduction of the design cycle are some of the many benefits realized through use of AM. AM is a potentially game changing tool for part production, but it is not the solution to all manufacturing needs as there are some drawbacks. Cost, speed, and time are unfavorable compared to conventional manufacturing when dealing with parts of simple geometry. Surface finish, limited materials, material properties, and lack of standards are some of the other drawbacks to AM. The key for users is to understand the capabilities and limitations and to know when it is best to use AM or rather chose a conventional manufacturing process instead. Depending on the desired object(s) and machine to be used there is a certain work flow process to go from CAD data to having a physical part. This process can vary slightly for each job and machine but in general all AM processes follow the same seven generic steps adapted 25 from Gibson [14]. The seven steps, in order are: CAD, Conversion to STL, STL Slicing and Transfer to AM Machine, Machine Setup, Build, Removal, and Post Processing. There are many factors such as geometry, material, intended use, cost, speed, et cetera that factor into which AM machine to use for any given build. The American Society of Testing and Materials, now known as ASTM International, has categorized all of the current machines by their AM process. There are currently seven process methodology or technology categories defined by ASTM International: Binder Jetting, Directed Energy Deposition, Material Extrusion, Material Jetting, Powder Bed Fusion, Sheet Lamination, and Vat Photopolymerization [15]. The seven generic steps and seven AM technology categories will be described in more detail in the immediately following sections. 2.1 General Additive Manufacturing Process Work Flow The detailed AM process will vary slightly from machine to machine and from build to build but these seven generic steps cover the majority of all AM process work flows. Depending on the machine, part(s), orientation of part, material, build quality, support material required, et cetera, certain steps will be more extensive than others, while some may be skipped altogether. Regardless, the following steps derived from Gibson [14] portray the typical work flow required LU UaInsfUrm C UAdata inio a physic~al kbject via AM (see F 2- f a visual representatiin of the seven steps with a cup as an example part). Step 1: CAD All AM parts must start from a software model that fully describes the geometry. This can involve the use of almost any CAD solid modeling software, but the output must be a 3D solid or surface representation. Reverse engineering equipment (e.g., laser and optical scanning) can also be used to create this representation. Step 2: Conversion to STL Nearly every AM machine accepts the STL file format, which has become a de facto standard, and nowadays nearly every CAD system can output such a file 26 format. This file describes the external closed surfaces of the original CAD model and forms the basis for calculation of the slices. Step 3: STL File Manipulation/Slicing/Transfer to AM Machine The order of these three sub-steps may vary, but the STL file describing the part must be transferred to the AM machine. There may be some general manipulation of the file so that it is the correct size, position, and orientation for building. The STL file is sliced into build layers and support material and corresponding support layers are generated, if need be. These slices or layers represent the physical build layers of material during the build. STL manipulation and slicing may occur on the AM machine or at a computer before transfer. Step 4: Machine Setup The AM machine must be properly set up prior to the build process. Such settings would relate to the build parameters like the material constraints, energy source, layer thickness, timings, et cetera. Setup usually involves cleaning, clearing, and resetting of the build area altered from previous builds. Step 5: Build The part is built out of the given material(s) layer by layer according to the slice data. Building the part is mainly an automated process and the machine can largely carry on without supervision. Only superficial monitoring of the machine needs to take place at this time to ensure no errors have taken place like running out of material, power or software glitches, et cetera. Newer and more industrial machines are beginning to monitor for errors and anomalies in order to notify the operator. Step 6: Removal Once the AM machine has completed the build, the parts must be removed. This may require interaction with the part, raw material, and machine, which may have safety interlocks to ensure for example that the operating temperatures are 27 sufficiently low or that there are no actively moving parts. Removal must be performed carefully and by experienced operators as many parts are damaged during this step. Step 7: Post-processing Once removed from the machine, parts may require an amount of additional work before they are ready for use. Parts may be weak at this stage or they may have supporting features that must be removed. This therefore often requires time and careful, experienced manual manipulation. Post processing is usually the most laborious step and yet the most commonly unknown step for those outside of the industry. 1: CAD 2: STL Conversion 3: Slicing and Transfer 4: Machine Setup 5: Build 6: Removal 7: Post Processing Figure 2-1: AM process work flow steps [14] 2.2 Additive Manufacturing Technologies In 2014 there were 49 industrial grade AM machine manufacturers, many selling multiple models [13]. In the same year there were hundreds of mostly smaller companies selling desktop grade3 machines as well [13]. All machine models are similar in that they build sequentially, layer by layer, defined by the slice data of the 3D CAD model. Yet all these AM machine models are different from one another in many ways, each with the technology and features the manufacturer believes their customers want. Still, they all fall into one of the seven AM process/technology categories defined by ASTM International (see Figure 2-2). Below are the definitions of the seven standard AM process categories according to ASTM International [15]: 3 Industrial grade and desktop grade machines are defined in Section 2.4 28 2.2.1 Binder Jetting An additive manufacturing process in which a liquid bonding agent is selectively deposited to join powder materials. * 2.2.2 -I2.2.3 Directed Energy Deposition An additive manufacturing process in which focused thermal energy is used to fuse materials by melting as they are being deposited. Material Extrusion An additive manufacturing process in which material is selectively MIS: dispensed through a nozzle or orifice. 2.2.4 Material Jetting An additive manufacturing process in which droplets of build hi material are selectively deposited. 2.2.5 Powder Bed Fusion An additive manufacturing process in which thermal energy selectively fuses regions of a powder bed. 2.2.6 Sheet Lamination An additive manufacturing process in which sheets of material are ME_ AT"I 2.2.7 L bonded to form an object. Vat Photopolymerization An additive manufacturing process in which liquid photopolymer in a vat is selectively cured by light-activated polymerization. Figure 2-2: AM process categories 29 Within the seven categories there are many machine models employing multiple variants of the general process, yet they can all be summarized by the ASTM International categories. More advanced AM machines are beginning to incorporate conventional manufacturing processes in parallel to the additive processes. These machines still fit into one of the seven categories, but are now being referred to as hybrid machines that are capable of both additive and subtractive processes. Some future AM machines currently in the research and design phase may not fit into one of these seven categories or actually blend two or more of the categories, but for now these seven categories will suffice. Additive Manufacturing Applications 2.3 Additive manufacturing has many applications and uses and more are being continually thought of and put into use every year. As the machines and processes evolve and improve the application space continues to grow. Originally AM parts were used solely as visual models to better convey a conceptual design. Currently, AM part applications can fit into one or many of the following categories: visual models, fit-check models, functional models, end-use parts, tooling and molds, assembly guides and fixtures, education, and research. In recent years the percentage of parts built for end-use has continued to climb and in 2014 end-use parts accounted f-Po 9of/ 1U1 2-71 ['21 a +p -lanw1+ h+U m bt -part U all _PaiLL UUIL, IIIJVV LM,~ 11nu3L pupuiai apVII LLIIIJ J L-J 11 ,ll~ ul LLUUL%.A LU L11%, _P 01 Tisa be attributed to the steady increase in performance and quality of the AM machines as well as the increased adoption and confidence from engineers, designers, and other users. Fit-check models was the second most popular category in 2014 accounting for 17.8%, while the least popular use was that of tooling [13]. The year 2015 will see a rise in both end-use parts and tooling, as the AM machines are as capable as ever, there are increased material options, these two categories have the most untapped potential, and leading manufacturers have been heavily spotlighting these applications in their advertisements as well as at trade shows and conferences. Many industries have realized the benefits of AM and its use has increasingly become more and more widespread in industries such as automotive, aerospace, industrial/business machines, consumer products and electronics, medical and dental, academic, government and military, architectural and others. The automotive and aerospace sectors were early adopters to AM and still represent a combined 30.9% of the total AM user-base , while consumer products and electronics are catching up with 16.6% [13]. The vast range of uses can be attributed to the 30 widespread adoption throughout all the major sectors as they begin to truly realize the many benefits of AM. One of the most popular benefits that most sectors look to capture is that of reducing the development cycle time for new products. AM can speed up rounds of design, prototyping, and testing through quick or even parallel production of multiple iterations of a design. Typically after the development cycle, there is a manufacturing cycle required to tailor the final development design to the manufacturing equipment for quality and efficiency in mass production. When AM is used for production of end-use parts the development cycle and manufacturing cycle are reduced even further, as the manufacturing cycle is no longer needed. The last iteration of parts that were built for the development cycle now become the manufacturing design and require no further work, as they are already being manufactured on the final manufacturing equipment. Because of this reduction in time and cost, among many other benefits, many sectors are looking to increase their use of AM. 2.4 Industry and Market The AM industry consists of two major classes of machines: desktop grade and industrial grade. For the intent of this publication any AM machine that retails for more than $5,000 USD is considered industrial grade. Any AM machine retailing for less than $5,000 USD is considered desktop grade. This provides a clear cut line between the two but their differences are quite obvious and extend well past their price tags. Industrial grade machines are just that; they are built for industrial use and are intended to be operated in an industrial setting by trained operators. The machines range from $5,000 to over $2,000,000 USD and are capable of very fine layer resolutions and large build volumes. Industrial machines can process the widest range of materials including, but not limited to, polymers, metals, ceramics, composites, and bio-matter. In general they have higher reliability, quality, resolution, layer thickness options, advanced build control, speed, efficiency, and robustness when compared to desktop models. Industrial grade machines are usually much more complex, yet easier to work with than desktop machines, due to better software and a more automated process. Typically they have larger build volumes and are able to build multiple parts in parallel. Industrial grade machines span all seven process categories and are starting to include hybrid machines that are capable of both additive and subtractive processes. Uses include all of the previously noted uses but in contrast to the desktop models, industrial grade 31 machines are used more frequently for end-use parts, tooling, and fit-checks, owing to their material selection and build quality/resolution. In 2014 Wohlers estimates that nearly 13,000 industrial grade machines were produced and sold. In total last year, industrial AM machine sales accounted for 86.6% of revenue from sales worldwide [13]. Desktop grade machines are designed for low cost and are able to fit on a desktop at work or at home. They range in price from about $400 to $5000 USD. These machines have not been around as long as their industrial counterparts, first breaking into the commercial market in 2007 and only truly being sold in large quantities beginning in 2011 [13]. Desktop models are notoriously known for being difficult to work with and lack in quality, resolution, and speed. The software, user interfaces, and calibration setting are weak points and cause most of the issues associated with this class of machines. Due mainly to their low cost, desktop grade machines have a very good price to performance ratio and are much less expensive to operate. They are limited to only a few simple material choices but usually have many build color options. These machines are more tailored to home, educational, artistic, and recreational uses. Currently desktop models are only available in one of two process/technology categories: extrusion and vat photopolymerization. Last year nearly 140,000 desktop machines were sold worldwide accounting for 13.4% of revenues from all AM machines, up from 9% the previous year [13]. The AM market is dominated by industrial and desktop machines but there is very little in-between. Recently hundreds of companies have started to produce thousands of desktop AM machines to satisfy the general public's craving for access to 3D printing. There is truly an untapped market sitting directly between the two current machine grades. A very accurate analogy can be made to conventional printers: Industrial grade AM machines are similar to large printing presses and desktop grade AM machines resemble conventional desktop inkjet and laser printers, but there is currently nothing similar to that of the networked office printer. Xerox, Canon, HP, and others have truly excelled in the networked office printer market, yet not a single AM machine has been designed for a similar 3D market. NVBOTS aims to tackle this untapped market with their NVPro 3D printer. The NVPro is networked and designed for speed and autonomy. This should be a good fit for this open market but it's safe to say that many of the existing industrial and desktop manufacturers are looking to fill this void as well. The classroom and office space may well be the next battleground for AM machine manufacturers. 32 Chapter 3 Company Background New Valence Robotics, or NVBOTS, is a 3D Printer manufacturing startup founded in March 2013 by four MIT students. At present, NVBOTS has all of its operations in Boston, MA. The vision of NVBOTS is to build a globally distributed network of on-demand intelligent automated 3D printers in order to deliver high quality printed parts. The team here believes that the current additive manufacturing process is full of hassles and this acts as an encumbrance against increasing the user base of the technology. There's a steep learning curve involved in designing for 3D printing, part removal is cumbersome and there is a lack of queuing which makes 3D printers difficult to share. To tackle these problems, NVBOTS has developed the world's only 3D printer with automated part removal, which through their cloud-based interface can run continuously by itself and be controlled by any device [10]. See Figure 3-1 for NVBOTS corporate logo. NVBOTS Figure 3-1: NVBOTS logo [10] Their current business model is to lease out printers for five-year terms at different pricing and packages to their educational and industrial customers with full service offered as a part of every package [16]. The company recently closed a successful $2M seed round of funding. 33 3.1 The Product The NVPro is a dual extrusion based printer with a resolution of 100 microns and an accuracy of 25.4 microns. The build volume is a cube of 8 inches and achievable printing speed is as high as 180 mm/s. Figure 3-2 shows how visually open the NVPro machine is, allowing for educational opportunities, while proudly displaying the complex internal mechanisms. Figure 3-2: NVPro printer [17] Other features of the NVPro include automated part removal which obviates manual presence to clear the build area for subsequent prints, and a built-in camera that allows real time viewing of the printing process from any device. All the printer management is through the cloud so no extra software is required [17]. The NVPro caters to the education market as their target audience. Additional offerings in the package include 3D printable curricula. These modules encourage project based and applicative learning and lessons include life sciences, earth-space sciences, engineering and many more [18]. The user interface (see Figure 3-3) is intuitive and easy to navigate. Its features include print preview with size, shape, and quality adjustments, administrative control for queue management, and a printer dashboard (see Figure 3-4) with a live video feed and other real time monitoring add-ons [19]. 34 0O& . . Figure 3-3: Print preview feature [19] n232- C .0 mffifflrll. 40 C ( 31 C Figure 3-4: Printer dashboard [19] 35 3.2 The Market NVBOTS leases out printers on a five-year contract that ensures recurring consumables revenue (plastic filament) and cloud services fees. Their beachhead market is the education space in an attempt to capture the future designers and scientists early and also, through their data, learn what is desired from 3D printing. They currently have 16 printers rented by educational customers, 10 printers working internally and 12 printers currently on the assembly line. Once they successfully penetrate the education marketplace, they will approach the industrial marketplace with improved technology offerings in an attempt to make a stronghold there. 3.3 Company Analysis Professor Michael Cusumano of the Sloan School of Management identifies eight key points of successful startup ventures [20]. The following section looks closely at how NVBOTS is currently positioned on the basis of these metrics, while Table 3.1 summarizes NVBOTS' status. Management Team The founding team of NVBOTS includes CEO AJ Perez, CTO Forrest Pieper, COO Chris Haid and VP of Engineering Mateo Pefla Doll, all of them MIT mechanical engineers. NVBOTS also has an esteemed board of advisors in former experts of manufacturing and 3D printing industry and also experienced professors at MIT. With new hires, including experienced people in key areas of supply chain, sales, and production, this metric seems well cleared for NVBOTS. Attractive Market The McKinsey Global Institute estimates the total economic impact of 3D printing by 2025 to be up to $0.6 trillion [21]. Hence, an attractive market certainly exists. With NVBOTS's beachhead market being largely untapped and their value proposition being specifically advantageous to capture it, they are well on track. They are also targeting industrial markets with improved technologies. 36 Compelling New Product The NVPro is a compelling product in itself but it must be put in reference to the competition that they face. In terms of feature offerings such as 24/7 printing without human intervention, ease of sharing among consumers and use of data for future improvements, it is the only product that achieves it. Strong Evidence of Customer Interest NVBOTS already has customers using the product and a high anticipated demand for financial year 2015. Besides, NVBOTS is currently catering to its beachhead market and at the same time working on product innovations that will serve them well in the industrial marketplace. The true litmus test will come when they attempt to pitch to industrial customers and compete with other well-established players in that field. Overcoming the "Credibility Gap" Professor Cusumano describes this as "the fear among customers that the venture will fail, leaving the buyer without technical support or a future stream of product upgrades." In order to avoid this, startups must use present customers as references for new customers [20]. This requires exceptional customer service and also a reliable product that does not face critical issues in the field. This is one concern area. Demonstrating Early Growth and Profit Potential NVBOTS has a well charted financial plan for growth and an existing customer base. Successful seed funding rounds are indicative of the company's merit. Flexibility in Strategy and Technology This metric cannot be assessed through plans made in advance but only after the company has been running for a certain period of time and proves to be responsive to market needs and technological changes in such disruption prone markets. 37 Potential for Large Investor Pay-Off A startup that has established sources of funding beyond angel investors and family, friends, et cetera, shows promise for large pay-off and looks better to potential investors [20]. This is an area of opportunity for NVBOTS. NVBOTS Elements for evaluation Management Team Attractive Market Compelling New Product Strong Evidence of Customer Interest Opportunity Overcoming the Credibility Gap Demonstrating Early Growth and Profit Potential Flexibility in Strategy and Technology Opportunity Potential for Large Investor Pay-Off Opportunity Table 3.1: NVBOTS evaluation 38 In conclusion, the company has a bright future ahead with their strong performance in almost all of the above mentioned metrics. NVBOTS should have a strong focus on capitalizing on opportunities by generating customer interest and also staying nimble and flexible in their strategy. By refining their product design and manufacturing process, they can eliminate the initial concerns that their product faces in the field and ward off their single potential problem. 39 PAGE INTENTIONALY LEFT BLANK 40 Chapter 4 Failure Mitigation Introduction Failure mitigation is extremely important to any company, but for a high technology startup it is of the utmost concern. As a high technology hardware startup, customer satisfaction and company reputation are critical to market acceptance and the ability to increase demand. Product failures, especially in the hands of the customer, can cause long lasting and devastating impact. During a startup phase, companies are most fragile to this impact as they have little to no reputation and are trying to establish themselves in the marketplace as a trusted and established company. The following sections will give a brief overview to the importance of failure mitigation and the ideal failure tracking, analysis, and resolution methodology that should be implemented to minimize any possible negative impact during startup, as well as throughout the life of the company. The initial failure mitigation status of NVBOTS will also be discussed. 4.1 Overview of Failure Mitigation and Importance Some of the benefits a manufacturing company can expect to gain from implementing a Failure Mitigation Strategy (FMS) comprised of failure tracking, analysis, and resolution are: - increased product reliability - reduced severity of failures - increased product performance " reduced service costs - increased manufacturing efficiency " increased customer satisfaction As the Failure Mitigation Strategy is a continual process it will provide reliability improvements over the complete lifecycle of the product [22]. Failure tracking can allow a company to categorically and statistically view all their failures over time. It gives an objective overview of the quality and reliability of the product and manufacturing system. Failure analysis is then performed on the collected failure tracking data; ideally the analysis is continually updated in 41 real time as new failure tracking data is added. This can lead to important insights regarding which failures are important and need to be prioritized for mitigation. Root cause is investigated during analysis to help direct the failure resolution. In failure resolution, multiple methods are used to combat failures. Sometimes changes to product design, assembly, or handling are required while in other cases testing may be more economical for failure mitigation. A failure mitigation process that includes tracking, analysis, and resolution will enlighten management, engineering, manufacturing, and sales about issues that they might have never even noticed as well as ways to improve the reliability and maintainability [22]. It will clearly bring to the surface which failures have the most impact and what is needed to mitigate them. If properly set up and ingrained into the company culture, a strong FMS can propel a manufacturing company to increased product quality, performance, and reliability. Continually improving quality, performance, and reliability provides a forceful competitive advantage that leads to a more successful and reputable company. Despite all these benefits, failure tracking systems are not common practice in many industries [23]. Without a proper failure tracking system, companies are left to address product failures in much less efficient methods. Some typical methods for prioritizing failures employed by companies without failure tracking and analysis include first-in first-out, most frequent first, most expensive first, and even management intuition. Obviously all of these methods are blind to the true impact of the failures and address them without regard to which will have the most benefit to the product and company. Without failure tracking many failures can go unnoticed and valuable data and information is lost [23]. Without the proper tools to identify, track, analyze, and resolve failures, companies run the risk of costly product failures and most of the time they will occur in the worst possible location, at the customer. High technology manufacturing startups, like most startups, lack the structure and calculated processes that keep larger and more established companies running smoothly. Because of this startups are less likely to have a well-established and complete Failure Mitigation Strategy. Startups tend to fight fires (failures) day to day and are usually just trying to keep up with the daily needs of starting a company. Instead of collecting all failure data, analyzing and addressing the most costly and impactful failures first, startups tend to take on the larger fire first. The only problem is that the apparent size of the fire is not always aligned with the most impact to the company. A failure may appear very large or important because one customer called to complain, while in actuality 42 it may be a very rare failure than does not warrant top priority. Proper failure mitigation is easier than most think to implement and once set up it can run with very minimal impact to daily operations, while providing impactful and immense benefits. A true FMS comprised of failure tracking, analysis, and resolution will ensure the company is aware of all failures and addresses them in the most effective and efficient manner in order to mitigate the negative impact of failures to the company as soon as possible. 4.2 Overview of Ideal Failure Mitigation Strategy An ideal Failure Mitigation Strategy is one that works best for a given company at a given time. However, most are structured in a similar manner and follow the same process flow. The Failure Mitigation Strategy detailed in this thesis was designed by the author for NVBOTS but was kept in a generic form so that it may be applicable to most manufacturing companies and possibly even appeal to other sectors. The ideal FMS should consist of three major components: failure tracking, failure analysis, and failure resolution. There is a set workflow, that if followed will result in a systematic reduction of costs incurred from failures through reduction in the number and impact of failures. Figure 4-1 diagrams the ideal FMS framework, visually detailing the structure and workflow. Each component of the FMS framework is introduced in this section and later expanded upon in Chapters 5, 6, and 7, in much greater detail. 43 FAILURE MITIGATION STRATEGY FRAMEWORK 2015 Derek S Straub FAILURE EVENTS I OBSERVE OBSERVE SYMPTOMS AND CONDITIONS CONCLUSIONS JUMPING TO ON CAUSE F AVOID U DOCUMENT EVERY ISSUE/FAILURE EVENT E EASY, QUICK, ACCESSIBLE DATA ENTRY FORM 0 STANDARDIZED AND DETAILED EVENT RECORDS - RECORD AS SOON AS POSSIBLE E SINGLE UNIFIED TRACKING SYSTEM a FREQUENCY/TOTAL OCCURRENCES - COST t FAILURE TRACKING DOCUMENT FAILURE ANALYSIS DECIDE A' - HARDWARE - LABOR/TIME * - e TRAVEL * DIAGNOSIS * REPAIR CUSTOMER IMPACT * DOWN SELECT BY USING TOTAL COST MODEL * INVESTIGATE ROOT CAUSE U FAILURE RESOLUTION ACT KNOWN CAUSE * - * CHANGE DESIGN, * SCREENING TEST ASSEMBLY, ETC. AS EARLY AS TEST IF ECONOMICAL POSSIBLE Figure 4-1: Failure Mitigation Strategy Framework Visually detailing the structure and workflow 44 UNKNOWN CAUSE * The failure mitigation workflow starts with a failure event. This event can be either a failure or an issue. A failure is defined as an event that renders the 3D printer unusable and cannot be easily fixed by the customer. It will no longer operate as grossly expected by the customer. An issue is defined as an event that degrades the performance of the 3D printer but the operation is similar to the customers' expectations or an event that renders the printer unusable but can be easily fixed by the customer. Failures cause more impact to the customer and company but both event types are important and must be recorded. When a failure event takes place the first action should be to observe. The company should observe the symptoms and conditions during the failure, if possible, and gather as much information as possible about the state of the 3D printer. If the failure event is being relayed from a customer, the company should extract as much detail as possible about the symptoms and condition of the printer. At this stage in the workflow the company should avoid jumping to conclusions on the cause of the failure This process should be performed in an objective manner. event. Failure tracking, failure analysis, and failure resolution will be described in further detail below. 4.2.1 Failure Tracking Once a failure event is observed or relayed the first step in the ideal FMS is failure tracking. This step is critical as it collects all the information and data that will be used in the remaining steps. Failure tracking is all about organized and complete documentation of the failure event. Every failure and issue must be documented. The data entry form must be quick, easy to use, and easily accessible in order to encourage employees to completely fill out all fields and provide a high level of detail. High quality entries allow for high quality results later on. The event record should use standardized fields and language so that subsequent data processing is more easily handled. Failure events should be logged as soon as possible in order to avoid any loss of data. The quicker the data is logged the more complete and accurate the record is [24]. At this step possible causes can be added to the tracking system but only after symptoms and conditions have been objectively recorded. There should be only one unified failure tracking system to eliminate the possibility of duplicates or mismatched data sets. More information on the failure tracking process and ideal failure tracking system is detailed in Chapter 5. 45 4.2.2 Failure Analysis The second step in the ideal FMS is failure analysis. The basis of this step is to decide which failures to address first by determining which ones have the greatest impact on the company. This analysis is performed using a total cost model. The total cost model takes into account both the frequency of a failure event type and the cost incurred to the company. The cost to the company can be further broken down into sub-costs: hardware, labor/time, and customer impact. For NVBOTS, the last sub-cost, customer impact, was determined to be very high with respect to the hardware and labor costs and thus the total cost values for NVBOTS are dominated by the customer impact. This will vary from company to company but customer impact must be accounted for as it is usually quite important and sometimes the driving cost factor. Once all of the data has been processed, the analysis will show which failure event types create the greatest cost to the company. Event types can now be down selected by the total cost model, only advancing the top cost contributors to root cause investigation. Ideally this step can be automated and continually updated in real time as new data is logged into the failure tracking system. Finally root cause analysis is performed on the top cost contributing failure event types to determine how to resolve the failure in the future. Ideally the company will be able to determine the cause of each failure type but this is not always the case. Luckily the failure resolution step can handle both known cause failure types and unknown cause failure types through the use of multiple resolution methods. More information on the failure analysis process is detailed in Chapter 6. 4.2.3 Failure Resolution Failure resolution follows the failure analysis step and is the last step in the ideal FMS. Failure resolution is initiated by the completion of a root cause analysis of a top cost contributing failure event type. If a root cause is found, the solution to that cause, be it a design change, assembly process change, new handling instructions, et cetera, will be estimated for cost. The cost of the solution will then be compared to the cost of the failures, extrapolated out in time and the change will be made as long as the cost is less than the predicted cost of future failures. This will be the case a majority of the time. In some rarer instances the cost to fix the failures may be higher than the cost of the failures themselves. If this is found to be true, the engineering team 46 should look into whether a test can be used to screen the particular failure depending on the nature of the cause. Most times a simple test can be very economical at reducing the chance of the product experiencing a failure later on especially when related to varying incoming part quality. If the root cause cannot be determined the engineering team should determine a screening or stress test that could weed out possible failures. This test should be implemented as early in assembly as possible in order to remove the failure before more value is added to the 3D printer. The earlier the failure is caught, the more time and money is spared by the manufacturer. Failures with unknown causes should be periodically revisited with a fresh root cause investigation as new information may lead to a definitive cause. More information on the failure resolution process is detailed in Chapter 7. 4.3 Initial Status of Company NVBOTS, as of March 2015, operated like most startups fighting fires day to day. They were attempting to collect failure data but they had multiple failure tracking spreadsheets with inconsistent and missing data. Tracking a failure was not a top priority and many times failures went undocumented. One major concern was that repeat failures were rarely recorded. The engineering team at NVBOTS figured that they were aware of the problem so recording it a second, third, 2 0 th time was not needed. This leads to lost data that could provide critical insights to true cost and impact of the failures as well as information needed to properly determine the root cause and resolution method. When looking at one failure event, confounding factors could mask the true cause, but over time with multiple similar failures the true cause is much easier to distinguish from the noise. Failure analysis was practically non-existent except for a root cause analysis performed with little structure. The engineering team would usually diagnose the individual failure on the spot and jump to logical causes without review of historical data. Changes would be made to the design or process and they would run a quick test to see if the change addressed the failure; sometimes it did, other times multiple iterations would be needed before a solution was found. This process might be quicker for one failure but it is a much slower process over the course of many failures, due to its inefficiencies. NVBOTS knew that they needed to collect data on failures in order for their company to succeed, but multiple factors, such as inexperience, workload, and perceived effort, prohibited them from establishing a proper Failure Mitigation Strategy. 47 NVBOTS performs a burn-in test of every printer once complete, before shipping it off to a customer. This burn-in period lasts one work week and mainly consists of multiple part builds under standard operation conditions. Some failures such as infant mortality rate associated failures in electronics and assembly errors are caught at this time. This is both good and bad for the company. It is positive in that they have caught many failures in-house before shipping to the customer, allowing fewer failures in the customers' hands, yet negative in that NVBOTS is finding failures in their product after 100% of the value has been added to the 3D printer. Because of this, any failure found requires extensive teardown and rebuild, costing the company valuable time and money that could be avoided. Again, some of these failures that occurred inhouse are not recorded or documented properly, leaving their failure data to show an incomplete picture. The research and work performed for this thesis is in support of establishing and ingraining a proper Failure Mitigation Strategy at NVBOTS to increase their likelihood of successful scale-up. 48 Chapter 5 Failure Tracking The first step in the ideal FMS deals with documenting the failure events (see Figure 5-1). tracking Failure converts failure events into valuable data. As failure tracking creates all of the data that the FMS runs on, it is critical to have rAILUKt I KALKIrYU DOCUMENT DOCUMENT EVERY ISSUE/FAILURE EVENT ' EASY, QUICK, ACCESSIBLE DATA ENTRY FORM SSTANDARDIZED AND DETAILED EVENT RECORDS " RECORD AS SOON AS POSSIBLE " * SINGLE UNIFIED TRACKING SYSTEM complete and accurate records. Without a proper failure tracking system valuable data is lost along with the possible cost savings and product improvements that could be highlighted by the failures. A strong FMS begins with a strong failure tracking system. Below, the concepts mentioned in Section 4.3.1, along with others, will be expanded upon detailing the ideal failure tracking methodology. 5.1 Figure 5-1: FMS failure tracking The first step in the FMS framework Detailed Methodology Document Every Failure Event Every failure event must be recorded no matter how big or small, significant or insignificant, unique or common, issue or failure, every individual event must be logged. This is very important as this data will later be statistically processed and missing data will only mislead the company to fictional results. The process is robust enough that one or two missed entries should not skew the data significantly but companies should take care to log every event in order to obtain results that reflect the true status of their failure position. Be sure to encourage the reporting of issues. Naturally, failures will be reported as the product is no longer usable, but issues may go unreported as they may be easily fixed or only 49 degrade the performance a little and the customer is still satisfied with the product. This can lead to many issues never being reported and thus not being tracked. Establish an incentive program or other motivational system to increase the number of issues reported. Depending on the product, reporting features can be built into the design to assist the company in tracking failures and issues. In the case of NVBOTS, the NVPro is always connected to the main company server via the internet so NVBOTS is able to track issues and failures remotely. For example, if a standard filament jam occurs and the customer is able to fix it, the printer can send a message that a jam has occurred even though the customer never reported it. Documenting as many failure events as possible is vital to a successful FMS. Detailed, Complete, and Accurate As the saying goes, garbage in equals garbage out. Care must be taken to ensure that failure events are logged in a detailed, complete, and accurate manner. Many times the user who entered the failure event details will not be on the team conducting the root cause analysis. Thus, the failure event record should be detailed enough that anyone reading it could reconstruct the situation without input from the original user who filed the failure event. The results from the FMS are only as good as the input data generated during failure tracking. Record Events As Soon As Possible Failure events should be recorded as soon as possible to alleviate the risk of loss of data. The longer the time between observing a failure event and recording, the less accurate details become. Accuracy is crucial so it must become second nature to record right away. When a technician is out on a service call it makes sense to repair the unit first, allowing for the customer to begin using the product again, but directly afterwards the technician should record the event to ensure the event gets documented and the details are as accurate as possible. Be Objective The symptoms, conditions, and details of the failure event should be recorded first and be strictly objective. Failure tracking is not the ideal time to make subjective decisions on the definitive cause of the failure event. The data from one single failure event is less conclusive than that from many similar failure types and thus decisions will be made later in the failure 50 analysis step. The technician must remember failure tracking is all about objective documentation of the failure event in order to create a detailed and accurate historical record of what happened. Failure tracking attempts to create a virtual snapshot of the failure event so that it can be viewed again in the future. At the very end of the entry form possible causes may be listed if deemed appropriate. Standardized Form and Language The failure tracking system must use a standardized form and standardized language or codes. The standardization is needed for statistical processing in the failure analysis step. Processing is much easier when the same failure type appears as the same code or name. This will streamline the remainder of the FMS as well as eliminate confusion while entering data into the failure tracking system. Standard language ensures that data recorded by one technician on failure type X appears the same as data entered by another technician documenting a failure event of the same failure type. This also allows for quicker and clearer communication between different members of the company. Easy, Quick, Integrated, and Accessible Data Entry Form The failure tracking data entry form should be easy, quick, integrated, and accessible. Ease of use is important as this will encourage more frequent usage and more complete entries as the technicians will be happy to use it. Similarly the form should be quick and to the point. The form must be as efficient as possible and capture all the required data without asking for extraneous items that are not needed. To increase the ease of use and speed of filing, the form should be integrated to the system so that where applicable data can be automatically populated. Items such as the failure event number, username, date and time, location, firmware revision, software revision, et cetera can be automatically filled saving the technician valuable time while also alleviating the possibility of typos and incorrectly entered data. The form must be highly accessible to allow timely entry of failure events. When technicians are out on a service call they must be able to remotely enter in data via a mobile device or laptop. Timely entry of failure event details reduces the chance of detail accuracy lost over time. Failure tracking systems with forms that are easy to use, quick, integrated and accessible will capture better data and be better accepted by the technicians. 51 Single Unified System Only one unified tracking system should be used to track all failures. Multiple tracking systems can allow for a number of issues such as duplicate entries, non-standard language, and confusion. Combining data from multiple tracking systems can become difficult as the data may not be compatible or may be formatted incorrectly. A single unified system promotes standard language and codes, accurate data entry and smooth transition to data analysis. Duplicate entries are deterred but easily recognized and discarded if found. A single tracking system also promotes simplicity allowing for better technician comprehension and adoption. Multiple tracking systems generate more work, are less efficient and should be avoided. Proper Training and Company Culture Technicians, engineers, and anyone else who will be using the tracking system should be properly trained. They should receive training on the whole FMS system as well as proper data entry. Training on the complete FMS system will provide all users a thorough understanding of the entire process which will allow them to understand the importance they play to the system. Each member of the team needs to fully comprehend his role and how proper execution will benefit himself, the system, and the company as a whole. Properly trained users will generate better data and be more efficient and effective at documenting failure events. The system is only as good as the users who use it. The concept of failure tracking should be ingrained into the company culture and heavily supported by management. Technicians should not be motivated by other factors to fill out the form as quickly as possible or even worse skip an entry altogether. Contrary, they should be encouraged and motivated to submit an accurate, detailed, complete, and timely entry for every failure event. Failure tracking should become second nature to the users of the system. Tracking Items Depending on the company and product, the items tracked can be tailored to best fit the given company. In general, the items should provide enough information to enable the full virtual reconstruction of the failure event. When developing a failure tracking system one should think about all of the possible ways to describe the state of the product and its environment, as these are the items that should be tracked. Items to be tracked should be clearly defined and a 52 standardized language or coding system should be used to standardize responses. Note fields for extra details can help a technician report an event that requires special reference beyond that of the standard language or codes. When applicable, items should be automatically populated by the failure tracking system to reduce technician manual entry. All fields are required except for those listed as optional. Fields with standardized language or codes should have drop down lists or searchable lists to choose from. Table 5.1 details the tracking items for NVBOTS. This list has been lightly tailored to fit the NVBOTS NVPro 3D printer, but it could easily be altered to fit another product or company. 53 Automatically generated, serialized number unique to each failure event Autofill User Name Unique identifier for the user who submitted the tracking form Autofill Date/Time Date and timestamp Autofill Firmware Rev # Current firmware revision number on the hardware involved with the failure event Autofill Atfl Software Rev# Current software revision number on the hardware involved with the failure event Autofill Macro and micro location of the 3D printer Autofill File currently being built when the failure occurred Autofill Material of filament spool currently loaded while the failure occurred Autofill Download of the log files currently on the machine while failure occurred Autofill Temperature, humidity, atmospheric pressure, et cetera in local vicinity to the printer Round trip travel time automatically calculated from online route planning Autofill Round trip travel distance automatically calculated from online route planning service Autofill Serialized hardware number unique to each NVPro frame Manual Failure event classification of failure or issue Manual Standardized symptoms observed that describe the failure Manual Open details field to note extra specific details about the failure beyond that of Manual Part #s Part numbers of the machine parts that are damaged and need to be replaced or repaired, if applicable Optional Fixed? Was the failure or issue fixed? Manual Details describing the fix, if applicable Manual Time in minutes spent diagnosing the failure event and deciding how to fix the failure event Manual Time spent in minutes on teardown and repair Manual Possible primary cause of failure event, if applicable Optional Possible secondary causes of failure event, if applicable Optional Photos/Videos Photos or videos that could assist in the investigation of the root cause or that document the status or condition of the printer or environment Optional Notes Open notes field for any extra information that is important but does not fit Optional # Failure Event Location of Printer Current Build Current Material Software Data Dump Environmental Conditions Travel Time Travel Distance Hardware # Failure or Issue Symptoms Details on Failure Event Fix Details Time Spent on Diagnosis Time Spent on Repair Primary Cause Secondary Causes .Autofill service the standard language elsewhere Table 5.1: Failure tracking items List of recommended tracking items for NVBOTS with brief descriptions 54 5.2 NVBOTS Case Study This thesis and subproject was conducted in support of NVBOTS and the work was intended to establish a framework for an ideal Failure Mitigation Strategy for the NVPro. Initially NVBOTS was tracking some of their failures but their tracking systems were inefficient and lacking in many areas. Table 5.2 details the tracking status of the old systems versus the new system with regard to the ideal tracking items. Improvement can be clearly seen as the new system has been designed to comply with the ideal FMS tracking methodology. Details on the old tracking systems and the new and improved tracking system can be found in the immediately following sections. 55 Old y ms Tracking Item Entry Type Failure Event # User Name Date/Time Firmware Rev # Software Rev # Location of Printer Current Build Current Material Software Data Dump Environmental Conditions Auto Auto Auto Auto Auto Auto Auto Auto Auto Auto Travel Time At Future Travel Distance Auto Futurp Hardware # Manual Failure or Issue Symptoms Manual Details on Failure Event Manual Manual Part #s Fixed? Fix Details Time Spent on Diagnosis Time Spent on Repair Primary Cause Secondary Causes Photos/Videos Notes Optional Manual Manual Manual Manual Optional Optional Optional Optional Some Some Some. Some Some Some Manual Soon Some some C Some Some Soon Soon Some ome Table 5.2: Tracking systems comparison Note the tracking improvement over the multiple old tracking systems 56 5.2.1 Original Tracking Systems Initially there were six different tracking systems being used at NVBOTS. The number of tracking systems grew organically as the company expanded. NVBOTS knew that tracking was important but new systems were created over time and they had separate systems for different failure locations. Each system had its own layout and there was no communication or ability to merge data between the multiple systems. Each tracked a different set of items and there was very little standardized language or codes. The data collected from event to event was very inconsistent which made it difficult to fully understand each failure event and even tougher to merge all of the data into one system. The six tracking systems being used were: - Google Docs In-house Operation Failures spreadsheet - Google Docs Field Operation Failures spreadsheet - Google Docs Assembly Notes spreadsheet - GitHub tracking system - Django Staging server " Django My server The three Google Docs spreadsheets all had a different format and tracked different items. Some of these spreadsheets were not recording who entered the data and they were not being detailed and objective in documenting the events. The GitHub tracking system was being used more as a work list than a proper tracking system. The major drawback to GitHub was that a majority of the time the users would only record a unique failure type once. Subsequent failure events of the same type would likely not be recorded as the item was already on the GitHub list for future work. This method throws away valuable failure data that could be used to track how often a specific event type occurs as well as the total cost to the company from those untracked failures. Moreover, root cause analysis is more successful with multiple data sets for the same failure type. Determining the root cause from one event is much more difficult and uncertain than concluding on a cause of a failure event type with multiple event data sets. Lastly the two Django servers were better at automatically capturing some tracking items but the systems were heavily oriented towards failure cause instead of objectively documenting the symptoms and conditions. Overall the systems did track over 200 failure events with 50 unique failure types but the data transfer and processing for analysis was laborious and largely manual. Standardization of 57 data was applied retroactively, while missing items were filled in with best estimates by the engineering team and the author. Many missing items were not able to be estimated and thus the data set is still incomplete. The initial situation of NVBOTS tracking systems highlights the need for a single proper failure tracking system. Figure 5-2 shows a collage of the multiple tracking systems originally employed by NVBOTS. 58 I I I II I, I I 2 I ~i I I 11111 - II - "UI - 1 13 Cil I I I II a - I~~lld I1 N 4 I II I IJ I" I CL i I I I I low. I ; I I C + C I I I -U a S I I I - 2 A e 2 I II 96 I U 2 'I I U U I 0 I I U ~3 I. I I. ~t I I U I V 4 I V V II 0 2 I 0 L 0 I i I i I I I II I I IjI II I I 1.I I III. *1 Figure 5-2: Old tracking systems Promoted non-standard language and near impossible data integration 59 5.2.2 New Tracking System The new FMS tracking system developed in this project and now in use by NVBOTS is based on the ideal failure tracking methodology. The tracking system still uses the Django server architecture, but has been completely revamped. Most of the tracking items are already added to the system, and many have been automated, with plans to include all of the recommended tracking items as soon as possible. A few of the more complex automation features will be added once developed. The new system is easy to use, has a clean layout, and documents the symptoms and conditions objectively using standardized language. The Django system is accessible anywhere with connection to the internet to allow for easy access and timely recording of failure events. The system is only as good as its users. Because of this the team at NVBOTS has been instructed on the complete FMS and how to properly use the tracking system to its full potential. One big change from before on the human side is that the team now understands why it is important to track every single failure event. Hopefully this leads to a culture of extremely high failure event tracking percentages. NVBOTS is also now fully aware of the concept of objectively documenting first, before thinking about causes. This will ensure that data is collected in a straightforward manner with little need for interpretation. The failure tracking knowledgebase at NVBOTS has increased in the past few months which will only further improve the already bolstered tracking system via increased and improved user input. Although advised against this, NVBOTS is currently using two tracking systems due to technical issues with how the 3D printers communicate with the main servers. Essentially they have two identically revamped Django tracking systems; one that tracks failure events of inhouse research and development 3D printers, while the other tracks all other 3D printers. One specific issue to watch out for is non-standardized language between the two systems. When NVBOTS is able to integrate the two systems into one unified system in the future, caution must be taken to avoid language mismatch. Also, failures found in the research and development printers may not be indicative of those found in the production models. This will force NVBOTS to include another tracking item when merging the two systems, in order to distinguish between research and development units and production units during analysis. Large improvements have been made to the failure tracking system at NVBOTS, but there is still more work to be done with respect to achieving the complete FMS tracking system detailed earlier. 60 Figure 5-3 and 5-4 show the current welcome screen and failure event list that are designed to be simple and easy to use. Figure 5-5 details the event submission page that has been designed to minimize the amount of manual entry while ensuring a common standard language is used. Site administration Recent Actions +Add Failure events change My Actions None available Figure 5-4: Django welcome screen Note the simple layout with intuitive and easy to find "Add" and "Change" failure event buttons v ur ev- n ome > Printer - Select failure event to change created Determined cause July 30, 2015, 3:11 p.m 0 H51 Shift in XY Plane During Print failure July 29, 2015, 5:40 p.m. 0 H162 Extruder Jam failure July 22, 2015, 12:34 p.m. 0 i Failure event L9 H17 Bed Won't Heat failure ' July 17, 2015, 3:01 p.m. 0 a.m. 0 July 15, 2015, 12:16 p.m 0 H17 Extruder Jam failure July 7, 2015, 7:18 p.m. 0 H17 Extruder Jam failure July 3, 2015, 7:08 p.m. 0 117 Torn Buildtak failure " H-17 Extruder Jam failure July 2, 2015, 2:12 p.m. July 1, 2015, 4:19 p.m. 0 117 Mega Jam, Hot End Ooze failure H H17 Removal Motor Stall failure June 30, 2015, 2:45 p.m. 0 June 30, 2015, 1:33 p.m. 0 0 H17 Extruder Jam, Extruder Bearing Failure, Mega Jam failure June 29, 2015, 6:44 p.m. 0 0 H 17 Extruder Jam failure June 29, 2015, 4:15 p.m. 0 117 Object under z-gantry failure H June 28, 2015, 3:33 p.m. 0 H117 Super Jam [generic] failure H31 Removal Motor Stall failure June 26, 2015, 10:06 am. 0 0 0 0 0 June 19, 2015, 10:11 a.m. 0 144 Printer Disconnected failure H 118 Removal Motor Stall failure June 17, 2015, 11:23 pm. 0 June 17, 2015, 10:26 a.m, 0 129 Shift in XY Plane During Print failure 129 Super Jam [generic] failure June 16, 2015, 3:05 p.m. 0 June 16, 2015, 12:35 p.m. 0 June 12, 2015, 3:43 p.m. S H44 Removal Motor Stall failure June 12. 2015, 10:59 a.m. 0 H31 June 11, 2015, 2:40 a.m, 0 H40 Removal Motor Stall failure May 28, 2015, 10:26 a.m. 0 117 Part comes off raft failure May 23, 2015, 11:57 a.m. 0 H38 Layer Splitting failure 1] H17 Torn Buildtak failure H17 Removal Motor Stall failure _ E] I L H17 Z Carriage L I Service required July 17, 2015, 9:43 Backlash failure Super Jam [End of Spool] failure 0 26 failure events Figure 5-4: Django failure event list 61 Fixed 0 0 0 0 a 0 0 0 0 0 0 0 0 F , 11 ,! -;I.; 1!, i I:). ii -. - 1- 1, - I - '' - : - 1 1. -, 1. Add failure event ommas iom- Ca Part R-mva S Motn, lam r Z C.-.. St.11 rMft [End of Sp-[) B'kI'.h Super I.. Shift kn XY Obj.ct -nd.r -gantry Exruder Jam [19-1c] Pt- During Print PrinD. Kot End 0-m I-m S.Itdtak 11 RP.-v . Exhrudwr Bearing F.11-r -0 l-,-. Kieai opling D-sngagtcd Faff-rCue(seby P-1-od Nut Croked F.A-uCau Part Fa.Ourr-.-s (D-sign) St.pper &IdppingFalr~ue(D-09n) .at End F.. Drop BrknF.11ure1au (C--P-onkn Printbed Connfto m-Rtd F.11ur-r-us (D0-1gn) Printbed L-r" hardwa m: - .,-.;,.----.- ...-. ,-,' ,,,,- ,--, ". .,-. -- . lrwruffenent Pritbed Pr-I.ad Fad-r.Cau. (A-sdbly) - (C-mn-nt Qu QUa0ty ------ Q Detad.s: cub--,-or Nvpo t-e: 1, A. A. Z015, ,2015, 5.81.2''. 5:35 p-m !$ 25 P.m. 35' 3.- .sd a dd itooths. 9-'. -. c -Un- ditng Figure 5-5: Django event submission page Reduces manual entry required via autofill and promotes use of standard language 62 Chapter 6 Failure Analysis FMS processes all of the data collected from the FauiREEyNS i OnERVE SYNIPTOM S AND COW17ftNS OFss'E 'c failure tracking system to determine which failure 9 DOCUMENT I'ERY ISUE/ AJWRF 1VENT E ASY, Quick, AM Sb.E 5ATA'0tm-Y FOWt types have the largest impact on the product and S1ANDAWEDANDETARWEaEVETRECCIr3 * **F"f"**A*"""YST" company (see Figure 6-1). This allows a company to focus on the top contributing failure types first, tooA ; S0N A$ POStStU S FAILURE ANALYSIS OF.T FREQUENCY/TOTAL OCCURRENCES efficiently reducing failure costs while improving HARDWARE LABOR/TIME TRAVEL 4 in the automatic prioritizing of failure types, via the FMiLURE WSOLUTION , DIAGNOSIS s " REPAIR CUSTOMER IMPACT . DOWN SELECT BY USING TOTAL COST MODEL " INVESTIGATE ROOT CAUSE . - their product. The failure analysis process results total cost model, for root cause investigation. Root CAUSE UmPG TO' * The failure analysis component of the ideal cause analysis is then performed on the top priority NoAUSECa -CHAN91MfSXN, AEMBY-E failure types to determine the appropriate failure I TEST 4 ECONOMICAL UNOWNC SE t(SREEMN6 TEST AS AtARL A-% POSSIBLE resolution method. Failure analysis also allows for Figure 6-1: FMS failure analysis visualization, statistical evaluation, and a clear and The second step in the FMS framework broad understanding of the failure landscape of a given product or company. Below, the concepts mentioned in Section 4.3.2, along with others, will be expanded upon detailing the ideal failure analysis methodology. This methodology is intended to provide a foundational framework that can be applied to any company interested in establishing a strong FMS. 6.1 Detailed Methodology Prioritizing Failure Types The main objective of the failure analysis step is to calculate which failure types create the greatest cost or impact to the company. This is a large concern for companies as they want to focus on solutions that will have the largest impact first. The failure analysis prioritizes failure 63 types by using a total cost model. The total cost model objectively calculates the total cost of all failure types to date by summing up the costs of individual failure events based on hardware costs, labor costs, and customer impact costs. This allows the company to see which failure types need to be addressed first in order to eliminate the largest impacts. All data for total cost should be included in the failure event form allowing for the calculations and prioritization to be automatically performed in real time after each event submission to the failure tracking system. This eliminates the need for manual processing or making decisions based on intuition. Which failure type to address first is now based on facts, not subjective feel. More details explaining the total cost model can be found in Section 6.2. Root Cause Investigation After prioritization, the top cost contributing failure types are investigated for root causes. Root cause analysis can be performed using one of the many popular methods, including but not limited to: " Cause and Effect (Fishbone or Ishikawa Diagram) " Apollo Root Cause Analysis " Tree Diagram - Kepner-Tregoe Approach - Five Whys The Cause and Effect Diagram method involves drawing a fishbone diagram with branching lines, signifying each of the possible cause categories such as materials, environment, machines, measurement, et cetera. Further lines are dawn sprouting off the category lines, marking possible causes within each category. The Apollo Root Cause Analysis involves asking the question "why?" to a problem or effect. The causal answers are two fold, always consisting of an action and a condition. This question is asked again and again until there are no more causal answer pairings. The Tree diagram method is primarily used to narrow down to a specific cause when the general issue is already known. It involves drawing a tree diagram with two or more branches spurring off the trunk and subsequent branches forming off the larger branches, with each level becoming more and more specific. The Kepner-Tregoe Approach is a method based on selecting the most probable or best known cause. It involves rating a list of potential causes and then testing the most probably cause. The Five Whys method is self-explanatory in that it is 64 performed by asking the question "why?" five successive times, resulting in the root cause being found in five levels or less. These methods can be performed by a single individual but they are usually more effective when performed by a team [25]. For more information on how to perform a root cause analysis and details on each method please see Otegui [26], Gano [27], Williams [28], and Sarkar [29]. The root cause analysis will either find a cause that when removed will remove the failure or the cause will remain undetermined. Thankfully the failure resolution step is able to handle both failure types, those with known root causes and those with unknown root causes. Failure resolution will be further detailed in Chapter 7. Statistical Insight Failure analysis provides a statistical insight to the failure landscape of a given product or company. The data collected in failure tracking is organized and processed so that trends and important connections become easily visible. It allows for quick sorting and filtering of data by attributes such as: " total cost " average cost per failure event " number of events m failure type - issue versus failure - cause type " when found - part cost - repair time Graphs and visuals can easily be generated to better display the statistical findings from the data. Some popular charts that allow for a deeper comprehension of the failure tracking data include: - number of failures over time - total cost of failure event types - failure subassembly histogram " failure cause type histogram - value added when found 65 The statistical insight provided by failure analysis allows engineering and management to fully understand where they currently stand with product failures and enlightens them to potential areas to focus on for improvement. 6.2 Failure Economics: Total Cost Model The metric used to prioritize and down select failure types for root cause analysis, and later failure resolution, was total cost. As most businesses exist to generate profit, cost is the vehicle that transmits the impact of failures to a company. Cost is also very advantageous to use, as the value or impact of most anything can be converted into a cost or a credit for universal comparison and mathematical manipulation. The total cost is the summation of all the costs of individual failure events. Thus the total cost of a specific failure type is the summation of the cost of all the failure events of that failure type. Cost can be broken down into three main subcosts: hardware cost, labor and time, and customer impact. Labor and time can be further divided into travel time and cost, diagnosis time, and repair time. These cost factors have been developed for NVBOTS but are generic enough to apply to most companies with minimal alteration. The total cost model should be designed to automatically update in real time as new failure events are logged in the failure tracking system. All costing data will be provided in the new fIMIre Latai11g syste, a vllwing I Lfoh company Lo always be minrmeu tO which faiure types are of top priority. A total cost calculation example is provided in Appendix A for better understanding. Hardware The hardware cost is the cost of the parts and consumables needed to fix the given failure. NVBOTS leases and services all of their 3D printers, thus the hardware cost is the cost required to source or manufacture each part. Costs for parts and consumables were taken directly from the NVBOTS parts cost list. Labor and Time As noted earlier labor and time can be broken down into travel time and cost, diagnosis time, and repair time. Travel time and cost was estimated at a flat rate for all historical failure events as this data was not recorded in the previous failure tracking systems. The flat rate for 66 travel was calculated assuming a round trip time of two hours and a round trip distance of 80 miles. The hourly skilled technician labor rate used in calculations was $20 per hour and the mileage cost rate was $0.575 per mile. These conservative estimates were based on the fact that currently all NVBOTS customers are local to the Greater Boston area and the service is based out of NBVOTS headquarters in the Seaport district of Boston. After surveying the service technicians, it was determined that the average trip length was approximately one hour each way. The average mileage was calculated using the distances between NVBOTS headquarters and their customer locations via Google Maps [30]. The assumed skilled technician labor rate was averaged from labor rates found on PayScale [31]. The mileage cost rate was derived from the United States Internal Revenue Service standard deductible cost rate for using an automobile for business in 2015 [32]. Using the ideal FMS failure tracking system, exact round trip mileage and travel time will be recorded, allowing for more accurate travel costing in the future. Diagnosis time is the time incurred assessing the failure and determining the appropriate fix. Repair time is the time spent by the technician during repair, including teardown and rebuild. Diagnosis and repair times for all historical failures were estimated by the engineering team and field technicians at NVBOTS during a meeting with the author as they were not recorded in the past. The same hourly labor rate of $20 was used for calculating diagnosis and repair costs. Moving forward, estimates will no longer be needed as the diagnosis and repair times will be tracked by the new tracking system. Customer Impact Customer impact is the most vague and intangible cost associated with the total cost model. Careful thought and time should be spent considering the cost of customer impact as it usually has a significant impact on the total cost. Customer impact is the value of lost trust and satisfaction of a customer in the product converted to a cost. For a startup company with a recurring lease business model, customer impact is of the utmost importance as customer satisfaction and trust are needed to ensure renewal of the lease. For NVBOTS the customer impact was calculated for the two failure event categories: issues and failures. The director of supply chain and finance along with the engineering team and the author determined that the average customer was likely to not renew their lease if they had more than one issue every other week or more than two failures within a year. 67 It was agreed upon that these were the best estimates for now but the values should be updated as more historical data is generated. A customer survey could also shed light into more accurate customer impact costs. The value of an NVBOTS customer over the intended five year long-term contract was estimated to total $25,000 taking into account production and service costs [33]. To calculate the cost per issue, the value of a lost customer ($25,000) was divided by the number of issues over five years estimated to cause a lost customer (130 issues). Thus, an issue was calculated to cost $192 per event in customer impact. Similarly, to calculate the cost per failure, the value of a lost customer ($25,000) was divided by the number of failures over five years estimated to cause a lost customer (10 failures). Thus, a failure was calculated to cost $2,500 per event in customer impact. It can be clearly seen that customer impact carries an immense cost for NVBOTS due to their business strategy and customer base. This highlights the importance of customer impact and the great need to reduce the number and severity of failure events occurring in the hands of the customer. There is no customer impact if the failure event is caught in-house. 6.3 NVBOTS Case Study Failure analysis was performed on historical data collected from the six failure tracking systems at NVBOTS. merge the Uata Substantial manual work was done to standardize the data language, seLs, aiiu 1111 111 ms Ing where pUsMUil. 11 Witems A spreasUeL Was UVeIUpeU LU organize and analyze the data. See Appendix B for a screenshot of a representative portion of the failure analysis spreadsheet. As of July 1 0th, 2015, 204 failure events were tracked, including 50 unique failure types, accounting for over $75,000 in tracked costs. The true cost of all failures is most likely higher as many historical failures were not documented. The data was first manipulated to categorize the failure events by subassembly. Figure 6-2 shows the resulting histogram of failure events by subassemblies. The gantry subassembly shows the highest number of failure events, most likely because of the heavy scrutiny applied to this subassembly at NVBOTS. This is the only subassembly that is measured on the coordinate measuring machine (CMM) before and after assembly. subassembly is early in the final assembly order. Luckily this This allows for failures to be caught and addressed early in the assembly process, before most of the value has been added to the 3D printer. The earlier a failure is found in the manufacturing process, the less backtracking and rebuild is required, thus avoiding greater costs. Of more concern are the values for the extruder 68 and electronics subassemblies. These two subassemblies are added to the printer much later in the final assembly process and incur greater costs if found after final assembly. A key takeaway from the chart is that NVBOTS should implement subassembly testing of the gantry, extruder, and electronics before the final assembly procedure. This will limit the number of failures that make their way into the final assembled product allowing them to be caught much earlier in the value added chain, thus reducing impact and cost. These three subassemblies account for a majority of the failures and thus they require focused attention. 1 70 60 4.' C a, 50 40 Ui * Issues E 30 .0 * Failures E 20 z 10 0 - -2 C' 0 (>0" Subassembly (left to right, in order of final assembly) Figure 6-2: Subassembly Failure Histogram Failure analysis can also show which departments of a company are responsible for failure occurrences. The NVBOTS data was processed manually and coded for probable cause type. Each failure event was listed as most likely being caused by design, assembly, part quality, 69 extreme use, unknown, or other. A histogram was generated to display these results visually and is shown in Figure 6-3. One benefit of this graph is that it is able to show which departments within a company are responsible for past failures and allows management to direct support to those departments in need. Design failure events most likely come from engineering and design, while production and assembly is most likely responsible for assembly cause type events. Part quality failure events should be handled by incoming inspection and supply chain, while extreme use events are most likely caused by miscommunication between sales and the customer or a result of poor customer training. These department-to-cause type relationships will not apply to every failure event but in general they hold true. The histogram shows that engineering, production, and supply chain will need focus and support moving forward in order to decrease the number of failure events in their respective areas. 70 70 60 50 C 40 M Issues LL. 4-30 30 0 Failures E 20 z 10 0 Design Assembly Part Quality Extreme Use Unknown Other Cause Type Figure 6-3: Failure Event Cause Type Histogram Another valuable graph resulting from failure analysis, diagrams when failure events occur along the value chain. As mentioned before, the earlier a failure is caught the less it will cost the company. Figure 6-4 shows when failure events are found with respect to the percent of value added to the product at the time of discovery. The percent valued added is based on the approximation that the total cost of the NVPro is $2000 and 25 hours of labor are needed to complete one build. At the time of analysis, these were the best estimates provided by the director of supply chain and finance, operations manager, and CEO. This graph is an eye opener to the company. An overwhelming majority of the failure events occur after 100 percent of the value has been added to the 3D printer. These failures are found during the one week bum-in test prior to shipping the product off to the customer. On a 71 positive note it is good that they are catching these failures before the product arrives at the customer but since 100 percent of the value has been added costly teardown and rebuild are needed to fix the failure. Another issue with failures occurring once the printer is complete and in testing is that it impacts the delivery date to the customer. The failure will delay the testing as it needs to be assessed and repaired, inevitably pushing back the ship date to the customer. This is not the first impression any company wants to make. The spike at 40 percent is easily explained by NVBOTS meticulous CMM inspection of the gantry and Z axis subassemblies. This is a positive step in that these failures are caught early in the value chain. The most troubling detail of this histogram is the fact that over one quarter of the failure events have occurred in the hands of the customer. Customer impact is a dominant factor in the total cost model for the NVPro causing these failure events to have the greatest cost and impact. As a startup looking to grow in popularity and sales, product reliability and company reputation are essential. A takeaway from this histogram is that more subassembly testing is needed to catch failures earlier in the value chain, preferably before final assembly. Another takeaway is that too many failures are occurring at the customer. NVBOTS needs to use the FMS to first and foremost lower the values on this histogram and secondly shift the values towards the left. Through the use of the ideal FMS they will be able to lower the number of failures as well as reduce the average cost per failure. 72 100 90 80 C 70 GJ 260 La 50 Issues 4-50 0 S40 . Failures E S30 z 20 10 0 10 20 30 40 50 60 70 80 90 100 120 Percent Value Added Figure 6-4: When Failures are Found with Respect to Value Added From the failure analysis spreadsheet individual costs and total costs were calculated to prioritize the failure types. This allowed NVBOTS to view which failure types had the greatest impact on their company and needed to be addressed first. Figure 6-5 shows the complete chart as well as a magnified section highlighting the most costly failure types. The long thin tail to the right is made up of failures and issues that were caught in-house or issues that rarely occurred out in the field and this shows that more than 50 percent of the failure types can safely be ignored for now. These failure events are both rare and inexpensive and do not currently need attention. The failure event types on the left are those that are most important to the company as they historically have the greatest total costs. These events are either very expensive to fix, occur often, or both. Most of these event types have occurred at least once at the customer due to the high customer impact costs of NVBOTS. 73 The super jam 4 failure type is by far the most costly event accounting for over $18,000. It occurs quite often and it is always a full failure. When at the customer this automatically requires a service run by a trained technician, but more importantly it will incur a $2500 customer impact cost for each event. The super jam is the top priority for NVBOTS. The next greatest total cost event is the printer board failure. This failure is quite different from the super jam in that it rarely occurs at the customer but the hardware is expensive and there have been many failure events. This can be seen as the average cost per event is relatively low. Frame alignment also falls into the same high number of occurrences category as printer board failure. After the first 15 to 20 failure types the total costs really drop off. Root cause analysis should be performed on the top 15 to 20 failure types to determine the root cause, if possible. Another item to note is that seven of the top 20 event types are in the electronics subassembly. Until the root cause of each can be determined, this is an easy subassembly to test before final assembly takes place. This can help catch some of the failures within this costly subassembly before it is integrated into the whole 3D printer and hopefully catches some failures that may have occurred later at the customer. This graph is very important to the FMS and should be automatically updated in real time as new failure events are logged. This graph should be brought up in a companywide meeting at least once a month to inform every one of the top priorities and update management of progress on root cause analysis and resolution actions underway. 4 A super jam is filament jam in the extruder head that cannot be fully unjammed by simply pulling the filament back out, usually due to the filament breaking, ending, or melting prematurely. 74 Total Cost of Failure Events See Below for Detail i vTotaos I t : I I l j I A s les Tt Average Cost per Event KI IN AFf", Evenhts[hl t cIz rn:I Ir+n ~ rma a Puan 0 0 1000 0 CD C0 0-t CD 800 m Total 0 Cost 600 0 i Average Cost per Even t 4000 200 0 0E e,~ LV~I ventL \0 PAGE INTENTIONALY LEFT BLANK 76 Chapter 7 Multi-Method Failure Resolution The last step in the linear workflow of the ideal FMS is failure resolution (see Figure 7-1). Failure FLURE EVENTS TAKG* F W ""*oi""** event types are fed into the failure resolution step from the root cause analyses Pom A -RERS performed during the failure analysis step. Failure resolution is the action step. This is where failure types are finally mitigated by complete elimination or reduction of impact. '",W"O Naturally, elimination o sounds like the ideal resolution method but that is not always true. The FMS is mainly concerned with reducing the impact of the cost of failures to a product or company. o C I 1"INf~nGA1ERQOT CAUSE FAILURE ACT Failure resolution must be VNi RESOLUTION KNOWN CAUSE CHANGE DESIGN, ASSEMBLY, ETC. TEST IF ECONOMICAL UNKNOWNCAUSE SCREENING TEST AS EARLY AS POSSIBLE based on economics and thus the ideal resolution Figure 7-1: FMS failure resolution method is the most economical one. Failure types The third step in the FMS framework can come out of root cause analysis with either a known or unknown cause. Multi-method failure resolution is robust enough to handle both cases and provides a framework for selecting and acting upon the most economical failure mitigation solution. The concepts mentioned in Section 4.3.3, along with others, will be expanded upon below detailing the ideal failure resolution methodology. 7.1 Detailed Methodology Solutions for the Future Solutions devised in failure resolution should be aimed at reducing failure events and severity for the foreseeable future. Although quick fixes and temporary stopgaps can help reduce near-term failures, this framework is designed to develop robust and long-lasting 77 solutions. Failure resolutions should be well vetted with a proper failure review board meeting. If the resolution requires a design change, a proper design change review should be conducted to ensure the change does not negatively impact other aspects of the product. Like root cause analysis, these solutions can be devised by a single individual but generally higher quality results are achieved when devised by a team [25]. It is recommended that an interdisciplinary team be tasked at addressing all failures that have been down selected by the failure analysis prioritization. Economical Solutions The solutions implemented during failure resolution must be economical. The main purpose of the ideal FMS is to limit the cost of failures. Most solutions are not free as they require engineering support, management oversight, production implementation, et cetera, along with any physical costs such as new hardware, test equipment, or tooling. Historical cost and expected future cost of the failure should be compared to the estimated solution cost. In all cases the solution cost must be less than the expected future cost of the failure for a resolution action to occur. Otherwise it is not an economically sound solution and further thought should go into other possible solutions. Known Cause When root cause analysis is able to determine the cause of a failure event type the resolution should eliminate the event type if economical. This is accomplished via design change, assembly procedure change, proper handling requirements, supplier change, et cetera. The estimated cost of the solution should be compared to the expected future cost of the failure type. If economical, the solution should be implemented to eliminate future failures. If the solution out weights the expected failure cost, a test may be a more economical resolution. Screening, burn-in, or stress tests can be implemented early in the value chain to reduce the number of failures post-test. This allows for potential failures to be brought to the surface as soon as possible to limit the impact they cause. Catching a failure before it reaches the customer can be very cost effective. A majority of the time, elimination of a known cause failure is economically viable, otherwise a simple test as early as possible is recommended. 78 Proper documentation of the resolution method should be kept for future employees to study as lessons learned from past failures. Unknown Cause Sometimes root cause analysis is unable to determine a definitive cause with the information currently available. When this occurs, an economical test should be devised to arrest failures early in the value chain. The test will increase the chance of finding the failure earlier, thus reducing the cost from failure timing. When possible, these tests should be implemented at or before the subassembly process in order to alleviate the amount of backtracking. As in the known cases these tests can have a large impact on cost by reducing the number of failures that reach the customer. Even though an effective test may already be implemented, a root cause investigation should be revisited periodically as new data may help determine the root cause and possibly lead to the elimination of that failure event type. 7.2 NVBOTS Path Forward After root cause analysis is complete on the top 15 or 20 failure event types, down selected in failure analysis, the failure resolution methodology detailed above should be used to take action and eliminate or mitigate each failure type. Two of the top 20 failures, idler bearing seizures and super jams, are already being addressed. For the idler bearing failure, the root cause was determined to be metal shavings from the old assembly procedure that were fouling the bearings and leading to premature failure. This failure has been virtually eliminated with a simple assembly procedure change to prevent the shavings from entering the needle bearings. On the other hand, super jams have been investigated for a while, but no single definitive root cause has been determined. NVBOTS has discovered that most super jams begin as a regular jam. This has allowed them to design a test, built into the 3D printer, which continuously checks for a jam. Once a jam is detected, the printer is paused to prevent the jam from developing further into a super jam. Although this does not prevent every super jam, it has proven quite effective at reducing the number of super jams and notifies the customer that a regularjam has occurred. Regular jams are only an issue and can be fixed by the customer, resulting in reduced cost and impact to NVBOTS. The company should continue to use the structured failure resolution approach to address the remaining top ranked failures. Once complete they should 79 continue to move down the prioritized list, improving the NVPro one failure type at a time, until the failures have so little impact that the solutions are no longer economical. 80 Chapter 8 Summary, Conclusions, and Recommendations 8.1 Summary Product reliability, quality, and performance are essential for all companies, especially high technology manufacturing startups looking to scale-up successfully. Company image and reputation can be heavily impacted by product failures. The cost of failures in-house and at the customer will only increase as a company scales up. Failure mitigation is critical to the success of a product and its company throughout the entire product lifecycle. This thesis proposes an ideal Failure Mitigation Strategy that provides a methodology and framework with linear process workflow and easy to follow steps that lead to the reduction of cost from failures. Establishing a strong FMS will assist the company in learning from their failures while reducing the total number and average cost of failure events. The ideal FMS was tailored to and implemented at NVBOTS in Boston, Massachusetts, as a case study. The ideal FMS consists of failure tracking, failure analysis, and multi-method failure resolution. Failure events are first observed and properly documented via the failure tracking system. Failure tracking data is then processed during failure analysis using a total cost model to automatically prioritize and down select the most impactful failure event types. Root cause analysis is then performed on the top priority failure event types. Finally a robust multi-method failure resolution methodology uses an economical combination of design and process changes along with testing to eliminate or reduce the cost of those failures. Over 200 failure events were tracked, including 50 unique failure event types, accounting for over $75,000 in costs at NVBOTS. A unified and improved tracking system was implemented at NVBOTS along with a powerful analysis framework. Failure analysis was performed, prioritizing the failures by total cost and a failure resolution framework was designed to implement the solutions to the top priority failure event types. The ideal Failure Mitigation 81 Strategy offered in this thesis provides NVBOTS and other entities a framework that allows for full understanding of the current failure landscape as well as a systematic method to reduce the impact from failures through elimination and mitigation. 8.2 Conclusions The purpose of the overall MEngM-NVBOTS collaborative project was to introduce process improvements to match production scale-up at NVBOTS. This was accomplished through three subprojects completed by the MEngM team. Chawla [11] focused on developing a framework for incoming part quality control and inspection procedures, the details of which can be found in his thesis. This thesis focused on implementing a Failure Mitigation Strategy to reduce the cost of failures while improving the product reliability, quality, and performance. Finally, Shabbir [12] focused on improving product reliability through systematic failure analysis and a quantitative understanding of product life through accelerated life testing, the details of which can be found in his thesis. The following conclusions are drawn from the work performed for this thesis focused on the FMS. Importance of Product Reliability PrOdIUcL re.1iLLUitLLy is extremeLy 1imporLta LU L11suXCcs 01 a coUmipany. company reputation are heavily impacted by product reliability and performance. I IUUL aU This is highly magnified for high technology manufacturing startups looking to scale-up, as they have little track record and their reputation is more fragile. Even more so, as NVBOTS leases their printers they are solely responsible for all service and costs to keep the 3D printers running. Product failures must be properly addressed to ensure the reliability of the product meets or exceeds the customers' expectations and limits the cost to NVBOTS. The FMS presented provides a structured method to reducing the number and impact of failures and thus increases product reliability. Failure Mitigation Strategy The ideal FMS consists of three steps: failure tracking, failure analysis, multi-method failure resolution. This framework provides a structured foundation that can be adapted to fit many companies. The presented case study details how the FMS was easily tailored to 82 NVBOTS and it is now a valuable tool in their daily operations. This is not an entirely novel idea, but it is not as widely used as expected and even less so documented. This work intends to serve as guide to establishing a FMS in hopes that many companies will benefit from its ability to methodically reduce the impact of failures. FMS is an effective and efficient cost reduction tool that pays for itself and more. Cost of Failure Timing The impact of when failures occur was highlighted throughout this work. Identical failures can have drastically different costs, solely due to when they occur. A failure that is caught early in the value added chain is easy to fix, and has limited impact past its subassembly. A failure occurring at the customer is more expensive and labor intensive to fix and the impact is much further reaching, as the product failed to meet the customers' expectations. It is economically advantageous to catch failures as soon as possible in the value chain if they cannot be eliminated altogether. Cost of Customer Impact Customer impact can have significant and impactful costs, as seen in the NVBOTS case study. These costs are not always so dominant, but in most cases customer impact costs are highly significant and should not be overlooked. Like most startups, NVBOTS was surprised to find out how much customer impact was truly costing them. This cost is hard to quantify and sometimes difficult to see, warranting effort and time spent to estimate it as best the company can. As the cost of customer impact can be significant, management should take it into consideration when making decisions for the future. Document Everything, Objectively The importance of documenting every failure was highlighted by the NVBOTS case study. Failures that are untracked are a loss of potential savings and product improvement. Failure events generate a lot of valuable data that is relatively easy to capture if a proper failure tracking system is in place. Every failure event must be recorded no matter how big or small, significant or insignificant, unique or common, issue or failure, every individual event must be logged. The events must be documented objectively, taking note to record symptoms and 83 conditions before contemplating causes. This will ensure the company is able to capitalize on the valuable data hidden within failures. Single Tracking System with Standardized Language A single unified tracking system with standardized language is critical to successful failure tracking and analysis. The standardization allows for efficient and automated data processing. Without it, tedious manual work is required to process the data and data integrity can be compromised in translation. This was exemplified by NVBOTS's six separate failure tracking systems, all tracking different items and lacking a common language. One system and one language will allow for efficient documentation, communication, and data analysis. Total Cost Model The total cost model was developed as an objective method to prioritize failure event types for mitigation and enlighten the company to the true cost of their failure events. This model is widely applicable as most anything can be converted into a cost for comparison and mathematical manipulation. Care must be exercised in developing the cost rates and other inputs to the model to ensure accurate results. Along with a proper failure tracking system, the total cost model can be automated to run after every new failure event submission. This automation will allow for the top priority list to be updated in real time, allowing for a deeper and more complete understanding of the current failure position of the company. Super Jams and Electronics Specifically for NVBOTS, super jams were calculated as the most costly failure type to date, totaling over $18,000. The total cost of super jam events was more than double that of the next most expensive failure event type. The immense total cost is being driven by the number of occurrences and customer impact as many super jams have occurred at the customer. NVBOTS is aware of this and is working towards reducing the impact via continuously monitoring for known symptoms and powering down the extruder before a super jam can occur. Among the rest of the top 20 prioritized failure types, seven involve the electronics subassembly. Root cause analysis should be performed to determine the best resolution methods. Until root cause 84 and resolution are complete, a simple subassembly test can be implemented to screen for failures early in the value added chain. Multi-Method Failure Resolution Failure resolution provides a robust and capable framework that can handle known cause and unknown cause event types. Failures with an indeterminate cause are treated just as importantly as their counterparts with known causes, as large cost savings are still possible. It details how to economically resolve failures through multiple methods, including design changes, new assembly procedures, and testing. Finally, solutions are implemented, reducing or eliminating the cost of failures and realizing the goal of the complete FMS. 8.3 Recommendations for Future Work The following recommendations are tailored for NVBOTS but are generally applicable to all companies. Automating the failure tracking and failure analysis systems should be completed as soon as possible in order to fully realize all of the benefits of implementing the Failure Mitigation System. Establishing this system early on and ingraining it into the company culture will allow for fewer growing pains while scaling up and prevent the cost of failures from scaling up as well. The last two recommendations complement the FMS and will bring about proper engineering rigor and structure, commonly found in successful and well established companies, to NVBOTS. 8.3.1 Automate Failure Tracking System The failure tracking system should be automated where possible to minimize manual technician input and reduce the risk of entry error. All of the recommended tracking items in Table 5.2 should be included in the new system in order to capture a complete snapshot of the failure event and provide the required data for failure analysis. The travel time and travel distance should be automatically populated from real time data requested from Google Maps at the time of the service. The macro location of the printer, along with the location of NVBOTS headquarters should allow for this to be sourced, once the hardware number is manually entered. A complete and searchable parts list with part numbers and names should be integrated into the 85 tracking system for technicians to quickly document which parts need repair or replacement. All of these recommendations will allow for more efficient and accurate data entry, and accurate data leads to accurate results. 8.3.2 Automate Failure Analysis System Automation of the total cost model within the failure analysis system is made possible by the new tracking system, as all data required for total cost calculations is now being captured. Automatic transfer of data from the tracking system to the analysis system, or a merger of the two systems, should be implemented in order to realize the capability of real time updates to the failure type top priority list. Once a new data entry is submitted, the data should automatically transfer to the analysis system where total cost is calculated, automatically adjusting the top priority graph in real time. Data over 12 months old should be automatically removed from the displayed failure priority graph to keep the data set fresh and show a more representative depiction of the current failure landscape. All data should be archived but only the most recent 12 months should be displayed. Along with the new failure tracking system and items, this automation will drastically reduce the amount of manual labor required for failure analysis, allowing the engineering team to focus on other tasks at hand. 8.3.3 Establish Failure Review Process Establish a failure review board comprised of an interdisciplinary team with representatives from each department of the company. This team will oversee all root cause investigations and failure resolution actions. The team will review the resolution solutions to ensure that they are economical and do not negatively impact any other aspect of the product. This interdepartmental team will own the FMS and create a culture of strict adherence to the rigor and structure required to operate an effective FMS. 8.3.4 Establish Design Change Review Process As the company grows, structured processes are vital to maintaining smooth operations and workflow. For large design changes, design change review meetings should be held to ensure the change is beneficial and has been properly vetted against negative impact to other aspects of the product. This is a great time to get a second set of eyes on a new design to ensure 86 it will work as intended and provide value to the product. The same interdisciplinary team, aforementioned above, should oversee this process, helping to ensure new failure methods are not introduced into a product via a design change. 87 PAGE INTENTIONALY LEFT BLANK 88 Appendix A Total Cost Example Calculations The following examples show the total cost calculations of two specific failure events. Every event's total cost was calculated in a similar manner, via the analysis spreadsheet in Appendix B. Failure Event: Broken Removal Blade Mount Failure/Issue: Failure Where: Field (at the customer) Item Quantity Cost/Rate 1 $5.00 Calculation Cost 1 x $5.00 $5.00 Total Hardware Blade Mount $5.00 Time/Labor Travel Mileage (mi) Travel Time (hr) Diagnosis Time (hr) Repair Time (hr) 80 2 0.083 0.333 80 x $0.575 2 x $20.00 0.083 x $20.00 0.333 x $20.00 $0.575 $20.00 $20.00 $20.00 $46.00 $40.00 $1.67 $6.67 $94.33 Customer Impact Failure 1 1 x $2,500.QO $2,500.00 $2,500.00 $2,500.00 $2,599.33 Total Cost 89 Failure Event: Printerboard Electronics Failure Failure/Issue: Failure Where: In-House Testing at NVBOTS Item Item Quantity Ouant1tv Cost/Rate cost/Rate 1 $85.00 Calculation Cost 1 x $85.00 $85.00 Total Hardware Printerboard $85.00 Time/Labor Travel Mileage (mi) Travel Time (hr) Diagnosis Time (hr) Repair Time (hr) 0 0 0.5 $0.575 $20.00 0.75 $20.00 0 x $0.575 0 x $20.00 0.5 x $20.00 0.75 x $20.00 $20.00 $0.00 $0.00 $10.00 $15.00 $25.00 Customer Impact Failure 0 0 x $2,500.00 $2,500.00 $0.00 $0.00 $110.00 Total Cost 90 PAGE INTENTIONALY LEFT BLANK 91 Appendix B Failure Analysis Spreadsheet The image on the following page is a screenshot of a representative portion of the master failure analysis spreadsheet that was used to analyze all 204 failure events. The data was manually standardized one event at a time and categorized for statistical processing. Total cost was calculated on the right half of the spreadsheet with color coding signifying the impact of each total cost. All graphs and statistics were derived from this spreadsheet. 92 . 2 - 2 2 = fi a- .- 2 8~ EEE ~22 E cD E2 - 2 - .2~~~~~~2 - 12( I ~I - lu ~88 Ln E ____ - .22~. i,7Z , _2m~ - - co . -- 'L' E = 2 - c wi12 -a = r4rir CC~ - cf o E w C4 8 88, , - __ -2 tE Ln ~ - 93 E 2 n- - t2 S wl ' ~~~ - i R E -2 22 caA ci m - E E ~ - 2 E E Rm. - - C:5- .2 W M2 12 2E - E 2 - r _ 22 2 E E wwauW 2 ~~-C ~ - Z- 2e!T2L .2 2 - -2 2 .2 2 2 2! T.2mw, 2 . - .2 PAGE INTENTIONALY LEFT BLANK 94 Bibliography [1] A. Hirai, "What Kills Startups?," Cayenne Consulting, 2010. [2] M. Less, "Startupedia: What Does Scaleup Mean?," Startup Institute. [Online]. Available: http://blog.startupinstitute.com/2015-3-24-what-is-a-scaleup/. [3] [Accessed: 01-Jul-2015]. S. Berger, "Scaling up Startups to Market," in Making in America: From Innovation to Market, Cambridge, MA: MIT Press, 2013, pp. 65-89. [4] National Venture Capital Association, "Annual Venture Capital Investment Tops $48 Billion in 2014," 2015. [Online]. Available: http://nvca.org/pressreleases/annual-venturecapital-investment-tops-48-billion-2014-reaching-highest-level-decade-accordingmoneytree-report/. [Accessed: 02-Jul-2015]. [5] T. Kurfess, "Why Manufacturing Matters," American Society ofMechanical Engineers, no. November, 2013. [6] C. Banden-Fuller and I. MacMillan, "3 Mistakes Made in Scaling up New Ventures," HarvardBusiness Review, Aug-2010. [7] M. Barros, "Poor Quality Will Kill You," One Entrepreneur's Perspective. [Online]. Available: http://marcbarros.com/poor-quality-will-kill-you/. [Accessed: 03-Jul-2015]. [8] E. R. Reynolds and H. Samel, "Invented in America, Scaled Up Overseas," ASME Mechanical EngineeringMagazine, no. November, Nov-2013. 95 [9] K. Weisul, "Everything Your Startup Needs to Know About Manufacturing," Inc.com, 2015. [Online]. Available: http://www.inc.com/kimberly-weisul/five-lessons-from-themanufacturing-trenches.html. [Accessed: 02-Jul-2015]. [10] NVBOTS, "About Us." [Online]. Available: http://nvbots.com/about/. [Accessed: 03-Jul2015]. [11] R. Chawla, "Scale-up of a High Technology Manufacturing Startup: Framework for Analysis of Incoming Parts, Inspection Procedure and Supplier Capability," Massachusetts Institute of Technology, 2015. [12] A. Shabbir, "Scale-up of a High-Technology Manufacturing Startup: Improving Product Reliability Through Systematic Failure Analysis and Accelerated Life Testing," Massachusetts Institute of Technology, 2015. [13] T. Wohlers and T. Caffery, Wohlers Report 2015: Additive Manufacturingand 3D PrintingState of the Industry: Annual Worldwide Progress Report. Fort Collins: Wholers Associates, Inc. 2015. [14] I. Gibson, D. W. Rosen, and B. Stucker, Additive Manufacturing Technologies, Second Edi. New York: Springer, 2015. [15] ASTM International, "F2792-12a - Standard Terminology for Additive Manufacturing Technologies," pp. 10-12, 2013. [16] NVBOTS, "Pricing." [Online]. Available: http://nvbots.com/pricing/. [Accessed: 12-Jul2015]. [17] NVBOTS, "NVPRO." [Online]. Available: http://nvbots.com/nvpro/. [Accessed: 12-Jul2015]. [18] NVBOTS, "NVLIBRARY." [Online]. Available: http://nvbots.com/nvlibrary/. [Accessed: 12-Jul-2015]. 96 [19] NVBOTS, "MYNVBOTS." [Online]. Available: http://nvbots.com/mynvbots/. [Accessed: 12-Jul-2015]. [20] M. A. Cusumano, "Evaluating a startup venture," Commun. ACM, vol. 56, no. 10, p. 26, 2013. [21] R. Dobbs, J. Manyika Yougang Chen, M. Chui, and S. Lund, "McKinsey Global Institute The McKinsey Global Institute," no. May, 2013. [22] M. Villacourt and P. Govil, "Failure Reporting, Analysis, and Corrective Action System," Austin, 1994. [23] W. Goble, "Value of failure data," Hydrocarb. Process., vol. 87, no. 10, p. 138, Oct. 2008. [24] M. R. Tuckey and N. Brewer, "The influence of schemas, stimulus ambiguity, and interview schedule on eyewitness memory over time.," J. Exp. Psychol. Appl., vol. 9, no. 2,pp. 101-118, 2003. [25] "Root Cause Analysis Processes & Methods," ASQ, 2015. [Online]. Available: http://asq.org/leam-about-quality/root-cause-analysis/overview/conducting-rootcause.html. [Accessed: 17-Jun-2015]. [26] J. L. Otegui, Failureanalysis, vol. 3, no. 2. Ney York: Springer, 2014. [27] D. L. Gano, Apollo Root Cause Analysis -A New Way of Thinking. 2007. [28] P. M. Williams, "Techniques for root cause analysis," Baylor Univ. Med Cent. Proc., vol. 14,no.2,pp.154-7,2001. [29] A. Sarkar, A. R. Mukhopadhyay, and S. K. Ghosh, "Root Cause Analysis, Lean Six Sigma and Test of Hypothesis," TQMJ., vol. 25, no. 2, p. 26, 2013. 97 [30] "Boston," Googel Maps, 2015. [Online]. Available: https://www.google.com/maps/place/Boston,+MA/@42.3133735,71.0571571,12z/data=!3ml!4bl!4m2!3ml!1s0x89e3652d0d3d311b:0x787cbf240162e8a0 . [Accessed: 30-May-2015]. [31] "Hourly Rate for Skill: Computer Hardware Technician," PayScale, 2015. [Online]. Available: http://www.payscale.com/research/US/Skill=ComputerHardwareTechnician/HourlyRa te. [Accessed: 30-May-2015]. [32] "New Standard Milage Rates Now Available; Business Rate to Rise in 2015," IRS, 2015. [Online]. Available: http://www.irs.gov/uac/Newsroom/New-Standard-Mileage-RatesNow-Available;-Business-Rate-to-Rise-in-2015. [Accessed: 30-May-2015]. [33] "Interview with Edward Brady, NVBOTS Director of Supply Chain and Finance." Boston, MA, 2015. 98