IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, VOL. 3, NO. 4, OCTOBER 2006 423 Fault Region Localization: Product and Process Improvement Based on Field Performance and Manufacturing Measurements Kamal Mannar, Darek Ceglarek, Member, IEEE, Feng Niu, Senior Member, IEEE, and Bassam Abifaraj Abstract—Customer feedback in the form of warranty/field performance is an important direct indicator of quality and robustness of a product. Linking warranty information to manufacturing measurements can identify key design and process variables that are related to warranty failures. Warranty data have been traditionally used in reliability studies to determine failure distributions and warranty cost. This paper proposes a novel fault region localization methodology to link warranty failures to manufacturing measurements (hence, to design and process parameters) for diagnosing warranty failures and to perform tolerance revaluation. The methodology consists of identifying relations between warranty failures and design/process variables using rough sets-based analysis on training data consisting of warranty information and manufacturing measurements. The methodology expands the rough set-based analysis by introducing parameters for inclusion of noise and uncertainty of warranty data classes. Based on the identified parameters related to the failure, a revaluation of the original tolerances can be performed to improve product robustness. The proposed methodology is illustrated using case studies of two warranty failures from the electronics industry. Index Terms—Diagnostics, manufacturing, product field performance, product life-cycle, quality, tolerance analysis, warranty. NOMENCLATURE DP PV PLM WIS Note to Practitioners—Warranty failures are indicative of the performance and robustness of the product. Warranty failures, especially those that occur early (e.g., within six months after sale), can be caused by interactions between various design and process characteristics of the individual components of the product. Due to the large number of components and the interactions between them, it is difficult to identify all of these relations during design. Furthermore, it is difficult to replicate actual product usage in the field during the design stage. The methodology proposed in this paper integrates a product’s warranty failure information with that of measurement data collected during manufacturing, to identify relevant design and process variables related to the failures. It also identifies the warranty fault region within the original design tolerance window for the parameters. This can help in avoiding warranty failure(s) through design changes and/or tolerance revaluation. The methodology was applied in the electronics and semiconductor industries. Manuscript received June 16, 2005; revised May 3, 2006. This work was supported in part by the National Science Foundation under Grant DMI-0218208 and in part by Motorola Corp. This paper was recommended for publication by Associate Editor M. Lawley and Editor P. Ferreira upon evaluation of the reviewers’ comments. K. Mannar and D. Ceglarek is with the Department of Industrial and Systems Engineering, University of Wisconsin-Madison, Madison, WI 53706-1572 (e-mail: darek@engr.wisc.edu). F. Niu is with Motorola Labs, Plantation, FL 33322 USA. B. Abifaraj is with Motorola Integrated Supply Chain, Nogales, AZ 85621 USA. Color versions of Figs. 2, 3, 7, and 8 and Table II are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TASE.2006.880526 (E) (E) 1545-5955/$20.00 © 2006 IEEE Design parameters. Process variables. Product life-cycle management. Axiomatic design matrix representing the relationship between DP and PV to functional requirements. Warranty Information system consisting of . training data for warranty failure set of all samples in training data. Field performance/warranty failure characteristic. Nonempty set of manufacturing measurements for each sample . Binary decision class for each sample , where 1 represents warranty failure and 0 the normal product. is the value set of any measurement , it is the set of distinct values that a particular has in . parameter is the uncertainty associated with the decision to n. class value Factors for estimating noise in the warranty and normal decision classes. Equivalent set is a set of product samples which are not distinguishable from each other based on measurements. Family of equivalent classes based on . Certainty to classify E to the warranty region based on . Certainty to classify E to normal region based on . Dependency degree is the fraction of samples in that can be classified into warranty or normal products relative to the total number of objects in . Reduct generated for U, is a minimum subset of C which approximately preserves dependency degree. 424 WFR NR BND GRS IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, VOL. 3, NO. 4, OCTOBER 2006 Warranty region in the parameters identified by FRL. Normal region in the parameters identified by FRL. Boundary region in the parameters identified by FRL. Generalized rough sets (RS). TABLE I RELATED RESEARCH ON WARRANTY ANALYSIS I. INTRODUCTION A. Motivation I N today’s intensely competitive market, manufacturers need to continuously reduce cost and improve the quality of their products to successfully attract customers. Current quality improvement efforts concentrate on the design and manufacturing phases of the product life-cycle management (PLM) [1]. These efforts can help to significantly reduce product variation caused by manufacturing [2]. However, warranty or field performance data represent crucial information related to the actual quality and robustness of the product as perceived by customers. Furthermore, warranty failures add significant costs in terms of additional service/rework, replacement of faulty products, and customer perception of the product. Therefore, a methodology for efficient diagnosis and prevention of warranty failures can provide an important competitive advantage. While warranty failures due to physical damage or misuse by customers are easy to detect during service operations, many failures cannot be diagnosed visually or by other current inspection procedures. Such failures may be related to process and product discrepancies unknown in the design phase. This observation is further supported by the fact that although quality control in manufacturing ensures that measured parameters are within their respective tolerances, most products still record warranty failures. These warranty failures are often due to the complex interactions between various design parameters (DPs) and process variables (PVs) not anticipated during the design stage. This makes the prevention of warranty failures during design and their detection in manufacturing a challenging task. Engelhardt [3] emphasizes that one of the major difficulties faced by designers is to determine the interactions between various DPs and their effect on a product’s functional requirements, which result in high warranty/field failures. In addition to the aforementioned challenges, customer complaints are not expressed in terms of design parameters and process variables. Hence, the diagnosis of warranty failures requires the integration of warranty information and manufacturing data to determine which process/product parameters are related to the failures and the nature of interaction causing the failures. Current advances in computers and database management allow for easy storage and retrieval of large amounts of data. This has enabled companies to maintain large databases that store the DPs and PVs measurements during manufacturing along with warranty information of all products manufactured for an extended period. This ensures traceability of the product (i.e., for any product in the field its corresponding manufacturing measurements of DPs and PVs and the warranty status can be determined). This provides an opportunity to develop a methodology which integrates the information from various PLM phases (design, manufacturing, and service) to enhance the robustness of the product and reduce warranty cost. B. Related Work Warranty has been recognized as an important factor in PLM and a great deal of research has focused on warranty analysis [4]. Current literature on warranty investigates a vast array of topics, such as the prediction of warranty failure (reliability analysis), analysis of warranty policies and cost, and the early detection of major warranty failures using the change point detection of failure rates. Table I provides a brief summary of the state of the art in warranty analysis with respect to the methodology proposed in this paper. Warranty analysis from a reliability standpoint is a well-researched area addressing prediction of component failures for various products ([5]–[7]). These methodologies use warranty data to determine the reliability of the subcomponents by estimating their lifetime distributions. The lifetime distributions can be further used to quantify the benefits of any design changes made for the product [6]. An alternative approach to using percentile life instead of average life time was proposed by Kim and Kuo [8]. Thomas and Rao [9] focus on analyzing the actual cost of warranty and formulation of the warranty policy to determine the period of coverage and remuneration to the customer. They also emphasize the importance of warranty analysis as feedback for product development. For future research purposes, they advocate the development of a closed-loop system with warranty failure analysis as a feedback to product design. Pham and Zhang [10] have developed a cost model with warranty and risk costs for software systems. Karim et al. [11] and Wu and Meeker [12] developed a methodology for the early detection of major reliability failures using warranty data based on statistical monitoring techniques. Their method allows monitoring change point detection in the rate of warranty failures for a given product. While all of these methods extract important information inherent in warranty data and related to lifetime and failure rates MANNAR et al.: FAULT REGION LOCALIZATION: PRODUCT AND PROCESS IMPROVEMENT of product components, they do not offer any direct and explicit feedback to manufacturing or design by identifying interactions among the DP- and PV-related measurements that cause the observed warranty failures. The mapping of the relation between warranty failures and DPs/PVs is essential toward the development of a diagnostic methodology and revaluation of design tolerances in order to improve product robustness to warranty failures. Ge et al. [13] focus on developing an interactive negotiation methodology between various design stakeholders for large-scale product development at an early design stage. The method assumes that the relationship between design space and performance space is known or can be represented by a mechanistic model. It then uses a set-based zoom-in process to provide the mappings between performance space and design space by conducting simulations based on a mechanistic model of the product. These mappings are then used to negotiate both performance target values and design parameter values to achieve optimal design. However, warranty failures are often caused by interactions between DPs and PVs, which are unknown in the design stage, and, therefore, cannot be identified based on a mechanistic model in the early design stage. Further, mechanistic models of the product may not be representative of the product’s actual usage in the field. Yang and Cekecek [14] extend warranty analysis by integrating the failure rate of each component with the axiomatic design approach. This integration was used to develop a design vulnerability index that identifies critical components of the products that need to be improved based on their functionality and failure rates. However, the model assumes that sufficient design knowledge represented in the form of axiomatic design matrices exists to identify the interaction causing product field failure. The methodology is based on the assumption that the design knowledge is complete and, thus, if the interactions causing the failure are unknown in the design stage, it is difficult to prevent or diagnose the field failure. The FRL methodology proposed in this paper helps to determine relations between warranty failures and DPs and PVs. These relations allow isolation of critical DPs and PVs causing warranty failures and determine the corresponding fault region within their tolerances. This helps in the diagnosis of warranty failure and revaluation of DPs’ and PVs’ tolerances to avoid the failure. The proposed FRL methodology extends the current analysis of product field failures by providing modeling capabilities to: 1) integrate warranty information with manufacturing measurements to determine interactions between DPs and PVs causing warranty failures; 2) determine the warranty fault region within the tolerance windows of the PVs and DPs, which can be used for tolerance revaluation to improve product robustness to field failures; and 3) include noise factors related to the decision classes for warranty data analysis. Separate noise factors are determined for normal and warranty classes to improve the performance of the FRL methodology. The rest of the paper is arranged in the following format. Section II describes the data and information flow in different phases of PLM which are of interest for warranty analysis. Section III outlines the proposed FRL methodology in the context 425 Fig. 1. Illustration of product life-cycle information for a multistation manufacturing system. of the nature of warranty failure data and requirements of the analysis. Section IV describes the FRL methodology based on a generalized rough set to extract the relationship between manufacturing measurements and warranty failures; this is then used in Section V to perform tolerance revaluation. Section VI illustrates the methodology with two industrial case studies. Finally, Section VII lists the conclusions and discusses potential future work. II. INFORMATION FLOW IN PRODUCT LIFE-CYCLE MANAGEMENT Information flow between various phases of product life-cycle management (PLM) can be regarded as a production system realization network with three major parts: design, manufacturing, and product field performance as described below. Fig. 1 provides an illustration of the production system information flow in generic multistation manufacturing with distributed sensing. A. Design Phase The design phase (I1) involves both product and process design. The product design phase initially determines all functional requirements (FRs) that a product must satisfy. Based on these functional requirements, the product architecture and relationship of DPs and PVs are determined. Tolerances are assigned to all of the selected DPs and PVs in order to satisfy the FRs. Here, let it be a subset of the functional requirements for which warranty failure is monitored. The proposed methodology determines whether there are any interactions among the DP- and PV-related measurements that cause the observed warranty failures. B. Manufacturing Phase The manufacturing phase (I2) includes all necessary operations to manufacture the product. The information about product quality and process performance is obtained based on end-ofline or distributed sensing systems ([15], [16]). The DPs and PVs are measured by various sensors in the manufacturing phase . These measurements (I2) represented as are used for product inspection and process control to ensure that the DPs/PVs are within their tolerances and the process is in control [18]–[20]. The measurements C may include all or a subset of DPs/PVs as well as additional measurements derived from DPs/PVs. The tolerances for these additional mea- 426 IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, VOL. 3, NO. 4, OCTOBER 2006 surements are determined based on the tolerances and requirements for the original DPs/PVs ([21]–[23]). Therefore, for the sake of simplicity in this paper, we consider all manufacturing measurements (C) as equivalent to DPs/PVs. While the manufacturing measurements are obtained during the manufacture of the product, they are stored in the manufacturer’s database for an extended period after the product is sold. This ensures traceability of manufacturing measurements for any product in the field. C. Field Performance Phase The field performance phase (I3) provides information about the product’s performance in the field when used by the customer. It consists of service and warranty data collected during the warranty period after sale. For example, field performance can be measured by: 1) warranty failures and 2) degradation of product performance during usage obtained by in-situ monitoring of the product in field. Field performance in this paper is primarily represented as warranty failures. This is due to the fact that in-situ monitoring in the field of all or a representative sample of products is difficult and expensive for products that are mass produced. For example, in the consumer electronics industry, such as cell-phone original equipment manufacturers (OEMs), it is easier to monitor the warranty status of all products after sales rather than in-situ measurements of the products in the field. However, it should be noted that warranty information is limited to product status (faulty or normal) and the corresponding nature of the warranty failure as reported by the customer. A variety of factors, such as the operating environment and nature of usage which affects product performance, are not measured in warranty data. These unmeasured factors increase the noise level and imprecision of the warranty data, adding further challenges in the development of the warranty failure diagnostic method. The specific characteristics of the warranty data and the corresponding advantages of FRL methodology are discussed in detail in Section III. Figs. 2 and 3 illustrate the information flow in electronics and automotive assembly processes, respectively. Fig. 2, depicts the process of printed-circuit boards (PCBs) assemblies that form radio products, such as cell phones. The product measurements are obtained at various manufacturing stages, such as autotesting at the end-of-line as well as after components placement and soldering operations. Similarly, Fig. 3 shows an example of automotive body assembly with measurements obtained in distributed measurement stations. The measurements made during manufacturing could be both categorical and continuous. For example, dimensional measurements of subassemblies in automotive body assembly and measurements of electrical properties (current, power, and voltage) in electronic assembly are recorded as continuous variables. On the other hand, pallets and fixtures IDs are recorded as categorical parameters. Figs. 2 and 3 also illustrate the warranty information available for both products. Warranty information typically consists of the status of a product in the field (normal or faulty) and the customer’s description of the fault. Products that have failed during the monitoring period are classified according to warranty failure categories as shown in Figs. 2 and 3. Fig. 2. Example of manufacturing measurements and warranty information for radio product (cell phones). Fig. 3. Example of manufacturing measurements and warranty information for automotive assembly with distributed measurement. III. METHODOLOGY OUTLINE The overall objective of the FRL methodology is to develop a model that links product field performance to design and manufacturing based on training data that consists of both manufacturing parameters and field information for each product sample. First, the objective of the FRL methodology is illustrated based on the assumption that continuous measure of the product performance in the field is available. Then, we will discuss an actual MANNAR et al.: FAULT REGION LOCALIZATION: PRODUCT AND PROCESS IMPROVEMENT scenario where only warranty information consisting of product binary status, faulty or normal, is available. be the set of functional requirements related to the Let , and product. be a continuous measure for a particular field performance char). acteristic “w” (Figs. 2 and 3 provide a few examples of The objective of field performance analysis is to identify the and DPs and relationship between the field performance PVs. This relationship is often represented in the design stage in the form of the design matrix following the axiomatic design approach as shown in (1). This relationship is used by the designer to determine the goodness of design in terms of the ability to satisfy the functional requirements by the selected DPs and PVs (1) As mentioned earlier, the relationship shown in (1) is often incomplete during the product and process design stages due to a lack of information regarding interactions causing field failures. However, these relationships can be determined by analyzing a representative sample of products which consist of both field performance and manufacturing measurements. The sample of products and the associated measurements can be represented as follows: where represents product sample j, with to n. The training data include the measurement corresponding to warranty failure type “w” for each product sample and the corresponding manufacturing measurements for each sample , where is the sample size and is the number of variables measured in manufacturing for each sample. The training data described above can be used to generate a to the field model linking the manufacturing measurements performance characteristic shown in generic form as (2) is a random variable associated with the field perwhere is the formance characteristic (warranty failure) “w” and matrix which describes the interactions between and . The noise represents an unmodeled relationship beand . Equations (1) and (2) represent two tween the generic models which integrate field performance with manufacturing measurements by identifying existing causality relationships between the DPs/PVs and product field failures based on a representative training sample. Therefore, although the actual interactions between the DPs and PVs are not deterministic (as shown by the unmodeled noise in (2)), there should exist certain persistent interactions between the DPs/PVs which cause the failure and can be extracted from the training data. The effect of noise is an important factor in selecting modeling framework for data analysis methodology which is discussed later in this section. 427 Availability of Warranty Information: While the complete information from the field should include the continuous meaas shown in Fig. 2, the sure of each warranty failure collection of actual product performance data from the field in the form of continuous measurements is very difficult and expensive, especially for products that are mass produced. Therefore, warranty information is provided as a binary variable corresponding to the status of each product. Hence, the training used to analyze a particular warranty failure indata cludes a corresponding binary random decision variable which provides information about the status of warranty failure “w” of the th-sampled product if the product is faulty if the product is normal. (3) Therefore, the field performance measure for failure “w” for all product samples in the training data can be expressed as , for all to . This reduces the model between and in (2) to a model between the correbinary outcome and measurements . Thus, this sponding can be considered as a supervised classification model which and differentiates between samples having based on the measurements . The two steps of the proposed FRL methodology can be described as follows: Step 1) Data-driven fault localization: It identifies a subset of manufacturing measurements that explains warranty failures of a given type . The identified subset is used to “w” define the corresponding warranty fault region (WFR), normal region (NR), and boundary region (BND). In this paper, a supervised classification approach based on the RS method is developed for performing this step. Step 2) Tolerance revaluation based on FRL: Tolerance design evaluation is conducted to eliminate warby redesigning tolerances of ranty failures each measured DPs/PV’s parameters to avoid the WFR region. A. Data-Driven Fault Localization The objective of the first step is twofold: 1) to identify the manufacturing measurements related to the specific warranty failure and 2) to identify the corresponding WFR, NR, and BND regions. Although the data-driven fault localization in FRL analysis can be considered as a supervised classification problem, the methodology must consider the nature of warranty data which can be described as follows: Easy and Intuitive Interpretation of Classifier Structure: The primary objective of the FRL analysis is to provide the capability for easy interpretation of the classifier structure for the diagnosis of warranty failures rather than to have a model with only strong failure predictive ability. Therefore, the model should be able to identify important manufacturing measurements related to warranty failures, and find a warranty fault region for each identified measurement which helps to define 428 IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, VOL. 3, NO. 4, OCTOBER 2006 the operation windows. While traditional classifiers, such as discriminant analysis, support vector machines, and neural networks provide good classification accuracy, the form of classifier and classification function are difficult to interpret. Similarly, the classifiers, which use nonlinear models, have the same limitation and do not provide an explicit relationship between tolerances of individual measurements and warranty failures. Characteristics of the Warranty Data: Warranty data have specific characteristics which must be taken into consideration by the classification methodology. These characteristics can be summarized as follows: Noise and error in warranty classification: Since only information available from the warranty is based on the nature of customer claims, warranty failure information is imprecise in nature and depends on customer perception. Furthermore, no information is available about the nature of usage. Multivariate non-normality and different covariance structure of warranty and normal data: It is difficult to satisfy the multivariate normality assumption due to the large number of measurements made in manufacturing and the presence of categorical variables. Furthermore, some electronic characteristic measurements can be defined as adjustable parameters. Multiple and disjoint fault regions caused by multiple root causes: Warranty failures may have multiple root causes and disjoint failure regions making it difficult to develop a single model to explain all of the failures. Small sample size of warranty failure available in the training data: In general, one of the major challenges in the analysis of warranty failure data is the large number of warranty failure types ( , where 150–200) and small sample of failures available for each failure type (in some cases, ). Furthermore, this also requires that the method be able to handle measurements of mixed type (i.e., of both categorical and continuous measures) since the possibility to convert categorical variables into dummy variables is limited due to the small warranty failure sample size. Table II provides a comparative analysis and review of traditional classification methods for warranty failure analysis based on the aforementioned criteria. Since the FRL methodology is based on the basic principles of the RS approach in Section IV, we discuss the specific drawbacks in depth of the traditional RS application for warranty data and compare it to the advantages of the developed FRL methodology. Appendix A provides the comparison of the traditional RS approach with other statistical classification methods, such as discriminant analysis and logistic regression, for simulated data with non-normal distributions. Furthermore, Appendix A also includes a comparative analysis of error rates of FRL methodology, discriminant analysis, and logistic regression based on the analysis of actual warranty data obtained from the electronics industry. In summary, the RS approach shows lower error rates than traditional classification approaches (LDA, QDA, and logit regression). B. Tolerance Revaluation FRL methodology provides a direct relationship between warranty failure and the identified subset of the mea- TABLE II COMPARATIVE ANALYSIS OF VARIOUS SUPERVISED CLASSIFICATION METHODS VERSUS PROPOSED FRL BASED ON CHARACTERISTICS OF WARRANTY DATA sured DPs and PVs. Additionally, it defines the WFR, BND, and NR regions in the tolerance for DP and PV represented by . For example, since in cell-phone manufacturing, all products in the field used by customers have passed all of the required tests during manufacture and were found to be within the design tolerances of all the measured parameters, the WFR and BND regions most likely intersect with the tolerance region of the identified as related to corresponding DPs and PVs the analyzed warranty failure. To improve product robustness to warranty failures, the tolerance of the identified DPs and PVs need to be redesigned to eliminate or reduce the overlap between the WFR and BND re. gions and the tolerance window of all the IV. FAULT REGION LOCALIZATION This section describes the FRL methodology and then compares it to the traditional RS method. Traditional RS were developed as a classification method which is robust to noise and imprecision of data [24]–[26]. A short review of the RS methodology is provided in Appendix B. While the RS methodology is robust to noise and distribution of data, it assumes that the decision class values (i.e., warranty is without failure status for each product samples “j,” uncertainty). However, in the case of warranty data analysis, is determined based on customer feedback and it is affected by numerous factors in the field including modes of product usage, environment among others which are unknown. Therefore, there is a considerable amount of uncertainty associated with the decision class. Furthermore, the performance of MANNAR et al.: FAULT REGION LOCALIZATION: PRODUCT AND PROCESS IMPROVEMENT 429 the product in the field may be affected by the infant mortality period with a decreasing failure rate, or a product may be in transition state to a normal life period with a low, relatively constant failure rate. The product performance in the field is affected by such degradation; however, it may or may not be reported by a customer as warranty failure. Therefore, the traditional RS approach has a relatively high error rate as it does not consider . uncertainty present in the decision class The FRL methodology uses the concept of generalized RS based on Han et al. [27]. We incorporate parameters quantifying noise in the warranty and normal decision class. This is done by and , which represent utilizing two classification factors the noise for and , respectively. Additionally, an uncertainty parameter is assigned to decision variable for each product sample “j”. When , the product , the product “j” is cer“j” is certainly faulty; and when tainly normal. All steps of the FRL methodology and the relationships between them are shown in Fig. 4 and described as follows: 1) Generation of WIS: This step uses the training data obtained for a particular warranty failure to generate an information system for the FRL analysis. 2) Discretization using Boolean reasoning algorithm: Since the manufacturing measurements could be both continuous and categorical, continuous measurements are discretized into intervals using Boolean reasoning-based discretization. 3) Determination of family of equivalent classes: After discretization, we determine equivalent classes present in , each equivalent class is defined as a set of all product samples which cannot be distinguished from each other based on the discretized measurements (i.e., all of the for all of the product measurement parameters samples are equal). The set of equivalent classes determined in the data forms the family of equivalent classes. 4) Determination of dependency degree based on equivalent classes: The dependency degree determines the ability of measurements to differentiate between the and . The dedecision classes pendency degree is calculated based on the concept of membership function for each equivalent class. 5) Generation of reduct using genetic algorithm: Reduct is defined as the minimum subset of the measurements for which the dependency degree is approxfor the imately the same as the dependency degree whole measurement set . The reduct generation is based on a fitness function which simultaneously maximizes the to approach and reduces the cardinality value of of . The procedure is based on genetic algorithms. 6) Generation of NR, WFR, and BND regions based on the reduct : Based on the identified reduct , the following : (1) WFR for faulty prodregions are determined ucts; (2) NR for normal products; and (3) BND region representing the transition from the normal product to warranty failures representing the impreciseness or noise in dimensional the data. The identified regions are in the space. Fig. 4. Generalized RS mapping of warranty to manufacturing measurements (Section IV). The details of the steps described above and also shown in Fig. 4 are as follows: A. Generation of Warranty Information System (WIS) The WIS representing training data for a particular failure is defined by the following relation: (4) where is the training data set for where is the product sample size. Each analysis of failure product sample consists of both manufacturing and warranty status . measurements is a nonempty set of manufacturing measurements corresponding to each sample in the training data. can be represented as shown below where to n; n is the product sample size; and to , where is the number of measured parameters during manufacturing. 430 IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, VOL. 3, NO. 4, OCTOBER 2006 WIS FOR TABLE III EXAMPLE 1 WITH 5 SAMPLES, n = 5 AND 3 MEASUREMENTS p = 3 TABLE IV WIS OF EXAMPLE 1 WITH DISCRETIZATION OF c AND c Based on the sorted value set, cuts are generated and , for to (m-1), midway between based on the conditions defined in (5). Equation (5) as the collection of decision class values defines . can be or . observed for each Equation (6) defines intervals based on midway cuts when and are not between all v and v . singletons or For each is the decision class with the binary elements for all product samples to . The binary element of decision indicates a product with warranty failure and class indicates the normal status of the product. represents a set of all distinct values for measurement found in such that is the uncertainty associated with the decito . where 0 indicates sion class value the normal status of the product and 1 indicates the product warranty failure. are two noise classification factors, which estimate the noise in normal and warranty decision classes (i.e., and , noise is defined as and for , respectively. This allows for separate representation of the noise level for faulty and normal class of products. This distinction is especially important for the analysis of warranty data since, generally, customer feedback about faulty products has larger ambiguity than for normal products. We use a simple case Example 1 to illustrate the steps A-D in the FRL methodology. Table III represents the WIS for Example 1. B. Discretization Using Boolean Reasoning Algorithm Since the manufacturing measurements could be both continuous and categorical in nature, the measurements are discretized to provide the ability to analyze them together. The data are discretized by using Boolean reasoning based on the discretization procedure. The procedure partitions each measurement into intervals while minimizing the loss of discernibility between normal and warranty decision classes. The concept of discernibility represents the capability of the measurements to distinguish between normal and faulty products. The discretization procedure is based on the Boolean reasoning algorithm developed by [28] and implemented by [25]. The algorithm consists of the two following steps: 1) Naïve cut generation: This step partitions individual measurements into intervals. The value set of denoted by is . sorted (5) (6) 2) Boolean reasoning-based cut evaluation: This step involves the measurement of goodness of ensemble of all cuts for all generated in (6) to partition the measurements space and distinguish between decision classes. This is then used to remove re. dundant cuts across different The goodness of ensemble of all cuts is measured by the Boolean product of sum function defined by (7). The prime implicant of is the minimal subset of cuts that preserves the original discernibility with respect to the decision classes. The implementation by [25] uses a greedy searchbased approach to compute the prime implicant of and and (7) The details of the Boolean reasoning methodology can be found in [25] and [28]. The discretized measurements for Example 1 are shown in Table IV. C. Determination of Family of Equivalent Classes An equivalent class is a set of product samples which cannot be distinguished from each other based on measure(i.e., samples in an equivalent class have the same ments ). This can be represented as: values for each is a member of equivalence class iff . MANNAR et al.: FAULT REGION LOCALIZATION: PRODUCT AND PROCESS IMPROVEMENT TABLE V EQUIVALENT CLASSES FOR EXAMPLE 1 BASED ON DISCRETIZED 431 TABLE VI DEPENDENCY DEGREE CALCULATION FOR EXAMPLE 1 C The membership functions can be similarly used for all equivto determine the membership of all the alent classes equivalent classes in the training data set based on . The memcan be combined to deterbership of equivalent classes in mine the membership of all samples into warranty, normal, and boundary region as follows: Many such equivalent classes may exist in for a given measurement . The family of equivalent classes is represented by . An example of equivalent classes is provided in Table V for the illustrative example. D. Determination of Dependency Degree of Equivalent Classes (11) Based on Family (12) The dependency degree determines the effectiveness of a set of measurements or any of its subsets to differentiate between the warranty and normal classes. The dependency degree is determined based on the membership that each equivalent class belongs to a normal or warranty decision class. The dependency degree is calculated in two steps: 1) Membership Functions for the Equivalent Classes : Each equivalent class consists of product samples that cannot be discerned from each other based on the measurements . The membership of any equivalent class is calculated based on uncertainty values for each sample “j” in the determines the uncertainty equivalent class . associated with decision class is measured by The membership of an equivalent class membership functions for a warranty class and for a normal class defined in (8) and (9) (8) where (10) represents the cardinality of (9) The membership function values are compared with the noise thresholds and defined for the warranty and normal product region to determine the membership of the equivalent and class . Based on the specified noise threshold is classified into a warranty class if and is classified into boundary normal class if and (i.e., it cannot region if be classified into either decision classes based on the members of the equivalent class). 2) Calculation of Dependency Degree : The dependency is the fraction of samples in that is classidegree fied into warranty or normal classes based on measurements . The dependency degree is defined as (13) The membership function values for each equivalent class in example 1 are shown in Table VI. If we assume that , we can see in Table VI that the and belong belongs to the warranty class with to the normal class while no equivalent class in the boundary region. Therefore, based on number of samples in each equivalent class and its membership, . we have The increase of the dependency degree provides a better ability to classify product samples into warranty or normal regions rather than the boundary region. The maximum attainable dependency degree is obtained when all of the measurements are used. E. Generation of Reduct Using GA Reduct is defined as the minimum subset of the measurements which approximately preserves the same dependency as dependency degree in the whole measurement degree set . The reduct generation is based on a fitness function which to approach and simultaneously maximizes the value of reduces the cardinality of . The procedure is based on the genetic-algorithm optimization to search for the minimum subset . The GA optimization is formulated based on the defined fit, which is then maximized to determine ness function F 432 IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, VOL. 3, NO. 4, OCTOBER 2006 ). The second term in (14) corresponds to the penalty function that penalizes longer reducts. Based on the two aforementioned criteria, the fitness function is represented as follows: Given (14) (b) Reduct identification based on Genetic Algorithm (GA): As shown by [29], the calculation of the optimal solution for reduct identification is computationally intractable. Additionally, the fitness function used for reduct identification could be discrete and multimodal. Therefore, the optimization of reduct based on the fitness function requires the use of heuristic-based optimization. Therefore, we use a genetic-algorithm-based approach to identify the reduct that maximizes the fitness function defined by (14). Fig. 5 shows the details of the GA-based reduct calculation procedure. The stopping criteria for the maximization of F(B, ) is a lack of improvement in the average fitness of the population over a specified number of generations (50 generations in the conducted case studies.) The output of the GA-based reduct identification is shown in (15) and Fig. 5. Genetic-algorithm-based reduct generation. reduct . Fig. 5 illustrates the genetic-algorithm implementation for the search of the reduct based on [25]. The method for generation of reduct is described in two steps: 1) Fitness function F (B, ): The fitness function F (B, ) used to search for the optimum reduct , combines two criteria: Approximate Reduct Determination: For any subset of , the dependency degree as in (13) can be calculated based on its family of equivalence classes and corresponding membership functions using (10)–(12). If a subset preserves the same dependency degree , then the measurements are redundant and, thus, is ). defined as the exact reduct (i.e., However, the calculation of exact reducts in warranty data analysis is computationally intractable and unstable [29] due to a large number of possible combinations of measurement variables and presence of noise. Therefore, it is necessary to find a that approximately preserves the same depenreduct ). The approxdency degree as (i.e., imation of is defined by threshold parameter “r.” The approximate reduct calculation is represented in (14), where the maximum value by the first term in F of the first term is restricted by threshold “r.” Penalty Function for Shorter Reducts: In addition to generating the approximate reduct, it is desirable to have a reduct of the smallest cardinality (i.e., the smallest possible subset of (15) F. Generation of NR, WFR, and BND regions based on the reduct The approximate reduct identifies parameters related to the . The reduct is then used to identify NR, warranty failure WFR, and BND regions. The membership of product samples belonging to WFR, NR, and BND regions can be calculated for the reduct based on the family of equivalent classes and the classification factors and as shown by (16)–(18). These are similar to (10)–(12) except that they are determined based on (16) (17) (18) V. TOLERANCE REVALUATION Tolerance revaluation is performed based on the WFR, NR, and BND regions. Fig. 6 provides the steps involved in tolerance revaluation which are also described below. A. Identify Ranges for Regions Defining WFR, NR, and BND The WFR, NR, and BND regions are defined by (16)–(18). It is important for ease of interpretation of warranty failures to relate the WFR, NR, and BND regions to the design tolerances of . Thus, it is necessary to express them in terms of the MANNAR et al.: FAULT REGION LOCALIZATION: PRODUCT AND PROCESS IMPROVEMENT 433 Range (24) C. Tolerance Revaluation Based on WFR, NR, and BND Regions Identified for Each Fig. 6. Tolerance revaluation based on WFR, BND, and NR by GRS (Section V). For each measurement (representing DPs/PVs), a tolerance is assigned during design. As seen from Fig. 6 using a 3-D includes WFR, example, the original Tolerance NR, and BND regions. Hence, the tolerance of the parameters identified in the reducts can be expressed as a Boolean sum of the WFR, NR, and BND regions as shown in (25) Tolerance range of for each region. The ranges of then have a direct relationship with the respective tolerances of DPs . and PVs represented by The ranges are determined based on the values of samples in (i.e., the maxWFR, NR, and BND regions for each for samples in imum and minimum values of parameter each region). be the identified reduct with Let . Equation (16) identifies the samples in the training data set that are in WFR based on all of the equivalent classes . Each sample Uj in has for . Therefore, the ranges of a corresponding value for representing WFR can be defined as (25) Since the WFR and BND regions could overlap with the tolerances of , it is possible to reduce the occurrence of the correby re-evaluating the tolerance sponding warranty failure to reduce the overlap. The tolerances could be redesigned to avoid the WFR. The BND region represents the uncertainty in the data and, thus, it can be avoided or retained in the tolerance based on the comparison of cost of warranty failures and tolerance adjustment. Equation (26) shows an example of redesigned tolerances which include only the NR region Tolerance (26) Range (19) Similarly, ranges for all to define NR and BND region ranges can be determined as follows: Range (20) Range (21) The ranges from (19)–(21) can be visualized easily up to three dimensions as shown in Fig. 6. However, for higher dimensions , they are easier to of WFR, NR, and BND regions represent and visualize in terms of if-then rules explained below. B. Rule-Based Representation of WFR, NR, and BND Regions Equations (19)–(21) identify ranges for WFR, NR, and BND . These can be combined into if–then regions for each rules for WFR, NR, and BND regions. The if-then rules for the WFR region can be represented as a Boolean summation of the individual ranges for all identified based on the approximate reducts and defined in (22) Range (22) Similarly, rules for the NR and BND regions can be represented as follows: Range (23) VI. INDUSTRIAL CASE STUDIES Two case studies from cell-phone manufacturing illustrate the FRL methodology. The outline of the cell-phone manufacturing process is shown in Fig. 2. The case studies analyze two war-Power/Battery ranty failures for one product model: 1) -audio signal failure. Both performance failure and 2) failures were classified as: “no defect found” (NDF) (i.e., no root causes of the failures were identified during repair processes at the service center). The product model considered in the analysis is part of the iDEN phone family introduced in 2001 and is a high-end consumer model. Each of the failures is analyzed independently since they represent failures occurring in different functional subsystems (audio and battery failures). The FRL methodology was used to identify the reducts (subset of measured DPs and PVs) which explain the warranty and . Corresponding WFR, NR, and BND failures regions are identified based on the reducts. The samples used for the analysis in both cases include products that failed within six months of their sale. The names of DPs and PVs have been changed to protect the confidentiality of the OEM. The data used in the analysis of the two warranty failures and consist of the following data. 1) Warranty failure samples: These were products classified under the warranty failure being studied ( and ). Their corresponding manufacturing measurements (C) were also collected. Since these samples represent warranty failures, they are given a decision . class 434 IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, VOL. 3, NO. 4, OCTOBER 2006 2) Normal samples: The products that have not failed in the field within six months after sales forming part of the training data along with their manufacturing measure. Since these samples represent normal or nonments faulty products, they are given a common decision class A. Case Study 1: Analysis for Warranty Failure Power/Battery Performance Failure The warranty failure was one of the major failures in the “No Defect Found” category and the battery life of the model warranty failure is was a major concern. The analysis of conducted following the main steps of the FRL methodology presented in Sections IV and V. 1) FRL-Generalized Rough Set (GRS) Analysis: Step A) Step B) Step C) Step D) Step E) WIS: The analysis was conducted with data set that had 450 normal operating data and 23 failures. The 23 warranty failures were collected over a period of five months and all of these failures occurred within six months after sales. A total of 170 parameters were measured for this product during testing in manufacturing which forms the measurement set (i.e., each sample consists of , with ). Further, a value of 0.75 was set for repreand senting the noise decision class value of 0.75 for . These values were determined based on the analysis of historical data for the warranty failure and feedback from the service regarding the failure. Discretization of continuous measurements: Since the measurements (C) consist of both categorical (65) and continuous (105) variables, the measurements are discretized based on the Boolean reasoning algorithm. Based on the discretized measurements, the family of equivalent sets was determined. The family of equivalent sets represents all sets whose elements cannot be distinguished from each other based on the available manufacturing measurements . is determined Dependency degree based on the membership of the equivalent classes using (13). Genetic-algorithm-based reduct generation: The generated in GRS approximation reduct . The value of approximation was chosen as 0.9 based on convergence and stability of reducts. The identified reduct for the was the following. failure 1) : A continuous variable which is a current measurement. : A continuous variable which is a 2) power measurement. Number of retests: Categorical variable 3) measuring number of times the product is tested in a testing station. Fig. 7. Scatter plot of both normal data (500 samples) and failures (8 samples) with C-D-E-J indicating the WFR, A-B-I-F-G-H is the NR and B-C-J-E-F-I is the boundary region. Therefore, the reduct generated provides a significantly smaller subset of parameters 3 out of the total measurement parameters of 170. Step F) Warranty, normal, and boundary regions: The calculation of WFR, NR, and BND regions is performed based on the identified approximate reducts using (16)–(18). The WFR region calculated consisted of 20 samples, NR of 444 samples, and BND region of 9 samples. 2) Tolerance Revaluation: The range (WFR), range (BND), and range (NR) are calculated based on (19)–(21) and are shown in Fig. 7 by WFR by C-D-E-G, BND by B-C-J-E-F-I, and NR by A-B-I-F-G-H. Also, it shows that the failures cannot be identified during testing in manufacturing as they are clearly within the original tolerances for both variables. Prevention of the failure requires a change in design or tolerance revaluation to avoid the fault region. The samples in the boundary region could signify products that have been functionally affected by the value of the identified parameters but they were not reported as warranty failures due to noise factors such as customer usage. The analysis results can also be expressed in the form of rules that identify the DPs/PVs related to the failures and the corresponding WFR, NR regions as shown in Table VII. The first rule and . The catis a combination of a categorical variable egorical variable is a measure of the number of times the testing is repeated for the product. If a product fails a particular test, it is reworked and tested again. The first rule identifies products that are tested multiple times and within the identified range of as warranty failures. WFR for The second rule is a combination of two continuous variables and . The plot of the data points using these two variables shown in Fig. 7 clearly identifies the ability of these two factors to discriminate between the faults and normal products. Based on the rules in Table VII and the original tolerances in Fig. 7, the occurrence of the failure can be reduced by reval- MANNAR et al.: FAULT REGION LOCALIZATION: PRODUCT AND PROCESS IMPROVEMENT W TABLE VII RULES GENERATED FOR FAILURE tions of the two parameters for the two manufacturing locations. Plant 2 is more sensitive to the warranty failure due its overlap with the WFR although it has been better centered based on the original tolerance window. Further, it should be noted that the WFR, NR, and BND regions were obtained based on the analysis for Plant 1 only; therefore, there is an overlap of 15 normal products from Plant 2 in the WFR region (which is a misclassification of 0.93%). B. Case Study 21: Analysis for Warranty Failure Failure Fig. 8. Scatter plot of the plot of both normal data (1500) and failures (20) with boundaries A-B-C-D indicating the fault region. uation of the tolerances of and to avoid or reduce overlap with the WFR and BND regions. 3) Discussion for FRw (Power/Battery Performance Failure): Fault interpretation: Based on the identified factors and their fault regions, the results were presented to design engineers for the physical interpretation of the results. The and determine the current drawn from parameters different subcomponents in the cell phone. When both and are in the higher end of tolerance, it results in a large load on the battery when the cell phone is switched on, affecting battery performance. Therefore, although the parameters are within their individual tolerances, they interact during operation of the product resulting in failure. on New Product DeImpact of the Analysis for sign: Based on the identified parameters related to the warranty failure and the interpretation of the interaction, design changes were made for the new products introduced in 2004 which for the new products. eliminated the “NDF” failure on Process Control for Cur4) Impact of Analysis for rent Product: In addition to changes in the new products, the analysis can be extended for process control for current prodand ucts to reduce the overlap of measurements of with their corresponding WFR and BND regions. The case study was analyzed based on normal data for a single plant or manufacturing location (Plant 1). The normal products from another manufacturing location (Plant 2) were obtained and the meaand were superimposed to the WFR, surements for NR, and BND regions obtained based on the analysis performed above. Fig. 8 shows normal data from both the manufacturing locations with respect to the WFR and NR region. From Fig. 8, we can see that there is a clear difference between the distribu- 435 , Audio was analyzed similarly The second warranty failure using warranty failures for a period of six months and the corresponding manufacturing measurements for the failures. The was a significant failure category related to warranty failure communication performance of the radio product. The failure information was collected for a period of 6 months consisting of 12 faults and compared with 500 normal operating data that had not failed in field during the period of 6 months. Similar and values of 0.75 as in failure analysis are used for analyzing audio failure. 1) FRL Analysis: Steps A to D, which correspond to all steps beginning with the WIS generation until determination of the dependency degree, are conducted in a similar way as it . was shown for warranty failure Step E: Reduct generation: Two possible reducts are identified for two possible failure relationships consisting of three variables each. , , and . Reduct 1: , , and . Reduct 2: The two reducts generated may indicate multiple root causes for the failures and, therefore, multiple fault regions. Step F Generation WFR, NR, and BND regions: The calculation of WFR, NR, and BND regions is performed separately for each reduct based on the identified approximate reducts using (16)–(18). Also, since the NR and WFR were well separated compared and , values of 0.75 no samples are obtained in the to BND region. The analysis identified two different sets of parameters related to different sets of test parameters. 2) Tolerance Revaluation: The ranges identified for WFR and NR identified by using 91–21 are shown in Figs. 9 and 10 for the two reducts. Since there are no samples in the BND region, no range is generated for the same. As seen from Figs. 9 and 10, there is a good separation between the NR and WFR region which results in no identification of the BND region based on the training data. The different reducts indicate two different , there is an overlap possibilities for the failure. Similar to of the WFR with the tolerance although, in this case, the tolerances of three parameters need to be considered for revaluation to eliminate or reduce warranty failures. C. Discussion Regarding Future Work Based on the Analysis of Case Studies The case study analyzes each warranty failure separately; this is performed since the analyzed failures are from different func- 436 IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, VOL. 3, NO. 4, OCTOBER 2006 VII. CONCLUSIONS AND FUTURE WORK Fig. 9. Scatter pot of the plot of both normal data (500) and warranty failures (5) for the first rule in Table VIII with the identified fault region. Fig. 10. Scatter plot of the plot of both normal data (500) and warranty failures (7) for the second rule in Table VIII with the identified warranty fault region. W TABLE VIII RULES GENERATED FOR FAILURES tions and/or subsystems of the product, making them independent. For example, the two failures studied (battery and audio failures) for cell phones were mutually exclusive (i.e., there was no instance of one product having both warranty failures). However, it may not be true in the case of failures related to same function or subsystem. This poses an important question regarding whether the customer complaint is sufficient to separate these two failure types or whether they need to be analyzed together. One future research area will be to study such failure types which are not mutually exclusive. Evaluation of the product field performance is a critical factor in product life-cycle management (PLM). Field performance information, such as warranty failures data, is an important measure of product quality and robustness. The developed fault region localization (FRL) methodology provides a general framework for integrating two traditionally separate areas—design and manufacturing with warranty failures. This integrated FRL methodology simultaneously addresses both: 1) identification and diagnosis of design and manufacturing root causes leading to product warranty failures and 2) provides analytical feedback to design to prevent or reduce the occurrence of field failures in new product models. Recent research and development related to storing and tracking warranty failure and key product and process parameters data provides a much needed opportunity to analyze and diagnose warranty failures. The FRL methodology is based on two steps. Step 1) Supervised classification of each warranty failure using the proposed generalized RS approach which identifies the minimum number of key DPs and PVs and their operation windows to classify a given warranty failure. Step 2) Tolerance revaluation of the identified DPs and PVs to avoid a given warranty failure. The tolerance revaluation approach provides an intuitive interpretation of the results in the form of warranty fault region, normal region, and boundary region represented graphically or as a set of designed “tolerance rules.” The presented methodology provides a general framework for the analysis of field failures based on manufacturing measurements. Two presented case studies from cell-phone manufacturing illustrate and validate the developed FRL methodology. The initial application of the methodology in cell-phone manufacturing completely eliminated one of the top 5 warranty failures classified as a No-Defect-Found category. APPENDIX A COMPARATIVE ANALYSIS OF GENERALIZED ROUGH SET WITH TRADITIONAL MULTIVARIATE STATISTICAL CLASSIFICATION TECHNIQUES Section III describes the motivation for using GRS methodology. Table II provides a summary of comparison between the different classification methods for warranty data analysis. This section describes the comparison in detail and provides numerical comparison of error rate for GRS with classification methods. The use of the proposed GRS methodology offers the following advantages based on the nature of the warranty data to be analyzed: 1) Imprecise Nature of Warranty Data: Warranty failures are characteristically imprecise in nature as they depend on customer perception and usage (i.e., two samples having the same manufacturing characteristics could have different field performance). This may be due to different operating conditions based MANNAR et al.: FAULT REGION LOCALIZATION: PRODUCT AND PROCESS IMPROVEMENT on customer usage, period of usage, environment, and other factors which are difficult to capture in warranty data and, therefore, form part of the noise in the data. Rough sets has been developed as an analysis method for imprecise data [24] and determines the boundary region between classes which takes into account the noise in the data. Further modifications are made in the GRS by introducing classifica), which are used to incorporate the noise tion factors ( in warranty and normal decision classes. In addition, a certainty is introduced to represent uncertainty or confidence in factor a sample. These two additions to the traditional rough set theory incorporate 1) the uncertainty in the classification of individual samples into warranty and normal and 2) allow the representation of noise present in warranty and normal data through classification factors. 2) Non-Normal or Asymmetric Distribution Due to Aggregation Based on Warranty Performance: The distribution of manufacturing measurements for normal and warranty decision classes could have non-normal or asymmetric distribution. Field failures process could have non-normal and asymmetric distributions as they are aggregated into two classes based on warranty information and some measurements could be inherently non-normal. Rough-set-based methodology is significantly robust compared to traditional classification methods with respect to sample distribution, covariance structure, and sample size which are critical for warranty analysis and are illustrated by the following quantitative comparison. Comparison of LDA, QDA, and Logit Regression With Traditional RS and GRS: Doumpos and Zopounidis [26] conducted a study to compare the performance of statistical classification methods (linear and quadratic discriminant analysis, logit analysis) to RS for continuous variables. The analysis conducted in [26] consists of extensive Monte Carlo simulation to examine the performance of these methods under different data conditions. The simulation was performed by [26] based on seven factors identified as important for classification analysis by the authors. The first factor encompasses various methodologies compared in the simulation (i.e., LDA, QDA, logit analysis, and RS). The other factors are the statistical distribution of sample data (exponential, uniform, log-normal, normal), number of groups in data, the size of the training sample, correlation coefficient between attributes, homogeneity of covariance matrices, and the degree of overlap between the groups. A validation sample is created for each combination of the above training. The analysis of the results obtained from the experimental design is based on the classification error in the validation sample. A seven-way analysis of variance is performed using the transformed misclassification rates of the methods using an error measure defined below Error measure error rate (A.1) Description of error rate: The error measure used in A.1 is based on the analysis performed by [26]. They use the transformation to stabilize the variance of the misclassification rates with higher values of the error measure indicating lower performance. In addition to the displayed error values, the effect of interaction 437 TABLE IX COMPARISON OF GENERALIZED ROUGH SETS WITH TRADITIONAL CLASSIFICATION METHODS FOR NON-NORMAL DISTRIBUTIONS AND WARRANTY CASE DATA (VALUES IN THE CELLS ARE ERROR MEASURE (A1), HIGHER VALUES INDICATE LOWER PERFORMANCE) between the methodology used (LDA, QDA, or RS) with the distribution was found to be statistically significant and the corresponding ANOVA analysis is shown in [26]. Other significant interactions include sample size, covariance matrix structure, and the size of the training sample. Rough sets is shown to provide consistently lower error rates with significant difference with statistical classification in small and medium sample sizes. In addition to this simulation for traditional RS, we performed a comparison of the proposed GRS with the same classification methods for warranty data analysis. The results of the analysis are shown in Table IX. 3) Multiple Root Causes and Disjoint Fault Region: The methodology should also consider the possibility of multiple root causes for the failures. Thus, unlike regression and other model-based techniques (i.e., a single model for the whole population), the method should be object based [i.e., it should be able to generate multiple models (if necessary) to describe the data]. Multiple root causes could mean different combinations of process/test variables related to the failure or different failure regions for the same combination of variables. The GRS methodology can create more than one reduct to explain the warranty failure to detect the existence of multiple root causes or disjoint regions. 4) Interpretation of Results and Mapping: The method should not only provide the parameters related to the faults but also the nature of relationship or the fault region in the tolerance of the identified parameters. As expressed in [30], although neural networks are robust as they do not require any distribution assumptions (similar to RS) and handle nonlinear data, they require a larger training sample size and are not easy to explain, especially in the presence of hidden layers. The ability to understand the interactions between the process/product parameters and warranty failures is critical for warranty data analysis. APPENDIX B RS THEORY BACKGROUND Rough set theory was developed by Pawlak ([24]) and has been used for the classification of imprecise or uncertain data. RS theory is a tool to describe dependencies between various measurements characterizing samples in the dataset. It is used to evaluate the significance and relationship of measurements to different sample types (in our case, normal and failure samples) and deal with inconsistent data. As an approach to handle imperfect data, it complements other theories that deal with data uncertainty, such as fuzzy set theory, and has been found to be 438 IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, VOL. 3, NO. 4, OCTOBER 2006 a very useful tool in the study of classification problems in various applications, such as business failure prediction, medical decision making, and diagnosis of assembly failures [31]. RS can also handle mixed-type data that contain both continuous and categorical information which is difficult to analyze using standard statistical techniques. It uses training data to determine patterns of interest from the dataset. The training data used to develop the classifier consist of samples or objects and each object has information associated with it. The information associated with each object consists of conditional and decision attributes. Conditional attributes are those which describe the characteristics of the object (e.g., manufacturing measurements in this paper). The decision attributes partition the objects into classes (e.g., warranty and normal classes in this paper). The principal characteristic involving classification using RS is the measurement of impreciseness in data. Objects that are characterized by the same conditional attribute values are considered to be indiscernible. This indiscernibility relation constitutes one of the mathematical bases of RS theory. A set of mutually indiscernible objects (equivalent set in this paper) forms a basic granule of knowledge about the dataset. The union of these equivalent sets forms RS with lower and upper approximations. A rough set can be described as a collection of objects that, in general, cannot be precisely characterized in terms of the values of their sets of attributes but which can be characterized in terms of lower and upper approximations. For classification, RS are generated corresponding to each decision class. These RS determine the space in the conditional attributes that are related to specific decision classes. The RS consists of two boundaries determining two regions in the conditional attribute space. The primary region defined by the lower approximation boundary consists of objects that definitely belong to the particular decision class, and the boundary region defined by the upper approximation boundary consists of objects that could possibly belong to the decision class. On the basis of lower and upper approximations regions, the ability of the conditional attributes in differentiating between the decision classes can be determined. This is defined by the ratio of cardinality of the primary region to the cardinality of the boundary region. The accuracy, therefore, represents the ability of the conditional attributes in determining the decision class for objects in the dataset [26]. On the basis of the determined approximation accuracy, we can reduce the required information so as to retain only those conditional attributes that are absolutely essential for the classification of the objects being studied. This is achieved by discovering the subsets that can provide the same quality of classification as the whole set of conditional attributes. Such subsets are called reducts. Based on the reducts, a set of rules can be developed for the classification of objects into the various decision classes. In summary, the RS-based classification approach identifies the minimum subset of conditional attributes that can be used to differentiate between various decision classes in the dataset. Further, RS are defined by the lower and upper approximation boundaries for sets corresponding to different decision classes. This allows to accurately determine and represent the imprecision in the data. REFERENCES [1] D. Ceglarek, W. Huang, S. Zhou, Y. Ding, R. Kumar, and Y. Zhou, “Time-based competition in manufacturing: Stream-of-variation analysis (SOVA) methodology-review,” Int. J. Flex. Manuf. Syst., vol. 16, no. 1, pp. 11–44, 2004. [2] D. Ceglarek and J. Shi, “Dimensional variation reduction for automotive body assembly,” Manuf. Rev., vol. 8, no. 2, pp. 139–154, 1995. [3] F. Engelhardt, “Improving systems by combining axiomatic design, quality control tools and designed experiments,” Res. Eng. Design, vol. 12, no. 4, pp. 204–219, 2000. [4] D. N. P. Murthy and W. R. Blischke, “Strategic warranty management: A life-cycle approach,” IEEE Trans. Eng. Manage., vol. 47, no. 1, pp. 40–54, Feb. 2000. [5] J. F. Lawless, “Statistical analysis of product warranty data,” Int. Stat. Rev., vol. 66, no. 1, pp. 41–60, 1998. [6] K. D. Majeske and G. D. Herrin, “Determining warranty benefits for automobile design changes,” in Proc. Annu. Reliability and Maintainability Symp., 1998, pp. 94–99. [7] W. R. Blischke and D. N. P. Murthy, Warranty Cost Analysis. New York: Marcel-Dekker, 1994. [8] O. M. Kim and W. Kuo, “Percentile life and reliability as performance measures in optimal system design,” IIE Trans., vol. 35, no. 12, pp. 1133–1142, 2003. [9] M. U. Thomas and S. S. Rao, “Warranty economic decision models: A summary and some suggested directions for future research,” Oper. Res., vol. 47, no. 6, pp. 807–820, 1999. [10] H. Pham and X. M. Zhang, “A software cost model with warranty and risk costs,” IEEE Trans. Comput., vol. 48, no. 1, pp. 71–75, Jan. 1999. [11] M. R. Karim, W. Yamamoto, and K. Suzuki, “Change point detection from marginal count failure data,” J. Jpn. Soc. Quality Contr., vol. 31, pp. 318–338, 2001. [12] H. Wu and W. Q. Meeker, “Early detection of reliability problems using information from warranty databases,” Technometrics, vol. 44, no. 2, pp. 120–133, 2002. [13] P. Ge, S. C.-Y. Lu, and S. T. S. Bukkapatnam, “Supporting negotiations in the early stage of large-scale mechanical system design,” Trans. ASME, J. Mechan. Design, vol. 127, no. 6, pp. 1056–1067, 2005. [14] K. Yang and E. Cekecek, “Design vulnerability analysis and design improvement using warranty data,” Quality Reliab. Int., vol. 20, no. 2, pp. 121–133, 2004. [15] A. Khan and D. Ceglarek, “Sensor optimization for fault diagnosis in multi-fixture assembly systems with distributed sensing,” Trans. ASME, J. Manuf. Sci. Eng., vol. 122, no. 1, pp. 215–226, 2000. [16] Y. Ding, P. Kim, D. Ceglarek, and J. Jin, “Optimal sensor distribution for variation diagnosis in multi-station assembly processes,” IEEE Trans. Robot. Autom., vol. 19, no. 4, pp. 543–556, Aug. 2003. [17] J. Lawless, J. Hu, and J. Cao, “Methods for the estimation of failure distributions and rates from automobile warranty data,” Lifetime Data Anal., vol. 1, no. 3, pp. 227–240, 1995. [18] D. C. Montgomery, Introduction to Statistical Quality control. New York: Wiley, 1997. [19] D. Ceglarek, J. Shi, and S. M. Wu, “A knowledge-based diagnosis approach for the launch of the auto-body assembly process,” Trans. ASME, J. Eng. Ind., vol. 116, no. 4, pp. 491–499, 1994. [20] Y. Ding, J. Jin, D. Ceglarek, and J. Shi, “Process-oriented tolerancing for multi-station assembly systems,” IIE Trans. Design Manuf., vol. 37, no. 6, pp. 493–508, 2005. [21] Q. Rong, J. Shi, and D. Ceglarek, “Adjusted least squares approach for diagnosis of compliant assemblies in the presence of ill-conditioned problems,” Trans. ASME, J. Manuf. Sci. Eng., vol. 123, no. 3, pp. 453–461, 2001. [22] Y. Ding, D. Ceglarek, and J. Shi, “Design evaluation of multi-station assembly processes by using state space approach,” Trans. ASME, J. Mechan. Design, vol. 124, no. 3, pp. 408–418, 2002. [23] B. W. Shiu, D. Apley, D. Ceglarek, and J. Shi, “Tolerance allocation for sheet metal assembly using beam-based model,” Trans. IIE, Design Manuf., vol. 35, no. 4, pp. 329–342, 2003. [24] Z. Pawlak, “Rough sets,” Int. J. Comput. Inform. Sci., vol. 11, pp. 341–356, 1982. [25] A. Ohrn, “Discernability and Rough Sets in Medicine: Tools and Applications,” Ph.D. dissertation, Dept. Comput. Inform. Sci., Norwegian Univ. Sci. Technol., Trondheim, Norway, 1999. [26] M. Doumpos and C. Zopounidis, “Rough sets and multivariate statistical classification: A simulation study,” Comput. Econom., vol. 19, no. 3, pp. 287–301, 2002. MANNAR et al.: FAULT REGION LOCALIZATION: PRODUCT AND PROCESS IMPROVEMENT [27] J. Han, X. Hu, and N. Cercone, “Supervised learning: A generalized rough set approach,” RSCTC, pp. 322–329, 2001. [28] H. S. Nguyen and A. Skowron, “Quantization of Real-Valued Attributes,” in Proc. 2nd Int. Joint Conf. Inf. Sci., 1995, pp. 34–37. [29] S. Vinterbo and A. Ohrn, “Minimal approximate hitting sets and rule templates,” Int. J. Approx. Reasoning, vol. 25, no. 2, pp. 123–143, 2000. [30] B. J. Nelson, G. C. Runger, and J. Si, “An error rate comparison of classification methods with continuous explanatory variables,” IIE Trans., vol. 35, no. 6, pp. 557–566, 2003. [31] K. Mannar and D. Ceglarek, “Continuous failure diagnosis for assembly systems using rough set approach,” Ann. CIRP, vol. 53, no. 1, pp. 39–42, 2004. Kamal Mannar received the B.S. degree in mechanical engineering from the National Institute of Technology Karnataka, Surathkal, India, in 2001, the M.S. degree in manufacturing systems engineering from the University of Wisconsin (UW)-Madison in 2005, where he is currently pursuing the Ph.D. degree in industrial and systems engineering. He was a Manufacturing Engineer in Delphi Automotive Systems, Bangalore, India, from 2001 to 2002. Currently, he is a Graduate Research Assistant in Manufacturing Systems Realization and Synthesis (MARS) Lab with UW-Madison. His research interests include methodologies for the diagnosis and prediction of warranty and field failures and the use of field performance as feedback to design and manufacturing to improve product robustness. The research is being performed in collaboration with General Electric Health Care and Motorola Labs. Dariusz Ceglarek (M’03) received the Ph.D. degree in mechanical engineering from the University of Michigan, Ann Arbor, in 1994. He was with the research faculty at the University of Michigan from 1995 to 2000. In 2000, he became Assistant Professor in the Department of Industrial and Systems Engineering at the University of Wisconsin-Madison, where he rose to the rank of Associate Professor and Professor in 2003 and 2005, respectively. His research interests include design and manufacturing with an emphasis on multistage production systems convertibility, scalability, and diagnosability, ramp-up time, and variation reductions; integration of statistical methods and engineering models for root cause identification of quality/variation faults; sensing systems/networks in manufacturing; and reconfigurable/reusable assembly systems. Dr. Ceglarek was elected as a corresponding member of CIRP and is a member of ASME, SME, NAMRI, IIE, and INFORMS. He was Chair of the Quality, Statistics and Reliability Section of the Institute of Operations Research and Management Sciences (INFORMS). Currently, he is a Program Chair of the ASME Design-for-Manufacturing Life Cycle (DFM-LC) Conferences and is an Associate Editor of the ASME Transactions, Journal of Manufacturing Science and Engineering. He also serves on the program review panel for the State of Louisiana Board of Regents R&D Program. He has received a number of awards for his work including the 2003 CAREER Award from the National Science Foundation, 1998 Dell K. Allen Outstanding Young Manufacturing Engineer Award from the Society of Manufacturing Engineers (SME), and two Best Paper Awards by ASME MED and DED divisions in 2000 and 2001, respectively. 439 Feng Niu (M’88–SM’99) received the B.S. degree in physics from Zhongshan University, Guangzhou, China, in 1982, and the M.S. degree in engineering from the Institute of Electronics, the Chinese Academy of Sciences (CAS), Beijing, China, in 1985, and the M.S. and Ph.D. degrees in electrical engineering from the Polytechnic University, Brooklyn, NY, in 1990 and 1992, respectively. He began his career with the Institute of Electronics, CAS, Beijing, China, and is now a Distinguished Member of the Technical Staff with Motorola Labs, Plantation, FL. Before joining Motorola, he was a research scientist with the Center for Advanced Technologies in Telecommunications (CATT), Brooklyn, NY. His research interests include cognitive radio, location technology, antenna, propagation, distributed sensing, and microelectromechanical systems (MEMS). Dr. Niu has been a technical reviewer for IEEE journals in the areas of antennas, propagation, and communications. He has served on the international program committees and technical committees, and as session chairs and reviewers of the international conferences in the areas of systems, communications, antennas, and RF technologies. He has given invited technical presentations and keynote speeches in conferences, universities, and professional societies. Bassam Abifaraj received the B.Sc. degree in electrical and computer engineering from Florida Atlantic University, Boca Raton, in 1985. He began his career in 1987 with Motorola, Plantation, FL, and held several positions including software development, test system development, technical operations, new product introduction, manufacturing production management, program management, and worldwide quality management. Currently, he is Quality Director for a Motorola integrated supply chain in Nogales, Mexico. Mr. Abifaraj holds several patents, publications, and engineering awards on inventions that were developed and implemented; one of which is an early type of belt clip for cellular phones that almost everyone uses. He represented Motorola with the South Florida Manufacturing Association and served on the board of directors of the Florida Sterling Council.