Quality Alert ATA5781 / ATA5833 07-Feb-2017 1. Team Members 2 1. Team members Name Title Department Location Grammling, Uwe Product Engineer PE-RFA Heilbronn Klaus Gratzke Product Management RFA Ulm Detruz, Glenn Product Quality Engineer & Team Lead PQE MPhil Dann Emilio Evasco Product Engineer PE-RFA MPhil Volkhard Flassnöcker Senior Analog Design Engineer (Memory Expert) RnD Heilbronn Sascha Wagner Applications Engineer Application Heilbronn Gerhard Willbold Software Development Engineer RnD Ulm Tobias Frankenhauser Hardware Development Engineer RnD Ulm Josef Seiler Director Customer Quality Manager Quality Heilbronn George Argos Customer Quality Manager Quality Colorado Springs Pascal Lecuyer Customer Quality Manager Quality Nantes Dietmar Sigmann Customer Quality Manager Quality Heilbronn 3 1. Team members Delphi Single Interface Communication prefered. Delphi is asked to define the key contact person for bi-weekly calls All the other involved Delphi contact will be informed via email Name Title Department Location 4 2. Describe the Problem 5 Top Level Summary Increase customer reject rate observed end Dec. 2016: 12 production line rejects during customer EEPROM programming and verification : Customer 1 1 0-km reject with different failure mode : Customer 1 3 confirmed production line rejects : Customer 2 31 further production line rejects reported : Customer 1 Should be send to MPhil for verification Impacted product family : ATA5781 / ATA5833 Root Cause : Within the standard production improvement the EEPROM Read Timing Calibration moved from Hot Final Test to Cold Final Test. Shipment started 18-Jul-2016 Impacted are some lots: mainly lot DA14S.1 and lower impacted lot DA14T.1 and lot D9THF.1 EEPROM Read Timing Calibration decentering in combination with process marginality of the above three lots caused the issue Updated: 03-Feb 2017 6 Block Diagram Microchip FW (FirmWare) is in ROM FW defines how which Bytes are read. Issue is caused by „not optimized EEPROM Read Time Calibration“ and affects the EEPROM content reading 7 Root Cause Analysis 1/2 Fishbone 8 Root Cause Analysis 2/2 Root cause of production line fails Not a physical defect Not a design weakness Issue is caused by production test Production test including a ‚time tuning‘ The tuning values are stored in EEPROM fuses This process to do ‚time tuning‘ is called ‚calibration‘ In test program before July 2016 ‚calibration‘ was done at Final Test Hot In test program V3.006 (July 2016) ‚calibration‘ is done at Final Test Cold ‚Calibration‘ target kept unchanged, but value is temperature dependent EEPROM Read Timing Calibration is decentered During production test EEPROM is checked via Burst Read mode. Single Byte Read mode is covered indirectly. New: 06-Feb 2017 9 Calibration of eeprom read timing New: 06-Feb 2017 11 Calibration of eeprom read timing New: 06-Feb 2017 12 Calibration of eeprom read timing New: 06-Feb 2017 13 Calibration of eeprom read timing New: 06-Feb 2017 14 Calibration of eeprom read timing New: 06-Feb 2017 15 ATA5781 Shipments & complaints overview Customer 1 Location-1 6,000 ProdFails Location-3 0-km & QTY0-km & Prod-Fails Field-fails shipped Field-fails Location-1 QTY0-km & QTYProd-Fails shipped Field-fails shipped Location-2 0-km & Fieldfails ProdFails QTYshipped ProdFails Location-3 0-km & Fieldfails QTYshipped 6,000 6,000 6,000 6,000 12,000 12,000 18,000 6,000 12,000 6,000 6,000 6,000 6,000 6,000 6,000 6,000 6,000 6,000 18,000 6,000 6,000 6,000 12,000 ProdFails 12,000 18,000 (1) 6,000 24,000 6,000 12,000 12,000 Impacted lot 1 12,000 6,000 18,000 6,000 6,000 12,000 6,000 6,000 24,000 12,000 12,000 6,000 12,000 12,000 12,000 24,000 Impacted lots 8 2 18,000 222,000 1 12,000 24,000 12,000 6,000 6,000 11 (1) 18,000 18,000 18,000 6,000 12,000 312,000 6,000 12,000 1 2 6,000 Note: A former Production fail tested (NTF) confirmed. Fab-Lot# D9CFK.1 D9CSF.1 D9F1N.1 D9FAP.1 D9FAQ.1 D9FJ7.1 D9FJ8.1 D9QHK.1 D9QHM.1 D9QHP.1 D9QN8.1 D9QN9.1 D9QNA.1 D9QPN.1 D9QPP.1 D9QPT.1 D9THF.1 D9THG.1 D9THJ.1 D9WS6.1 D9YAJ.1 D9YAK.1 DA14Q.1 DA14R.1 DA14S.1 DA14T.1 DA3H9.1 DA3HA.1 DA3HF.1 DA3LL.1 DA3LN.1 DA3PW.1 DA3Q0.1 DA5N9.1 DA5NC.1 Total QTYshipped Customer 2 Location-2 1 6,000 18,000 96,000 3 6,000 18,000 0-km & Fieldfails Total 12,000 24,000 24,000 24,000 12,000 18,000 24,000 12,000 12,000 6,000 12,000 24,000 24,000 18,000 12,000 6,000 18,000 18,000 18,000 6,000 6,000 12,000 36,000 6,000 48,000 24,000 12,000 18,000 42,000 12,000 36,000 36,000 30,000 6,000 12,000 660,000 Customer 1. Location2 EOL reject rate~ 190ppm (incl. 31 reported rejects customer 1) Updated: 03-Feb. 2017 16 Status 0-km fail Root cause for 0km fail show different failure mode like production line rejects of customer 1 – not linked to Single byte read EEPROM issue Retest with production program device fail: Open connection on PC5 / PB5 Pin Short on AVCC Pin Fuse setting are not correct major device setting are wrong Test of digital part show some fail in SCAN mode Conclusion: Different failure mode as production fails Electrical overstress detected see “PFA-Status” Updated: 6-Feb 2017 17 PFA-Status Non-Destructive Analysis (performed in HNO) Incoming Inspection (Optical-, X-ray- & SAM) No package anomalies visible by optical inspection No internal anomalies concerning bond wire geometry or die attach with X-ray microfocus No delamination visible Curve-Trace The characteristic measurement shows a short circuit at pin 12 and an open pin 27 Pin#12 (AVCC) Shorted vs GND Pin#27(PB5) open vs GND Decapsulation and optical inspection Optical microscope inspection of the chip surface after decapsulation (PO) of the package showed EOS in the area pin 27. The open pin 27 is probably a result of electrical overstress. Most of the damage is probably under metallization New: 6-Feb 2017 18 3. Verification, Risk Assessment and Containment Actions 19 Application Effect of Failure EEPROM read impacted Internal EEPROM read via two modes possible: EEPROM cell content not impacted (cell margin o.k.) EEPROM read is needed for each customer application Application fail due to bit errors of special EEPROM content Single Byte Read (SBR) affected, as indirect tested by production test Burst Read (BR) not affected, as direct tested by prodcution test Impact on Application If EEPROM is not read correctly it can cause several failure modes but available ECC (Error Correction Code) will correct a one bit per byte fail Updated: 06-Feb-2017 20 Impact Factors to Fail/Pass From current point of view: • Therefore for a final risk-assessment following aspects needs to considered Status 21 Risk Assessment 3.3 Mio pcs ATA5781 delivered since July 2016 15 production line rejects out of 3 lots are affected: 10 rejects from lot DA14S.1, 4 rejects from lot DA14T.1 and 1 reject from lot D9THF.1 Detection at Customer End Of Line (Time 0) Estimation : customer segregated critical ICs Important : no timing degradation over life time 48h results : no drift, 168h results : no drift 504h target 16. Feb. Risk Assessment Estimation : No/low ppm field rejects expected as several factors necessary to bring IC to fail. See Impact Factors to Fail/Pass slide Updated: 02-Feb-2017 22 Risk-Assessment Check of EEPROM Read Time (Pre-charge timing) over lifetime • Lifetime Dependency of calibrated pre-charge timing • • • with 4-Insertion flow (initail Qualification data) No drift between 0h and 1000h with 3-Insertion flow ( lot# D8GJS.1, used for release) No drift between 0h and 168h Single-IC drift 0h & 168h read-out • • Timing slightly longer after 168h. Comparable to initial Qualification data Single-IC drift between 0h, 48h & 168h is <1ns (max: 800ps) equivalent to ~1.6% of target values no drift. Absolute value per IC 0h, 48h, 168h (cold, room, hot) Relative drift between 168h and 0h Updated: 06-Feb 2017 23 Test Program Overview Test Program Comment V3.006 Production Program New 3-Insertion Flow Eeprom read timing calibration decentered (Final Calibration moved from Hot to cold temp) Active between Jul-2016 and Jan-2017 V3.007 Production Program 3-Insertion Flow Eeprom read timing calibration corrected Active since 24-Jan-17 V3.008 Engineering Screening Test for Risk Assessment Delphi setting programmed into eeprom and read out via single byte read mode if all byte read correctly the part is pass (similar to Delphi production check, but more critical as application because all eeprom bytes are read via single byte mode) Updated 6-Feb-17 24 Results screening test Special Engineering Test Program V3.008 created to screen material tested with Test Program V3.006 (between July 2016 and Jan 2017) for Risk Assessment Checks Test to Check example Customer 1 EEPROM setting read in single byte mode, test result is failed if one byte out of ~600 byte failed Results of 1k + 6k + 6k ICs from three random picked lots from that time frame shown next slide To confirm effectiveness of containment action 1k + 6k pcs are tested and pass application relevant test Updated 6-Feb-17 25 Screening Summary Risk Assessment Used Material random selected from stock tested with V3.006 Check Containment Action Material from stock tested with V3.007 V3.008 Testresult Application Risk failed devices – Lot 1 V3.008 Testresult Application Risk failed devices - Lot 2 V3.008 Testresult Application Risk failed devices - Lot 3 V3.008 Testresult Application Risk failed devices - Lot 4 V3.008 Testresult Application Risk failed devices - Lot 5 V3.008 Testresult Application Risk failed devices - Lot 6 30°C 0 2 2 1 0 0 -40°C 0 0 0 9-Feb-17 0 0 110°C 1 (0,1%) 50 (0,8%) 9 (0,15%) 9-Feb-17 0 0 Overall 1 / 1000 52 / 6000 11 / 6000 9-Feb-17 0 / 1000 0 / 6000 0 ; 2 ; 2 ICs found at 30°C and would be detected at customer EOL testing 1 ; 50 ; 9 ICs (out of 13000) show a fail in single byte eeprom read issue at 110°C and would not be screened at customer EOL testing Even if 1 byte is not read correctly in single byte read in production test it can be read in operation in a different mode and not create an application fail (see slide 5) Lot to Lot variation confirmed High Temperature is worst case Important Note: 1k + 6k screened which are tested with TP 3.007 don‘t show any fail in this test Containment Action is effective and is Corrective Action. As all „old“ material used already not customer containment needed Updated 6-Feb-17 26 Review Customer EEPROM settings Customer Critcal EEPROM content Delphi Application 1 „S101“ Delphi Application 2 „CH072“ Critical content in customer settings (e.g. 1x – one byte is critical) / SBR: single byte read by FW / BR: burst read by FW „0x1C“ „0x3C“ „0x18“ „0x38“ 1x Event handling (SBR) uncritical function 3x RX Setting (BR) 2x Debug Output (BR) 0x 1x Event handling (SBR) uncritical function 2x RX Setting (BR) 0x 0x 0x 1x RX Setting path A (SBR / BR) critical function 1x SFID length (BR) 1x Demod Setting (BR) 0x 0x 2x SFID length (BR) 0x 0x 0x 2x SFID length (BR) 0x Delphi Application 3 „PSA_BSI_433“ Delphi Application 4 „PSA_BSI_315“ Customer 2 Customer 3 Customer 3 Most critical EEPROM contents are readout in uncritical Burst Read (BR) Fails in Single Byte Read (SBR) are depending from content of the byte – The above shown four critical EEPROM contents are most critical and reviewed for availble customer settings New 03-Feb-17 27 Application failure model 1 / 4 Field risk calculated by this formula Field Riskper lot = fnon * fSBR * fimp * ftemp * ffield New 3-Feb-17 28 Application failure model 2 / 4 For failure model these factors are considered fnon : This is the slip factor (non detected) at customer EOL room temp. testing: Data extracted from 3 lot screening (total 13k ICs) 1 ; 50 ; 9 ICs not failing in room temperature test but at 110°C = 0,1% ; 0,8% : 0,15% Room temperature fails are detected by customer during end of line programming and verify Refinement of this figure with further 6k screening until 9th Feb 17 fSBR : This is Single Byte Read (SBR) factor Access of Firmware of ATA5781 / ATA5833 to EEPROM Single Byte Read (SBR) and Burst Read (BR) must be seperated Analysis of firmware done 38x SBR / 335x BR used = 38 / (38+335) = 0,102 ~10% of EEPROM content is read via critical SBR fSBR = 10% fimp : This is Impact Factor. Not each EEPROM information is needed for main functions. There are some for debug function, event signals, ... As shown in slide „Review Customer EEPROM settings“ there are customer configuration there all critical content is readout via burst read settings Based on review only ~ 30% of byte can cause application fail (fimp) Each customer configuration has different factor (quantity of critical content is different, 30% for setting 2, 3 and 4 for Delphi and 17% for setting 1) Updated 6-Feb-17 29 Application failure model 3 / 4 ftemp : This is Use Temperature Factor based on typical Mission Profile: Ambient temperature in °C Mission Profil in % -40 10 23 20 40 53 75 15 105 2 % of range 83% 17% 17% of time device operate in critical temperature (high temp is worse case) Critical temperature factor (ftemp) = 17% (remember: weak parts screened @ customer end of line programming already) ffield : This is factor how endcustomer could be impacted In case the EEPROM is not read correctly and impacting the function it will not lead in all cases to an field return Impact on performance only not the function (e.g. less range, not detected by end user) Impact on function, but operable after repetition Assumption only 30% of fails will be recognized by end user and claimed (ffield) Proposal to review potential fail of critical eeprom content read to application (see table Review Customer EEPROM setting) New 3-Feb-17 30 Application failure model 4 / 4 Room temp results Delphi EOL Programming: DA14T1.1 26 of 12k pcs= 2167ppm (unconfirmed 24pcs) DA14S.1 12 of 42k pcs = 285ppm (unconfirmed 1pcs) D9THF.1 1 of 18k pcs = 56ppm Microchip 30°C screening: Lot 1: 0 of 1k pcs = 0ppm Lot 2: 2 of 6k pcs = 333ppm Lot 3: 2 of 6k pcs = 333ppm Lot 4: 1 of 6k pcs = 167ppm New 7-Feb-17 31 Application failure model 4 / 4 Field risk calculated by this formula Field Riskper lot = fnon * fSBR * fimp * ftemp * ffield fnon Lot 1 Lot 2 Lot 3 0,10% 0,80% 0,15% Lot 1 Lot 2 Lot 3 0,10% 0,80% 0,15% DA14T.1 DA14S.1 D9THF.1 8,00% 8,00% 8,00% DA14T.1 DA14S.1 D9THF.1 8,00% 8,00% 8,00% fSBR fimp ftemp Application 1 10% 17% 17% 10% 17% 17% 10% 17% 17% Application 2 / 3 / 4 10% 30% 17% 10% 30% 17% 10% 30% 17% Customer Return Lots Application 1 10% 17% 17% 10% 17% 17% 10% 17% 17% Application 2 / 3 / 4 10% 30% 17% 10% 30% 17% 10% 30% 17% ffield ppm Rate 30% 30% 30% 1 7 1 30% 30% 30% 2 12 2 30% 30% 30% 69 69 69 30% 30% 30% 122 122 122 Assumption: Complainted Lots considered to have 8% slip factor (factor 10 higher than worse seen screened lot (pessimistic model) New 7-Feb-17 32 Executive Summary Containment / Corrective Action: Adjusted EEPROM Read Time Calibration to cold temperature : Done, 24th Jan. 17. Shipments All shipments starting 26th Jan. 17 are tested with adjusted EEPROM Read Time Calibration Exchange of Customer parts Customer to ship back remaining parts of lot DA14S.1, DA14T.1 and D9THF.1. Stock exchange needs to be coordinated with Microchip Inside Sales 33 Next Steps Continue the Risk Assessment Reliability : Life Time (HTOL) verify possible impact on Single Byte Read 48h results : no drift 168h results : no drift 504h target 16. Feb. Failure Rate Extrapolation 3 different fab lots 13k ICs tested with Engineering Screening Test program to verify reject rate at room, hot and cold temperature. Additional 6k (of a 4th fab lot) planned Target date 9. Feb. Lot / process variation influence: Further correlation of PCM (E-Test) data versus Wafer Probe data on lots DA14S.1, DA14T.1 and lot D9THF.1 and screened lots ongoing. Target date: 8. Feb 5 Why for Lessons Learned and systemic improvement Started by Product Engineering Updated:06-Feb-2017 34 © 2016 Atmel Corporation. Atmel®, Atmel logo and combinations thereof, Enabling Unlimited Possibilities®, and others are registered trademarks or trademarks of Atmel Corporation or its subsidiaries. ARM®, ARM Connected® logo and others are the registered trademarks or trademarks of ARM Ltd. Other terms and product names may be the trademarks of others. Disclaimer: The information in this document is provided in connection with Atmel products. No license, express or implied, by estoppel or otherwise, to any intellectual property right is granted by this document or in connection with the sale of Atmel products. EXCEPT AS SET FORTH IN THE ATMEL TERMS AND CONDITIONS OF SALES LOCATED ON THE ATMEL WEBSITE, ATMEL ASSUMES NO LIABILITY WHATSOEVER AND DISCLAIMS ANY EXPRESS, IMPLIED OR STATUTORY WARRANTY RELATING TO ITS PRODUCTS INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTY OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR NON-INFRINGEMENT. IN NO EVENT SHALL ATMEL BE LIABLE FOR ANY DIRECT, INDIRECT, CONSEQUENTIAL, PUNITIVE, SPECIAL OR INCIDENTAL DAMAGES (INCLUDING, WITHOUT LIMITATION, DAMAGES FOR LOSS AND PROFITS, BUSINESS INTERRUPTION, OR LOSS OF INFORMATION) ARISING OUT OF THE USE OR INABILITY TO USE THIS DOCUMENT, EVEN IF ATMEL HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. Atmel makes no representations or warranties with respect to the accuracy or completeness of the contents of this document and reserves the right to make changes to specifications and products descriptions at any time without notice. Atmel does not make any commitment to update the information contained herein. Unless specifically provided otherwise, Atmel products are not suitable for, and shall not be used in, automotive applications. Atmel products are not intended, authorized, or warranted for use as components in applications intended to support or sustain life.