PLATFORM READINESS TEST PLAN Pinak Raval B.E., C.U.Shah College of Engineering & Technology, India, 2006 PROJECT Submitted in partial satisfaction of the requirements for the degree of MASTER OF SCIENCE in ELECTRICAL AND ELECTRONIC ENGINEERING at CALIFORNIA STATE UNIVERSITY, SACRAMENTO FALL 2009 PLATFORM READINESS TEST PLAN A Project by Pinak Raval Approved by: ___________________________________, Committee Chair Dr. Thomas Matthews ____________________________________, Ashok Singh _________________ Date ii Student: Pinak Raval I certify that this student has met the requirements for format contained in the University format manual, and that this project is suitable for shelving in the Library and Credit is to be awarded for the project. ________________________________, Graduate Coordinator Dr. Preetham Kumar Department of Electrical and Electronic Engineering iii ___________ Date Abstract of PLATFORM READINESS TEST PLAN by Pinak Raval This report will present steps taken in order to develop a “Platform Readiness Test Plan”. The platform readiness test plan will provide assistance to the original equipment manufacturers (OEM) in testing and validating their motherboard robustness prior to production. The test procedure listed in this report assures the customer that their platforms are robust in key areas of design. This report is written mainly for the customer’s product development team so that they follow a correct test procedure and hence validate their platform according to the standards. This test will be the way of gauging the readiness of the customer platform for the launch in the marketplace. Key customers like Hewlett Packard, Dell, ASUS use a customer reference board (CRB) design provided by chip manufacturing companies like Intel, AMD to keep the data consistent and in agreement with the data provided by the vendor. This project presents a test plan, the purpose of which is to uncover the potential issues such as circuit marginality, signal integrity and more that were slipped during other pre-silicon and post-silicon validation plans. And in future, when iv the customer identifies a failure through cursory analysis using platform readiness test plan, recommendation on ways to correct will be provided by the vendor. _____________________________________________, Committee Chair Dr. Thomas Matthews __________________ Date v TABLE OF CONTENTS List of Figures…………………………………………………………………………... vii Chapter 1. INTRODUCTION……………………………......................................................... 1 Intel Architecture – Platform Overview…………………………………….….…2 1.1 Front Side Bus (FSB)………………………………………………….5 1.2 Quick Path Interconnect (QPI) ……………………………………… 9 1.3 PCI-Express…………………………………………………….…....12 1.4 USB…………………………………………………………………..13 1.5 SATA……………………………………………………………...…14 2. TESTING METHODOLOGY…………………………………………………….16 2.1 SATA Test Setup…………………………………………………… 17 2.2 SATA Test Procedure…..................................................................... 18 2.3 Test Result………………………………………………….………. 28 3. TOOLS………………………………………………………………………..…. 33 3.1 UVMC8 Automargin Card…………………………………………. 34 3.2 Mars Software…………………………………………………….… 39 4. CONCLUSION……………………………………………………………………43 Refrences……………………………………..…………………………………………. 44 vi LIST OF FIGURES Page Figure 1.1. 2008 Platform Diagram ---------------------------------------------------------------4 Figure 1.2. Maximum Acceptable Overshoot/Undershoot Waveform -----------------------8 Figure 1.3. Low-to-High FSB Receiver Ringback Tolerance ---------------------------------8 Figure 1.4. High-to-Low FSB Receiver Ringback Tolerance ---------------------------------9 Figure 2.1. Connecting the SMA Cables to the SATA test fixture --------------------------19 Figure 2.2. Figure setup to export CSV example ----------------------------------------------23 Figure 2.3. signal quality eye rendering program main menu -------------------------------24 Figure 2.4. Import the CSV data file ------------------------------------------------------------25 Figure 2.5. Verify the data file button -----------------------------------------------------------25 Figure 2.6. Data file error window example ----------------------------------------------------26 Figure 2.7. SATA generation-1 template selection --------------------------------------------26 Figure 2.8. SATA gen2 template selection -----------------------------------------------------27 Figure 2.9. TEST button --------------------------------------------------------------------------27 Figure 2.10. Test results screen ------------------------------------------------------------------28 Figure 2.11. SATA generation-1 non-transition passing results example -----------------30 Figure 2.12. SATA generation-1 transition passing results example -----------------------31 Figure 2.13. SATA generation-2 non-transition passing results example ------------------31 Figure 2.14. SATA generation-2 transition passing results example -----------------------32 Figure 3.1a. 2008 Platform Diagram ------------------------------------------------------------35 vii Figure 3.1b 2008 Platform Diagram -------------------------------------------------------------36 Figure 3.2. Proper USB Cable --------------------------------------------------------------------37 Figure 3.3. Channel’s potentiometer configuration for the UVMC8 automargin card----39 vii 1 Chapter 1 INTRODUCTION Leading chip manufacturing companies such as Intel, AMD, etc. follow a detailed chip validation process prior to the product launch to market. The detailed validation process includes functional validation, data integrity validation, circuit marginality validation, and other tests to ensure the chip is fully functional in all possible customer environments. The author of this report was part of the team which developed and wrote the test plan in order to test various high traffic buses. The author’s responsibilities included carrying out the procedure as specified in the test plan, analyzing results and publishing an internal proprietary report as well as this report. The test plan described in this report covers platform validation. Platform validation takes place after chip validation which includes a functional validation and other tests. For platform validation, the chip is mounted on a customer reference board (CRB) and tested with real-time applications. One of the key test plans during this phase is Platform Readiness Test Plan. In this phase, the corner cases such as circuit marginality, setup time violations and other issues unintentionally slipped prior to validation flow are discovered. Mainly, the testing is performed to stress different buses. Basically, a bus is an interconnection between two functional blocks used to transfer data 2 in a unidirectional or bi-directional mode. These buses are most stressed when any realtime application is running due to a high data transfer rate. To test the buses, a test fixture has been developed that creates conditions that are very similar to a real-time application but allows more control over the transfer rate and, hence, the stress. Once the buses are stressed enough and validated, the chip manufacturing company can notify the original equipment manufacturers (OEMs) to move their platform to the high volume manufacturing phase. The purpose of this testing is to validate the buses to a sufficient level at which a company can decide whether the system is healthy and ready to go into a high-volume production phase. Intel Architecture – Platform Overview Figure 1.1 shows a diagram of a typical 2008 desktop platform. The platform will support an Intel® LGA775 processor and will feature dual memory channels (maximum of four dual in-line memory module (DIMMs) and a peripheral component interconnect (PCI) express graphics port. Different high-speed buses are using different protocols to communicate with different devices. Some of the key buses that remain stressed during different types of transactions are as follows: FSB (Front Side Bus): This bus connects the north bridge of IA to the CPU. QPI (quick path interconnect): This bus connects core to core and core to chipset. 3 PCI-Express: This bus connects the graphics card and the local area network (LAN) card with the north bridge SATA (Serial ATA): This bus connects external or internal hard disk to the south bridge. USB (Universal Serial Bus): This bus connects different USB devices to the south bridge. Power Delivery: This block performs voltage regulation for delivering variable voltage to the central processing unit. 4 Processor 800/1066/1333 MHz FSB Analog Display VGA System Memory Channel A MCH ADD2 or MEC DDR2/3 OR Channel B DDR2/3 PCI Express* x16 Graphics DMI Interface Controller Link USB 2.0 10 ports Power Management SMBus 2.0/I2C GPIO Intel® ICH10 6 Serial ATA Ports SPI Flash BIOS SPI Intel® High Definition Audio Codec(s) LCI GLCI SST and PECI Sensor Input Intel® 82566 Gigabit Platform LAN Connect PCI Express* Bus Fan Speed Control Output PCI Bus TPM LPC Interface Figure 1.1. 2008 Platform Diagram 5 w/LANor 6 PCIe Slots Four PCI Masters Sys_Blk_G 5 The test platform performs tests for each of the following buses. A higher-level overview of each bus is provided to describe the need of their electrical verification. 1.1 Front Side Bus (FSB) The front side bus connects the processor and memory controlling hub. This bus always remains stressed because it is the only runway between a chipset and the processor. Normally, if any small task is performed on the PCI-Express, which is the only graphics port, it requires large amounts of memory transaction. Such a chunk of memory used in a transaction will be processed at the CPU, and the front side bus (FSB) is the only runway that will carry this chunk of memory data to the CPU for further processing. So, it is one of the key areas that must be tested. Multi-chip package processors have a different front side bus (FSB) architecture so they need different methods to test them. To test the FSB of a multi-chip package processor, it is recommended there be core-to-core traffic. One of the easiest ways to have core-to-core FSB traffic is to boot to the operating system (Windows XP) with Service Pack-2 or any prior edition. Therefore, it is recommended Windows be booted while margining the gunning transceiver logic reference voltage (GTLREFs) of the FSB. This voltage is swept during testing to measure the maximum and minimum marginality. We can also use margining stress (MARS) software to generate stressed patterns, which can find faults on different corners. The USB-controlled voltage margining card (UVMC) card is used to vary GTL on the high side as well as the low side, thus finding the failure. 6 So, by performing this, it is possible to obtain the voltage reference (VREF) value on the high and low sides. These VREF values provide a good indication of whether the bus will function correctly without any performance issues. 1.1(a) Signaling Specifications Most processor FSB signals use gunning transceiver logic (GTL+) signaling technology. This technology provides improved noise margins and reduced “ringing” (the reader is referred to section 2 for an explanation of “ringing”) through low voltage swings and controlled edge rates. Intel platforms implement a termination voltage level for GTL+ signals defined as the voltage terminal threshold (VTT). Because platforms implement separate power planes for each processor (and chipsets), separate VCC and VTT supplies are necessary. This configuration allows for improved noise tolerance as the processor frequency increases. Speed enhancements to data and address buses have caused signal integrity considerations and platform design methods to become even more critical than with previous processor families. The GTL+ inputs require a reference voltage (GTLREF) used by the receivers to determine if a signal is a logical 0 or logical 1. A GTLREF must be generated on the motherboard. Termination resistors (RTT) for GTL+ signals are provided on the processor silicon and are terminated to VTT. Intel chipsets will also provide on-die termination thus eliminating the need to terminate the bus on the motherboard for most GTL+ signals. 7 1.1(b) Overshoot and Undershoot While validating the FSB, overshoot and undershoot are also some of the key parameters that should be known. Excessive signal swings on the high side are called overshoot and excessive signal swings on the low side are called undershoot as referenced in Figure 1.2. Overshoot and undershoot are detrimental to the silicon gate oxide integrity and can cause device failure if absolute voltage limits are exceeded. Additionally, overshoot and undershoot can cause timing degradation due to the build up of inter-symbol interference (ISI) effects. For example, ringback is the result of an oscillation following an overshoot (or undershoot) having high enough amplitude to cross threshold as shown in Figures 1.3 and 1.4. Hence, it is important that the designer work to achieve a solution that provides acceptable signal quality across all systematic variations encountered in volume manufacturing. 8 Figure 1.2. Maximum Acceptable Overshoot/Undershoot Waveform Figure 1.3. Low-to-High FSB Receiver Ringback Tolerance 9 Figure 1.4. High-to-Low FSB Receiver Ringback Tolerance 1.2 Quick Path Interconnect (QPI) Quick path interconnect (QPI) is the technology that will replace the FSB in Intel’s next generation architecture. QPI will be responsible for data transfer between core and core or between cores and input output hub (IOH). When using the Intel® interconnect built-in self test (BIST) test and debug feature both the send and receive portions of the QPI interface can be checked for functional validation. The built-in self test (BIST) feature is built into both the IOH and the CPU, thus allowing testing of the entire send and receive data path. There are a variety of different tests that can be performed with BIST. BIST testing is done by sending pre-defined or customized data patterns across the interface being tested using the actual drivers and receivers for that interface, then 10 comparing the patterns received to the patterns that were sent. Different combinations of test patterns can be sent to try to simulate combinations of worst case data patterns. Errors are detected when the pattern received does not match the pattern sent. The BIST is one of many tools available to test specific serial buses on the platforms for functionality. This test capability is built into the silicon itself and requires no test equipment that physically contacts the link in question, alleviating the need to evaluate the impact of the testing device on the link itself or a probing point that is less than optimal. 1.2(a) QPI Overshoot & Undershoot Specification Overshoot (or undershoot) is the absolute value of the maximum voltage above or below VSS. The overshoot/undershoot specifications limit transitions signals with fast edge rates so that they may not go too much higher than positive supply (VCC) or go too much more negative than the negative supply (VSS or ground). The processor can be damaged by single and/or repeated overshoot or undershoot events on any input, output, or I/O buffer if the resulting voltage is large enough (i.e., if the over/undershoot is great enough). Baseboard designs that meet signal integrity and timing requirements and do not exceed the maximum overshoot or undershoot limits listed will ensure reliable inputoutput (IO) performance for the lifetime of the processor. 11 1.2(b) Overshoot/Undershoot Magnitude The maximum difference between a signal and its voltage reference level is called it magnitude. The maximum positive difference between a signal and VTT as shown in Figure 1.2 is called overshoot and the maximum negative difference between a signal and ground as shown in Figure 1.2 is called undershoot. It is important to note that overshoot and undershoot conditions are separate and their impact must be determined independently. The pulse magnitude and duration must be used to determine if the overshoot/undershoot pulse is within specifications. 1.2(c) Overshoot/Undershoot Pulse Duration The overshoot or undershoot pulse duration describes the total amount of time for which an overshoot/undershoot event exceeds the overshoot/undershoot reference voltage. The total time could encompass several oscillations above the reference voltage. Multiple overshoot/undershoot pulses within a single overshoot/undershoot event may need to be measured to determine the total overshoot or undershoot pulse duration. 12 1.3 PCI-Express PCI-Express is the protocol that connects different devices on the north bridge as well as on the south bridge. The north bridge is a functional block at which memory controller and all display ports and memory modules are connected. It handshakes with CPU via FSB and with south bridge via PCI express bus. South bridge is a functional block where all the interconnect peripherals such as USB, SATA, parallel and serial ports are connected. The higher-level functionality of a PCI Express protocol is to connect any of the graphics devices to the graphics and memory controlling hub (GMCH). This protocol operates on two different specifications: GENERATION 1 (GEN 1) and GENERATION 2 (GEN 2). GEN 1 operates at the rate of 2.5 GT/s (giga transfer/sec) and GEN 2 operates at the rate of 5.0 GT/s (giga transfer/sec). Different types of add-in cards are available on the market like x1, x4, x8, and x16, which should be supported by the platform using this PCI express protocol. This bus is also a high-speed bus and the data transfer is point-to-point using differential signaling. This bus always remains stressed because it is the only runway between a chipset and the graphics. Thus, it is one of the key areas that must be tested. The PCI Express Gen 2 signal quality tools were developed to help verify product compliance to the PCI Express Gen 1 and Gen 2 specification(s). A series of tests will be 13 used to evaluate PCI Express systems, motherboards, and PCI Express add-on card products with the PCI Express signal quality tools. PCI Express signal quality will be gauged by measuring transmissions at the connector on the motherboard. The data is captured in an eye diagram for analysis and will be compared against an eye diagram template that reflects the PCI Express Gen 2 CEM specification. The CEM specification limits are set to ensure that passing motherboards will work with worst case spec compliant add-in cards. The tests performed are transmitted signals only. This transmitted signal is measured using a PCI Express Compliance card. This measurement will be described in detail in section 2. 1.4 USB USB is a universal serial interface that will connect USB devices to the platform. It is also a peer-to-peer bus and uses a differential signaling scheme for data transfer. This bus also remains stressed due to continuous polling, so it should be tested prior to a highproduction phase. The vendor recommends manufacturers perform three key tests to ensure the proper operation of the USB interface on the platform. The key tests will be the following: 14 1.4(a) Voltage Drop Test: The voltage drop test uses a static load and ensures that the output voltage does not violate the specification under full and no load conditions. 1.4(b) Voltage Droop Test: The voltage droop test uses a switching load and measures transient current handling capabilities to ensure that the board design does not violate the USB voltage specification. 1.4(c) Signal Quality Test: The signal quality test examines the signal quality at low speed, full speed, and high speed to ensure the rise/fall times, over/undershoot voltage, jitter, amplitude, and crossover voltage do not violate the specification. If the platform passes all three tests, then we can say the USB ports are robust enough to go in high-volume production phase. 1.5 SATA Serial ATA (SATA) is the bus connected between a hard drive and the south bridge. It also uses a differential signaling scheme for transmitting and receiving data. It is the only bus that shares the content of the secondary memory with the random access memory and the different bus masters. Hence, this bus will also remain stressed and require electrical testing prior to moving to a platform in high-volume production. 15 The ICH (interface control hub) Serial ATA motherboard signal quality test procedures are developed to enable motherboard testing of the SATA interface for gauging the signal quality of the Intel chipset-based motherboard designs. The tests are designed to be run on the ICH SATA ports and are focused on the motherboard, not the controller validation. Intel strongly recommends running the motherboard signal quality test (MSQT) procedure since the SATA interface is a high-speed interface. Due to electrical characteristic differences, the MSQT procedure should be performed at both GEN 1 (1.5 Gb/s) and GEN 2 (3.0 Gb/s) speeds on each motherboard SATA port. SATA motherboard signal quality is gauged by measuring transmissions as close as possible to the SATA connector on the motherboard. The data is captured in an eye diagram for analysis and compared against an eye diagram template that reflects the expected ICH SATA interface characteristics for the given speed. By performing analysis in this way, worst case device and cable effects are built into the eye diagram template, thus removing characteristics of the actual cable and device, which may not be worst case per the SATA Specification. Total jitter can also be used to analyze the characteristics of the SATA interface. The tests performed are transmit-only and measured using a SATA Test Fixture. This measurement will be described in detail in chapter 2. 16 Chapter 2 TESTING METHODOLOGY Before any system gets into the high volume production phase, it has to be tested thoroughly. This can be achieved by the sequence of the following three phases: 1. Setup 2. Procedure 3. Evaluation Setup A setup is an environment selected by the engineer in order to conduct a specific test. The setup documentation contains all the information about the environment selected by the engineer. For example, if an engineer is testing SATA, s/he will provide all the information about the motherboard, processor, BIOS version, OS, oscilloscope, connectors, etc. This setup information is important because it provides an exact idea of the required environment under which the results are obtained so that the test produces consistent results. One can understand all of the environmental variables by referring to the given setup information. 17 Procedure The procedure is the set of sequential steps followed by the engineer to perform specific test. This information is helpful because one must follow the same procedural steps to replicate the same test or to obtain the same results. So, the procedure is also significant information provided along with the results. Evaluation The result provides the information about the behavior of the system being tested. Apart from this, it acts as a quick source of information, which checks the functionality of any system. It can also provide the engineer with the errors encountered. When test results are evaluated with respect to standards of performance, it gives a measure of healthiness of the system. The following is the detailed example of a high-speed bus test to explain the aforementioned steps in detail. 2.1 SATA Test Setup The following is the test setup and procedure followed by me in order to test SATA protocol. Board Tested: Intel® Chipset Family 18 BIOS (Manufacturer and Rev #): XXXXX Operating System: Microsoft MS-DOS CPU: Intel® Processor, 2.92GHz, 1333 MHz Memory (Manufacturer, capacity, and device type): Slot 0: DDR3 1024MB Slot 1: - Slot 2: - Slot 3: - Video Card: Internal Graphics Oscilloscope Probe: Digital signal oscilloscope (DSO) Test Fixture: SATA receptacle test cable 80mm length Digital Oscilloscope setup: If setup files of SATA are available then load the SATA setup files. If setup files are not available than create a math function for each terminal of the connector (i.e., individual math file for P terminal and N terminal). 2.2 SATA Test Procedure Connecting the SATA Test Fixture to the DSO 19 There are several pairs of sub-miniature version A (SMA) connectors, shown in Figure 2.1, on the SATA test fixture. Connect transmit differential pairs because receiver differential pairs will not be used for this particular test. Connect the DSO SMA Cables to the SATA test fixture as described below: 1. Connect the lower channel number SMA cable on the DSO to SATA transmit positive (TxP) on the SATA test fixture 2. Connect the higher channel number SMA cable on the DSO to SATA transmit negative (TxN) on the SATA test fixture 3. Connect the SATA test fixture into the SATA connector. Figure 2.1. Connecting the SMA Cables to the SATA test fixture SATA controller and test mode set-up Basically SATA can be used in many different test mode set-ups. 20 AHCI: advanced host controller interface. RAID: redundant array of independent disks. IDE: integrated development environment. AHCI mode and RAID modes need a software support to test them. We will continue our test procedure by setting up the environment in an IDE mode. The following section provides a detailed set-up procedure to test a SATA port under IDE Mode. 1) Check that the SATA controllers are enabled The need of SATA controller is decided depending on the number of SATA ports. Normally, there are two SATA controllers in the whole south bridge. The SATA controller is enabled or not can be verified by using the function disable register Bits. Depending on the programming of the function register, 2 bits are allocated to enable or to disable the SATA controller. Hence, the value at that memory address will verify the status of the SATA controller. If the status is disabled than it needs to be enabled. 2) Disable the SATA ports The SATA port address is given in the PCI configuration space. Depending on the specification of the PCI configuration space, different SATA ports are disabled. For example: 21 Set PCI device 23, function 4, offset 94h, bits 3:0 to 0h= xxxx 0000b. First, disable all the SATA ports using the aforementioned method. 3) Enable SATA port for configuration Now, enable the SATA port being tested. To enable the SATA port, we need to know about the I/O space editor, which provides a SATA index/data pair base address (SIDPBA) in the PCI configuration space of each controller. For example: SIDPBA + Offset xxh= 01h 4) Designate the SATA port for configuration Use an I/O space editor and access the SATA index/data pair Base address located at some offset (depends on I/O space programmer) to configure the designated port. For example: SIDPBA + Offset XXh 5) Set the SATA port generation SATA is available in two different generations. generation 1 operates at 1.5 Gb/s and generation 2 operates at 3.0 Gb/s. Each port must be tested for both generations. Using an I/O space editor, access the SIDPBA located at some offset in a PCI configuration Space. Than, select the generation to be tested. For example: Gen 1 (1.5 Gb/s): SIDPBA + Offset XXh = YYh Gen 2 (3.0 Gb/s): SIDPBA + Offset XXh = YZh 22 6) Place south bridge in SATA test mode Using the PCI configuration space editor, enable the designated port to send test packets. For example: Set PCI device 23, function X, offset XXh = xxxY ZZxxxb Set PCI device 23, function X, o vffset YYh = xxxx xZxxxb 7) Enable Ports for the test Use the PCI Configuration Space editor to select different SATA ports. For Example: Port 0: Set device 23, function X, offset xxh, bits [x:y] = xxxx YYYYb Port 1: Set device 23, function X, offset xxh, bits [x:y] = xxxx NNNNb Port 2: Set device 23, function X, offset xxh, bits [x:y] = xxxx ZZZZb Port 3: Set device 23, function X, offset xxh, bits [x:y] = xxxx FFFFb Port 4: Set device 23, function X, offset xxh, bits [x:y] = xxxx PPPPb Port 5: Set device 23, function Y, offset xxh, bits [m:n] = xxxx QQQQb Port 6: Set device 23, function Y, offset xxh, bits [m:n] = xxxx GGGGb 8) Prepare DSO for data capture Keep the DSO in armed mode. Take advantage of the full range of the scope by adjusting the vertical settings. The waveform should remain in the vertical display boundary. 23 1. To set up the vertical scale, use vertical set-up menu. Choose mV per division settings, which will use the full range of the scope without clipping the waveform. 2. Use the same settings for both channel 1 and channel 3. 9) DATA Capture Once the steps mentioned above have been followed, we are able to acquire data on the scope. To analyze the results, continued acquisition should be converted into a single acquisition. We can capture a single acquisition by using a single button on the scope. This single button will acquire a single acquisition of the transmitted differential signal. Save the captured sample as comma separated value (CSV) file as shown in the Figure 2.2. Figure 2.2. Figure setup to export CSV example 24 10) Data processing and result analysis 1. Launch SigTest (SATA post analysis software). 2. Launch the CSV file to process it as shown in Figure 2.3. Figure 2.3. signal quality eye rendering program main menu 11) Import CSV data file: Import the CSV data file from the proper location and verify it. If the desired selection is not made, a data window error will occur as shown in the Figure 2.4, 2.5 and 2.6. 25 Figure 2.4. Import the CSV data file Figure 2.5. Verify the data file button 26 Figure 2.6. Data file error window example 12) SATA generation-1 template selection Select generation 1 or generation 2 template as shown in Figure 2.7 and 2.8. generation 1 uses 1.5 GB technology and generation 2 uses 3.0 GB technology. The receiver or transmitter option is selected according to the test being performed. Figure 2.7. SATA generation-1 template selection 27 Figure 2.8. SATA gen2 template selection 13) Test Button Selection Select the Test option as shown in Figure 2.9. Figure 2.9. TEST button 28 2.3 Test Result Test will be performed on the selected data file. Results will be generated as shown in Figure 2.10. The tool will provide complete information about the overall result. It will provide an analysis of data rate, median to peak jitter, peak to peak jitter, and eye violation points. Figure 2.10. Test results screen 29 Test Results Required Tests: Overall Result: Pass! Data Rate: 2.998271 GB/s Data Rate: Pass! Median to Peak Jitter: 35.776702 ps Median to Peak Jitter Pass! Peak to Peak Jitter: 49.413793 ps Peak to Peak Jitter Pass! Eye Violations: 0 points Eye Test: Pass! Test Result analysis Examine the eye diagram waveform shape to check for measurement methodologyinduced errors or impedance discontinuities. Eye diagram distortions (e.g., oval shapes) indicate poor measurement technique or a potential board issue. The Eye diagram is distorted if probes are used instead of SMA cables and a test fixture to probe SATA signals. 30 It is possible to pass the eye diagram and still have potential board issues (e.g., impedance discontinuities) that could be caught using some different testing (not the scope of this project). The SATA transmit connector (TX connector) signal quality eye diagrams are a good indicator of proper signaling for the chipset and motherboard. Passing results using the chipset specific eye diagram templates indicate that the motherboard design is performing within the expected chipset SATA transmit characteristics. Some non transition as well as transition passing eye diagrams is shown in Figures 2.11 to 2.14. The eye diagrams are very healthy with proper rise time, fall time, and jitter selection of generation 1 and generation 2 technology. The non-passing results are not shown due to intellectual property conflicts. Figure 2.11. SATA generation-1 non-transition passing results example 31 Figure 2.12. SATA generation-1 transition passing results example Figure 2.13. SATA generation-2 non-transition passing results example 32 Figure 2.14. SATA generation-2 transition passing results example 33 Chapter 3 TOOLS Stress Testing Tools Stress testing is comprised of software and hardware tools that control a system’s electrical parameters. Together, the tools generate specific signal patterns while controlling the platform’s reference voltage (VREF) target to detect data corruption and eventually push a system to failure. I have used 2 tools, margining stress (MARS) and universal voltage margining card (UVMC), to perform aforementioned tests. The margining stress (MARS) software is an electrical stress tool used to test the robustness of a target’s interface. It is designed to stress the system memory bus and the FSB to ensure data integrity is maintained. The test program generates patterns that provide worst-case noise on the host bus by modifying the patterns over frequency and victim bit mode. Modifying the patterns over varying frequencies may cause a higher power delivery than the platform can support, thus resulting in data corruption. Alternatively, victim bit mode leads to corruption due to crosstalk. Another cause of failure may be due to insufficient current being provided to the target undergoing the stress test. The MARS software is used to test for errors due to all of the above mechanism The universal voltage margining card (UVMC) is a hardware tool used to override a platform’s reference voltage. The automargin UVMC is adjusted to the nominal voltage 34 of the target platform, then it is used to increment or decrement the VREF voltage from this nominal level. Together, the MARS software is used as data integrity stress while UVMC8 is used as voltage marginality stress. 3.1 UVMC8 Automargin Card The UVMC8 is a multi-channel automargin hardware tool capable of generating eight independent VREF UVMC outputs to voltage margin subsystems on a target platform. A channel’s VREF UVMC is produced by a resistance ratio consisting of 3 digital potentiometers (POTs) referred to as top, middle, and bottom (see Figure 19). Each of these POTs is capable of 256 levels of analog output. The UVMC tool is designed to override the reference voltages on the target. The process requires the user to substitute the VREF target on the motherboard with an external VREF UVMC supplied by the UVMC automargining card. Solder work is required. Details on how to install the UVMC are provided in the Setup section. The following are a few of the UVMC8 features: 1. 8 independent VREF margining circuits 2. Ability to change target VREF without the need to remove resistors on target platform 3. USB interface eliminate compatibility issues 4. Support VTT voltages from 0.8V to 3.3V 35 5. Output capable of driving 25 loads 6. +/- 50 % minimum VREF adjustment range 7. 256 position resolution, +/- 1% of VREF step sizes 8. VREF output levels can be calibrated manually without a host present. 3.1.2 Connectors Target Connector (J6) The target connector consists of a set of input/output pins connecting the UVMC board to the test platform. The voltage input (VIN), VREF, target connect ground (TC_GND) pins on the UVMC board correspond to the VTT, VREF, and GND on the target, respectively. Refer to Figure 3.1a and 3.1b for signal definitions on the target connector. Note, The Figure represents the top view of connector J6 on the automargin card. Figure 3.1a. 2008 Platform Diagram 36 Figure 3.1b 2008 Platform Diagram USB – Connector (P2) The USB interface is the primary communication between the host and automargining card. Both the stress software and utility use USB for sending commands and retrieving the status from the UVMC. See Figure 3.2 for proper USB cable. 37 Figure 3.2. Proper USB Cable Calibrating for VREF The VREFUVMC is generated by a voltage divider circuit consisting of three digital potentiometers. The three digital potentiometers are referred to as top, middle, bottom see Figure 3.3. The wiper of the middle potentiometer is reserved primarily for VREF margining. Each of these POTs consists of 256 steps. Use only the top and bottom POTs to calibrate a channel’s default voltage. The middle POT should be left at the nominal step value of 127 to allow equal range on both sides of VREF for margining (128 steps up and 128 steps down). To achieve the desired VREFUVMC, the following steps should be employed: 1. Choose step setting for the top POT 2. Set middle POT to step=XX (Intentionally left out for intellectual property issues) 38 3. Calculate step setting for bottom POT. The preferred method for setting the POTs is with the UVMC utility. As an alternative, the POTs can also be adjusted with push buttons on the margining card. The utility is used to set output voltages and program power-on default voltages on the automargin card. The term “power-on default voltage” refers to the default potentiometer settings that generate an output voltage for matching the test platform’s VREF target. The primary method of communication between the host and the margining card is via USB. An example for deriving the appropriate step settings to achieve the desired nominal VREF, first, determine the necessary resistance to achieve the desired VREF, then convert the resistance to a step setting using the 40 ohms/step conversion When setting the bottom POT to the derived value, the channel may not generate the exact VREFUVMC. If this discrepancy is observed, simply increment or decrement the bottom POT accordingly (by either push buttons or the utility) until the desired VREFUVMC is attained. Once the power-on default values have been established, program the steps into the POT’s memory register (MEM). Executing the WRITEEEMEM command using the UVMC utility will store the default value into the MEM register. This procedure ensures that upon boot up, a channel’s nominal VREFUVMC output will initialize to the desired VREF voltage level on the target. 39 Figure 3.3. Channel’s potentiometer configuration for the UVMC8 automargin card 3.2 MARS SOFTWARE The MARgining stress/scope (MARS) tool is a family of target-based application programs that significantly decrease the amount of time to perform VREF margining on a given platform. The software provides valuable failure data to isolate the most marginal signaling cases. These programs have two primary goals: 1) Identify the worst case pattern for a particular bus in a short amount of time. 40 2) Provide an oscilloscope mode in which a pattern can be studied in isolation on a particular bus. The MARS tool was developed to be used at the beginning of a project to give maximum time to address issues through design changes and allow validation to a desired VREF margin target. Use of the MARS program can reduce the time to determine a platform’s margin from days, weeks, or months to minutes. It provides the highest probability of meeting a launch target. The MARS program transfers data containing analog stress data patterns back and forth across specifically targeted buses that comprehensively sweep worst-case I/O patterns in frequency domain. The main source of cross talk generation comes from aggressor lines. Switching all data lines, except one, to aggressor lines creates the frequency domain stimulus. The leftover line is considered the victim line and is switched in various relationships to the aggressor lines to look for worst-case I/O events. There are six modes in which the patterns run. Victim line is being checked for validity in the presence of the crosstalk from all of the aggressor lines. The modes create the scenario in which a victim line switches in various relationships to as many aggressors as possible. The operating five modes are as followed: Even mode: Victim and aggressors switch together. Odd mode: Victim and aggressors switch in opposite directions. 41 Max Frequency: Victim switches at the maximum I/O rate regardless of the aggressors. Victim High: Victim stays high. Victim Low: Victim stays low. In addition, the MARS programs have built-in support for automating VREF margin testing. They interface with Intel’s VREF margining hardware tool and are designed to sweep up and down through discrete VREF steps until an error is detected. The program controls programmable potentiometers that replace the platform’s VREF generation circuits, and are used to push the system to failure. The basic flow of the program is to margin the selected agents with VREF starting at the nominal setting. All patterns are run on all data bits. If there are errors they are recorded, and the program exits. If there are no errors then the VREF is stepped up or down again and the patterns are repeated. Because the exact pattern that generated the failure is recorded, the scope mode of the tool can then be used to examine this failing case in isolation to get an understanding of what the problem is. A third mode of the tool will loop all the patterns indefinitely without margining VREF. The loop mode can be used for a prolonged period of stress, and for collecting worst-case signal integrity eye diagrams. A no-host system is required to drive the margining card, and the current version of the MARS tool can be run from a DOS bootable floppy diskette. The MARS program’s execution is highly configurable via a standard .ini text format configuration 42 file. The configuration parameter values may also be entered on the command line when invoking the MARS program. 43 Chapter 4 CONCLUSION The platform readiness test plan is an indicator which provides the proof of robustness for a given platform so the original equipment manufacturers (OEMs) may put their platform in to the high volume manufacturing phase. This test plan will uncover the potential issues that were slipped during other pre-silicon and post-silicon validation plans. And in future, when the customer identifies a failure through cursory analysis using platform readiness test plan, recommendation on ways to correct will be provided by the vendor. The platform readiness test plan has been described in this report. It has been used at Intel inc. and proven its usefulness. 44 REFERENCES 1) http://edc.intel.com/Platforms/Tool-Loaner-Program/USB-Controlled-VRefMargining-Card/ 2) http://edc.intel.com/Platforms/Xeon-5500/#hardware 3) http://edc.intel.com/Training/Library.aspx?ttag=ttipt 4) www.intel.com