What’s A Codeword Or Two Among Friends? Ron Hranac Codeword Errors © 2011 Cisco Systems, Inc. All rights reserved. Cisco Public 1 The World of Codewords Questions often crop up about codewords, codeword errors, and acceptable percentages of DOCSIS upstream codeword errors. What the heck is this codeword stuff anyway? A good place to start is to first define what a codeword is, and how it applies to the world of data transmission over cable networks. We’ll get to that shortly. All of this falls under the umbrella of something known as forward error correction (FEC), a combination of techniques and algorithms used to identify and fix data transmission errors. Codeword Errors © 2011 Cisco Systems, Inc. All rights reserved. Cisco Public 2 Ensuring Data Transmission Integrity In an ideal world, all bits would be received exactly as transmitted Bit errors can and do occur, but there are ways to detect—and sometimes correct—those errors The simplest form of error detection is parity checking Appending a row of bits with a parity bit allows for simple error checking, but it doesn’t tell where in that row of bits the error occurred Applying parity in two dimensions—that is, in rows and columns—not only tells us that an error occurred, but also where it occurred! This is known as block coding. Examples of coding in DOCSIS include Reed Solomon (RS) and Trellis coding Codeword Errors © 2011 Cisco Systems, Inc. All rights reserved. Cisco Public 3 Parity One method of identifying errors is parity detection For example, a parity bit can be appended to a group of seven bits Assume that even parity is used. The appended bit will be a 1 if an odd number of 1s is in the original seven bits, and a 0 if there is an even number of 1s Parity allows the detection of an odd number of errors, but doesn’t tell us which bit (or bits) is in error 7-bit word Parity bit 0110100 1 Transmitted word (including parity bit) Error occurs 01101001 01001001 © 2011 Cisco Systems, Inc. All rights reserved. 0100100 1 Receiver cannot determine which bit is in error, only that an error occurred. Total number of 1s is now even, hence the name “even parity” Codeword Errors Receiver detects error because parity bit is wrong for number of 1s now in 7-bit word Cisco Public 4 Block Coding As mentioned previously, applying parity in two dimensions—that is, in rows and columns—not only tells us that an error occurred, but also where it occurred! This is known as block coding. Original data stream 0110100111000100110011100110 Groups of 7-bit words shown for clarity 0110100 1110001 0011001 1100110 Data arranged into a tabular form, and parity bits appended to rows and columns 0110100 1110001 0011001 1100110 1 0 1 0 Parity bits 0111010 0 0110100111100010001100111100110001110100 Codeword Errors © 2011 Cisco Systems, Inc. All rights reserved. Cisco Public Transmitted data stream with parity bits added 5 Block Coding Transmitted data stream with parity bits and a bit error. The 0 shown in red was originally a 1. 0110100111100010001000111100110001110100 0110100 1110001 0010001 1100110 Data arranged into a tabular form in the receiver, and parity bits in rows and columns are checked. 1 0 1 0 0111010 0 Parity errors! Received data stream with parity bits removed and errored bit corrected. 0110100111000100110011100110 Adapted from example in Modern Cable Television Technology, 2nd Ed. Codeword Errors © 2011 Cisco Systems, Inc. All rights reserved. Cisco Public 6 Ensuring Data Transmission Integrity ITU-T Recommendation J.83, Annex B, defines downstream data transmission in cable networks: We’re interested in the FEC part of this Source: ITU-T Recommendation J.83 Codeword Errors © 2011 Cisco Systems, Inc. All rights reserved. Cisco Public 7 Ensuring Data Transmission Integrity In J.83, FEC has four processing layers comprising Reed Solomon coding, interleaving, randomization, and Trellis coding Source: ITU-T Recommendation J.83 Codeword Errors © 2011 Cisco Systems, Inc. All rights reserved. Cisco Public 8 DOCSIS Downstream Codewords As data enters a RS encoder, it is grouped into chunks called blocks or codewords, or more specifically, RS blocks or RS codewords (I’ll use the latter terminology throughout the remainder of this discussion). Each downstream RS codeword is made up of 128 RS symbols, and each RS symbol is 7 bits. Note that RS “symbols” are not the same kind of symbols that make up each symbol point in a QAM constellation. 0 1 1 0 0 1 ≠ 0 7 bits = 1 RS symbol Codeword Errors © 2011 Cisco Systems, Inc. All rights reserved. Cisco Public 9 DOCSIS Downstream Codewords 122 of each RS codeword’s 128 symbols are data symbols, and the remaining six are parity symbols used for error correction. ITU-T J.83, Annex B states that the data is “…encoded using a (128,122) code over GF(128)…” which shows each RS codeword consists of 128 RS symbols (first number in first parentheses) and the number of data symbols per RS codeword is 122 (second number in first parentheses), leaving six symbols per RS codeword for error correction. DOCSIS downstream RS FEC is configured for what is known as “t = 3,” which means that the FEC can fix up to any three errored RS symbols in a RS codeword. Codeword Errors © 2011 Cisco Systems, Inc. All rights reserved. Cisco Public 10 DOCSIS Downstream Codewords In DOCSIS downstream Reed Solomon FEC, 7 bits = 1 RS symbol, and 128 RS symbols = 1 RS codeword 0 1 1 0 0 1 0 7 bits = 1 RS symbol RS symbol #1 RS symbol #2 RS symbol #3 RS symbol #4 RS symbol #127 RS symbol #128 128 RS symbols = 1 RS codeword In each RS codeword: 122 RS symbols = data symbols, 6 RS symbols = parity symbols Codeword Errors © 2011 Cisco Systems, Inc. All rights reserved. Cisco Public 11 Codeword Errors What happens when there is, say, a burst of noise that causes a bit error or errors in one RS symbol? It doesn’t matter to the RS decoder if one bit in that RS symbol is errored or all seven bits are errored—the entire symbol is considered broken. = good RS symbol = errored RS symbol = errored RS symbol = errored RS symbol Codeword Errors © 2011 Cisco Systems, Inc. All rights reserved. Cisco Public 12 Codeword Errors With a RS FEC configuration of “t = 3” the FEC decoder can fix up to any 3 errored symbols in a RS codeword 128 RS symbols = 1 RS codeword This is a correctable codeword error When there are more than 3 errored symbols in a codeword the entire codeword is errored This is an uncorrectable codeword error Codeword Errors © 2011 Cisco Systems, Inc. All rights reserved. Cisco Public 13 DOCSIS Downstream Codewords There can be anywhere from three to 21 bit errors in a RS codeword that has three errored RS symbols. For a t = 3 configuration, as long as there are no more than three errored RS symbols in a given codeword, the FEC decoder in the cable modem can fix all of them. That would result in what’s known as a correctable FEC error or correctable codeword error. Codeword Errors © 2011 Cisco Systems, Inc. All rights reserved. Cisco Public 14 DOCSIS Downstream Codewords As soon as the number of errored RS symbols in a given RS codeword is four or more, the entire RS codeword is toast. FEC can’t fix three RS symbols and leave the remaining ones broken; if more than three RS symbols are errored, the entire RS codeword is errored. That errored codeword is now an uncorrectable codeword, and we have an uncorrectable FEC error or uncorrectable codeword error! Codeword Errors © 2011 Cisco Systems, Inc. All rights reserved. Cisco Public 15 DOCSIS Upstream Codewords DOCSIS downstream FEC is fairly straightforward, at least from the perspective of a fixed RS codeword size (128 RS symbols) and t = 3 configuration. When DOCSIS 1.0 was introduced, upstream FEC supported codeword sizes ranging from 18 bytes (16 data or “k” bytes plus two parity bytes) up to 255 bytes (k bytes + parity bytes), and t = 0 to t = 10. DOCSIS 2.0 brought improved FEC, supporting t = 0 up to t = 16. Codeword Errors © 2011 Cisco Systems, Inc. All rights reserved. Cisco Public 16 DOCSIS Upstream Codewords The same basic principles apply in the upstream as is the case in the downstream: RS FEC can fix up to “t” errored symbols (bytes) per RS codeword, but if the number of errored symbols per codeword exceeds “t”, the entire codeword is considered uncorrectable. Codeword Errors © 2011 Cisco Systems, Inc. All rights reserved. Cisco Public 17 What’s Acceptable? Which brings me to a common question: What’s an acceptable percentage or number of upstream codeword errors? That’s definitely an it depends question, but let’s start with an ideal goal: No correctable or uncorrectable codeword errors at all in either the downstream or upstream! One could argue that in a network performing so well that there are no codeword errors of any kind in the QAM signals, it would be possible to take advantage of eliminating FEC overhead altogether and picking up a little more throughput per channel. Of course, this goal is unrealistic. There is no perfect cable network that is completely free of data transmission errors. Still, the fewer codeword errors, the better. Codeword Errors © 2011 Cisco Systems, Inc. All rights reserved. Cisco Public 18 Upstream Packet Loss Let’s look at another metric for a moment. I have for several years suggested that upstream packet loss— that is, packet loss in just the outside plant’s upstream transmission path—be limited to no more than about 1 percent when the traffic is plain old high-speed data. If voice is thrown into the mix, upstream packet loss should not exceed about 0.1 to 0.5 percent, with the lower end of that range the better place to be. Codeword Errors © 2011 Cisco Systems, Inc. All rights reserved. Cisco Public 19 Codeword Errors vs. Packet Loss How does upstream packet loss relate to codeword errors? That’s another it depends. First, if uncorrectable codeword errors exist, that means that packet loss is happening. However, uncorrectable codeword errors do not necessarily track lost packets on a one-for-one basis. For instance, one can configure upstream operation such that packet length equals the number of information bytes in a codeword. One also could have a configuration in which the packet length is longer than the number of information bytes in one codeword, but less than in two codewords. Codeword Errors © 2011 Cisco Systems, Inc. All rights reserved. Cisco Public 20 Codeword Errors vs. Packet Loss It might be worthwhile to work with your network operations center folks to come up with a packet loss estimate versus uncorrectable codeword errors in relation to the specific configuration used in your system. From there, target maximum packet loss goals similar to my previously suggested numbers. Once that’s done, start looking at trends over time. Codeword Errors © 2011 Cisco Systems, Inc. All rights reserved. Cisco Public 21 Packet Loss and Trends Let’s say you find upstream packet loss to be less than 0.1 percent fairly consistently, but then you notice that it starts to trend upwards to 0.1 percent, 0.2 percent, 0.25 percent, and so on over some period of time. That upward movement in packet loss suggests that something is degrading upstream performance and it’s getting worse. Here the trend itself is arguably more important than the actual packet loss value, assuming the original value was reasonable to begin with. Codeword Errors © 2011 Cisco Systems, Inc. All rights reserved. Cisco Public 22 Other Troubleshooting Ideas From a practical perspective, tracking uncorrectable codeword errors should be considered one of many tools used to troubleshoot problems. Others include modulation error ratio (MER), a CMTS’s Flap List metrics, and perhaps a third party spectrum monitoring tool. Here are a couple examples of how evaluating more than one metric can better help the troubleshooting process. Let’s say the CMTS’s reported upstream MER (also called “upstream SNR”) is good, but uncorrectable codeword errors are out of whack. That may indicate impulse noise or laser clipping. Low MER and excessive uncorrectable codeword errors could indicate low carrier-to-noise ratio, or maybe the presence of some significant linear distortions (micro-reflections, amplitude ripple, group delay). Codeword Errors © 2011 Cisco Systems, Inc. All rights reserved. Cisco Public 23 Tracking Trends One can certainly track raw correctable and uncorrectable codeword error numbers—and many cable operators do—but another metric is to track codeword error ratio (CER) The following example is excerpted from one of my Communications Technology magazine columns written back in 2003: According to Motorola’s Marc Belland, using upstream uncorrectable codewords is one of the best ways to complement the CMTS’s [MER] estimate. Says Belland, “I know of several operators who use the following three management information base (MIB) values to calculate, using a script, what can be called codeword error ratio (CER) on their CMTS upstream ports. They are: docsIfSigQUnerroreds docsIfSigQCorrecteds docsIfSigQUncorrectables Codeword Errors © 2011 Cisco Systems, Inc. All rights reserved. Cisco Public 24 Tracking Trends (cont’d) “A total of all three divided into the uncorrectables number will give you a CER at that point in time. Subsequent polls at user specified intervals—for example, every five minutes—can give you a trend as to whether the upstream is getting better or worse. Obviously if it’s getting worse you’ll need to pro-actively get someone to attend to the issue, depending on the rate of upstream degradation.” Belland provided the following example, which could be ported to an Excel spreadsheet: unerroreds: 1,562,456 correctables: 803,867 uncorrectables: 209,134 total: 2,575,457 loss %: 8.120% CER = 8.12E-02 Codeword Errors © 2011 Cisco Systems, Inc. All rights reserved. Cisco Public 25 Additional Resources You’ll find some other useful background information and ideas in an older white paper authored by Cisco’s John Downey: “Upstream FEC Errors and SNR as Ways to Ensure Data Quality and Throughput,” available at http://www.cisco.com/en/US/tech/tk86/tk319/technologies_white _paper09186a0080231a71.shtml An updated version of the paper is available but not online. Contact John directly at jdowney@cisco.com to get a copy of the newer version. Codeword Errors © 2011 Cisco Systems, Inc. All rights reserved. Cisco Public 26 Q and A Codeword Errors © 2011 Cisco Systems, Inc. All rights reserved. Cisco Public 27