1698 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 52, NO. 10, OCTOBER 2004 Achievable Information Rates and Coding for MIMO Systems Over ISI Channels and Frequency-Selective Fading Channels Zheng Zhang, Student Member, IEEE, Tolga M. Duman, Senior Member, IEEE, and Erozan M. Kurtas, Member, IEEE Abstract—We propose a simulation-based method to compute the achievable information rates for general multiple-input multiple-output (MIMO) intersymbol interference (ISI) channels with inputs chosen from a finite alphabet. This method is applicable to both deterministic and stochastic channels. As an example of the stochastic MIMO ISI channels, we consider the multiantenna systems over frequency-selective fading channels, and quantify the improvement in the achievable information rates provided by the additional frequency diversity (for both ergodic and nonergodic cases). In addition, we consider the multiaccess multiantenna system and present some results on the achievable information-rate region. As for the deterministic MIMO ISI channels, we use the binary-input multitrack magnetic recording system as an example, which employs multiple write and read heads for data storage. Our results show that the multitrack recording channels have significant advantages over the single-track channels, in terms of the achievable information rates when the intertrack interference is considered. We further consider practical coding schemes over both stochastic and deterministic MIMO ISI channels, and compare their performance with the information-theoretical limits. Specifically, we demonstrate that the performance of the turbo coding/decoding scheme is only about 1.0 dB away from the information-theoretical limits at a bit-error rate of 10 5 for large interleaver lengths. Index Terms—Channel capacity, frequency-selective fading, information rates, intersymbol interference (ISI), iterative decoding, multiantenna systems, multiple-input multiple-output (MIMO) systems, multitrack recording, turbo coding. I. INTRODUCTION R ECENTLY, the multiantenna systems have received a lot of attention due to their capability in taking advantage of the spatial diversity. In [1] and [2], the authors have independently proved that the use of the multiple-transmit/multiple-receive antennas significantly improves the capacity for a Rayleigh flat-fading channel, compared with the single-transmit/single-receive antenna systems. More specifically, when the subchannels experience independent Paper approved by R. A. Valenzuela, the Editor for Transmission Systems of the IEEE Communications Society. Manuscript received August 15, 2003; revised November 3, 2003. This work was supported in part by a grant from Seagate Technology, and in part by the National Science Foundation under CAREER Award CCR-9984237. This paper was presented in part in the IEEE International Symposium on Information Theory (ISIT), Lausanne, Switzerland, July 2002, and in part at ISIT, Yokohama, Japan, June–July 2003. Z. Zhang and T. M. Duman are with the Electrical Engineering Department, Arizona State University, Tempe, AZ 85287-5706 USA (e-mail: zheng.zhang@ asu.edu; duman@asu.edu). E. M. Kurtas is with Seagate Technology, Pittsburgh, PA 15222-4215 USA (e-mail: Erozan.M.Kurtas@seagate.com). Digital Object Identifier 10.1109/TCOMM.2004.836449 Rayleigh fading, and the same number of antennas are used at both the transmitter and the receiver, the capacity increases linearly with the number of the transmit or receive antennas. For high-data-rate wireless communications, since the symbol duration is small, compared with the multipath spread of the channel, the channel experiences intersymbol interference (ISI), resulting in a frequency-selective fading channel. Although the frequency selectivity may complicate the signaling algorithms, it provides additional (multipath) diversity. In this paper, we consider the multiantenna systems over a frequency-selective fading channel, and refer to these systems as stochastic multiple-input/multiple-output (MIMO) ISI systems. MIMO systems are also considered in many other areas. For example, in magnetic recording channels, multitrack recording is proposed, where data are written to a group of adjacent tracks simultaneously and read back by multiple heads in parallel, in order to reduce the effect of the intertrack interference (ITI) [3]. For narrowtrack systems of the future, the multitrack approach is an efficient solution to achieving high recording densities in the presence of severe ITI. Note that the magnetic recording channel is a typical ISI channel, therefore, the multitrack recording systems can be modeled as deterministic MIMO ISI channels. With this motivation, we also consider deterministic MIMO ISI channels throughout this paper. The capacity of the MIMO Gaussian channel with both deterministic and stochastic channel-transfer functions is derived in [1] and [2], and related work for MIMO frequency-selective fading channels is reported in [4] and [5]. For the single-input single-output (SISO) ISI channel with additive white Gaussian noise (AWGN), the capacity is derived in [6]. It is shown that the capacity-achieving signals are Gaussian random variables for these cases. On the other hand, in practice, signals are selected from a finite alphabet. For example, signals used in the magnetic recording channels are binary, and in the wireless communications, specific constellations, such as the phase-shift keying (PSK) and the quadrature amplitude modulation (QAM), are employed. Therefore, in general, one cannot achieve the “unconstrained” capacity, especially in the high signal-to-noise ratio (SNR) region, and achievable information rates under suitable input constraints should also be considered. When the inputs are selected from specific signal constellations (instead of the Gaussian inputs), there are usually no closed-form solutions to the achievable information rates, even for the simple ISI channels with AWGN. Instead, upper and lower bounds on the information rates are provided in [7] and 0090-6778/04$20.00 © 2004 IEEE ZHANG et al.: ACHIEVABLE INFORMATION RATES AND CODING FOR MIMO SYSTEMS [8]. More recently, a simulation-based technique has been proposed to compute the information rates of SISO ISI channels with inputs selected from a finite alphabet in [9] (see also [10] and [11]). The main idea is to use a simulation of the channel excited by inputs with a specific distribution and employ the Bahl–Cocke–Jelinek–Raviv (BCJR) algorithm [12] to estimate the joint probability of the output sequence. Then this joint probability is used to estimate the differential entropy of the output sequence and the mutual information between the input and the output, thus the achievable information rate under the specific input constraint. In the first part of the paper, we extend the technique of [9] to the deterministic MIMO ISI channels and the multiantenna systems over frequency-selective fading channels. We take both the ISI and interchannel interference (ICI) into account and compute the mutual information between multiple inputs and multiple outputs for a given input–output simulation. In [13] and [14], some preliminary results on the simulation-based algorithm for computing the achievable information rates of MIMO ISI channels were presented. Here, we discuss the method in detail and extend it further. We consider both the ergodic and nonergodic multiantenna systems over frequency-selective fading channels. Our results quantify the improvement provided by the frequency diversity and show the difference between the achievable information rates using specific constellations and the unconstrained capacity. We further note that there are some important factors that affect the achievable information rates of the multiantenna systems, including the existence of the line-of-sight (LOS) signals and the presence of the spatial correlation in the channel-transfer function. With the proposed method, we can easily compute the achievable information rates by taking these considerations into account as well. Moreover, in addition to the case with independent identically distributed (i.i.d.) inputs, we also consider the channels with Markov inputs and perform the maximization of the information rates over such inputs. The techniques developed are quite general, and they are applicable to other important scenarios. As an example, we consider the multiaccess multiantenna system with inputs from specific constellations. We apply the simulation-based algorithms to find the corresponding information-rate regions and compare the results with the capacity region [15]. We will see that the information-rate region for the two-user case is shown as a pentagon, which asymptotically becomes a rectangle (or square) when the SNR is increased. In the second part of the paper, we adopt a joint turbo coding and iterative decoding scheme to the MIMO ISI channels, where a soft-input soft-output channel detector (or equalizer) and an outer decoder, corresponding to the outer code used, work cooperatively in an iterative manner. We design the maximum a posteriori (MAP) detector, as the channel equalizer, for both the deterministic and stochastic channels, and use a turbo code or a single convolutional code as the outer code. The results show that, to achieve a bit-error rate (BER) of , which can be considered as reliable transmission, the required SNR for the proposed tubo-coding scheme with a block size of around bits is only about 1-dB away from the information-theoretical limits for a variety of MIMO ISI channels. 1699 The paper is organized as follows. In the next section, we present the general MIMO ISI channel model. As examples for the stochastic and deterministic channels, we describe the multiantenna system over a frequency-selective fading channel and the multitrack magnetic recording system. In Section III, we give the details of the simulation-based methods used to compute the achievable information rates for the general MIMO ISI channels when the inputs are chosen from a finite alphabet. We also discuss some special cases in detail, including the multiantenna system over frequency-flat fading channels, the multiaccess multiantenna systems, and the single-track recording systems with the ITI. In Section IV, we study several examples of the information rates, compare them with the unconstrained capacity (if available), and discuss the results. In Section V, we describe the MAP detector and the joint turbo coding/decoding scheme for MIMO ISI channels, and present several examples. Finally, we conclude the paper in Section VI. II. CHANNEL MODEL We consider a general discrete-time MIMO ISI channel with inputs and outputs (denoted by a system), and memory . The th received signal at the time instance is given by (1) is the channel gain of the th where is the th input, tap for the subchannel from the th input to the th output, and is the AWGN in the th receiver output. When the channel . Also, we assume that the is time invariant, noise terms are temporally and spatially independent. In vector notation (2) , the input where the output vector , the noise vector vector (where denotes the transpose), and the . channel-transfer function We define the average SNR at each receiver as (3) where denotes expectation, is the energy per symbol at is the variance of the AWGN if the each transmitter, and noise is real, and it is the variance per dimension if it is complex. We also note that the channel is assumed to be stationary, is independent of , and therefore, the expectation of if the channel is deterministic, the expectation operation can be omitted altogether. A. Multiantenna Systems Over Frequency-Selective Fading Channels For the frequency-selective fading case, the channel in (2) is nothing but a tapped-delay-line model where the number of . The channel coefficients the taps or resolvable paths is are complex Gaussian random variables with equal 1700 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 52, NO. 10, OCTOBER 2004 variance. Unless otherwise stated, we assume that the channel is Rayleigh fading, i.e., the channel coefficients have zero mean, are independent, except when and all the elements in we consider the effects of the spatial correlation on the achievable information rates. In this case, the noise term in (2) is also zero-mean complex Gaussian distributed. For the multiantenna wireless systems, we consider both ergodic channels (e.g., independent fading from one symbol to the next, or block fading), and nonergodic channels [16] (e.g., quasi-static fading, where the fading coefficients are fixed during the transmission of an entire block). We note that although both the i.i.d. fading and block fading can be considered as ergodic frequency-selective fading channels, they do not refer to the exact same channel model, unlike the flat-fading scenario, and the capacities and information rates may be different. We assume that the channel state information (CSI) is available at the receiver, but not at the transmitter. Therefore, no waterfilling is performed at the transmitter, that is, the signal energy is evenly allocated among all the transmitters. A. Information Rates of the Deterministic and Stochastic MIMO ISI Channels B. Multitrack Recording Systems The multitrack recording channel is an example of deterministic MIMO ISI channels, where the channel-transfer function is constant. For clarity, let us first consider the discrete-time channel model of the single-track system without ITI. The th received symbol can be expressed by (4) where is the set of ISI coefficients of the channel due to the channel pulse response [17]. For practical systems, we also need to incorporate the ITI among the adjacent tracks into the multitrack system, where there are channel model. In a heads reading tracks simultaneously, the th received symbol for the th read head is then given by Let us first consider the deterministic channel and ergodic wireless channel, and compute the Shannon type capacity, or more precisely, the achievable information rate with the input constraint, which is given by (6) where we omit the conditioning on the channel-transfer function for simplicity. In this paper, we mainly consider the achievable information rates with independent and uniformly distributed (i.u.d.) inputs, called symmetric information rates. Since the capacity is the achievable information rate maximized over the distribution of the constrained inputs, the symmetric information rate can be viewed as a lower bound. This lower bound should be very close to the capacity in the high SNR region, if there is no ambiguity about the channel inputs, given the noiseless channel outputs. This is justified by the fact that, in the limiting case when the SNR approaches infinity, the achievable information rate can be expressed by the entropy of the input vectors, since the entropy of the input vectors conditioned on the noiseless output vectors is zero [18], and this entropy is maximized by i.u.d. inputs. Furthermore, for the ergodic stochastic channels, without the CSI known to the transmitter, using i.u.d. inputs is a reasonable assumption that will result in a meaningful lower bound on the constrained capacity. However, the method we will present is more general. For example, it can be used to estimate the information rates for any given Markov input, and even maximization can be performed over all possible Markov inputs. This would generalize the algorithms given in [9] and [19], which deal with the SISO systems over ISI channels, to the case of MIMO systems. At the end of this section, we will discuss these extensions. Since the additive noise is independent of the input signals, the information rate can be expressed as (5) where we assume that the pulse responses from different tracks are identical, i.e., the set of ISI coefficients is the same, except for the amplitude varying with the distance between the track and the read head, which is reflected by the coefficients . Clearly, a more general case of different pulse responses on different tracks can also be considered in a similar fashion. . Comparing with (2), we see that In the examples shown later, we consider the ideal normalized and Class IV partial-response (PR4) channel [17], where , , , even though the technique for information-rate computation is very general. III. COMPUTATION OF THE ACHIEVABLE INFORMATION RATES In this section, we describe the algorithm that is used for computing the achievable information rates of the MIMO ISI channels in detail, for both the deterministic and stochastic cases. We also focus on several specific cases, including the multiantenna systems over the frequency-flat fading, the multiaccess multiantenna systems, and the single-track systems with ITI. (7) where the entropy of the noise sequence can be easily computed when the covariance matrix of the noise is known [18]. Therefore, the problem reduces to the computation or estimation of the entropy of the output sequence. Clearly (8) that is, the entropy can be expressed as the expectation of the logarithm of the joint probability of the output sequence. Since the inputs are selected from a finite alphabet, we can generate the channel output sequence by simulation, set up a trellis based on the memory of the channel and the multiple inputs, then use the forward recursion of the BCJR algorithm [12] to compute the joint probability of the output sequences. We can estimate the entropy by conducting many simulations of the channel and averaging the logarithm of the joint probability estimates. However, for our model, the channel outputs are stationary ergodic hidden-Markov processes, so the Shannon–McMillan–Breiman ZHANG et al.: ACHIEVABLE INFORMATION RATES AND CODING FOR MIMO SYSTEMS theorem holds [18], [20]. Thus, we can estimate the entropy by conducting a single simulation with a very large block length, which simplifies the computations considerably. Let us explain in detail the estimation of the joint probability of the output sequence obtained from a channel simulation. Assume that the size of the finite input alphabet is . Since there are inputs and the memory of the channel is , there are states in the trellis describing the MIMO ISI channel. We denote the trellis state at the time instance as , and define (9) (10) is short for and , . Then we can compute for every trellis state by using the forward recursion of the BCJR algorithm, as follows: where (11) possible transitions or branches for every state, There are for these branches when i.u.d. inputs are and used. For other transitions, . For a given state tranare indesition, the elements of pendent Gaussian random variables, and the real and imaginary parts of any output sample are also independent if they are complex. Therefore, for the complex outputs, we have (12) where the subscripts of and represent the real and imaginary parts, respectively. Otherwise (13) and . Initially, we set With this recursive computation, we update the metrics with the forward processing of the trellis. At the final stage, the joint , is given by the sum probability of the outputs, of the metrics over all trellis states (14) The procedure for computing the information rates of the MIMO ISI channels is summarized as follows. 1) Generate the output sequences with a large length through the simulated MIMO ISI channel, where the inputs are chosen i.i.d. from a finite alphabet of size . states and valid 2) Set up the MIMO trellis with state transitions. 1701 3) Set the initial values of the metrics ) and set the time instance . 4) Compute the probabilities of the output samples conditioned on the trellis transition using (12) or (13). 5) Update the metrics using (11). if . 6) Repeat steps 4) and 5) with and 7) Compute by (14) and (8), respectively. 8) Finally, estimate the achievable information rate by (7). We emphasize again that, for stochastic channels, although we did not express the probabilities as conditional ones (conditioned on the channel-transfer function) for simplicity in the above computations, it is important to note that they are known to the receiver and we can treat them as constants at every stage of the trellis processing. For the nonergodic channels, the fading coefficients are chosen randomly and independently, but, in contrast to the ergodic case, they are fixed during the transmission of the entire block. Thus, the Shannon type capacity is zero, as no matter how small the transmission rate is, there is still a nonzero probability that the instantaneous information rate supported by the channel will be smaller. In this case, the capacity can be considered as a random variable corresponding to the instantaneous channel fading coefficients, and the cumulative distribution function (CDF) of the capacity can be obtained by a large number of channel realizations. Then we can compute the information rate versus outage probability, or the outage information rate, as compared with the outage capacity defined in [2] and [16] for the case without constrained signaling. 1) Information Rates With Markov Inputs: In the above discussion, we have assumed that i.i.d. inputs are employed at the transmitter. But sometimes, the constrained inputs may be correlated. For example, the run-length limited (RLL) modulation codes are generally used in recording channels in order to reduce the ISI and to provide easier synchronization [21]. With these codes, the channel inputs are, in fact, Markov inputs with the memory determined by the parameters of the RLL codes. As stated before, the information rates for the MIMO ISI channels with any Markov input can also be computed by using the simulation-based algorithm. To accomplish this, we need to use a joint trellis that incorporates both the Markov inputs and the MIMO ISI channel. Suppose that the maximum memory of the Markov inputs is , then the memory of the new trellis becomes , where stands for the maximization. In addition, the transition probability depends directly on the distribution of the Markov inputs. Furthermore, we can perform waterfilling (that is, the maximization of the mutual information over the distribution of the Markov inputs) if the channel is known at the transmitter (as in the multitrack recording systems). By taking the correlated inputs into consideration, we may achieve a higher information rate than the case with i.i.d. inputs, especially in the low SNR region. In [19], the information rates for the SISO ISI channels are maximized over all Markov inputs with a certain memory by optimizing the transition probabilities of the input source via an iterative technique. This simulation-based algorithm is performed by using the forward and backward recursions of the BCJR algorithm that operate on the trellis of the ISI channel. 1702 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 52, NO. 10, OCTOBER 2004 We can extend this method to the MIMO ISI channels as well by considering the new trellis of the MIMO ISI channel describing the channel and the Markov inputs jointly, with number of states . Similar to the SISO case [19], the information rate for the MIMO ISI channel with constrained (Markov) inputs can be equivalently expressed by ( , ). Then, we compute for . In the second half of the iteration, we the given set of update the transition probabilities using (22). We repeat these steps until the transition probabilities converge (or the information rate converges). By relating this with the Arimoto–Blahut algorithm [22], [23], it is conjectured in [19] that this iterative method will lead to a convergence to the maximized information rate, which is also supported by empirical evidence. (15) represents the steady-state probability satisfying . Denoting and by and , respectively, the term for to can be written as a valid transition from state where (16) To compute , in addition to the forward recursion of the BCJR algorithm, we need the following backward recursion [12]: (17) where we define (18) for all trellis states , and initialize . Then, we obtain [12] as for (19) where can be computed by (14). Similarly (20) or (21) can be computed by substituting these quantities Then, into (16). in (15) for fixed ( , To maximize ), we define the noisy adjacency matrix with size of , where for all the valid transitions, and for the others [19]. Suppose that the maximum real and the corresponding eigenvector is eigenvalue of is , then the maximization of in (15) is achieved by (22) The iterative algorithm works as follows. In the initialization, we arbitrarily choose a valid set of transition probabilities B. Special Case: Multiantenna Systems Over Frequency-Flat Fading Channels We now consider the information rates of the MIMO systems over frequency-flat fading channels with inputs chosen from a finite alphabet, which can be considered as a special case of . The mutual information the MIMO ISI channels with between the input vector and output vector conditioned on the channel-transfer function is expressed by (23) where we use , , to denote the input, output, and noise vectors, respectively (instead of using the general denotations in (1) for simcan be easily obtained for a plicity). Since given noise covariance matrix, we only need to compute . In [24] and [25], numerical integration is used in the calculations of the information rates over SISO flat-fading channels. We may apply this approach to the MIMO channels, but the resulting complexity is very high. For example, for a (2, 1) system, where we ignore the subscripts for the receiver and express , we need to compute the output as (the subscripts of and represent real and imaginary parts, respectively). It is clear that a six-fold integration is necessary for such a small-sized multiantenna system. We can easily see that we need an -fold integration for a general system with this , we can reapproach. Although in the special case when duce this computation to a -fold integration by dephasing, or even to a -fold one with the binary PSK (BPSK) modulation by ignoring the imaginary parts, the computational burden to obtain an accurate result is still very high. Instead, we can also adopt the simulation-based method and estimate the entropy of the output sequence conditioned on the channel-transfer function in a much more efficient way. In this case, we generate channel realizations, obtain output vectors, and estimate the expectation directly by (24) is the computed joint with large , where probability of the output vector for the th realization, given by (25) ZHANG et al.: ACHIEVABLE INFORMATION RATES AND CODING FOR MIMO SYSTEMS 1703 where is the value set of the input vector with size . Given the inputs and fading coefficients, the probability of , the output vector can be computed by (12) and assuming the i.u.d. inputs are used. C. Multiaccess Multiantenna Systems We can also apply the Monte Carlo simulation technique to the case of multiantenna systems with multiple users, which is, in fact, a more general setting including the single-user MIMO system as a special case. We consider a multiple-access channel consisting of users, each equipped with transmit antennas, and a common receiver with antennas. We denote this setting system. The output vector of the multiaccess system as a resolvable over a frequency-selective fading channel with paths is given by (26) Fig. 1. where the subscript can rewrite (26) as Information rates of the (2, 2) systems over ergodic channels. denotes the th user. Similar to (2), we (27) where is a vector, and is an matrix. Assume that all the transmitted signals go through independent and idensystem takes the same channel tical channels, then the one. Therefore, we can apply representation as the the algorithm described above for the multiaccess multiantenna systems in a straightforward manner. We assume that the CSI is known only at the receivers. In this case, the ergodic capacity region is obtained in [15], where it is reported that the ergodic capacity region for a two-user system is a pentagon, and all the boundary points are achievable by i.i.d. Gaussian input signals with evenly allocated transmit power. Here we consider the inputs chosen from a finite alphabet and compute the information-rate region for such channels with two users. The information-rate region under the specific input constraint is the closure of the convex hull of satisfying , all and , where and represent the transmitted signals for users 1 and 2, respectively, and represents the received signals. We assume that there is a power constraint for each user (which can be identical for and can be both). The boundaries computed using the same method as in the case of single-user multiantenna system, since the interference from the other by using the user is known. We can compute achievable information rate for the system when the subchannels are i.i.d.. without loss of generality, we assume that the signal on the first track is the desired signal. To compute the information rates for this single-track system, we cannot use (7), since there is interference in the received signal. However, the information rate can be written as (28) where we drop the subscript for the output sequence since there is only one read head, and stands for the interference-plusnoise term given by (29) Therefore, we can consider an imaginary channel model with and use the BCJR algorithm to estimate both terms output on the right-hand side (RHS) of (28) at the same time. There is another approach obtained by applying the chain rule of and the mutual information [18]. Denoting by and , respectively, we rewrite (28) by (30) where the two terms on the RHS can be computed by the inforand systems, respectively, mation rates for the with the assumption that the signals transmitted through different tracks are independent. IV. EXAMPLES OF ACHIEVABLE INFORMATION RATES D. Single-Track Systems With ITI A. Information Rates and Outage Information Rates of the Ergodic and Nonergodic Multiantenna Systems For comparison, we also consider the single-track channels, which take the interference from adjacent tracks into account, but have only one read head. The receiver detects the desired signal from the corresponding track and treats the others as pure interference, thus does not perform joint decoding. For this case, We first consider the multiantenna wireless systems with BPSK modulation. Fig. 1 shows the information rates of a (2, 2) system over the ergodic frequency-selective fading channels and three equal-energy taps, with two respectively. The channel fading is assumed to be independent 1704 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 52, NO. 10, OCTOBER 2004 Fig. 2. Outage capacity and information rates of the (2, 2) systems over nonergodic channels (outage probability = 0:1). Fig. 3. from one symbol to the next. Also shown are the unconstrained capacity and information rates of the frequency-flat . We observe that, as expected, the fading channels frequency-selective fading channel has a higher information rate than the frequency-flat one. This is in contrast to the deterministic ISI channels, where ISI typically degrades the system performance. We can further increase the number of independent taps to obtain higher frequency diversity. However, we know from the simulation results that the additional improvement in terms of achievable information rates is only marginal with increasing . In Fig. 2, we present the outage capacities and information rates for the (2, 2) system over both frequency-flat and frequency-selective fading channels with two equal-energy taps, when the outage probability is 0.1. The outage-capacity result is obtained by using the method in [5], where i.i.d. Gaussian inputs are used. Similarly, we observe that both the capacity and information rates for the frequency-selective fading channel are considerably higher than the ones for frequency-flat fading. In addition, we observe that there is almost no difference between the capacity and information rates in the low SNR region, while there is a significant difference in the high SNR region. show the linear increase of the information rates as well. These results (which are specifically for the case with constrained signaling schemes) are new and analogous to the (Gaussian input) capacity results known for MIMO flat-fading channels. B. Effects of the Number of Antennas With the simulation-based algorithm, we can easily compute the information rates with any number of transmit/receive an, and show tennas. In Fig. 3, we consider the case when the achievable information rates as a function of the number of antennas for the ergodic frequency-selective fading channel with two balanced taps. The SNR shown is from 10 to 10 dB in 4-dB increments. Again, we assume that the fading is independent from one symbol to the next. We observe that there is a linear increase in the achievable information rates, just as in the case of the unconstrained capacity over the frequency-flat fading channels [2]. The major difference is that there is a rate limit, i.e., bits per channel use, that cannot be exceeded by increasing the SNR for the constrained case. We can easily obtain similar results for the frequency-flat fading channel, which Information rates of the (t; t) system over frequency-selective fading. C. Effects of the Spatial Correlation In the above simulations, we assumed that the fading coefficients from different antennas are independent, which is a necessary condition to achieve full space diversity. But in reality, the antennas may not be separated far enough, thus there may be significant correlations among different subchannels. Here we use one simple example to illustrate the effect of the spatial correlation on the achievable information rates. Now we consider a (2, 1) system with BPSK over the ergodic frequency-flat Rayleigh fading channels. For every symbol instance, there are two fading coefficients, both of which are zero-mean complex Gaussian random variables with independent real and imaginary parts. We . denote them by and , and assume , with We define the correlation coefficient by denoting the complex conjugate. In Fig. 4, we give an example of the information rates for different values of , together with the information rates for the SISO system. We see that, even , the SNR loss due to the spatial correlation is less when than 2 dB for all the transmission rates, and the highly correlated (2, 1) system can still offer a large capacity improvement , the unconstrained over a SISO system. Note that when capacity for the fully correlated (2, 1) system is exactly the same as the SISO system when the transmission power is normalized according to the number of transmit antennas. However, the achievable information rates are different. In particular, the asymptotical information rate with infinite SNR is 1.5 bits per . Intuitively, this is because we can channel use when detect the two input bits when they are the same, 1 or 1, otherwise, we can only tell that one is 1 and the other is 1. Mathematically, the achievable information rate in this case is just the entropy of the channel output, which takes three values for a fixed fading coefficient and has the entropy of 1.5 bits if i.u.d. binary inputs are used. ZHANG et al.: ACHIEVABLE INFORMATION RATES AND CODING FOR MIMO SYSTEMS Fig. 4. Information rates of the (2, 1) system with spatial correlation. 1705 Fig. 5. Information rates of the (2, 2) Ricean fading channels. Fig. 6. Information-rate and capacity regions of the (2, 1, 2) system. D. Information Rates Over Ricean Fading Channels We now consider the channels with a dominant LOS component, thus we use the Ricean fading channel model instead. is defined as In this case, the channel-transfer function , where is the random transfer function for the MIMO Rayleigh fading channel as described before, and is the deterministic matrix that represents the LOS com. That is, the elements ponent, given by are complex Gaussian random variables with of the matrix independent real and imaginary parts that, respectively, have the and , and have the common mean for all and variance, say . We assume , and define the Ricean factor as with the normalization . When , there is no LOS and the channel exhibits Rayleigh fading, and when , there is no fading and we can consider the channel as a Gaussian channel. It is well known that the LOS is helpful for the system performance in a SISO system, but it is generally not the case in MIMO systems. In Fig. 5, the information rates of the (2, 2) system over ergodic Ricean fading are shown, where for the all the elements. We observe that the we set LOS is not desirable in terms of achievable information rates, i.e., rich scattering provides more diversity for MIMO systems, in this case. E. Information-Rate Region of the Multiaccess Multiantenna Systems In this subsection, we present an example for a two-user multiaccess channel, where each user is equipped with a single transmit antenna and there are two antennas at the receiver. In Fig. 6, the ergodic frequency-flat fading channel is considered, and both the information-rate region and capacity region [15] are shown with a total SNR of 0 and 10 dB, respectively, where the two users have the same power constraint. We observe that the information-rate region becomes almost a rectangle (or even square, in the case of two symmetric users) for high SNR values, which is not true for the capacity region. The reason behind this result is that the information rate almost equals when the SNR is large, i.e., the knowledge of the interference cannot help much to increase the information about the desired signal. In the limiting case with infinite SNR, the detection of the signals can be done with zero error probability only if the signals from the two users experience different fading. But for the capacity with Gaussian inputs, the knowledge of the interference always helps significantly, even with a very large SNR. Similar observations can be made for the frequency-selective fading channels. F. Information Rates of the Multitrack and Single-Track Systems Now we turn our attention to the deterministic MIMO ISI channels. We consider a (2, 2) multitrack PR4 system with and , i.e., (31) In order to obtain a fair comparison with the single-track system with ITI, we consider the information rates per track, that is, we 1706 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 52, NO. 10, OCTOBER 2004 Fig. 7. Achievable information rates for the multitrack system (LHS: information rates and capacity bound. RHS: comparison with the single-track system). compute (in bits per track use), where is defined by (6). On the left-hand side (LHS) of Fig. 7, we show the unconstrained capacity and the achievable information rates 0.2 and 0.5, respectively. The for the (2, 2) system with capacity is obtained by using i.i.d. Gaussian inputs, thus it is, in fact, a lower bound. Following a similar approach as in [5], the capacity bound is given by (32) where is the identity matrix with size of 2 given by 2 and is (33) This technique is an extension of the one introduced in [6] for the SISO ISI channel. First, a hypothetical channel model is set up, where the linear convolution of the ISI channel is replaced by the circular convolution. Then discrete Fourier transform (DFT) is applied to convert the channel into parallel and independent memoryless subchannels. The capacity for this hypothetical channel is thus obtained, which is shown to be the same as the capacity for the original channel with linear convolution. In the same figure, we also show the maximized information . We rates over all the Markov inputs with a memory observe that in the low-to-medium SNR region, the information rate achieved by the optimized Markov inputs is significantly larger than the one with i.i.d. inputs, and it is even larger than the capacity lower bound. Particularly, to achieve a rate of 0.4 bits per track use, there is a gain of 2.6 dB by using optimized . However, Markov inputs over the i.i.d. inputs when in the high SNR region, the i.i.d. inputs are almost optimum, as we expect. In addition, there is a significant SNR loss in the high-rate region when is increased from 0.2 to 0.5. For comparison purposes, we consider the single-track . The insystem with the transfer function formation rates for the multitrack and single-track PR4 systems are shown on the RHS of Fig. 7 with i.u.d. inputs and different values of . We observe that the improvement in achievable information rates by adopting the multitrack system is large, especially for high ITI levels. For example, to achieve 16/17 , about 11.6 dB is needed for bits per track use when the single-track system, but only 6.0 dB is necessary for the multitrack system. V. TURBO CODING AND ITERATIVE DECODING FOR THE MIMO ISI CHANNELS In this section, we describe a turbo coding/decoding scheme for the MIMO ISI channels, in order to assess how close the practical coding schemes may be to the information-theoretical limits. In the first part, we describe the setting in detail. Then we derive the MAP detector for the MIMO ISI channels, using a (2, 2) system as an example. Finally, we present several examples for the multiantenna systems over ergodic frequency-selective fading channels, and for multitrack and single-track recording systems. A. Concatenated Coding for MIMO ISI Channels Both serial and parallel concatenation schemes have been used for the recording systems [26], [27]. In parallel concatenation, the information bits are first encoded using a turbo code, i.e., two parallel concatenated convolutional codes (PCCC) separated by an interleaver, then the coded bits are interleaved and transmitted over the ISI channel. In serial concatenation, instead of a turbo code, a single convolutional code (SCC) is employed. The decoder in both cases is an iterative decoder composed of MAP decoders for the component codes and a MAP detector for the ISI channel. The turbo coding and decoding algorithms are also employed in the multiantenna systems over frequency-flat fading channels (e.g., [28]–[30]). In [30], the turbo-coded bits are transmitted through a number of antennas, and the obtained signals from several receive antennas are sent to the turbo decoder for joint decoding. The log-likelihood ratios (LLR) of the coded bits are ZHANG et al.: ACHIEVABLE INFORMATION RATES AND CODING FOR MIMO SYSTEMS Fig. 8. Block diagram of the turbo coding scheme for MIMO ISI systems. first computed based on the channel observations, and then they are used in the turbo decoder. Turbo equalization by using an iterative demodulation-decoding algorithm is also performed. We now extend these techniques to the case of MIMO ISI channels by designing the corresponding channel MAP detectors for several scenarios, including stochastic channels (multiantenna wireless channels) and deterministic channels (multitrack and single-track recording channels). The block diagrams of the transmitter and receiver are shown in Fig. 8. At the transmitter, we first encode the message bit sequence, denoted by , by using an outer encoder, such as the SCC or turbo code. After being passed through a random interleaver, the coded bits, represented by , are divided evenly into groups, which are sent through the transmitters or tracks. In the simulations, we will use the (2, 2) system as an example, where we choose the odd indexed bits for one group and the even ones for the other. The output sequences of the (deterministic or stochastic) MIMO ISI channels, corrupted by the AWGN, constitute the received signal . At the receiver, the turbo equalization [31] is used, where a modified channel MAP detector takes the channel outputs and the extrinsic information fed back from the outer decoder as its inputs, and generates the soft information about the coded bits. This soft information is deinterleaved and passed to the outer decoder. The outer decoder, i.e., a turbo decoder or a MAP decoder for a SCC, generates the extrinsic information, which is then fed back to the channel MAP detector after appropriate processing for the next iteration step. The LLRs of the message bits are used to make hard decisions after a number of iterations. B. MAP Detector for the MIMO ISI Channels We use a binary-input (2, 2) MIMO system with ISI memory as an example to illustrate the necessary modifications of in the MAP algorithm. First of all, the channel trellis is set up according to the multiple inputs and multiple outputs. The states, and, in fact, it is the same as the one trellis has 1707 used in the computation of the information rates. Suppose that and are the coded bits with block lengths of transmitted through the (2, 2) ISI channel, and the output vector is . Thus, the LLRs of the two coded bits at the time instance can be computed as shown in (34) and (35) at is the trellis state at the time the bottom of the page, where instance , as defined in Section III, is short for , and is the set of valid state and ( , transitions corresponding to ). We observe that there are four common terms in (34) and (35), which can be computed by the BCJR algorithm operating on the trellis of the MIMO ISI channel. Similar to the algorithm for computing the information rates, we have (36) , , and are defined as in (9), (10), where and (18), respectively. Particularly, for a specific trellis state transition correand , we rewrite as sponding to (37) (38) where (37) holds because the two observations are independent for a given state transition, and the approximation (38) follows due to the use of the interleaver. Then, the a priori information fed back from the outer decoder can be used to update and in the iterative decoding. We also use this MAP detector and the coding/decoding scheme for the single-track system, where there is only one received signal and one desired transmitted signal, and no detection is performed for other transmitted signals, which are considered as pure interferences. In this case, we set up a multiple-input single-output (MISO) trellis and follow a similar procedure as the one described above. Since we have only one output and there is no a priori information about the other input bit in any iteration, we modify (38) by (39) (34) (35) 1708 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 52, NO. 10, OCTOBER 2004 Fig. 9. Performance of the SCC scheme for the (2, 2) multiantenna system. Fig. 10. Performance of the SCC scheme for the multitrack and single-track PR4 channels. where is a constant corresponding to the fixed a priori information for the interfering input. We usually assume , therefore can be omitted in the computation. We only compute the LLRs of the desired information bits by (34). When the channel response is random and known to the receiver, we just use the channel coefficients in the decoding, as they are constant during every symbol duration. That is, in the , the probability of the output samcomputation of ples for a given trellis transition is conditioned on the random channel response, just as the case of the computation of the information rates. C. Examples We employ this coding/decoding scheme in the multiantenna system over the frequency-selective fading channels, where we use the SCC as the outer encoder with generators of (33/31) in (the overall rate is octal form and a code rate of ). The block length of the input sequence to the outer encoder is 10 016 and 15 iterations are used. In Fig. 9, we consider the ergodic frequency-selective channels with two balanced taps and three balanced taps , respectively. We assume that the channel fades are i.i.d. from one symbol to the , where next. We employ a precoder, defined by indicates the modulo-2 addition and denotes the delay operator. The performance of the uncoded system is obtained by passing the uncoded and nonprecoded information bits through the channel and using the channel MAP detector to obtain the decisions. The BER for the coded system is plotted in terms of , with SNR given in the SNR per bit, which is defined as SNR (3). As expected, the performance over the channel with larger frequency diversity is better. We also see that if we do not use the precoder in this case, we cannot observe the waterfall property of the BER curves. For the precoded system, the required for the case with is SNR per bit to achieve a BER of about 6.8 dB, while the theoretical limit shown in Fig. 1 is about dB 5.9 dB per bit (where the rate loss of is considered). Therefore, the gap between the theoretical limit and the performance achieved by this practical coding/decoding as reliable scheme is about 0.9 dB, if we consider BER of transmission. Similarly, the gap between these two values for is also about 0.9 dB. We observe that the the case with SNR gain of the three-tap system over the two-tap one is about 0.6 dB, which is consistent with the information-theoretical results shown in Fig. 1. Furthermore, compared with the uncoded system, a coding gain of about 8–9 dB is achieved. Since we are using an ergodic channel, the diversity order is mostly dependent on the minimum distance of the turbo code or the convolutional code, and that is the reason we observe a very sharp slope for the BER curves. We conducted extensive simulations with other codes of different rates, and observed that a performance of about 1.0 dB away from the theoretical limit is typical. In Fig. 10, we show the BERs of this coding/decoding scheme over the multitrack and single-track systems with different values of . We employ the SCC as the outer code as well, and the same parameters are used as in the previous is employed. Compared example. A precoder of with Fig. 7, we are only about 0.7–1.2 dB (the rate loss is considered as well) away from the theoretical limits for all the cases. We also observe a great improvement of the multitrack systems over the single-track systems with the same level of ITI, which is again in line with the information-theoretical results. Although no results are shown here, we have also used the PCCC as the outer encoder, and found that the SCC achieves a better performance than the PCCC for both the stochastic and deterministic MIMO ISI channels. VI. CONCLUSIONS In this paper, we have computed the achievable information rates for the general MIMO ISI channels with the constraint that the inputs are chosen from a finite alphabet. The methods developed are applicable to both deterministic and stochastic channels. As an example of the stochastic MIMO ISI channels, we have considered the multiantenna system over frequency-selective fading channels and have computed the information rates ZHANG et al.: ACHIEVABLE INFORMATION RATES AND CODING FOR MIMO SYSTEMS when the CSI is known to the receiver only. We have quantified the improvement in the achievable information rates provided by the additional diversity over the flat-fading channels for both ergodic and nonergodic cases. In addition, we have discussed the effects of the number of antennas, the spatial correlation of the fading coefficients, and the existence of the LOS signals on the achievable information rates with the signaling constraints. This setting also includes as special cases the single-antenna systems over frequency-selective fading channels, and the multiantenna systems over flat-fading channels. We have also applied this technique to compute the information-rate region of the multiaccess multiantenna systems with inputs from a specific constellation. As for the deterministic MIMO ISI channels, we have used the multitrack magnetic recording systems, which employ multiple write and read heads, as examples. Both information rates with i.i.d. inputs and optimized Markov inputs have been considered. Our results show that the multitrack recording channels have significant advantages over the single-track channels, in terms of the achievable information rates when the ITI is considered. The “constrained” capacity results are important for practical communication systems, since they can be used to compare the performance of specific coding schemes with the informationtheoretical limits. In this paper, we have also described a turbo coding/decoding scheme for the MIMO ISI channels with a MAP detector developed for such channels, and we have demonstrated that a performance of about 1.0 dB away from the information-theoretical limits can be achieved. REFERENCES [1] G. J. Foschini and M. J. Gans, “On limits of wireless communication in a fading environment when using multiple antennas,” Wireless Pers. Commun., vol. 6, no. 3, pp. 311–335, Mar. 1998. [2] I. E. Telatar, “Capacity of multiantenna Gaussian channels,” Eur. Trans. Telecommun., vol. 10, pp. 585–595, Nov./Dec. 1999. [3] L. C. Barbosa, “Simultaneous detection of readback signals from interfering magnetic recording tracks using array heads,” IEEE Trans. Magn., vol. 26, pp. 2163–2165, Sept. 1990. [4] A. F. Molisch, M. Steinbauer, M. Toeltsch, E. Bonek, and R. S. Thoma, “Measurement of the capacity of MIMO systems in frequency-selective channels,” in Proc. IEEE Vehicular Technology Conf., vol. 1, May 2001, pp. 204–208. [5] H. El Gamal, A. R. Hammons, Y. Liu, M. P. Fitz, and O. Y. Takeshita, “On the design of space-time and space-frequency codes for MIMO frequency-selective fading channels,” IEEE Trans. Inform. Theory, vol. 49, pp. 2277–2292, Sept. 2003. [6] W. Hirt and J. L. Massey, “Capacity of the discrete-time Gaussian channel with intersymbol interference,” IEEE Trans. Inform. Theory, vol. 34, pp. 380–388, May 1988. [7] S. Shamai, L. H. Ozarow, and A. D. Wyner, “Information rates for a discrete-time Gaussian channel with intersymbol interference and stationary inputs,” IEEE Trans. Inform. Theory, vol. 37, pp. 1527–1539, Nov. 1991. [8] S. Shamai and R. Laroia, “The intersymbol interference channel: Lower bounds on capacity and channel precoding loss,” IEEE Trans. Inform. Theory, vol. 42, pp. 1388–1404, Sept. 1996. [9] D. Arnold and H.-A. Loeliger, “On the information rate of binary-input channels with memory,” in Proc. IEEE Int. Conf. Communications, vol. 9, June 2001, pp. 2692–2695. [10] H. D. Pfister, J. B. Soriaga, and P. H. Siegel, “On the achievable information rates of finite state ISI channels,” in Proc. IEEE Global Communications Conf., vol. 5, Nov. 2001, pp. 2992–2996. [11] V. Sharma and S. K. Singh, “Entropy and channel capacity in the regenerative setup with applications to Markov channels,” in Proc. IEEE Int. Symp. Information Theory, June 2001, p. 283. 1709 [12] L. R. Bahl, J. Cocke, F. Jelinek, and J. Raviv, “Optimal decoding of linear codes for minimizing symbol error rate,” IEEE Trans. Inform. Theory, vol. IT-20, pp. 284–287, Mar. 1974. [13] Z. Zhang, T. M. Duman, and E. M. Kurtas, “On information rates of single-track and multitrack recording channels with intertrack interference,” in Proc. IEEE Int. Symp. Information Theory, June–July 2002, p. 163. [14] Z. Zhang and T. M. Duman, “Achievable information rates of multiantenna systems over frequency-selective fading channels with constrained inputs,” IEEE Commun. Lett., vol. 7, pp. 260–262, June 2003. [15] W. Rhee and J. M. Cioffi, “Ergodic capacity of multiantenna Gaussian multiple-access channels,” in Proc. 35th Asilomar Conf. Signals, Systems, Computers, vol. 1, Nov. 2001, pp. 507–512. [16] E. Biglieri, J. Proakis, and S. Shamai, “Fading channels: informationtheoretic and communications aspects,” IEEE Trans. Inform. Theory, vol. 44, pp. 2619–2692, Oct. 1998. [17] J. G. Proakis, “Equalization techniques for high-density magnetic recording,” IEEE Signal Processing Mag., vol. 15, pp. 73–82, July 1998. [18] T. M. Cover and J. A. Thomas, Elements of Information Theory. New York: Wiley, 1991. [19] A. Kavčić, “On the capacity of Markov sources over noisy channels,” in Proc. IEEE Global Communications Conf., vol. 5, Nov. 2001, pp. 2997–3001. [20] B. G. Leroux, “Maximum-likelihood estimation for hidden Markov models,” Stochastic Processes, Applicat., vol. 40, pp. 127–143, 1992. [21] J. G. Proakis, Digital Communications. New York: McGraw-Hill, 2001. [22] S. Arimoto, “An algorithm for computing the capacity of arbitrary discrete memoryless channels,” IEEE Trans. Inform. Theory, vol. IT-18, pp. 14–20, Jan. 1972. [23] R. Blahut, “Computation of channel capacity and rate-distortion functions,” IEEE Trans. Inform. Theory, vol. IT-18, pp. 460–473, July 1972. [24] E. K. Hall and S. G. Wilson, “Design and analysis of turbo codes on Rayleigh fading channels,” IEEE J. Select. Areas Commun., vol. 16, pp. 160–174, Feb. 1998. [25] E. Baccarelli and A. Fasano, “Some simple bounds on the symmetric capacity and outage probability for QAM wireless channels with Rice and Nakagami fadings,” IEEE J. Select. Areas Commun., vol. 18, pp. 361–368, Mar. 2000. [26] W. E. Ryan, L. L. McPheters, and S. W. McLaughlin, “Combined turbo coding and turbo equalization for PR4-equalized Lorentzian channels,” in Proc. Conf. Information Sciences and Systems, Mar. 1998, pp. 489–493. [27] T. V. Souvignier, M. Öberg, P. H. Siegel, R. E. Swanson, and J. K. Wolf, “Turbo decoding for partial response channels,” IEEE Trans. Commun., vol. 48, pp. 1297–1308, Aug. 2000. [28] H. Su and E. Geraniotis, “Space–time turbo codes with full antenna diversity,” IEEE Trans. Commun., vol. 49, pp. 47–57, Jan. 2001. [29] Y. Liu, M. P. Fitz, and O. Y. Takeshita, “Full rate space–time turbo codes,” IEEE J. Select. Areas Commun., vol. 19, pp. 969–980, May 2001. [30] A. Stefanov and T. M. Duman, “Turbo-coded modulation for systems with transmit and receive antenna diversity over block fading channels: System model, decoding approaches, and practical considerations,” IEEE J. Select. Areas Commun., vol. 19, pp. 958–968, May 2001. [31] D. Raphaeli and Y. Zarai, “Combined turbo equalization and turbo decoding,” in Proc. IEEE Global Communications Conf., vol. 2, Nov. 1997, pp. 639–643. Zheng Zhang (S’00) received the B.E. degree with honors from Nanjing University of Aeronautics and Astronautics, Nanjing, China, in 1997, and the M.S. degree from Tsinghua University, Beijing, China, in 2000, both in electronic engineering. Currently, he is working toward the Ph.D. degree in electrical engineering at Arizona State University, Tempe. His current research interests are in digital communications, wireless and mobile communications, information theory, channel capacity, channel coding, turbo codes, LDPC codes, MIMO systems, and relay channels. 1710 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 52, NO. 10, OCTOBER 2004 Tolga M. Duman (S’96–M’98–SM’03) received the B.S. degree from Bilkent University, Ankara, Turkey, in 1993, and the M.S. and Ph.D. degrees from Northeastern University, Boston, MA, in 1995 and 1998, respectively, all in electrical engineering. Since August 1998, he has been with the Electrical Engineering Department, Arizona State University, Tempe, first as an Assistant Professor (1998–2004), and currently as an Associate Professor. His current research interests are in digital communications, wireless and mobile communications, channel coding, turbo codes, coding for recording channels, and coding for wireless communications. Dr. Duman is the recipient of the National Science Foundation CAREER Award, the IEEE Third Millennium medal, and the IEEE Benelux Joint Chapter best paper award (1999). He is an editor for the IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS. Erozan M. Kurtas (M’98) received the B.Sc. degree from Bilkent University, Ankara, Turkey, in 1991, and the M.Sc. and Ph.D. degrees from Northeastern University, Boston, MA, in 1993 and 1997, respectively. He is currently the Research Director of the Channels Department in the Research Division of Seagate Technology, Pittsburgh, PA. His research interests cover the general field of digital communication and information theory with special emphasis on coding and detection for intersymbol interference channels. He has published over 75 book chapters, and journal and conference papers on the general fields of information theory, digital communications, and data storage. He is the co-editor of the book Coding and Signal Processing for Magnetic Recording System (Boca Raton, FL: CRC Press, 2004). He has 11 pending patent applications.