ArticlePDF Available

A new approach to layered space-Time coding and signal processing.

Authors:

Abstract and Figures

The information-theoretic capacity of multiple antenna systems has been shown to be significantly higher than that of single antenna systems in Rayleigh-fading channels. In an attempt to realize this capacity, Foschini (1996) proposed the layered space-time architecture. This scheme was argued to asymptotically achieve a lower bound on the capacity. Another line of work has focused on the design of channel codes that exploit the spatial diversity provided by multiple transmit antennas (Tarokh et al. 1998, Hammons and Gamal 2000). In this paper, we take a fresh look at the problem of designing multiple-input-multiple-output (MIMO) wireless systems. First, we develop a generalized framework for the design of layered space-time systems. Then, we present a novel layered architecture that combines efficient algebraic code design with iterative signal processing techniques. This novel layered system is referred to as the threaded space-time (TST) architecture. The TST architecture provides more flexibility in the tradeoff between power efficiency, bandwidth efficiency, and receiver complexity. It also allows for exploiting the temporal diversity provided by time-varying fading channels. Simulation results are provided for the various techniques that demonstrate the superiority of the proposed TST architecture over both the diagonal layered space-time architecture in Foschini (1996) and the multilayering approach (Tarokh et al. (1999).
Content may be subject to copyright.
IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 6, SEPTEMBER 2001 2321
A New Approach to Layered Space–Time Coding
and Signal Processing
Hesham El Gamal, Member, IEEE, and A. Roger Hammons, Jr., Member, IEEE
Abstract—The information-theoretic capacity of multiple an-
tenna systems was shown to be significantly higher than that of
single antenna systems in Rayleigh-fading channels. In an attempt
to realize this capacity, Foschini proposed the layered space–time
architecture. This scheme was argued to asymptotically achieve a
lower bound on the capacity. Another line of work has focused on
the design of channel codes that exploit the spatial diversity pro-
vided by multiple transmit antennas [2], [3].
In this paper, we take a fresh look at the problem of designing
multiple-input–multiple-output (MIMO) wireless systems. First,
we develop a generalized framework for the design of layered
space–time systems. Then, we present a novel layered architecture
that combines efficient algebraic code design with iterative signal
processing techniques. This novel layered system is referred to as
the threadedspace–time(TST)architecture.TheTSTarchitecture
provides more flexibility in the tradeoff between power efficiency,
bandwidth efficiency, and receiver complexity. It also allows
for exploiting the temporal diversity provided by time-varying
fading channels. Simulation results are provided for the various
techniques that demonstrate the superiority of the proposed TST
architecture over both the diagonal layered space–time architec-
ture in [1] and the recently proposed multilayering approach [4].
Index Terms—Array processing, fading channels, multiple
transmit and receive antennas, multiuser detection, space–time
coding.
I. INTRODUCTION
R
ECENTLY, information-theoretic studies have shown that
spatial diversity provided by multiple transmit and/or re-
ceive antennas allows for a significant increase in the capacity
of coherent wireless communication systems operated in a flat
Rayleigh-fading environment [5]–[7]. Following this discovery,
two approaches for exploiting this spatial diversity have been
proposed [2], [8], [3], [1]. In the first approach [2], [3], channel
coding is performed across the spatial dimension as well as time
to benefit from the spatial diversity provided by using multiple
transmit antennas. Tarokh et al. coined the term “space–time
coding” for this coding scheme. One potential drawback of this
scheme is that the complexity of the maximum-likelihood (ML)
decoder is exponential in the number of transmit antennas.
ManuscriptreceivedNovember13,1999;revised December1,2000.Thema-
terial in this paper was presented in part at the Thirty-Seventh Annual Allerton
Conference on Communication, Control, and Computing, University of Illinois
at Urbana Champaign, Urbana, IL, September 22–24, 1999.
H. El Gamal was with the Advanced Development Group, Hughes Network
Systems, Germantown, MD 20876 USA. He is now with the Electrical Engi-
neering Department, Ohio State University, Columbus, OH 43210 USA (e-mail:
helgamal@ee.eng.ohio-state.edu).
A. R. Hammons, Jr. was with the Advanced Development Group, Hughes
Network Systems, Germantown, MD 20876 USA. He is now with Corvis Co-
operation, Columbia, MD 21046 USA (e-mail: rhammons@corvis.com).
Communicated by M. L. Honig, Associate Editor for Communications.
Publisher Item Identifier S 0018-9448(01)06214-9.
The second approach, proposed by Foschini [1], relies
upon suboptimal signal processing techniques at the receiver
to achieve performance asymptotically close to the outage
capacity with reasonable complexity. In this approach, no effort
is made to optimize the channel coding scheme. The preferred
approach in [1] is referred to as the Diagonal Bell Laboratories
Layered Space–Time (D-BLAST) architecture. One of the
contributions of this paper is a new scheme that combines ideas
from these two approaches. Specifically, we present a new
layered space–time transmission architecture—the threaded
space–time (TST) architecture—that benefits from the advan-
tages provided by efficient algebraic code design and advanced
iterative signal processing [9]–[11].
Recently, Tarokh et al. proposed a new scheme for combined
array processing and space–time coding [4] that likewise ad-
dresses some of the problems encountered with D-BLAST. This
approach relies upon a zero-forcing group interference suppres-
sion technique and shows performance that is 6–9 dB from the
outage capacity at 10% frame error rate [4]. The threaded ar-
chitecture and signal processing proposed in this paper, how-
ever, close the gap to less than 3 dB from the outage capacity
with the same frame length, error rate, and receiver complexity.
It also provides greater flexibility in terms of the tradeoff be-
tweenpowerefficiency,bandwidthefficiency,andreceivercom-
plexity.
The rest of this paper is organized as follows. The system
description and a brief review of previous work on the de-
sign of space–time modems are presented in Section II. In
Section III, we present a novel approach for the design of
layered space–time systems. This approach combines iterative
multiuser detection and decoding with algebraic space–time
coding. Algebraic space–time code constructions for the new
architecture are given in Section III-A1). In Section III-A2),
the turbo processing principle is utilized to develop an iterative
minimum mean-square error (MMSE) receiver. Comparisons
of the various layered architectures in terms of efficiency
and achievable diversity order are presented in Section IV,
while simulation results are compared in Section V. Finally,
Section VI presents our conclusions.
II. O
VERVIEW OF SPACE–TIME CONCEPTS
In this section, we lay out the basic concepts for space–time
code design and signal processing. The key ideas involved in
space–time coding for coherent channels [2], [8], [3], layered
space–time processing [1], and a recently proposed hybrid
multilayered approach [4] are briefly explained. This overview
serves to establish our perspective and notation in the context
of the prior body of work.
0018–9448/01$10.00 © 2001 IEEE
2322 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 6, SEPTEMBER 2001
Fig. 1. Multiple antenna communication system.
A. Signal Model
We consider a multiple-antenna communication system with
transmit and receive antennas as shown in Fig. 1. In this
paper, we are interested in the scenario where the fadingchannel
is frequency nonselective and channel state information is only
available at the receiver [3], [2], [8]. In Fig. 1, the channel en-
coder accepts input from the information source and outputs a
coded stream of higher redundancy suitable for error correction
processing at the receiver. The encoded output stream is modu-
lated and distributed among the
antennas. The transmissions
from each of the
transmit antennas are simultaneous and syn-
chronous. The signal received at each antenna is therefore a su-
perposition of the
transmitted signals corrupted by additive
white Gaussian noise (AWGN) and multiplicativefading.At the
receiver end, the signal
received by antenna at time is
given by
(1)
where
is the energy per transmitted symbol, is the com-
plex path gain from transmit antenna
to receive antenna at
time
, is the symbol transmitted from antenna at time ,
and
is the AWGN sample for receive antenna at time . The
noise samples are independent samples of circularly symmetric
zero-mean complex Gaussian random variables with variance
per dimension. At each time , the different path gains
are assumed to be statistically independent. The fading
model of primary interest is that of a block flat Rayleigh-fading
process in which the codeword encompasses
fading blocks.
The complex fading gains are constant over one fading block
butare independent from block to block. The quasi-static fading
model studied extensively in [8], [2], [3], [7] is a special case of
the block fading model in which
.
The received signal can be expressed in vector notation as
(2)
where
is the received vector at time , is the
complex channel matrix whose th column corresponds to the
path gains for the
th transmit antenna, isthe transmitted
vectorat time
, and is the white Gaussian noise vector.
B. Space–Time Channel Codes
In the concept of a space–time code, the channel encoding,
modulation, and distribution of symbols across antennas are
intrinsically connected—i.e., a two-dimensional (2-D) coded
modulation technique. Given a set
, the space of row
vectors and the space of
matrices taking values in will
be denoted by
and , respectively. Then, a block code
of length
over the discrete symbol alphabet is a subset
of the -dimensional space . Usually, the number of code-
words in
is a power of the alphabet size, , so that
there is a one-to-one mapping,
: , of information -tu-
ples onto codewords. The mapping
is an encoder for .In
this paper, we will be primarily interested in the case in which
is a binary linear code—i.e., is the elementary binary field
GF and is linear.
The baseband modulation mapping
: assigns to
each
-tuple of alphabet symbols a unique point in the discrete,
complex-valued signalingconstellation
,whichis assumed not
to contain the point zero. Conversely, the inverse map
pro-
vides a
-symbol labeling of the constellation points. By exten-
sion,
denotes the modulated version of the vector .
In this case, it is understood that
must be a multiple of and
that the blocking of symbols into
-tuples for the modulator is
performed left to right.
Let
denote the expanded constellation. Then,
the spatial modulator is a mapping
: that
sends the vector
to an complex-valued matrix ,
whose nonzero entries are a rearrangement of the entries of
. Specifically, is the baseband version of the codeword
as transmitted across the channel. Thus, in the notation of (1),
the matrix
has th entry equal to . Note that, in this for-
mulation, it is expressly allowed that a complex zero (i.e., no
transmission) be assigned to a givenantennaat a givensignaling
interval; thus,
. This provision is intended to simplify
the generalized layering framework outlined in Section III. We
will refer to
and , respectively, as the spatial span and tem-
poral span of
.
Finally,forconvenience,let
denotethe ma-
trix in which eachconstellation point is replaced by its
-symbol
label and any zero entry is replaced by a
-tuple of special blank
symbols.
EL GAMAL AND HAMMONS: A NEW APPROACH TO LAYERED SPACE–TIME CODING AND SIGNAL PROCESSING 2323
Fig. 2. Layering and signal processing for D-BLAST.
Definition 1: A space–time code consists of an underlying
channel code
together with the spatial modulator function .
The fundamental performance parameters [8], [2] for space–
time codes are the following: 1) diversity advantage, which
describes the exponential decrease of decoded error rate versus
signal-to-noise ratio (SNR) (asymptotic slope of the perfor-
mance curve on a log–log scale); and 2) coding advantage
which does not affect the asymptotic slope but results in a shift
in the performance curve. The diversity advantage is the more
critical of the two performance metrics as it determines the
asymptotic slope of the performance curve. Ideally, the coding
advantage should be optimized after the diversity advantage is
maximized [2], [8], [3].
For quasi-static fading channels, it has been shown [2], [8]
that the spatial diversity advantage of the code, assuming ML
decoding, is the product of the number of receive antennas and
the minimum rank among the set of complex-valued matrices
associated with differences between baseband-modulated code-
words. It is clear that full spatial diversity
will be achieved
if and only if all the difference matrices have full rank.
In [3], we developed an algebraic framework for systematic
design of binary phase-shift keying (BPSK) and quaternary
phase-shift keying (QPSK) space–time codes that achieve
full spatial diversity. This framework will be utilized in
Section III-A1) to design algebraic space–time codes for the
layered scenario.
C. Layered and Multilayered Space–Time Architectures
Inthelayeredspace–timearchitecture, the channel encoder of
Fig. 1 is composite and the multiple, independent coded streams
are distributed throughout the transmission resource array in
so-called layers. The primary design objective is to design the
layering architecture and associated signal processing so that
the receiver can efficiently separate the individual layers from
one another and can decode each of the layers effectively. Fos-
chini [1] discusses different layering schemes for the proposed
BLAST architecture. In the simplest variation, the code words
are transmitted in horizontal layers (H-BLAST). The preferred
scheme, however, involvesthe transmission of code words in di-
agonal layers (D-BLAST).
The BLAST receiver uses a multiuser detection strategy
based on a combination of interference cancellation and
suppression. In D-BLAST, each diagonal layer constitutes a
complete codeword, so decoding is performed layer-by-layer.
Consider the codeword matrix shown in Fig. 2, the entries
below the first diagonal layer are zeros. To decode the first
diagonal, the receiver generates a soft-decision statistic for
each entry in that diagonal. In doing so, the interference from
the upper diagonals is suppressed by projecting the received
signal onto the null space of the upper interference. The soft
statistics are then used by the corresponding channel decoder
to decode this diagonal. The decoder output is then fed back
to cancel the first diagonal contribution in the interference
while decoding the next diagonal. The receiver then proceeds
to decode the next diagonal in the same manner. It is worth
noting that the zero-forcing suppression strategy requires that
; however, this requirement can be relaxed by using
MMSE filtering instead of the zero-forcing strategy.
The multilayered space–time architecture, as introduced
by Tarokh et al. [4], is a hybrid approach involving use of
both space–time channel codes and layered processing. In this
scheme, the input stream is divided,for example, into
sub-
streams. The different substreams are encoded using
-level
diversity component space–time trellis codes
.
Each component code is then transmitted from
antennas
(horizontal
-layering). At the receiver, each component code
is decoded separately while suppressing signals from other
component codes. The group interference suppression strategy
[4] is based on the zero-forcing principle and requires that
. In quasi-static fading channel, the spatial
diversity gain achieved by
is . The
decoded output from
is subtracted from signals at different
receive antennas. This gives a communication system with
transmit and receive antennas. Hence, assuming
correct decoding of
, the space–time code affords a
diversity gain of
, and so on. Using the
fact that the diversity gain increases with each decoding stage,
unequal power levels are allocated to the different component
codes. Because all the space–time codes proposed in [2] were
two-level diversity codes, except for the delay diversity, the
design examples in [4] were limited to
.
III. G
ENERALIZED SPACE–TIME LAYERING
The different layering and multilayering approaches avail-
able in the literature were partly inspired by the signal pro-
cessing techniques employedat the receiver.Forexample,in the
D-BLAST approach, each layer is constrained to occupy a diag-
onal in the 2-D transmission resources array. It is easy to see that
this constraint is imposed by the interference cancellation/sup-
pression technique proposed in [1]. In this paper, we follow a
2324 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 6, SEPTEMBER 2001
differentpath. First, we generalize the notion of space–time lay-
ering independent of the signal processing employed at the re-
ceiver. Based on this generalized notion, we recognize a certain
typeoflayering—TST layering—that efficiently exploits the di-
versity available in the system. Then, we consider the design of
algebraic space–time codesand iterative signal processing tech-
niques that optimize the performance of TST systems.
In our framework, a layer is defined as a section of the trans-
mission resources array having the property that each symbol
interval within the section is allocated to at most one antenna.
This property ensures that all spatial interference experienced
by the layer comes from outside the layer.
Formally, a layer in an
transmission resource array may
be identified by an indexing set
havingthe property
that the
th-symbol interval on antenna belongs to the layer
if and only if
. Then, our formal notion of a layer
requires that, if
and , then either
or (i.e., that is a function of ). The pair of spatial and
temporal spans of a layer is defined as
where . This pair represents the ability of the layer to
exploit the available spatial and temporal diversity, and hence,
it is desirable to develop a layering approach in which all layers
have full spatial and temporal spans
.
Consider a composite channel encoder
consisting of con-
stituent encoders
operating on independent in-
formation streams. Let
: , so that
and
Then, there is a partitioning of the com-
posite information vector
into a set of disjoint compo-
nent vectors
, of length , and a corresponding partitioning
of the composite codeword into a set of constituent code-
words
, of length . In the generalized layering archi-
tecture approach, the space–time transmitter assigns each of the
constituent codewords
to one of the set of disjoint
layers. For simplicity, we consider the case in which the con-
stituent codes are all of the same rate and have the same code-
word length:
and for all .
There is a corresponding decomposition of the spatial modu-
lating function that is induced by the layering. Let
:
denote the component spatial modulating function, as-
sociated with layer
, which agrees with the composite spa-
tial modulator
regarding the modulation and formatting of the
layer elements but which sets all off-layer elements to complex
zero. Then
It is straightforward to see that the layered architectures pro-
posedbyFoschiniin [1] are specialcasesofthisgeneralizedlay-
ering. For example, in the D-BLAST architecture [1], the output
of each encoder is distributed among the
antennas along the
diagonal layers such that
(3)
where
isthe width of the diagonal,
is the temporal span, and denotes the function returning the
integer part of a real-valued input reduced modulo
.
A. TST Layering
In this section, we present a new space–time layering de-
sign that efficiently exploits the diversity available in the mul-
tiple-input–multiple-output (MIMO) channel. In the proposed
approach, the encoding, interleaving, and distribution of each
layer’ssymbolsamong differentantennas are optimized tomax-
imize spatial and temporal diversity for a given transmission
rate, assuming no interference from the other layers. Mean-
while, interleaving is also optimized to maximize the efficiency
of the iterative signal processing techniques necessary to sup-
press other layers’ interference as described in Section III-A2).
It is worth noting that the threaded approach is applicable to ar-
bitrary constellations with binary (or nonbinary) codes.
Asin thegeneralizedlayering architecture, thetransmitterhas
available a disjoint set of layers
and
transmits the composite codeword
by sending in layer .
The layer set
is designed so that each layer is active during
all of the available symbol transmission intervalsand,overtime,
uses each of the
antennas equally often. Thus, during each
symbol transmission interval, each layer transmits a symbol
using a different antenna; and, in terms of antenna usage, all
of the layers are equivalent. Unlike the layered architectures
of [1], the new design approach treats the coded transmission
in each layer as a bona fide space–time code, constructions for
which are given in the next section. Looking at the space–time
coding performed on a single layer in isolation, one notes that
the main limitation of this construction is the reduction in
throughput resulting from the silence periods imposed on the
different antennas. But, in the overall transmission scheme, the
silent periods on antennas not used by a given layer are filled
with the transmissions from the other component space–time
codes. Iterative signal processing at the receiver, necessary to
remove or suppress spatial interference among the layers, is
discussed in Section III-A2). One innovation of the new archi-
tecture is that, under the assumption of error-free interference
cancellation, the component space–time codes can be designed
to achieve full spatial diversity without degradation in overall
system throughput.
The new space–time architecture is not a multilayered ap-
proach since the transmit positions occupied by the modulated
code symbols for a particular codewordconstitute a single layer.
Yet, the new architecture is not a layered architecture in the
same sense as the BLAST architecture, since the layering is
more general, well-suited for iterativemultiuser techniques, and
the channel coding design in the new approach is 2-D based on
EL GAMAL AND HAMMONS: A NEW APPROACH TO LAYERED SPACE–TIME CODING AND SIGNAL PROCESSING 2325
Fig. 3. A simple example for threaded layering (each shade represents a
thread).
space–time coding principles designed to exploit both the spa-
tial and temporal diversity. To distinguish this new approach,
we refer to it as the threaded space-time (TST) architecture and
each layer in the new architecture is referred to as a thread.A
thread can be defined as a layer with full spatial span
and
full temporal span
.” The simplest example of threaded lay-
ering is the set
shown in Fig. 3 in which
(4)
1) Design of TST Codes: Now, we look at the design of the
component space–time codes used in the threaded architecture.
The design of these codes follows the algebraic approach intro-
duced in [3]. The layering provided by the threaded architecture
allows the algebraic formulation to be extended to arbitrary sig-
naling constellations. Importantly, the requirement for indepen-
dent interleaving in the iterative multiuser receiver, discussed
in the following section, is easily accommodated in these code
designs. Our results are first developed for quasi-static fading
channels, then we outline the extension to time-varying block
fading channels.
Consider a single threaded layer
and the corresponding
component space–time code
associated with encoder . The
spatially modulated codewords of
are the complex ma-
trices
.To simplify notation, we willdroptheindexes,
letting
, , and . We will let denote the
component spatial modulator function associated with layer
.
Unsubscripted vectors such as
or will be used to refer to the
information stream.
For the design of the space–time code
associated with
thread
, we have the following stacking construction using
binary matrices for the quasi-static fading channel.
Theorem 2 (Threaded Stacking Construction): Let
be
a threaded layer of spatial span
. Given binary matrices
of dimension , let be the binary
code of dimension
consisting of all codewords of the form
, where denotes an arbitrary
-tuple of information bits. Let denote the spatial modulator
having the property that
is transmitted in the
symbol intervals of that are assigned to antenna .
Then, as the space–time code in a communication system
with
transmit antennas and receive antennas, the
space–time code
consisting of and achieves spatial
diversity
in a quasi-static fading channel if and only if
is the largest integer such that have the
property that
is of rank over
the binary field
Proof: Due to the lack of spatial interference within a
layer, the baseband rank criterion [2], [8] is straightforward
to apply. In particular, note that the baseband difference
has rank if and only if it has precisely
nonzero rows.
Now suppose that, for some
satisfying
we have that
is singular. Then, there exist , , , such that
. In this case, has an all-zero
row for every nonzero coefficient
. Since there are
nonzero coefficients, has rank less than
. Thus, does not achieve -level diversity.
Conversely, suppose
does not achieve -level diversity.
Then, there exist
, , , such that the baseband dif-
ference
hasrankless than . It must,there-
fore,haveatleast
all-zerorows.Let denoteasetofin-
dicesfor
suchrows,and set for and
otherwise. Then, the matrix is
singular since
.
Corollary 3: Full spatial diversity isachieved if and only
if
are of rank over the binary field.
A space–time code that achieves
-level spatial diversity
in a communication system with
transmit and receive
antennas over the quasi-static fading channel is called a
-space–time code.
Corollary 4: The maximum transmission rate for a commu-
nication system using the threaded layering architecture with
transmit antennas, a signaling constellation of size , and
component codes achieving
-level transmit spatial diversity is
bits per second per hertz.
Proof: By Theorem 2, in order for the code to achieve
-level spatial diversity, the number of columns in must
satisfy
. Then the code rate for is
. Therefore, the maximum transmis-
sion rate of each thread is
bits per sig-
naling interval. Then, the total transmission rate of the
threads
is
. A different proof can be obtained using the ar-
gument in [12] on the maximum lossless compression transmis-
sion rate.
The following result is straightforward but quite important
for the design of space-time threaded codes that allow for max-
imizing the efficiency of the iterative multiuser detector as dis-
cussed in the next section.
2326 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 6, SEPTEMBER 2001
Theorem 5: Let be a -space–time code consisting of the
binary code
whose codewords are of the form
where denotes an arbitrary -tuple of information bits,
and the spatial modulator
in which is assigned to
antenna
along threaded layer . Given the linear vector-space
transformations
: , we construct a new
space–time code by assigning
to antenna along
threaded layer
. Then, the new space–time code achieves the
same spatial diversity order
if are nonsingular.
In particular, we can take the linear transformation
of the
previous theorem to be an arbitrary permutation
. Then, the
interleaved space–time code resulting from assigning
to antenna along threaded layer achieves the same level of
spatial diversity as the noninterleaved space–time code
.
We now look at the special case of designing space–time
trellis codes for the threaded architecture. The main advantage
of such codes is the availability of computationally efficient,
soft-input/soft-output (SISO) decoding algorithms. The natural
space–time codes [3] associated with binary, rate
, convo-
lutional codes with periodic bit interleaving are attractive can-
didates for the threaded space–time architecture as they can be
easily formatted to satisfy the threaded stacking construction.
Each output arm from the encoder is transmitted from a separate
antenna. There is no restriction on the interleaving employed
by each antenna (i.e., different interleaving can be used by the
different antennas without violating the threaded stacking con-
dition). As discussed earlier, this feature allows for the design
of efficient iterative multiuser receivers. These convolutional
codes were considered for a similar application (the block-era-
sure channel) in [12].
The prior literature on space–time trellis codes treats only the
case in which the underlying code has rate
matched to the
number of transmit antennas. In our development of threaded
space–time code design, we consider the more general case in
which the convolutional code has rate greater than
. The
treatment includes the case of rate
convolutional codes con-
structed by puncturing an underlying rate
convolutional
code.
Let
beabinaryconvolutionalcode ofrate .Theencoder
processes
binary input sequences
andproduces codedoutputsequences
which are multiplexed together to form the output codeword.
For quasi-static fading channels, the input and output
sequences of interest are of fixed finite length; in the more
general case, however, the sequences are semi-infinite indexed
by
. We let denote the space of all such
binary sequences. A sequence
is often
represented by the formal series
We refer to as a -transform pair. The space
ofallformalseriesisan integraldomain whose invertible
elements are those that are not multiples of
.
The action of the binary convolutional encoder is linear and
is characterized by the so-called impulse responses
associating output with input . Thus, the
encoder action is summarized by the matrix equation
where
and
.
.
.
.
.
.
.
.
.
.
.
.
We consider the natural space–time formatting of in which
the output sequence corresponding to
is assigned to the
th transmit antenna and wish to characterize the spatial diver-
sity that can be achieved by this scheme. Our algebraic anal-
ysis technique considers the rank of matrices formed by con-
catenating the column vectors
.
.
.
Specifically, for , let
Then we have the following theorem relating the spatial diver-
sity of the space–time code
in the quasi-static fading channel
to the rank of these matrices over
.
Theorem 6: Let
denote the threaded space–time code con-
sisting of the binaryconvolutional code
, whose transfer
function matrix is
and the spatial modulator in which the output
is assigned to antenna along threaded layer . Let be the
smallest integer having the property that, whenever
, the matrix has full
rank
over . Thenthespace–timecode achieves -level
spatial transmit diversity over the quasi-static fading channel
where
and .
Proof: All of the codewords of
are of the form
. Under the stipulated conditions
of the theorem and following the argument of Theorem 2
(threaded stacking construction), only the all-zero codeword
has
or more all-zero rows, so the spatial transmit diversity of
is at least . On the other hand, since is the smallest
integer having the stated property, there is some information
sequence
resulting in a codeword with all-zero
EL GAMAL AND HAMMONS: A NEW APPROACH TO LAYERED SPACE–TIME CODING AND SIGNAL PROCESSING 2327
rows. Hence, the spatial transmit diversity of is precisely
.
Rate convolutional codes with can also be put
into this framework. Let
be a binary convolutional code with
transfer function matrix
The coded bits are to be distributed among transmit antennas.
For simplicity, we consider the case in which
is an
integer and the coded bits are assigned to the antennas periodi-
cally. Thus, for each of the coded bit streams
,
the subsequence
is assigned to antenna
; the subsequence is assigned
to antenna
; and so on. Alternate assignments such as
symbol-based demultiplexing would also be possible and can
be analyzed using the same framework.
In general, we partition the series
corresponding to
into its modulo components corresponding to
the subsequences
Then
Similarly, we partition into components and
into components . The space–time code under
consideration therefore consists of the binary code
together
with a spatial modulator function in which
is assigned
to antenna
.
By multiplying the expansions for
and and col-
lecting terms, one may show that the coded bit stream assigned
to antenna
is given by
where
In matrix form, we have
which is the dot product of row vector
and column vector
.
.
.
The theorem now applies directly. The spatial transmit diver-
sity achieved by
is given by , where is the
smallest integer having the property that, whenever
, the matrix has
full rank
. In particular, we note that the best possible spatial
transmit diversity is
. When ,wehave
so that full spatial transmit diversity is possible as
expected.
Example: Consider the four-state convolutional code with
optimal
and generators and
. In the case of two transmit antennas,
it is clear that the natural threaded space–time code achieves
level diversity.
In the case of four transmit antennas, we note that the
rate–
code can be written as a rate– convolutional code
with generator matrix
By inspection, every pair of columns is linearly independent
over
. Hence, the natural periodic distribution of the code
across four transmit antennas produces a threaded space–time
code achieving the maximum
transmit spatial diversity.
For six transmit antennas, we express the code as a rate–
code with generator matrix
Every set of three columns in the generator matrix has full rank
over
, so the natural space-time code achieves maximum
transmit diversity.
Thus far, we have considered the design of threaded space–
time codes that exploit the spatial diversity over quasi-static
fadingchannels. However, one of the advantagesof the threaded
architecture is its ability to jointly exploit the spatial diversity
provided by the multiple transmit and receive antennas, and
the temporal diversity provided by the time variations in the
block fading channel. In fact, the results obtained for threaded
space–time code design for the quasi-static fading channel can
be easily extended to the more general block fading channel.
In the absence of interference from other threads, the quasi-
static fading channel under consideration may be viewed as a
block fading channel with receive diversity, where each fading
block is represented by a different antenna. For the threaded
architecture with
transmit antennas and a quasi-static fading
channel,there are
independent and noninterfering fadinglinks
per codeword that can be exploited for transmit diversity by
propercodedesign. In thecaseof the blockfadingchannel, there
is a total of
such links, where is the number of indepen-
dent fading blocks per codewordper antenna. Thus, the problem
of block fading code design for the threaded architecture is ad-
dressed by simply replacing the parameter
by .
For example, the following “multistacking construction” is a
direct generalization of Theorem 2 to the case of a block fading
channel.
Theorem 7 (ThreadedMultistackingConstruction): Let
be
a threaded layer of spatial span
. Given binary matrices
2328 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 6, SEPTEMBER 2001
Fig. 4. Iterative multiuser detector for space–time signals.
of dimension , let be the binary code of dimension
consisting of all codewords of the form
where denotes an arbitrary -tuple of information bits, and
is the number of independent fading blocks spanning one code-
word. Let
denote the spatial modulator having the property
that
is transmitted in the symbol intervals of that
are assigned to antenna
in the fading block .
Then, as the space–time code in a communication system
with
transmit antennas and receive antennas, the
space–time code
consisting of and achieves diversity
in a -block fading channel if and only if is the largest
integer such that
have the property
that
is of rank over the binary field
Proof: This result is immediate from the equivalent quasi-
static model with
transmit antennas.
2) Iterative Signal Processing for TST Layering: In the pre-
vious section, we have considered the problem of designing
TST systems assuming that a genie were to cancel the other
layers’ interference at the receiver. Ultimately, the performance
of threaded systems will hinge upon the efficiency of the signal
processingatthe receiverinseparatingthe signals from different
threads. The problem of space–time signal processing can be
formulated as a joint multiuser detection and decoding problem.
Hence, the turbo processing principle [13] can be efficiently
used to develop a set of iterative multiuser detection algorithms
that allow tradeoffs between performance and complexity. A
block diagram of the iterative receiver is shown in Fig. 4. In
this block diagram, a SISO multiuser detector module provides
soft-decision estimates of the
streams of data. The detected
streams are decoded by the separate SISO channel decoders
associated with the component channel codes. After each de-
coding iteration, the soft outputs from the channel decoders are
used to refine the processing performed by the SISO multiuser
detector. Note that, in the iterative receiver, each of the streams
is independently interleaved to facilitate convergence. This key
feature of the receiver is instrumental in ensuring good conver-
gence characteristics for the iterative algorithm [10], [11]. We
explicitly allowed for this random interleaving option in our al-
gebraic code constructions in the previous section.
The complexity of the SISO multiuser detector constitutes
a major part of the overall complexity of the iterative receiver.
Three SISO multiuser detection algorithms that provide a
tradeoff between performance and complexity have been
proposed in the literature. The first is based on the maximum a
posteriori (MAP) probability rule [9], [14]; the second is based
on the MMSE criterion [10], [11]; and the third is the soft
interference canceler which can be viewed a suboptimal ap-
proximation of the iterative MMSE receiver [15]. In this paper,
we will focus our attention on the iterative MMSE receiver
because it provides an efficient tradeoff between performance
and complexity among the three iterative approaches [11], [10].
The iterative MMSE receiver is adapted from the first
author’s work [10] on iterative MMSE multiuser detectors
for code-division multiple access (CDMA) systems. (Such
receivers for CDMA applications were also investigated
independently by Wang and Poor [11].) For simplicity of
presentation, binary channel codes and BPSK modulation are
assumed.
In this scheme, the soft outputs are used after each iteration
to update the a priori probabilities of the transmitted symbols.
These updated probabilities are then used to calculate the con-
ditional MMSE filter feed-forward and feedback weights. The
feedbackconnection represents the subtractiveinterference can-
cellation part of the receiver, while the feed-forward weights
serve to suppress any residual interference. Let
be the esti-
mate of the
th-antenna symbol at time given by (the subscript
will be omitted for convenience)
(5)
where
is the optimized feed-forward coefficients
vector and
is a single coefficient that represents the soft
cancellation part. The coefficients
, are obtained
through minimizing the conditional mean-square value of the
error between the data symbol and its estimate. Now, let
be
the
complex channel vector of the th transmit antenna;
be the matrix composed of the complex
channel vectors of the other
transmit antennas; and
EL GAMAL AND HAMMONS: A NEW APPROACH TO LAYERED SPACE–TIME CODING AND SIGNAL PROCESSING 2329
be the transmitted data vector form the other
transmit antennas. Assuming statistically independent a priori
information and using standard minimization techniques, it is
easily shown that the conditional MMSE solutions for
,
and
are given by [10]
(6)
(7)
where
(8)
(9)
(10)
(11)
is the identity matrix of order , and is the
vector of the conditional expected values of the
transmitted symbols from the other
antennas. The a
priori probabilities used to evaluate these expected values are
obtained from the previous decoding iteration soft outputs
through the component-wise relation
(12)
where
is the extrinsic information corresponding to the
symbol transmitted from the
th antenna at time [16]. Note
that in the first iteration, one takes
3) Performance Bound: In this section, we investigate the
spatialdiversityadvantageachievedbythe threaded architecture
over the quasi-static fading channel when the iterative MMSE
algorithm is used.
Proposition 8: Let
be a -diversity code used in each
thread in a setting with
transmit and receive antennas in
quasi-static fading channels, then the zero-forcing receiver
achieves spatial diversity
.
Proof: To detect the signal transmitted from the
th an-
tenna, the zero-forcing receiver projects the received signal on
the null space of
. Let be the null space of , and
be an matrix whose rows are orthonormal
vectors of
. Then the output vector corre-
sponding to
is computed as
(13)
The elements of
, are Gaussian random variables with
. Note that, in general, .
Hence, at the output of the zero-forcing filter, the channel
is equivalent to an interference-free correlated block fading
channel with
blocks and receive antennas. Since
the different equivalent Gaussian fading gains are linearly
independent, the equivalent channel correlation matrix is of full
rank [17]. Thus, by the argument in [2], the diversity order is
.
Let SIR denote the signal-to-interference-plus-noise ratio
(SIR) for a symbol transmitted fromthe
th antenna after the th
iteration of the iterative MMSE algorithm. Then, conditioning
on the set of path gains, we have
SIR
(14)
where
is the vector of feed-forward filter coefficients used
in the
th iteration.
Proposition 9: Let
be a -diversity code used in each
thread in a setting with
transmit and receive antennas.
The SIR at the output of the iterative MMSE detector after
iterations is at least as large as the SIR after one iteration.
Furthermore, the output SIR is at least as large as that produced
by the zero-forcing detector.
Proof: If SNR
denotes the SIR at the output of the
zero-forcing detector, then it follows from the definition of
the MMSE receiver that SNR
SNR . Also, from the
definition of the MMSE filter, it follows that
SIR
SIR
as was to be shown.
In [18], Poor and Verdú have shown that the output of the
MMSEreceiverinAWGNchannelscanbetightlyapproximated
by a Gaussian random variable. In the space–time code setting,
thechannelis AWGNwhen conditioned on thepathgains. Thus,
the twopropositions imply that the diversityadvantage achieved
by the iterative MMSE receiver for the threaded architecture
is approximately lower-bounded by the performance achieved
by the zero-forcing receiver. Consequently, in a threaded archi-
tecture using
-space–time codes, the iterative MMSE receiver
should achieve diversity
satisfying
(15)
We note that this lower bound justifies our approach to code de-
sign for the threaded architecture. In particular, the design cri-
teria developed in Theorems 2 and 7 for optimizing the channel
coding for each thread in the absence of interference also serves
to maximize a lower bound on the diversity advantage when
the iterative MMSE detector is used to mitigate the interference
from other threads. The simulation results of Section V suggest
that the lower bound is, in fact, a pessimistic estimate of the
performance of the threaded architecture with iterative MMSE
multiuser detection.
2330 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 6, SEPTEMBER 2001
TABLE I
C
OMPARISON OF DIFFERENT LAYERED ARCHITECTURES
IV. SYSTEM COMPARISONS
Ahigh-level comparison ofthe variousarchitectures is shown
in Table I. As shown in the table, all of the transmission formats
achieve comparable efficiency. Here, efficiency refers to the
number of information symbols per vector channel use. For ex-
ample, in the horizontallayering scheme, there are
layers each
containing a codeword of length
and rate . Thus, successful
use of all transmission resources provides a total of
in-
formation symbols. Normalizing by the total number of symbol
transmission intervals
gives an efficiency of information
symbols per transmitted symbol interval. For the diagonal-lay-
ering approach, the efficiency is somewhat less since the diag-
onal layers cannot utilize a portion of the transmission resources
(theresultinthe tableassumes the width of the diagonal
).
We also report the diversity orders achieved by the various
architectures in both quasi-static and block fading channels. In
the different approaches, the channel-coding schemes are as-
sumed to achieve the maximum possible diversity level for rate
codes.Sinceno attempt wasmadein[1] to optimize the coding
for the diagonal layering architecture, the results reported in the
table are on a per-symbol basis. The diversity order achieved by
the previous layered and multilayered architectures is variable.
For these approaches, Table I shows the range of values (min-
imum: maximum) and notes whether the variation is from layer
to layer or from symbol to symbol. In the case of the proposed
threaded architecture, the diversity order is not variable, but the
exact value is difficult to determine. In this case, the upper and
lowerbounds from (15) are used in Table I. For the blockfading
channel, the parameter
denotes the number of fading blocks
per codeword.
The threaded layering is similar to H-BLAST in that each
transmitted symbol in a thread is subject to interference from
otherlayers,butbetter spatial diversityis achievedthrough
more efficient transmit diversity and multiuser detection signal
processing. The threaded layering is similar to D-BLAST inthat
all of the transmit antennas are used equally by each compo-
nent coded transmission, but it more fully exploits the avail-
able temporal diversity since temporal interleaving is allowed
across each transmit antenna. Furthermore, unlike D-BLAST,
the threaded layering with space–time code design and itera-
tive multiuser detection algorithms provide uniform spatial di-
versity from symbol to symbol. Finally, unlike the horizontal
multilayeringapproachwithgroupinterferencesuppression, the
threaded architecture provides uniform performance from one
component space–time code to the next; and each component
space–time code can, under the ideal interference cancellation
assumption,achievethe maximumpossiblespatialand temporal
diversity.
V. P
ERFORMANCE COMPARISONS
In this section, the different schemes are compared via simu-
lation. In the study, we have used convolutional codes, the main
advantage of which is the availability of computationally effi-
cientSISO decoders. Periodic bit demultiplexersareusedto dis-
tribute the encoder outputs across the different antennas. In ad-
dition,innerrandominterleaversareusedtoaidtheconvergence
of the iterative MMSE receiver as discussed in Section III-A2).
The error statistics are obtained by averaging the frame error
rates of all the component codes. The channel decoder is based
on the soft-output Viterbi algorithm (SOVA). Unless otherwise
stated, the channel is assumed to follow the quasi-static fading
model.Thenumber ofiterationsforthe iterativeMMSE receiver
is four. The code rate of the component codes is
.
Fig. 5 compares the performance of the iterative MMSE re-
ceiver with horizontal layering versus interference-free perfor-
mance.Inthe case of theiterative MMSE, there arefourtransmit
and four receiveantennas, and the bandwidth efficiency is
2
b/s/Hz (i.e., BPSK modulation). The frame length corresponds
to 100 transmissions. For the interference-free reference, there
are four receive antennas but only one transmit antenna. The
bandwidth efficiency in this case is
0.5 b/s/Hz. In general,
the relation between the energy per bit to noise ratio
and
the total transmitted SNR is
SNR
(16)
The same SNR per transmit antenna is used in both scenarios,
and the SNR reported in the figure is the total SNR for the
four transmit antennas in the iterative MMSE case. The interfer-
ence-free scenario representsa lower bound onthe performance
EL GAMAL AND HAMMONS: A NEW APPROACH TO LAYERED SPACE–TIME CODING AND SIGNAL PROCESSING 2331
Fig. 5. Performance of the iterative MMSE receiver.
achieved by the optimum receiver. It is shown that the iterative
MMSE receiver performs within a fraction of a decibel from the
lower bound.
Now, we are interested in comparing the performance of the
TST architecture presented in Section III with the D-BLAST
architecture, and with the multilayering architecture proposed
by Tarokh, Naguib, Seshadri, and Calderbank (TNSC) in [4].
Fig. 6 compares the TST architecture with a lower bound on
theframeerrorrateachievedby D-BLAST in quasi-static fading
channels. This lower bound assumes error-free decision feed-
back. In practice, the performance of D-BLAST is expected to
be close to the lower bound at high SNRs (where the bound is
tight) but much worse than the bound at lower SNRs (where the
bound is loose). The same four-state convolutional code with
generator polynomials (
, ) is used for both schemes. The it-
erative MMSE receiver is shownto provide a 3-dB gain over the
D-BLAST lower bound under these conditions. Since the same
code was used in both approaches, we can attribute the perfor-
mance gain to the superiority of the iterative MMSE receiver
over the signal processing algorithm used in the D-BLAST.
To further highlight the advantages of the threaded architec-
ture, we report in Fig. 7 the same performance comparison for
a block fading channel with three independent blocks per code-
word. Due to the diagonal restriction imposed on each layer,
the performance of the D-BLAST in this scenario is the same
as that in the quasi-static fading channel. On the other hand, it
is shown that the performance of the threaded architecture is
improved by about 1 dB at 1% frame error rate without any ad-
ditional complexity. This improvement is due to the increased
diversity advantage achieved by efficient code design that ex-
ploits the additional temporal diversity.
Figs. 8 and9 compare the performance of the TST and TNSC
architecturesforthecases of four transmit/four receiveandeight
transmit/eight receive antennas, respectively. QPSK modulation
with Gray mapping is used to map the binary input at each an-
tenna to a complex constellation. Hence, the spectral efficiency
is 4 and 8 b/s/Hz, respectively. The frame length corresponds
to 130 transmissions. The results of the TNSC scheme are ob-
tained from [4, Figs. 4 and 6], respectively. The same four-state
encoders are used for the TST architecture as in the previous
case. Therefore, the overall complexity of the TST receiver in-
cluding the iterations and soft-output decoding is in the same
order as the 32-state decoders used in the TNSC [4]. From the
figures,the significant gain provided by the TST overthe TNSC
scheme is clear. Indeed, the TST approach shows a gain of 4–8
dBover theTNSCscheme.The TSTresultsare within 2–3dBof
the outage capacity. The gain in diversityadvantage achieved by
the TST architecture can be seen in the steeper asymptotic slope
of the performance curve. It is also shown that the gain provided
by the TST increases with the number of antennas. This can be
attributed to the better exploitation of the diversity in the TST.
Finally, we note that by replacing the four-state code with a
more powerful 64-state code we can close the gap between the
TST frame error rate performance and the 10% outage capacity
to less than a fraction of a decibel with the same system param-
eters.
VI. C
ONCLUSION
In this paper, we took a fresh look at the design problem
for multiple-antenna systems operatingoverthe fading channel.
The problem was addressed from both a signal processing and
a space–time coding perspective. From the space–time coding
2332 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 6, SEPTEMBER 2001
Fig. 6. Performance of the TST and BLAST architectures in quasi-static fading channels.
Fig. 7. Performance of the TST and BLAST architectures in block fading channels.
perspective, we presented a new generic approach, the TST ar-
chitecture, that allows for exploiting the spatial and temporal
diversity available in the system. From the signal processing
side, we proposed to utilize the turbo processing principle to
develop iterative algorithms for joint decoding and detection
which offer several advantages over previously proposed tech-
niques [1], [4]. Simulation results were provided for the itera-
tive MMSE receiver establishing its ability to approach the in-
EL GAMAL AND HAMMONS: A NEW APPROACH TO LAYERED SPACE–TIME CODING AND SIGNAL PROCESSING 2333
Fig. 8. Performance of the TST and TNSC architectures.
Fig. 9. Performance of the TST and TNSC architectures.
terference-free performance lower bound within a fraction of a
decibel.The threadedarchitecturewith efficientcode designand
iterative signal processing was shown, through simulation, to
achieve significant gains over the D-BLAST and the combined
array processing and space–time coding recently proposed by
Tarokh et al. [4].
As a final remark, we note that, in the absence of interfer-
ence from other threads, the fading channel is equivalent to the
2334 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 6, SEPTEMBER 2001
block fading channel with receive diversity, where the number
of independent blocks in the equivalent model is equal to the
product of the number of transmit antennas and the number
of fading blocks. The algebraic framework that we developed
for TST code design is, therefore, also useful in the study of
code design for block fading channels and is applicable to both
block and trellis-based codes. Conversely, optimization of the
TST channel coding and interleaving schemes would also ben-
efit from prior work on code design for such channels (see, for
example, Lapidoth [12] or Wesel and Cioffi [19]).
R
EFERENCES
[1] G. J. Foschini, “Layered space-time architecture for wireless communi-
cation in fadingenvironmentswhen using multiple antennas,” Bell Labs.
Tech. J., vol. 2, Autumn 1996.
[2] V. Tarokh, N. Seshadri, and A. R. Calderbank, “Space-time codes for
high data rate wireless communication: Performance criterion and code
construction,” IEEE Trans. Inform. Theory, vol. 44, pp. 774–765, Mar.
1998.
[3] A. R. Hammons, Jr. and H. El Gamal, “On the theory of space–time
codes for PSK modulation,” IEEE Trans. Inform. Theory, vol. 46, pp.
524–542, Mar. 2000.
[4] V. Tarokh, A. Naguib, N. Seshadri, and A. R. Calderbank, “Combined
array processing and space-time coding,” IEEE Trans. Inform. Theory,
vol. 45, pp. 1121–1128, May 1999.
[5] E. Teletar, “Capacity of multi-antenna Gaussian channels,” AT&T-Bell
Labs, Tech. Rep., June 1995.
[6] G. J. Foschini and M. Gans, “On the limits ofwirelesscommunicationin
a fading environment when using multiple antennas,” Wireless Personal
Commun., vol. 6, pp. 311–335, Mar. 1998.
[7] T. Marzetta and B. Hochwald, “Capacity of a mobile multiple antenna
communication link in Rayleigh flat fading,” IEEE Trans. Inform.
Theory, vol. 45, pp. 139–158, Jan. 1999.
[8] J.-C. Guey, M. R. Bell, M. P. Fitz, and W.-Y. Kuo, “Signal design for
transmitter diversity wireless communication systems over Rayleigh
fading channels,” in Proc. IEEE Vehicular Technology Conf., Atlanta,
GA, 1996, pp. 136–140.
[9] M. C. Reed, C. B. Schlegel, P. D. Alexander, and J. A. Asenstorfer,
“Iterative multiuser detection for DS-CDMA with FEC,” in Proc. Int.
Symp. Turbo Codes and Related Topics, Brest, France, Sept. 1997, pp.
162–165.
[10] H. El Gamal and E. Geraniotis, “Iterative multiuser detection for coded
CDMA signals in AWGN and fading channels,” IEEE J. Select. Areas
Commun., vol. 18, pp. 30–41, Jan. 2000. Special issue on Spread Spec-
trum for Global Communications.
[11] X. Wang and H. V. Poor, “Iterative (turbo) soft interference cancellation
and decoding for coded CDMA,” IEEE Trans. Commun., vol. 47, pp.
1046–1061, July 1999.
[12] A. Lapidoth, “The performance of convolutional codes on the block era-
sure channel using various finite interleaving techniques,” IEEE Trans.
Inform. Theory, vol. 43, pp. 1459–1473, Sept. 1997.
[13] J. Hagenauer,“The turbo principle: Tutorial introduction andstate of the
art,” in Proc. Int. Symp. Turbo Codes and Related Topics, Brest, France,
Sept. 1997, pp. 1–9.
[14] M. Moher, “An iterative multiuser decoder for near-capacity communi-
cations,” IEEE Trans. Commun., vol. 46, pp. 870–880, July 1998.
[15] P. D. Alexander, A. J. Grant, and M. C. Reed, “Iterative detection in
code division multiple access with error control coding,” Europ. Trans.
Telecommun., vol. 9, pp. 419–425, Sept./Oct. 1998.
[16] S. Benedetto, G. Montorsi, D. Divsalar, and F. Pollara, “A soft-input
soft-outputmaximum aposteriorimoduletodecodeparallelandserially
concatenated codes,” Telecom. and Data Acquisition Progr. Reps., Jet
Propulsion Lab., no. 42-127, Nov. 1996.
[17] A. Papoulis, Probability, Random Variables, and Stochastic Pro-
cesses. New York: McGraw-Hill, 1984.
[18] H. V. Poor and S. Verdú, “Probability of error in MMSE multiuser de-
tection,” IEEE Trans. Inform. Theory, vol. 43, pp. 858–871, May 1997.
[19] R. D. Wesel and J. M. Cioffi, “Joint trellis code and interleaver design,”
in Proc. IEEE GLOBECOM’97, Nov. 1997.
... It was shown in [2] that a Diagonal Bell Laboratories Layered Space-Time (DBLAST) MIMO system using a combination of forward error control (FEC) codes can exploit spatial diversity to asymptotically achieve outage capacity. El Gamal et al. [3] proposed a threaded layered space-time code (TLSTC) structure, which has an improved bandwidth efficiency compared to the DBLAST structure. In layered space-time coded (LSTC) systems, co-channel interference from adjacent layers limits the system performance. ...
... In layered space-time coded (LSTC) systems, co-channel interference from adjacent layers limits the system performance. To reduce co-channel interference, two iterative receivers with combined detection and decoding are proposed in [3] and [4], based on the turbo principle. The first scheme implements minimum mean square error (MMSE) detection with soft-output Viterbi algorithm (SOVA) decoding in the iterative receiver. ...
... In this paper, a new adaptive iterative TLSTC receiver is proposed based on a joint adaptive iterative detection and decoding algorithm. The proposed receiver does not require channel state information as the non-adaptive iterative receivers in [3] and [4]. Therefore, the proposed receiver does not require a matrix inversion process in the system. ...
Article
Full-text available
An adaptive iterative receiver for layered space-time coded (LSTC) systems is proposed. The proposed receiver, based on a joint adaptive iterative detection and decoding algorithm, adaptively suppresses and cancels co-channel interference. The LMS algorithm and maximum a posteriori (MAP) algorithm are utilized in the receiver structure. A partially filtered gradient LMS (PFGLMS) algorithm is also applied to improve the convergence speed and tracking ability of the adaptive detector with a slight increase in complexity. The proposed receiver is analysed in a slow and fast Rayleigh fading channels in multiple input multiple output (MIMO) systems.
... First of all, the multiplexing/diversity tradeoff constitutes the most salient MIMO design tradeoff since the inventions of Foschini's Bell Laboratories Layered Space-Time (BLAST) [208] in 1996 and Alamout's Space-Time Block Code (STBC) [209] in 1998. On one hand, the family of BLAST schemes [208], [210], [211] effectively use M Transmit Antennas (TAs) to transmit M indepdent data Lawton [35], [36] DPSK Proposed that the constant phase rotation in AWGN channels may be mitigated by the simple DPSK transceiver. 1959 Cahn [37] DPSK Demonstrated that the CDD aided DPSK scheme suffers from a 3 dB performance penalty in AWGN channels compared to its coherent PSK assuming the idealistic perfect estimation of the frequency offset phase. ...
... Bandwidth-efficiency is directly measured by the effective system throughput, which is the data rate that can be delivered over a given bandwidth. For example, the multiplexing-oriented V-BLAST scheme [208], [210], [211] has a high throughput of R = M log 2 L that grows linearly both with the number of TAs M and with the number of bits per symbol log 2 L, hence V-BLAST may be classified as a bandwidth-efficient MIMO design. Power-efficiency refers to the signal transmission power required for achieving specific target performance. ...
Article
Full-text available
Sixty years of coherent versus non-coherent tradeoff as well as the twenty years of coherent versus non-coherent tradeoff in Multiple-Input Multiple-Output (MIMO) systems are surveyed. Furthermore, the advantages of adaptivity are discussed. More explicitly, in order to support the diverse communication requirements of different applications in a unified platform, the 5G New Radio (NR) offers unprecendented adaptivity, abeit at the cost of a substantial amount of signalling overhead that consumes both power and the valuable spectral resources. Striking a beneficial coherent versus non-coherent tradeoff is capable of reducing the pilot overheads of channel estimation, whilst relying on low-complexity detectors, especially in high-mobility scenarios. Furthermore, since energy-efficiency is of salient importance both in the operational and future networks, following the powerful Index Modulation (IM) pholosophy, we conceive a holistic adaptive pholosophy striking the most appropriate coherent/non-coherent, single-/multiple-antenna and diversity/multiplexing tradeoffs, where the number of RF chains, the Peak-to-Average Power Ratio (PAPR) of signal transmission and the maximum amount of interference tolerated by signal detection are all taken into account. We demonstrate that this intelligent tripple-fold adaptivity offers significant benefits in next-generation applications of mmWave and Terahertz solutions, in space-air-ground integrated networks, in full-duplex techniques and in other sophisticated channel coding assisted system designs, where powerful machine learning algorithms are expected to make autonomous decisions concerning the best mode of operation with minimal human intervention.
... Such error correcting techniques were utilized in many applications specially in mutli-user detection, forward error control, data estimation and control signals. The reader is referred to [5][6][7][8][9] for more reading and application dependent techniques. ...
Article
Full-text available
The prohibitive computational complexity of optimal coded multiuser detection necessitates using suboptimal detectors in practical implementations. The filter is very computationally simple and is also demonstrated to provide faster convergence and superior bit error rate (BER) performance. Further investigation of the weighted delay filter concept produces a second filter—derived via the joint likelihood function. It is analytically demonstrated that extrinsic feedback systems will not benefit from weighted delay filtering. A system model is provided that introduces the notion of feedback ‘residue’, which is shown to be the key difference between a-posterior probability (APP) and extrinsic systems when determining the parallel interference cancellation (PIC) output statistics. It is analytically shown that the weighted delay filter derived via a maximum signal-to-noise ratio (SNR) approach is identical to a weighted delay filter derived via the joint likelihood function. It is analytically shown that when extrinsic feedback is used in a coded-code division multiple access (C-CDMA) system, no benefit will be realised by weighted delay filtering, as soft outputs from previous cycles are a merely scaled, noisy version of the most recent data. The notion of a ‘feedback residue’ for systems with APP feedback is introduced, and it is empirically shown that this residue term is a key consideration when determining the PIC output statistics. Using the ‘residual feedback’ model, it is shown that when APP feedback is utilised, data from previous cycles is not simply “a scaled, noisy version” of the current data. For this reason, benefits may be realised by APP feedback use. The simulation results shows that the residue may be trivial at small loads, the residue builds to the substantial value of nearly 0.4 at a reasonably modest load of K/N=15/10, and continues to grow as the load increases.
... general there exists three MIMO design trade-offs that arise throughout the conception of MIMO techniques. The use of MIMO codes in order to increase the data rate through multiplexing was presented mainly with the work of Foschini in Bell Laboratories Layer Space-Time (Bell Laboratories Layer Space-Time (BLAST))techniques[8] in 1996 and later with[9,10]. These techniques use M transmit antennas to transmit M independent data streams, which leads to a linear increase in capacity rather than a logarithmic increase, with the number of antennas. ...
Thesis
Today the multi-antenna techniques MIMO (Multiple Input Multiple Output) and Massive MIMO are very present in the various wireless communication systems. However, these diagrams make it possible to have in reception an estimate of the response of each channel between each transmit and receive antenna, which, in many cases, can greatly reduce the final spectral efficiency of these systems. The purpose of this thesis is to explore an alternative solution based on the use of differential space-time modulation (DSTM) schemes for these non-coherent MIMO systems that does not require an estimate of the response of the receiving channel. First, schemes based on the use of the Weyl multiplicative group of 2 × 2 unit matrices are studied in the process of building DSTM type MIMO building DSTM type MIMO systems with 2 transmitting antennas. Then using the Kronecker product, extended to 4 × 4 and 8 × 8 matrices. In order to improve the spectral efficiency of these schemes, single and double extensions of the Weyl group are proposed. An information matrix selection algorithm maximizing the distance between the improved matrices as well as an optimized mapping are then developed. Finally, an analytical study of the performance of DSTM schemes proposed by expressions of the pairwise error conversation (PEP) is continued. In particular, a new optimal algorithm for selecting the information matrices, having as a performance measure the exact value of the PEP between the pairs of matrices, is optimized.
... Although the optimum detectors can harness the full channel capacity, their complexity increases exponentially with the number of transmit antennas [3]. As a result, improving the performance of sub-optimum detectors is a demanding research topic [4]- [7]. Few implementations for sub-optimum detectors operating based on the Markov-Chain Monte Carlo (MCMC) algorithm are reported in [2], [8], [9]. ...
Preprint
Full-text available
This preprint is later published by IEEE VLSI-SOC 2020: https://doi.org/10.1109/VLSI-SOC46417.2020.9344098
Chapter
A hybrid analog/digital signal processor has been proposed to implement energy-efficient multi-input-multi-output (MIMO) detectors. A sub-optimum MIMO detector based on Markov Chain Monte Carlo (MCMC) algorithm for a 4 \(\times \) 4 MIMO system is presented. A careful partitioning between analog and digital domains has been made to reduce system power consumption. The outputs of the proposed analog signal processing unit are being converted to digital using a low-resolution analog-to-digital converter (ADC), to deliver the signals to the digital portion of the detector system. The proposed 4 \(\times \) 4 MCMC MIMO detector is designed in a standard 45 nm CMOS technology, that consumes 29.3 mW from 1.0 V supply. A throughput of 235.3 Mbps is achieved, while operating at 1.0 GHz clock frequency. The design occupies a 0.11 mm\(^2\) silicon area.
Article
We consider a multi-antenna wireless system consisting of a source transmitting to its destination with limited feedback, where the channel state information (CSI) is considered to be quantized and fed back from the destination to the source in the face of channel quantization errors (CQE). We propose a multi-antenna precoding (MAP) scheme to mitigate an adverse effect of the CQE, which is called the CQE oriented MAP and denoted CQE-MAP for short. Typically, using more channel quantization bits enhances the accuracy of quantized CSI acquired at the source and improves the data rate of the source-destination transmission, which, however, results in an increase of the CSI feedback overhead. We define an effective throughput as the difference of the data transmission rate and the CSI feedback rate that is used for characterizing the system overhead of sending the quantized CSI. An optimization analysis of our CQE-MAP scheme is carried out in terms of maximizing the effective throughput with regard to the number of quantization bits per channel. It is shown that the conventional maximal ratio transmission (MRT) based MAP method denoted by MRT-MAP is a special case of the proposed CQE-MAP scheme for certain scenarios. Simulation results demonstrate that the proposed CQE-MAP achieves a higher effective throughput than the MRT-MAP for a given number of channel quantization bits, especially in the high signal-to-noise ratio (SNR) region. It is also illustrated that the effective throughput of our CQE-MAP scheme can be further maximized through an optimization of the number of quantization bits per channel. Moreover, with an increase of the SNR or a decrease of the terminal moving speed, an increased number of quantization bits per channel is needed for the sake of maximizing the effective throughput.
Article
Full-text available
Concatenated coding schemes with interleavers consist of a combination of two simple constituent encoders and an interleaver. The parallel concatenation known as \turbo code" has been shown to yield remarkable coding gains close to theoretical limits, yet admitting a relatively simple iterative decoding technique. The recently proposed serial concatenation of interleaved codes may ofier performance superior to that of turbo codes. In both coding schemes, the core of the iterative decoding structure is a soft-input soft-output (SISO) module. In this article, we describe the SISO module in a form that continuously updates the maximum a posteriori (MAP) probabilities of input and output code symbols and show how to embed it into iterative decoders for parallel and serially concatenated codes. Results are focused on codes yielding very high coding gain for space applications. The recent proposal of \turbo codes" (2), with their astonishing performance close to the theoretical Shannon capacity limits, has once again shown the great potential of coding schemes formed by two or more codes working in a concurrent way. Turbo codes are parallel concatenated convolutional codes (PCCCs) in which the information bits are flrst encoded by a recursive systematic convolutional code and then, after passing through an interleaver, are encoded by a second systematic convolutional encoder. The code sequences are formed by the information bits, followed by the parity check bits generated by both encoders. Using the same ingredients, namely convolutional encoders and interleavers, serially concatenated convolutional codes (SCCCs) have been shown to yield performance comparable, and in some cases superior, to turbo codes (5).
Article
This paper addresses digital communication in a Rayleigh fading environment when the channel characteristic is unknown at the transmitter but is known (tracked) at the receiver. Inventing a codec architecture that can realize a significant portion of the great capacity promised by information theory is essential to a standout long-term position in highly competitive arenas like fixed and indoor wireless. Use (nT, nR) to express the number of antenna elements at the transmitter and receiver. An (n, n) analysis shows that despite the n received waves interfering randomly, capacity grows linearly with n and is enormous. With n = 8 at 1% outage and 21-dB average SNR at each receiving element, 42 b/s/Hz is achieved. The capacity is more than 40 times that of a (1, 1) system at the same total radiated transmitter power and bandwidth. Moreover, in some applications, n could be much larger than 8. In striving for significant fractions of such huge capacities, the question arises: Can one construct an (n, n) system whose capacity scales linearly with n, using as building blocks n separately coded one-dimensional (1-D) subsystems of equal capacity? With the aim of leveraging the already highly developed 1-D codec technology, this paper reports just such an invention. In this new architecture, signals are layered in space and time as suggested by a tight capacity bound.
Article
This paper addresses digital communication in a Rayleigh fading environment when the channel characteristic is unknown at the transmitter but is known (tracked) at the receiver. Inventing a codec architecture that can realize a significant portion of the great capacity promised by information theory is essential to a standout long-term position in highly competitive arenas like fixed and indoor wireless. Use (nT, nR) to express the number of antenna elements at the transmitter and receiver. An (n, n) analysis shows that despite the n received waves interfering randomly, capacity grows linearly with n and is enormous. With n = 8 at 1% outage and 21-dB average SNR at each receiving element, 42 b/s/Hz is achieved. The capacity is more than 40 times that of a (1, 1) system at the same total radiated transmitter power and bandwidth. Moreover, in some applications, n could be much larger than 8. In striving for significant fractions of such huge capacities, the question arises: Can one construct an (n, n) system whose capacity scales linearly with n, using as building blocks n separately coded one-dimensional (1-D) subsystems of equal capacity? With the aim of leveraging the already highly developed 1-D codec technology, this paper reports just such an invention. In this new architecture, signals are layered in space and time as suggested by a tight capacity bound.
Article
This paper is motivated by the need for fundamental understanding of ultimate limits of bandwidth efficient delivery of higher bit-rates in digital wireless communications and to also begin to look into how these limits might be approached. We examine exploitation of multi-element array (MEA) technology, that is processing the spatial dimension (not just the time dimension) to improve wireless capacities in certain applications. Specifically, we present some basic information theory results that promise great advantages of using MEAs in wireless LANs and building to building wireless communication links. We explore the important case when the channel characteristic is not available at the transmitter but the receiver knows (tracks) the characteristic which is subject to Rayleigh fading. Fixing the overall transmitted power, we express the capacity offered by MEA technology and we see how the capacity scales with increasing SNR for a large but practical number, n, of antenna elements at both transmitter and receiver. We investigate the case of independent Rayleigh faded paths between antenna elements and find that with high probability extraordinary capacity is available. Compared to the baseline n = 1 case, which by Shannon’s classical formula scales as one more bit/cycle for every 3 dB of signal-to-noise ratio (SNR) increase, remarkably with MEAs, the scaling is almost like n more bits/cycle for each 3 dB increase in SNR. To illustrate how great this capacity is, even for small n, take the cases n = 2, 4 and 16 at an average received SNR of 21 dB. For over 99%