Conference Paper

Algorithm and VLSI architecture for linear MMSE detection in MIMO-OFDM systems

Integrated Syst. Lab., ETH, Zurich
DOI: 10.1109/ISCAS.2006.1693531 Conference: International Symposium on Circuits and Systems (ISCAS 2006), 21-24 May 2006, Island of Kos, Greece
Source: DBLP


The paper describes an algorithm and a corresponding VLSI architecture for the implementation of linear MMSE detection in packet-based MIMO-OFDM communication systems. The advantages of the presented receiver architecture are low latency, high-throughput, and efficient resource utilization, since the hardware required for the computation of the MMSE estimators is reused for the detection. The algorithm also supports the extraction of soft information for channel decoding

Download full-text


Available from: Andreas Peter Burg,
  • Source
    • "The SSFE forms a class of tree search algorithms that provides a feasible implementation complexity for moderately correlated channels [3]. The LMMSE and SSFE hardware implementations are at a mature stage and different implementations can be found in [4] and [5]. A unified hardware solution for both LMMSE and SSFE is difficult to implement. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Cognitive radio (CR) systems require flexible and adaptive implementations of signal processing algorithms. An adaptive symbol detector is needed in the baseband receiver chain to achieve the desired flexibility of a CR system. This paper presents a novel design of an adaptive detector as an application-specific instruction-set processor (ASIP). The ASIP template is based on transport triggered architecture (TTA). The processor architecture is designed in such a manner that it can be programmed to support different suboptimal multiple-input multiple-output (MIMO) detection algorithms in a single TTA processor. The linear minimum mean-square error (LMMSE) and three variants of the selective spanning for fast enumeration (SSFE) detection algorithms are considered. The detection algorithm can be switched between the LMMSE and SSFE according to the bit error rate (BER) performance requirement in the TTA processor. The design can be scaled for different antenna configurations and different modulations. Some of the algorithm architecture co-optimization techniques used here are also presented. Unlike most other detector ASIPs, high level language is used to program the processor to meet the time-to-market requirements. The adaptive detector delivers 4.88 49.48 Mbps throughput at a clock frequency of 200 MHz on 90 nm technology.
    9th International Conference on Cognitive Radio Oriented Wireless Networks (CrownCom); 06/2014
  • Source
    • "As for the hardware implementations, systolic array is the most common architecture to support massive computing parallelism. Designs can be a 2-D tri-array [2] or a 1-D linear array [6],[7]. In [3], a complicated function unit in charge of all iterations is used. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Complex-valued QR factorization is a fundamental but computationally intensive operation commonly used in various MIMO signal detection algorithms. In this paper, a novel factorization scheme based on Givens rotations and symmetrical nullification was devised. The proposed scheme successfully integrates the MMSE criterion into factorization and can achieve better BER performance. Instead of working on a complex-valued domain, the scheme starts with a block-wise symmetric real-valued matrix counterpart. By exploiting the symmetrical property, the proposed scheme effectively reduced almost half of the computing complexity. Based on the presented scheme, a novel systolic array design featuring fully parallel and deeply pipelined processing was developed subject to the EWC 802.11n recommendation. Architecture optimization measures such as look-up table (LUT) free CORDIC implementations and hardware sharing among scaling operations were employed to minimize the hardware design complexity. Post layout simulation results using TSMC 0.18μm process indicate the proposed design, with a gate count of 233K and a maximum clock rate of 120 MHz, can admit a new 4×4 complex matrix for MMSE based factorization in every 8 clock cycles (66.7ns).
    Circuits and Systems (ISCAS), Proceedings of 2010 IEEE International Symposium on; 07/2010
  • Source
    • "The latency introduced by the PHY layer varies with the frame length due to the preprocessing latency [8]. As the detection and decoding of an OFDM symbol takes less time than the OFDM symbol duration t s , the preprocessing latency can be caught up. "
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper describes a modular hardware platform of a multi-user (MU) multiple-input multiple-output (MIMO) orthogonal frequency division multiplexing (OFDM) testbed. The hardware platform is based on multiple field programmable gate arrays (FPGAs), provides four integrated radio-frequency (RF) chains, and has capabilities for extension boards. The performance and modularity of the testbed enables real-time MU-MIMO-OFDM experiments as well as offline processing experiments. To this end, the MIMO physical (PHY) layer of Haene et al., IEEE J-SAC, 2008, has been adapted to the new hardware platform and extended with bi-directional communication facilities and a basic media access control (MAC) layer equipped with Ethernet connectivity.
    Circuits and Systems, 2009. ISCAS 2009. IEEE International Symposium on; 06/2009
Show more