Performance Assessment of MIMO-BICM Demodulators based on System Capacity

Performance Assessment of MIMO-BICM Demodulators based on System Capaci ty Peter Fertl, Student Member , IEEE, Joakim Jald ´ en, Member , IEEE, and Gerald Matz*, Senior Member , IEEE Abstract W e provide a comprehen si ve perfor mance comparison of soft-outpu t and hard- output demodu lators in th e con text of non -iterative multiple-inp ut multip le-outpu t bit-interleaved co ded mo dulation (MI MO- BICM). Coded bit error rate (BER), widely u sed in literature for demod ulator comparison, has the drawback o f depend ing stron gly on th e erro r co rrecting code being used. T his motiv ates us to propo se a cod e-indepen dent p erform ance me asure in terms of system capacity , i. e., mu tual inf ormation of the equiv alent modulatio n channel that comp rises modu lator , wireless ch annel, and demodu lator . W e pre sent extensi ve nu merical results for ergodic and qu asi-static fading ch annels und er perf ect and impe rfect channel state info rmation. These results rev eal that the perfor mance r anking of MIMO demo dulators is rate-depe ndent. Furtherm ore, they provid e n ew insights r egarding MIMO- BICM system d esign, i.e., the choice of antenna co nﬁguration , symbol co nstellation, and demodu lator fo r a giv en target rate. Index T erms MIMO, BICM, perfo rmance limits, soft d emodulatio n, system capacity , log-likelihood ratio EDICS Categor y: MSP-CAPC, MSP-CODR I . I N T RO D U C T I O N A. Backgr ound Bit-interlea ved cod ed modu lation (BICM) [ 2], [3] has be en c onceived as a prag matic a pproach to coded modulation. It ha s received a lot of attention in wireless commu nications due to its ban dwidth P . Fertl is with BMW Forschung und T ec hnik GmbH, Hanauer Str . 46, 80992 Munich, Germany (phone: +49 89 382 53408, email: peter .fertl@bmw .de). J. Jald ´ en is with the Signal Processing Laboratory , ACCESS Linnaeus Center, KTH Roy al Instit ute of T echnology , Osquldas v ¨ ag 10, SE- 100 44 S tockholm, Sweden ( phone: +46 8 790 77 88, email: jalden@kth.se). G. Matz is with the Institute of Communications and Radio-Frequency Engineering, V ienna Univ ersity of T echnology , Gusshausstrasse 25/389, A-1040 V ienna, Austria (phone: +43 1 58801 38942, email: gmatz@nt.tuwien.ac.at). Part of this work has been pre viously presented at IEEE 9th W orkshop on Signal Processing Adva nces i n W irelesss Communications (SP A W C 2008) [1]. This work was supported by the ST REP project MASCOT (IST -026905) within the Sixth Framew ork Programme of the European Commission and by FW F Gr ant N10606 “Information Networks. ” October 29, 2018 DRAFT 1 and power ef ﬁciency and its robustness agains t fading. For s ingle-antenna systems, BICM with Gray labeling c an ap proach chann el capac ity [2], [4]. These advantages hav e moti vated extensions of B ICM to mu ltiple-input multiple-output (MIMO) s ystems [5]–[7]. In MIMO-BICM sy stems, the optimum demod ulator is the soft-output maximum a posteriori (MAP) demodulator , which provides the channe l de coder with log-likelihood ratios (LLRs) for the code bits. Due to its high computationa l c omplexity , n umerous alternativ e d emodulators have bee n propo sed in the literature. Applying the ma x-log a pproximation [7] to the MAP demod ulator red uces co mplexity without sign iﬁcant performanc e loss and leads to a sea rch for data vectors minimizing a Euclidean norm. Exact implementations of the max-log MAP detec tor ba sed o n sphere decoding have b een presented in [8]–[10]; sph ere decod er variants in wh ich the Euc lidean norm is replaced with the ℓ ∞ norm have b een proposed in [11], [12 ]. Howe ver , the complexity of s phere de coding g ro ws expo nentially with the number of transmit a ntennas [9]. An alternativ e d emodulator that yields approximations to the true L LRs is b ased on se mideﬁnite relaxa tion (SDR) and ha s polyn omial worst-case complexity [13 ], [14]. Several demod ulation scheme s use a list of can didate data vectors to obtain approxima te LLRs. The size o f the ca ndidate list off ers a trade -of f betwe en pe rformance and complexity . The c andidate list c an be generated using i) tree sea rch techniqu es as with the list s phere de coder (LSD) [15 ], ii) lattice re duction (LR) tech niques [16]–[20], or iii) bit ﬂipping tec hniques, i.e., ﬂipping some of the bits in the label o f a data vector obtained by hard d etection, e .g. [21]. MIMO demodulators with still smaller complexity consis t of a linear eq ualizer followed by per-l ayer scalar soft demod ulators. This approach has bee n studied using zero-forcing (ZF) equ alization [22], [23] and minimum me an-square error (MMSE) eq ualization [24], [25]. The s oft interference cance ler (SoftIC) propose d in [26] iterativ ely performs parallel MIMO interference ca ncelation by su btracting an interference e stimate which is co mputed u sing soft symbols from the preceding iteration. Hard-output MIMO demodu lators are a lternati ves to soft de modulators that p rovide tentati ve de cisions for the code bits but no ass ociated reliability information. Among the best-known scheme s here are maximum likelihood (ML), ZF , and MMSE de modulation [27 ] and suc cessive interference c ancelation (SIC) [28 ]–[30]. B. Contributions In the c ontext of MIMO-BICM, the performance of the MIMO demodu lators listed above h as mostly been asse ssed in terms of cod ed bit error rate (BER) using a spec iﬁc cha nnel code. These BER resu lts depend strongly on the c hannel cod e and hence rende r an impartial de modulator comp arison difﬁcult. October 29, 2018 DRAFT 2 In this pape r , we advocate an information the oretic a pproach for as sessing the performance o f (soft and h ard) MIMO demodulators in the c ontext of non-iterativ e 1 (single-shot) BICM receivers (see also [1]). Inspired b y [5], we propose the mutua l information betwee n the modulator input bits a nd the assoc iated MIMO demodulator output as a code-inde penden t performanc e measure. This qua ntity can be interpreted a s system cap acity (maximum rate allowing for error -free information rec overy) of an equiv a lent “ modulation” chann el tha t co mprises modulator an d demodulator in addition to the phys ical channe l. This app roach estab lishes a systematic framew ork for the asse ssment of MIMO demo dulators. W e no te tha t ZF-bas ed and ma x-log d emodulation have been comp ared in a s imilar sp irit in [23 ]. Using Monte Carlo simulations, this paper provides e xtensive performance ev alua tions and comparisons for the above-mentioned MIMO demodulators in terms of s ystem capa city , con sidering diff erent sys tem conﬁgurations in fast and quas i-static fading. W e also inv estigate the performanc e loss of the various demodulation sche mes u nder imperfect chan nel s tate information (CSI). Due to lack of spa ce, on ly a part of our numerical results is s hown here. F urther results for other an tenna conﬁgurations , sy mbol constellations, a nd bit mappings ca n be found in a suppo rting documen t [31]. Our results allow for several conclusions . Most importantly , we found that no u ni versal p erformance ranking of MIMO d emodulators exists, i.e., the ranking d epends on the information rate or , equ i v alently , on the signal-to-noise ratio (SNR). As an example, soft MMSE outperforms hard ML at lo w rates while at high rates it is the other way around. W e also verify this surprising ob servation in terms of BER simulations using lo w-density parity-check (LDPC) codes. Finally , we us e o ur numerical res ults to develop practical guide lines for the de sign of MIMO-BICM syste ms, i.e., which antenn a conﬁguration, symbol cons tellation, and demod ulator to ch oose in order to achieve a certain rate with minimum SNR. C. P aper Organization The res t of this pa per is organized as follows. Section II d iscusse s the MIMO-BICM system model and S ection III propose s system ca pacity as performance me asure. In Sections IV and V, we ass ess the system cap acity ac hiev ab le with the MIMO-BICM demodulators referred to above for the c ase of fast fading. Section VI a nalyzes the impact of imperfect channe l state information (CSI) on the demod ulator performance, and Sec tion VII in vestigates the rate-versus-ou tage tradeoff o f se lected d emodulators in quasi-static en vironments. In Section VIII, we summarize key obs ervations and infer practical sys tem design gu idelines. Fina lly , c onclusions are provided in Section IX . 1 A performance assessment of soft-in soft-out demodulators in iterativ e BICM recei vers requires a completely dif ferent approach and is thus beyond the scope of this paper . October 29, 2018 DRAFT 3 . . . . . . . . . . . . Map . DEMUX Map . MUX y [ n ] MIMO Channel x [ n ] Π Encoder (Suboptimal) Demodulator ( ˜ Λ l [ n ]) Λ l [ n ] Π − 1 Decoder b [ q ] c l [ n ] ˆ b [ q ] Equiv alent “modulation” channel . . . . . Fig. 1. Block diagram of a MIMO-BICM system. I I . S Y S T E M M O D E L A. MIMO-BICM T ran smission Model A bloc k diagram of ou r MIMO-BICM model is shown in Fig. 1. The information bits b [ q ] are enc oded using an error -co rrecting code and is then passe d throug h a bitwise interl eaver Π . Th e interlea ved code bits are demultiplexed i nto M T antenna streams (“laye rs”). In each layer k = 1 , . . . , M T , groups of Q code bits c ( i ) k [ n ] , i = 1 , . . . , Q , ( n de notes symbol time) are mapped via a one-to-one f unction µ ( · ) to (complex) data symbols x k [ n ] from a s ymbol alpha bet A of siz e |A| = 2 Q . Spec iﬁcally , x k [ n ] = µ ( c (1) k [ n ] , . . . , c ( Q ) k [ n ]) , where  c (1) k [ n ] , . . . , c ( Q ) k [ n ]  = µ − 1 ( x k [ n ]) is referred to as the bit labe l of x k [ n ] . The trans mit vec tor is given b y 2 x [ n ] , ( x 1 [ n ] . . . x M T [ n ]) T ∈ A M T and satisﬁe s the p ower con straint E {k x [ n ] k 2 } = E s . It carries R 0 = QM T interleav ed cod e b its c l [ n ] , l = 1 , . . . , R 0 , with c ( i ) k [ n ] = c ( k − 1) Q + i [ n ] . W e will for simplicity write x [ n ] = µ ( c 1 [ n ] , . . . , c R 0 [ n ]) and c [ n ] , ( c 1 [ n ] . . . c R 0 [ n ]) T = µ − 1 ( x [ n ]) as shorthand for the mapping x [ n ] =  µ ( c (1) 1 [ n ] , . . . , c ( Q ) 1 [ n ]) . . . µ ( c (1) M T [ n ] , . . . , c ( Q ) M T [ n ])  T and its in verse. Assuming ﬂ at fading, the receiv e vec tor y [ n ] , ( y 1 [ n ] . . . y M R [ n ]) T ( M R denotes the number o f receive antennas ) is given by y [ n ] = H [ n ] x [ n ] + v [ n ] , n = 1 , . . . , N , (1) where H [ n ] is the M R × M T channe l ma trix, a nd v [ n ] , ( v 1 [ n ] . . . v M R [ n ]) T is a no ise vector with independ ent identically distrib uted (i.i.d.) circularly sy mmetric complex G aussian elements with z ero mean a nd variance σ 2 v . In most of w hat follows, we will omit the time index n for con venien ce. At the rece i ver , the optimum demo dulator u ses the received vector y and the chan nel ma trix H to calculate LLRs Λ l for all c ode b its c l , l = 1 , . . . , R 0 , carried by x . In practice, the use o f suboptimal demodulators o r of a ch annel estimate ˆ H will res ult in approximate LLRs ˜ Λ l . The L LRs a re p assed through the deinterleaver Π − 1 and then on to the channel de coder that deli vers the d etected bits ˆ b [ q ] . 2 The superscripts T and H denote transposition and Hermitian transposition, respective ly . Furthermore, A M T , A × . . . × A is the M T -fold Cartesian product of A , E {·} denotes expectation, and k · k i s the ℓ 2 (Euclidean) norm. October 29, 2018 DRAFT 4 B. Optimum S oft MAP Demodu lation Assuming i.i.d. uniform co de bits (as g uaranteed, e. g., by an ideal interleaver), the optimum s oft MAP demodulator calculates the exact LLR for c l based o n ( y , H ) acc ording to [7] Λ l , log p ( c l = 1 | y , H ) p ( c l = 0 | y , H ) = log P x ∈X 1 l exp  − k y − Hx k 2 σ 2 v  P x ∈X 0 l exp  − k y − Hx k 2 σ 2 v  . (2) Here, p ( c l | y , H ) is the p robability mas s function (pmf) o f the code bits conditioned on y and H , X 1 l and X 0 l denote the complementa ry se ts of trans mit vectors for wh ich c l = 1 and c l = 0 , resp ectiv ely (note that A M T = X 1 l ∪ X 0 l ). Unfortunately , compu tation of (2) h as complexity O ( |A| M T ) = O (2 R 0 ) , i.e., expo nential in the nu mber o f transmit an tennas. For this reaso n, several sub optimal demodula tors have be en prop osed which promise nea r -optimal p erformance while requiring a lo wer c omputational complexity . The aim o f this work is to provide a fair performance comparison of these demodulators. I I I . S Y S T E M C A PAC I T Y In order for the information ra tes discuss ed be low to have interpretations a s ergodic capacities, we consider a fast fading s cenario where the channe l H [ n ] is a stationary , ﬁnite-memory proce ss. W e rec all that the ergodic ca pacity with Ga ussian inpu ts is gi ven by [32] C G = E H n log 2 det  I + E s M T σ 2 v HH H o (3) (here, I de notes the identity matrix). The non-ergodic regime (slow fading) is discusse d in Section III-D. A. Capacity of MIMO Coded Mo dulation In a cod ed mod ulation (CM) system with equa lly likely transmit vectors x ∈ A M T and no CSI at the transmitter , the average mutual information in bits per chan nel use (bpcu) is giv en by (cf. [5]) C CM , I ( x ; y | H ) = R 0 − E x , y , H      log 2 P x ′ ∈A M T f ( y | x ′ , H ) f ( y | x , H )      . (4) Here, we use d the conditional proba bility density function (pdf) (cf. (1) ) f ( y | x , H ) = 1 ( π σ 2 v ) M R exp  − k y − Hx k 2 σ 2 v  . (5) In the follo wing, we will refer to C CM as CM capacity [2] (sometimes, C CM is a lternati vely termed constellation-cons trained capac ity). It is seen from (4) that C CM ≤ R 0 ; in fact, the last term in (4) may be interpreted as a pen alty term resulting from the noise an d MIMO interference. October 29, 2018 DRAFT 5 Using the fact that the ma pping between the symbo l vector x a nd the assoc iated bit lab el { c 1 , . . . , c R 0 } is o ne-to-one and applying the chain rule for mutual information [33, page 24 ] to (4) lea ds to C CM = I ( c 1 , . . . , c R 0 ; y | H ) = R 0 X l =1 I ( c l ; y | c 1 , . . . , c l − 1 , H ) (6) = R 0 X l =1 H ( c l | H ) − H ( c l | y , c 1 , . . . , c l − 1 , H ); here, H ( · ) denotes the entropy function. T he sing le-antenna equiv alent of (6) s erved as a motiv ation for multile vel c oding and multistage dec oding, wh ich can ind eed achieve CM ca pacity [4]. Multile vel c oding for multiple antenn a systems h as b een cons idered in [34]. B. Capacity of MIMO-BICM In the follo wing, we ass ume an ideal, inﬁnite-length b it interleaver 3 which allows us to treat the BICM system as a set of R 0 independ ent parallel memoryles s bina ry-input chan nels as in [2, Sec tion III.A]. Using the as sumption of i.i.d. uniform co de bits, the maximum rate achievable with BICM is g i ven b y (cf. [5]) C BICM , R 0 X l =1 I ( c l ; y | H ) = R 0 X l =1 H ( c l | H ) − H ( c l | y , H ) (7) = R 0 − R 0 X l =1 E x , y , H      log 2 P x ′ ∈A M T f ( y | x ′ , H ) P x ′ ∈X c l l f ( y | x ′ , H )      , where 4 c l = ( c ) l = ( µ − 1 ( x )) l denotes the l th bit in the label of x . Since con ditioning reduces entropy [33, page 29 ], a c omparison of (6) a nd (7) re veals that [34 ] C BICM ≤ C CM . The gap C CM − C BICM increases with |A| and M T and dep ends strongly on the symb ol labeling [7]. For single-a ntenna BICM systems with Gray labeling, this ga p has been shown to be n egligible [2], [4]; howev er , for MIMO sy stems (see Sec tion IV) and at low SNRs in the wideban d regime [3] it can b e signiﬁcant. The capac ity loss c an be attributed to the fact that the BICM receiv er neglec ts the depend encies be tween the transmitted cod e bits. Under the u nrealistic assump tion of perfectly known 3 In practice, this means that the i nterleav er needs t o be much longer than the codew ords transmitted ove r the channel. 4 By ( x ) k and ( X ) k,l we respectiv ely denote the k th element of the vector x and the element in row k and column l of the matrix X . October 29, 2018 DRAFT 6 channe l SNR, multile vel coding with mu ltistage decod ing c an in principle av oid such a capacity loss but suffers from error p ropagation [4], [34]. BICM doe s not require the c hannel SNR at the trans mitter and can be c onsidered more robust. A hybrid version of CM an d BICM whose comp lexity and performance is between the two was pres ented in [34]. Furthermore, a ugmenting BICM with spa ce-time ma ppings can be bene ﬁcial (cf. [34 ], [35]) but is n ot c onsidered he re due to s pace limitations. It can be shown that the log-likelihood ratio Λ l in (2) is a s ufﬁcient statistic [36] for c l giv en y a nd H . Therefore, (7) can b e rewritt en as C BICM = R 0 X l =1 I ( c l ; Λ l ) . (8) Hence, C BICM can be interpreted as the ca pacity of an equiv alent channe l with inputs c l and o utputs Λ l (cf. Fig. 1). This ch annel is characterized b y the conditional pdf Q l f (Λ l | c l ) , which us ually is hard to obtain an alytically , howe ver . C. System Capacity and Demodulator P erformanc e Moti v ated by the inte rpretation of C BICM as the s ystem capa city o f BICM using the optimum MAP demodulator , we propo se to meas ure the p erformance of sub-optimal MIMO-BICM demodulators via the sys tem capacity of the associated equ i v alent “mo dulation” chan nel with b inary inputs c l and the approximate LLRs ˜ Λ l as continuous o utputs (cf. Fig. 1). This channel is de scribed by the conditional pdf Q l f ( ˜ Λ l | c l ) . Its system capa city is d eﬁned as the mutual information b etween c l and ˜ Λ l , which can be s hown to equal C , R 0 X l =1 I ( c l ; ˜ Λ l ) = R 0 − R 0 X l =1 1 X b =0 Z ∞ −∞ 1 2 f ( ˜ Λ l | c l = b ) log 2 f ( ˜ Λ l ) 1 2 f ( ˜ Λ l | c l = b ) d ˜ Λ l , (9) where f ( ˜ Λ l ) =  f ( ˜ Λ l | c l = 0) + f ( ˜ Λ l | c l = 1)  / 2 . W e emphasiz e that the system capac ity C provides a performance measure for MIMO (soft) demodulators that is independent of the outer channel c ode. In fact, it ha s an intuiti ve operational interpretation as the h ighest rate achiev able (in the sense of asymptotically vanishing error proba bility) in a BICM system with indepen dent parallel ch annels (as sumption o f an ideal inﬁnite-length interleaver , cf. [2, Section III.A]) , us ing the spec iﬁc demodulator wh ich produc es ˜ Λ l . Since ˜ Λ l is deriv ed from y and H , the data p rocessing inequality [33 , page 34 ] implies that C ≤ C BICM with eq uality if ˜ Λ l is a one -to-one function of Λ l . The performance of a soft demod ulator can thus be measured in terms of the gap C BICM − C . O f course, the information theoretic performance measure in (9) does not take into acc ount co mplexity issues and it ha s to b e expected that a reduction of the ga p C BICM − C in gen eral can only be achieved at the expen se of increa sing computa tional complexity . October 29, 2018 DRAFT 7 W e ca ution the read er that the rates in (8) and (9) are sums of mutual informations for the individual code bits c 1 , . . . , c R 0 carried by one sy mbol vec tor . Indee d, the pd fs f (Λ l | c l ) and f ( ˜ Λ l | c l ) in gene ral depend on the code bit po sition l , even tho ugh for certain systems (e.g. 4-QAM mod ulation) the cod e b it protection and LLR s tatistics a re ind epende nt of the bit pos ition l for reason s of symmetry . Ac hieving (8) and (9) thus requires cha nnel enco ders an d dec oders that take the bit p osition into a ccount. When the chann el code fails to use this information, the rate loss is sma ll p rovided tha t the mapping protects dif ferent co de bits c l roughly e qually ag ainst noise and interference. D. Non-Er godic Chann els In the ca se of quasi-static or slow fading [37], the c hannel H is random but constant over time, i.e. , each c odeword c an exten d over only one c hannel realization. Here, the ergodic ca pacity of the mo dulation channe l is no longer ope rationally meaning ful [37], [38]. Instea d we consider the outage pr obability P out ( R ) , P { R H < R } , (10) where R H is a random v ariable deﬁ ned as R H , R 0 X l =1 I H ( c l ; ˜ Λ l ) . Here, I H ( c l ; ˜ Λ l ) denotes the con ditional mu tual information, which is evaluated with f ( ˜ Λ l | c l , H ) in place of f ( ˜ Λ l | c l ) (cf. (9)). Note that the ergodic sys tem c apacity C in (9) equa ls C = E H { R H } . The ou tage probability P out ( R ) can be interpreted a s the smallest codeword error probability ac hiev a ble at rate R [38]. A close ly related c oncept is the ǫ -c apacity of the equiv a lent modulation c hannel, deﬁne d as C ǫ , sup { R | P { R H < R } < ǫ } . (11) The ǫ -capa city may be interprete d as the maximum rate for which a codeword e rror probab ility less than ǫ can be achieved. Ra tes smaller than C ǫ are referred to as ǫ -achiev a ble rates [38]. If P out ( R ) is a continuous a nd inc reasing function of R (which is us ually the ca se in practice), it h olds tha t P out ( C ǫ ) = ǫ . E. Generalized Mutual Information The ope rational interpretation of our performance meas ure as the lar gest achievable rate for a BICM system using a given d emodulator requires the ass umption of an ideal inﬁnite-length interleaver . W ith a ﬁnite-length interlea ver , the parallel channe ls (i.e., the dif ferent bits in a given sy mbol vector) a re not independ ent in g eneral; here, achiev able rates can be c haracterized in terms of the generalized mutua l October 29, 2018 DRAFT 8 information (GMI) which is o btained by treating the BICM receiv er as a mismatched decode r [3], [39], [40]. For the c ase of optimum soft MAP de modulation (cf. (2)), the BICM c apacity us ing the indep endent parallel-channel model coinc ides wi th the GMI [40]. W e recently p rovided a non-straightforward extension of this result by s howing tha t the GMI of a BICM system with su boptimal demod ulators a ugmented with scalar LLR c orrection (se e Section IV -D) coincides with the system capacity in (9) obtained for the parallel-channel model [41]. 5 Scalar LLR correction has been us ed previously to provide the binary decode r with acc urate reliability information [36], [42]–[45]. The GMI of a BICM system with ﬁnite interleav er and LLR-corrected s uboptimal demodulators can t hus ef ﬁciently be computed by evaluating (9) [41]; this provides additional justiﬁcation for the us e of (9) as a code-indepe ndent performance meas ure for ap proximate d emodulators. W e n ote that a GMI-based analysis of BICM with misma tched decoding metrics tha t ge neralizes our work in [41] ha s recen tly been p resented in [46]. I V . B A S E L I N E M I M O - B I C M D E M O D U L A T O R S In this sec tion, we ﬁrst revie w max -log an d hard ML demod ulation as w ell as linear MIMO demod- ulators and the n we provide res ults illustrating their performance in terms of s ystem c apacities. These demodulators serve a s baseline systems for later d emodulator performance comparison s in Sec tion V. W e note that max-log and hard ML MIMO demodu lators h av e the highes t comp lexity a mong all so ft and h ard d emodulation schemes , res pectively , wh ereas linear MIMO demod ulators are most efﬁcient computationally . Due to s pace limitations, we only state the c omplexity order of ea ch de modulator in the follo wing and we give referen ces that provide more detailed c omplexity a nalyses . A. Max-Log and Hard ML Demodulator Applying the max-log ap proximation to (2) simpliﬁes the LLR c omputation to a minimum distance problem an d results in the approx imate LLRs [7] ˜ Λ l = 1 σ 2 v  min x ∈X 0 l k y − Hx k 2 − min x ∈X 1 l k y − Hx k 2  . (12) This expression can be i mplemented easier tha n (2) since it a voids t he logarithm and e xponen tial functions. Howe ver , compu tation o f ˜ Λ l in (12) still requ ires two se arches over sets of siz e |A | M T / 2 = 2 R 0 − 1 . S phere decode r impleme ntations of (12) are presen ted in [8], [10]. 5 W e note that the L LR correction leaves the mutual information which underlies system capacity unchanged. October 29, 2018 DRAFT 9 Hard vector ML d emodulation amou nts to the minimum distance prob lem ˆ x ML = arg min x ∈A M T k y − Hx k 2 . (13) This optimization prob lem can be so lved by exha usti ve search or us ing a sphe re decoder; in both ca ses, the computational complexity sc ales expo nentially with the numbe r of transmit antenna s. The detecte d code bits ˆ c l correspond ing to (13) are ob tained v ia the o ne-to-one mapping betwee n cod e bits and s ymbol vectors, i.e., ˆ c = ( ˆ c 1 . . . ˆ c R 0 ) T = µ − 1 ( ˆ x ML ) . It can b e sh own that the co de bits ˆ c l obtained by the hard ML detec tor co rrespond to the sign of the correspon ding max-log LL Rs in (12), i.e., ˆ c l = u ( ˜ Λ l ) where u ( · ) denotes the unit s tep function. Wh en it comes to computing the system capac ity with hard-output demodulators, the only dif ference to soft-output de modulation is the discr ete nature o f the outputs ˆ c l of the equiv alent “ modulation” channe l, which here be comes a binary chann el. Consequ ently , the integral over ˜ Λ l in (9) is replaced with a s ummation over ˆ c l ∈ { 0 , 1 } . B. Linear Demodulators In the follo wing, Λ ( i ) k is the LLR corresp onding to c ( i ) k (the i th bit in the bit label of the k th s ymbol x k ). Soft de modulators with extremely low comp lexity can be obtaine d b y using a linea r (ZF or MMSE) equalizer follo we d by per-layer max-log LLR calcula tion acco rding to ˜ Λ ( i ) k = 1 σ 2 k  min x ∈A 0 i | ˆ x k − x | 2 − min x ∈A 1 i | ˆ x k − x | 2  , i = 1 , . . . , Q , k = 1 , . . . , M T . (14) Here, A b i ⊂ A den otes the set of (sc alar) symb ols wh ose b it label at position i equals b , ˆ x k is an estimate of the s ymbol in layer k provided by the equ alizer , a nd σ 2 k is a n e qualizer-speciﬁc weight (se e be low) . W e emphasize that calcu lating LLRs se parately for ea ch lay er res ults in a signiﬁcan t complexity reduction. In fact, calcu lating the symbol e stimates ˆ x k using a ZF or MMSE equalizer re quires O ( M R M 2 T ) operations; furthermore, the complexity of ev aluating (14 ) for all code bits scales as O ( M R M T 2 Q ) = O ( M R M T |A| ) , i.e., linearly in the nu mber of antennas [24], [25 ] . Equalization-bas ed ha rd bit decisions ˆ c ( i ) k can be obtaine d b y quantization of the equ alizer o utput ˆ x k with res pect to A (denoted by Q ( · ) ), followed by the de mapping, i.e.,  ˆ c (1) k . . . ˆ c ( Q ) k  T = µ − 1  Q ( ˆ x k )  . Again, the detected c ode b its co rrespond to the sign of the LLRs , i.e., ˆ c ( i ) k = u ( ˜ Λ ( i ) k ) . 1) ZF-base d Demodulator [ 22], [23]: Here, the ﬁrst stage consists of ZF eq ualization, i.e. , ˆ x ZF = ( H H H ) − 1 H H y = x + ˜ v , (15) where the pos t-equalization noise vector ˜ v has correlation matrix R ˜ v = E { ˜ v ˜ v H } = σ 2 v ( H H H ) − 1 . (16) October 29, 2018 DRAFT 10 −10 −5 0 5 10 15 20 25 0 1 2 3 4 5 6 7 8 SNR [dB] Max. Achievable Rate [bpcu] Gauss CM BICM max−log hard−ML soft−MMSE hard−MMSE soft−ZF hard−ZF −1 0 1 2 3 4 5 2 2.5 3 3.5 4 4.5 Zoom (a) −5 0 5 10 15 0 1 2 3 4 5 6 7 8 soft hard SNR [dB] Max. Achievable Rate [bpcu] Gauss CM BICM max−log hard−ML soft−MMSE hard−MMSE soft−ZF hard−ZF 2 3 4 5 6 7 3.5 4 4.5 Zoom (b) Fig. 2. Numerical capacity results for (a) a 4 × 4 MIMO system with 4 -QAM, and (b) a 2 × 4 MIMO system with 16 -QAM (in both cases with Gray labeling). Subseq uently , approximate bit LLRs are o btained according to (14) with s ymbol estimate ˆ x k = ( ˆ x ZF ) k and we ight factor σ 2 k = ( R ˜ v ) k ,k . 2) MMSE-base d Demo dulator [24], [2 5]: He re, the ﬁrst stage is a n MMSE equalizer tha t can be written as (cf. (15) an d (16)) ˆ x MMSE = W ˆ x ZF , with W =  I + M T E s R ˜ v  − 1 . (17) Approximate L LRs are then calculated ac cording to (14) with ˆ x k = ( ˆ x MMSE ) k W k and σ 2 k = E s M T 1 − W k W k , where W k = ( W ) k ,k . Here, ˆ x k denotes the output of the unbiased MMSE equalizer , which is pr eferable to a biased MMSE equalizer for non-cons tant modulus modulation sche mes such as 16 -QAM and 64 -QAM [47]; in the remainder we will thus restrict to unbiased soft and ha rd MMSE demodulators. C. Capacity R esults W e next compa re the performanc e o f the above bas eline d emodulators (i.e., max-log, hard ML, MMSE and ZF) in terms of their maximum a chiev able rate C (ergodic system capac ity , see (9)). In addition, the CM capacity C CM in (4), the MIMO-BICM c apacity C BICM in (7 ) (correspon ding to the system c apacity of the o ptimum s oft MAP demo dulator in (2)) , and the Gaussian inp ut c hannel ca pacity C G in (3) are shown as be nchmarks. Through out the pa per , all c apacity resu lts have been ob tained for spa tially i.i.d. Rayleigh f ading, with all fading coefﬁcients normalized to unit variance. The pdfs requ ired for ev aluating (9) are g enerally hard to obta in in closed form. Thus , we measured these pdfs us ing Monte Carlo simulations and then ev a luated all integrals nume rically . Ba sed on the October 29, 2018 DRAFT 11 results in [48], we numerically optimized the binning (used to mea sure the pdfs) in order to reduce the bias and variance of the mutual information estimates (see Appendix A for more details) . The capa city results (obtained with 10 5 fading rea lizations) are shown in Fig. 2 in bits per cha nnel u se versus SNR ρ , E s /σ 2 v . In the followi ng, in some o f the plots we sh ow insets that provide z ooms of the ca pacity curves around a rate of R 0 / 2 bpcu . Fig. 2(a) pertains to the c ase of M T × M R = 4 × 4 MIMO with Gray-labeled 4 -QAM (here, R 0 = 8 ). At a target rate of 4 bpc u, the S NR required for CM and Ga ussian ca pacity is virtually the s ame, wh ereas that for BICM is larger by about 1 . 3 dB. The SN R penalty of us ing max-log demodulation instea d o f soft MAP is about 0 . 3 dB. Furthermore, h ard ML demodulation requires a 2 . 1 dB higher SNR to a chieve this rate tha n max -log demodu lation; for s oft and hard MMSE d emodulation the SNR gaps to max-log are 0 . 2 dB and 3 . 1 dB, respectively , while for soft a nd hard ZF demodulation they res pectively e qual 5 . 1 dB and 8 . 1 dB. An interesting observation in this s cenario is the fact that at low rates, soft and hard MMSE demodulation s lightly ou tperform max -log and hard ML demo dulation, respe cti vely , wherea s at h igh ra tes MMSE de modulation degrade s to ZF performance. Hard MMSE demodula tion can ou tperform hard ML demodulation since the latter minimizes the vector sy mbol e rror probability wherea s our s ystem capac ity is deﬁ ned o n the bit lev el. Surprisingly , at low rates soft MMSE e ssentially coinc ides with BICM capac ity . Moreover , soft MMSE demodulation outperforms hard ML d emodulation at low-to-medium rates whereas at high rates it is the other way a round (the c ross-over c an be seen at abou t 5 . 8 bpcu). These obs ervati ons reveal the somewhat unexpected fact that the demodulator performance ranking is not univ ersal but depe nds on the tar get rate (or equiv alently , the target SNR), even if the number of antennas , the sy mbol c onstellation, and the labeling are ﬁxed. Similar observations app ly to 16 -QAM instead o f 4 -QAM and to se t-partitioning labeling instead o f Gray labeling (see [31]). Apart from a general shift of all curves to h igher S NRs, the lar ger constellation and/or the dif ferent labeling strategy cause s an increa se of the gap b etween CM cap acity and BICM c apacity . The gaps b etween ha rd ML, hard MMSE, a nd so ft ZF demodu lation are signiﬁcantly smaller , though, in this ca se (soft ZF outperforms hard MMSE for rates above 6 . 2 bpcu a nd approac hes ha rd ML for rates arou nd 6 bpcu). When d ecreasing the antenna conﬁg uration to a 2 × 2 system, we obs erved that soft ZF ou tperforms ha rd ML demodulation for low-to-medium rates, e.g., b y a bout 1 . 7 dB at 4 b pcu with 16 -QAM [31]. The situation changes for the c ase of a 2 × 4 MIMO system with Gray-labe led 16 -QAM (again R 0 = 8 ), sh own in Fig. 2(b). The inc reased SNR gap betwee n CM a nd BICM capacity implied by the lar g er constellation is comp ensated by having more receive than transmit a ntennas (this ag rees with observations in [7]). In ad dition, the pe rformance differences between the individual demodulators are October 29, 2018 DRAFT 12 signiﬁcantly redu ced, rev ealing an ess ential distinction being b etween so ft and ha rd demodulators. Having M R > M T helps the linear demo dulators approach their non -linear c ounterparts ev en at larger rates, i.e., soft ZF /MMSE perform close to max-log a nd ha rd ZF/MMSE pe rform close to hard ML, with an SNR gap of about 2 . 3 dB be tween ha rd and soft de modulators. Note that in this scena rio s oft MMSE and soft ZF both outperform hard ML demodu lation at a ll rates. D. BER P erforman ce Even though we advocate a demodu lator comparison in t erms of system ca pacity , the cross -over o f some of the capa city curves prompts a veriﬁcation in terms of the BER of s oft a nd hard MMSE demodulation as well as max -log and hard ML de modulation. W e cons ider a 4 × 4 MIMO-BICM sys tem with Gray- labeled 4 -QAM in con junction with irregular LDPC codes 6 [49] of b lock length 64000 . For the case of s oft d emodulation, the LDPC codes w ere de signed for a n ad diti ve white Gau ssian noise (A WGN) channe l whe reas for the ca se of h ard demo dulators the des ign was for a b inary symmetric ch annel. At the rec eiv er , mes sage-pa ssing LDPC decoding [49 ] was performed. In the cas e of hard d emodulation, the messag e-passing decoder was provided with the LLRs ˜ Λ l = (2ˆ c l − 1) log 1 − p 0 p 0 , (18) where p 0 = P { ˆ c l 6 = c l | c l } , the cross-over proba bility of the e quiv alen t binary channe l, was de termined via Mon te Carlo s imulations. W ith the soft demodulators, we performed a n LLR co rrection via a lookup table as in [45]. Us ing LLR correction for so ft demo dulators and (18) for hard output de modulators is critical in order to provide the chan nel decode r with ac curate reliability information [36], [41]–[45]. The BERs obtained for code rates of 1 / 4 ( 2 bpc u) and 3 / 4 ( 6 b pcu) a re s hown in Fig. 3(a) a nd (b), respectively . V erti cal lines indicate the resp ectiv e capa city limits, i.e., the minimum SNR required for the target rate according to Fig. 2(a). It is s een that the LDPC c ode designs are less than 1 dB away from the capa city limits. At low rates soft MMSE performs best an d h ard ML performs worst whereas at high rates max-log and hard MMSE give the be st and worst results, respec ti vely . More spec iﬁcally , at rate 1 / 4 soft MMSE o utperforms max-log and ha rd ML demodulation by 0 . 3 dB and 2 . 9 dB, respe cti vely (cf. Fig. 3(a)); at rate 3 / 4 s oft MMSE performs 0 . 5 dB poo rer than hard ML an d 2 . 1 dB poorer than max- log (cf. F ig. 3(b)). Th ese BER resu lts conﬁrm the c apacity-base d observation tha t there is no un i versal (i.e., rate- a nd SNR-indep endent) demod ulator performance ran king. W e note that the block error rate results in [50] imply s imilar conc lusions, ev en though not explicitly me ntioned in that pa per . 6 The LDPC code design was performed using the web-tool at http://lthcww w.epfl.ch/rese arch/ldpcopt . October 29, 2018 DRAFT 13 −3 −2 −1 0 1 2 10 −3 10 −2 10 −1 10 0 SNR [dB] BER Limit=−2.53dB Limit=−2.17dB Limit=0.17dB Limit=0.46dB max−log hard−ML soft−MMSE hard−MMSE (a) 5 6 7 8 9 10 11 12 13 10 −3 10 −2 10 −1 10 0 SNR [dB] BER Limit=6.34dB Limit=8dB Limit=8.27dB Limit=11.51dB max−log hard−ML soft−MMSE hard−MMSE (b) Fig. 3. BER vs. SNR for a 4 × 4 MIMO system with Gray-labeled 4 -QAM and LDPC codes of (a) rat e 1 / 4 and (b) rate 3 / 4 . V . O T H E R D E M O D U L A T O R S In the following, we study the sy stem c apacity of several other MIMO-BICM de modulators tha t d if fer in their unde rlying p rinciple a nd their compu tational c omplexity . Un less stated otherwise , cap acity results shown in this sec tion p ertain to a 4 × 4 MIMO s ystem with 4 -QAM using Gray labeling ( R 0 = 8 ). The results for asymmetric 2 × 4 MIMO sy stems with 16 - QAM (shown i n [31] b ut not here) essentially conﬁrm the general distinction between ha rd and soft de modulators obse rved in co nnection with Fig. 2(b). A. List-based Demodulators In order to save computational complexity , (12) can be ap proximated by dec reasing the s ize o f the search s et, i.e., replacing A M T with a smaller s et. Usua lly , this is ach iev ed by generating a (non-empty) candidate list L ⊆ A M T and res tricting the search in (12) to this list, i.e ., ˜ Λ l = 1 σ 2 v  min x ∈L∩X 0 l k y − Hx k 2 − min x ∈L∩X 1 l k y − Hx k 2  . (19) As the numb er of op erations required to c ompute the metric for ea ch can didate of the list is O ( M T M R ) , the overall computa tional co mplexity of the metric evaluations and minima searche s in (19) scales as O ( M T M R |L| ) . Thus, the list size |L| allows to trade off p erformance for c omplexity s avings. A larger list size gen erally incu rs higher complexity but yields more a ccurate approximations of the max -log LLRs. For a ﬁxed list size, the performanc e further d epends on how the list L is gene rated. In the followi ng, we co nsider two types of list ge neration, o ne base d on sphere dec oding and the other on b it ﬂipping. 1) List Sphere Deco der (LSD): The LSD proposed in [15] uses a simple mo diﬁcation of the hard- decision sph ere d ecoder [51] to ge nerate the can didate list L su ch that it contains the |L| sy mbol vectors x with the smallest ML metric k y − Hx k 2 (thus, by deﬁn ition L contains the hard ML solution ˆ x ML in October 29, 2018 DRAFT 14 −5 −2.5 0 2.5 5 7.5 10 1 2 3 4 5 6 7 SNR [dB] Max. Achievable Rate [bpcu] BICM soft−MMSE LSD−256/max−log LSD−8 LSD−4 LSD−2 LSD−1/hard−ML 2 3 4 5 3.5 4 4.5 Zoom Fig. 4. System capacity of LSD with li st size |L| ∈ { 1 , 2 , 4 , 8 , 256 } ( 4 × 4 MIMO, 4 -QAM, Gray labeling). (13)). If the l th bit in the labels of all x ∈ L equa ls 1 , the set L ∩ X 0 l is empty and (19) cannot be ev a luated. Sinc e in this case there is strong evidence for c l = 1 (at lea st if |L| is not too sma ll), the LLR ˜ Λ l is s et to a prescribed positive value ˇ Λ ≫ 0 . An alogously , ˜ Λ l = − ˇ Λ in case L ∩ X 1 l is e mpty . While the LSD may offer sign iﬁcant complexity savings compared to max-log demodulation, statements about its computational complexity are dif ﬁcult and de pend strongly o n the actual implementation of the sphere decod er as we ll as the choice of the list size (for details we refer to [15]) . W e n ote that the case |L| = 2 R 0 = |A M T | implies L = A M T ; thus , L ∩ X b l = X b l such that (19) equ als the max -log demodulator in (12). The other extreme is a list s ize of one, i.e., L = { ˆ x ML } (cf. (13)), in which case either L ∩ X 0 l or L ∩ X 1 l is empty (depe nding o n the bit la bel of ˆ x ML ); here, ˜ Λ l = (2ˆ c l − 1) ˇ Λ whe re ˆ c = (ˆ c 1 . . . ˆ c R 0 ) T = µ − 1 ( ˆ x ML ) and thus the LS D o utput is e quiv ale nt to hard ML demodulation (except for the choice of ˇ Λ , wh ich is irrele vant, howe ver , for ca pacity). Capacity R esults. Fig. 4 shows the max imum rates achiev able with a n LSD for various list size s. BICM and so ft MMSE c apacity are s hown for comparison. Note that with 4 -QAM and M T = 4 , |L| = 256 and |L| = 1 c orrespond to max-log and hard ML d emodulation, respectively . It is seen that with incre asing list size the g ap between LSD and max-log dec reases rapidly , s peciﬁcally at high rates. In particular , the LSD with list siz es of |L| ≥ 8 is already quite close to max-log p erformance. H owe ver , at low rates LSD (ev en with la r ge list sizes) is o utperformed by soft MMSE: below 5 . 3 dB, 3 . 7 dB, a nd 2 . 8 d B the system capa city of soft MMSE is higher than that of LSD with list size 2 , 4 , and 8 , respe cti vely . Similar observations a pply to other antenna co nﬁgurations and symbol co nstellations (see [31]). 2) Bit Flipping Demod ulators: An other way of ge nerating the can didate list L , p roposed in [21], is to ﬂip s ome of the bits in the labe l of the ha rd ML symbo l vector e stimate ˆ x ML in (13). Mo re ge nerally , the ML solution ˆ x ML can be rep laced by a symbol vector ˆ x ∈ A M T obtained with a n arbitrary h ard-output October 29, 2018 DRAFT 15 demodulator (e.g., h ard ZF an d MMSE demod ulation). Let ˆ c = µ − 1 ( ˆ x ) d enote the bit labe l of ˆ x . The candidate list then con sists of a ll symbol vectors who se bit label has Hamming dista nce a t mo st D ≤ R 0 from ˆ c , i.e., L = { x : d H ( µ − 1 ( x ) , ˆ c ) ≤ D } . Here, d H ( c 1 , c 2 ) denotes the Hamming distance between two bit lab els c 1 and c 2 . Th is list c an be generated by syste matically ﬂipping up to D bits in ˆ c a nd mapping the results to s ymbol vectors. The resulting list size is giv en by |L| = P D d =0  R 0 d  . Here, the structure of the list ge nerated with bit ﬂipping allows to reduc e the complexity pe r ca ndidate to O ( M R ) , 7 givi ng an overall co mplexity of O ( M R |L| ) (plus the ope rations required for the initial estimate). For D = R 0 , L = A M T and (19) reduces to max-log de modulation; furthermore, with ˆ x = ˆ x ML and D = 0 there is L = { ˆ x ML } and (19) become s e quiv alen t to hard ML demodulation. In c ontrast to the LSD, b it ﬂipping with D > 0 ens ures that L ∩ X 0 l and L ∩ X 1 l are non empty so that (19) can always b e evaluated. Capacity Res ults. F ig. 5 shows the maximum rates achiev able with bit ﬂipping demodulation where the initial s ymbol vector estimate is c hosen either as the h ard ML s olution ˆ x ML in (13) or the ha rd MMSE estimate Q ( ˆ x MMSE ) (cf. (17)). For D = 1 ( |L| = 9 ), Fig. 5(a) reveals that ﬂipp ing 1 bit (labeled ’ﬂip-1’) allows for signiﬁc ant performance improvements over the respe cti ve initial hard demod ulator (about 2 . 1 dB at 2 bp cu). For rates b elow 5 bpcu , h ard ML and hard MMSE initialization yield effecti vely identical res ults, with a maximum los s o f 0 . 9 dB (at 3 . 5 bp cu) compa red to so ft MMSE. At higher rates, MMSE-based b it ﬂipp ing ou tperforms soft MMSE demodulation s lightly . For D = 2 ( |L| = 37 ), it can b e seen from F ig. 5(b) that bit ﬂipping demo dulation p erforms close to max-log b elow 4 bpcu and that ha rd ML and hard MMSE initialization are very clos e to e ach other for almost all rates and SNR s; i n f act, below 6 . 7 bpcu hard MMSE initialization p erforms slightly be tter than hard ML initialization while a t higher rates ML initialization gives s lightly better resu lts. T o maintain this behavior for larger co nstellations and more antenna s, the maximum Hamming dista nce D may h av e to increase with increasing R 0 (see [31]). B. Lattice-Reduction-Aided De modulation Lattice reduction (LR) is an important technique for improving the performance or complexity of MIMO d emodulators [16 ], [17] for the case of QAM constellations. The basic underlying idea is to view the co lumns o f the chan nel matrix H as bas is vectors of a point lattice. L R then yields an alternativ e basis which amounts to a transformation of the system model (1) prior to demod ulation; the advantage of suc h an ap proach is that the trans formed c hannel matrix (i.e., the reduced b asis) has improved p roperties (e.g., 7 Changing the value of a particular bit changes only one symbol in the symbol vector . T hus, t he residual y − Hx in (19) can be easily updated by adding an appropriately scaled column of H . This requires only O ( M R ) instead of O ( M T M R ) operations. October 29, 2018 DRAFT 16 −5 −2.5 0 2.5 5 7.5 10 12.5 15 1 2 3 4 5 6 7 8 SNR [dB] Max. Achievable Rate [bpcu] max−log flip−1,ML hard−ML soft−MMSE flip−1,MMSE hard−MMSE 2 3 4 5 6 7 3.5 4 4.5 Zoom (a) −5 −2.5 0 2.5 5 7.5 10 12.5 15 1 2 3 4 5 6 7 8 SNR [dB] Max. Achievable Rate [bpcu] max−log flip−2,ML hard−ML soft−MMSE flip−2,MMSE hard MMSE 2 2.5 3 3.5 4 4.5 5 5.5 3.5 4 4.5 Zoom (b) Fig. 5. System capacity of bit ﬂipping demodulator with (a) D = 1 and (b) D = 2 ( 4 × 4 MIMO, 4 -QAM, Gray l abeling). smaller c ondition number). An efﬁcient a lgorithm to obtain a reduc ed ba sis was proposed by Le nstra, Lenstra, and Lov ´ asz (LLL ) [52 ]. The overall computational complexity of LR-aided d emodulation depends on the complexity of the LLL algorithm which is currently an acti ve research topic. Bounds on the a verage computational complexity of the LLL algorithm have been provided in [53]. A comparison of diff erent LR methods in the c ontext o f MIMO hard demodulation was p rovided in [54]. Since LR a lgorithms are often formulated for eq uiv alen t real-v alue d models, we ass ume for now that all quantities are real-valued. Any lattice bas is trans formation is desc ribed by a unimodular transformation matrix T , i.e ., a matrix with integer e ntries a nd det( T ) = ± 1 . De noting the “reduc ed ch annel” by ˜ H = H T a nd deﬁning z = T − 1 x , the sy stem mode l (1) c an be rewr itten as y = Hx + v = ˜ Hz + v . (20) Under the assumption x ∈ Z M T (which for QAM c an be ensured by a n ap propriate off set a nd scaling), the unimodularity of T gua rantees z ∈ Z M T and h ence any d emodulator ca n be applied to the better-behav ed transformed system mo del o n the right-hand side of (20) . LR-aide d soft de modulators (cf. [18]) are essen tially list-based [19], [20], and often apply bit ﬂipping (cf. Section V -A2) to L R-aided ha rd-output demodulators. He re, we restrict to LR-aided hard an d soft output MMSE demodu lation [17]. Capacity Results. Fig. 6 shows the c apacity results for h ard and soft LR-aided MMSE d emodulation. Soft outputs are obtaine d by a pplying bit ﬂipping with D = 1 and D = 2 to the LR-aide d hard MMSE demodulator output (cf. S ection V -A2). It is s een for 4 × 4 MIMO with 4 -QAM ( R 0 = 8 ) in Fig. 6 that LR with hard MMSE demodu lation s hows a signiﬁcant performance advantage over hard MMSE demodu lation for SNRs above 7 . 2 dB (rates highe r than 4 . 5 bpc u). At rates highe r than about 7 . 1 bpcu, LR-aided hard demo dulation even outperforms soft MMSE d emodulation. Bit ﬂipp ing October 29, 2018 DRAFT 17 −5 0 5 10 15 20 1 2 3 4 5 6 7 8 SNR [dB] Max. Achievable Rate [bpcu] max−log hard−ML soft−MMSE flip−2,LR−MMSE flip−1,LR−MMSE hard−LR−MMSE hard−MMSE 2 3 4 5 6 7 3.5 4 4.5 Zoom Fig. 6. System capacity of LR-aided hard and soft MMSE demodulation ( 4 × 4 MIMO, 4 -QAM, Gr ay l abeling). is h elpful particularly at low-t o-medium rates. Thus , for SNRs below 6 . 8 d B (rates lower tha n 5 . 2 bpcu ) LR-aided soft demodulation with D = 1 e ssentially performs b etter than h ard ML. When ﬂipping up to D = 2 bits, LR-aided soft demodulation closely approache s max-log p erformance and re veals a signiﬁcant performance ad vantage over soft MMSE de modulation without LR in the high-rate regime. C. Semideﬁnite Relaxation Demodu lation Based on con vex optimization techniques , semideﬁnite relaxa tion (SDR) is an approach to approxi- mately solve the hard ML problem (13) with po lynomial worst-case complexity [13], [55]. W e sp eciﬁcally consider hard-ou tput a nd soft-output versions of a n SDR demod ulator tha t approximates max-log d emod- ulation and h as an overall worst-case complexity of O ( R 4 . 5 0 ) (see [14]). W e note that this approac h applies only to BPSK or 4 -QAM alpha bets and employs a randomization proced ure described in detail in [13 ]. Capacity R esults. In Fig. 7 we show the system capa city for a 4 × 4 MIMO system with 4 -QAM ( R 0 = 8 ) u sing ha rd an d soft SDR demod ulation (as described in [14]) and randomization with 25 trials. Surprisingly , hard and soft SDR demod ulation here exactly match the performance of ha rd ML and ma x-log demo dulation, respec ti vely . D. Inﬁnity-Norm Demodulator The VLS I implementation comp lexity of the sp here decod er for hard ML demodu lation is signiﬁcan tly reduced b y replacing the ℓ 2 norm in (13) with the ℓ ∞ norm, i.e., ˆ x ∞ = arg min x ∈A M T k Q H y − Rx k ∞ . (21) Here, the ℓ ∞ norm is deﬁne d a s k a k ∞ , ma x  Re { a 1 } , . . . , Re { a M } , Im { a 1 } , . . . , Im { a M }  and Q and R a re the M R × M T unitary a nd M T × M T upper triangular factors in the QR decomp osition H = QR o f October 29, 2018 DRAFT 18 −5 −2.5 0 2.5 5 7.5 10 12.5 15 1 2 3 4 5 6 7 8 SNR [dB] Max. Achievable Rate [bpcu] max−log hard−ML soft−MMSE soft−SDR hard−SDR Fig. 7. System capacity of hard and soft SDR demodulation ( 4 × 4 MIMO, 4 -QAM, Gray labeling). The curves for hard and soft SDR are identical to those for hard ML and max-log, respective ly . the ch annel matrix. Th e advantage of (21 ) is that expens i ve squa ring operations are av oided and fewer nodes are visited du ring the tree search underlying the sp here de coder [11], [12]. If (21) h as no uniqu e solution, on e so lution is selected at random. Soft outputs can be ge nerated by using the ℓ ∞ -norm s phere deco der to determine ˜ x b l = arg min x ∈X b l k Q H y − Rx k ∞ for b ∈ { 0 , 1 } and then evaluating the approximate LLRs us ing the ℓ 2 norm: ˜ Λ l = 1 σ 2 v h k y − H ˜ x 0 l k 2 − k y − H ˜ x 1 l k 2 i . Capacity Res ults. Fig. 8 shows the system cap acity for ha rd and s oft ℓ ∞ -norm de modulation. For the 4 × 4 case with 4 -QAM in F ig. 8(a), hard and so ft ℓ ∞ -norm demodulation perform within 1 dB o f hard ML and max-log, res pectiv ely . At rate s belo w abou t 4 bpc u, ℓ ∞ -norm demodulation is outperformed b y MM SE demodulation, though. For the 2 × 4 cas e with 16-QAM de picted in Fig. 8(b), all s oft-output baseline demodulators pe rform a lmost identical a nd the s ame is true for all hard-output bas eline demodulators, i.e., there is o nly a distinction between soft and hard demodulation (c f. Fig. 2(b)). Howe ver , so ft a nd hard ℓ ∞ -norm de modulation pe rform signiﬁca ntly worse in this a symmetric se tup, spe ciﬁcally at lo w- to-medium rates. At 2 bpcu, soft ℓ ∞ -norm demodulation requires 1 . 75 dB higher SNR than max-log and soft MMSE and hard ℓ ∞ -norm de modulation requires 2 . 3 d B h igher SNR than h ard ML/MMSE. E. Succes sive and S oft Interference Cancelation Succes siv e interference canc elation (SIC) is a hard-output de modulation approa ch that b ecame pop ular with the V -BLAST ( V ertical Be ll Labs Laye r ed Spac e-T ime ) sy stem [28]. W ithin one SIC iteration, only the layer with the largest post-equa lization SNR is detec ted and its contribution to the rec eiv e s ignal is October 29, 2018 DRAFT 19 −5 −2.5 0 2.5 5 7.5 10 12.5 15 0 1 2 3 4 5 6 7 8 SNR [dB] Max. Achievable Rate [bpcu] max−log hard−ML soft−MMSE soft−Linf hard−Linf hard−MMSE 2 3 4 5 6 7 3.5 4 4.5 Zoom (a) −5 −2.5 0 2.5 5 7.5 10 12.5 15 0 1 2 3 4 5 6 7 8 SNR [dB] Max. Achievable Rate [bpcu] max−log hard−ML soft−MMSE soft−Linf hard−Linf hard−MMSE 3 4 5 6 7 8 3.5 4 4.5 Zoom (b) Fig. 8. System capacity of hard and soft ℓ ∞ -norm demodulation for (a) 4 × 4 MIMO with 4 -QAM and ( b) 2 × 4 MIMO wi th 16 -QAM (Gray labeling i n both cases). subtracted (cance led). A SIC impleme ntation that replac es the ZF d etector from [28] with an MMSE detector a nd orders the laye rs efﬁciently a ccording to signal-to-interference-plus-noise -ratio (SINR) was presented in [29]. Note tha t this approa ch shows a c omplexity order o f O ( M R M 2 T ) which is the same a s for linear MIMO demodulation. Suboptimal but mo re e f ﬁcient SIC schemes are discus sed in [30]. A parallel soft interference canc elation (SoftIC) scheme with reduced error propagation was propos ed in [26]. SoftIC is an iterativ e me thod tha t iterati vely performs (i) p arallel MIMO interferenc e cance lation based on soft sy mbols and (ii) c omputation of improved soft symb ols using the ou tput of the interference cance lation stage. The complexity of one SoftIC iteration dep ends linearly on the n umber of an tennas. Here, we use a mo diﬁcation that buil ds upon bit-LLRs. Let ˜ Λ ( i ) k [ j ] de note the LLR for the i th bit in lay er k ob tained in the j th iteration. Symbol prob abilities ca n then be ob tained as P ( j ) k ( x ) = Q Y i =1 exp  b i ( x ) ˜ Λ ( i ) k [ j ]  1 + exp  ˜ Λ ( i ) k [ j ]  , with b i ( x ) de noting the i th bit in the lab el of x ∈ A , leading to the soft symbol es timate ˜ x ( j ) k = X x ∈A x P ( j ) k ( x ) . Soft interferenc e can celation for each lay er then yields y ( j ) k = y − X k ′ 6 = k h k ′ ˜ x ( j ) k ′ = h k x k + X k ′ 6 = k h k ′  x k ′ − ˜ x ( j ) k ′  + v , (22) where h k denotes the k th column of H . Finally , u pdated LLRs ˜ Λ ( i ) k [ j + 1] are calcu lated from (22) based on a Gaus sian ass umption for the res idual interference plus n oise (for details we refer to [26 ]). In contrast to [26], we su ggest to initialize the s cheme with the L LRs obtained by a low-complexity soft October 29, 2018 DRAFT 20 −5 0 5 10 15 20 1 2 3 4 5 6 7 8 SNR [dB] Max. Achievable Rate [bpcu] BICM SoftIC max−log hard−ML soft−MMSE MMSE−SIC soft−ZF −1 0 1 2 3 4 5 2 2.5 3 3.5 4 Zoom (a) −5 −2.5 0 2.5 5 7.5 10 12.5 15 1 2 3 4 5 6 7 8 SNR [dB] Max. Achievable Rate [bpcu] BICM SoftIC−1 SoftIC−3 SoftIC−4 SoftIC−8 soft−ZF 4 4.5 5 5.5 6 6.5 7 4.5 5 5.5 Zoom (b) Fig. 9. System capacity of MMSE-SIC and SoftI C ( 4 × 4 MIMO, 4 -QAM, Gray l abeling). demodulator , e.g., the soft ZF de modulator in Se ction IV -B. By carefully coun ting operations it can be shown tha t the complexity p er iteration of the ab ove SoftIC algorithm sc ales as O (2 Q M T ( Q + M R )) . Capacity Results. In Fig. 9 (a), we d isplay capacity results for (hard) MMSE-SIC with detec tion ordering as in [30] (there in referred to a s ‘MMSE-BLAST’) a nd for S oftIC demodula tion with 3 iterations (initialized using a so ft ZF de modulator whose performance is sh own for reference ). Hard MMSE-SIC demodulation is s een to perform similarly to hard ML de modulation at low rates and even o utperforms it slightly at very lo w rates . At high rates, MMSE-SIC shows a noticeable ga p to hard ML but outperforms soft MMSE and So ftIC. SoftIC is sup erior to MMSE-SIC for rates of up to 7 bpcu (SNRs below 11 dB). At lo w rates, SoftIC ev en performs slightly better than max-log demodulation and ess entially coincide s with BICM capa city and soft MMSE. For the chosen system parameters , SoftIC close ly ma tches s oft MMSE at low rates an d even outpe rforms it at h igh rates. This s tatement is not gen erally valid, however . For example, with 16 -QAM So ftIC performs poorer than soft MMSE even a t high rates (see [31]). At high SNRs, we observed that So ftIC performance d egrades if iterated too long (se e Fig. 9(b), showing SoftIC with 1 , 3 , 4 , an d 8 iterations). This can be explained by the fact that at high SNRs the residual interference-plus -noise be comes very small and hence the LLR ma gnitudes gro w unreasona bly lar ge. Our simulations showed tha t SoftIC performs best wh en terminated after 2 or 3 iterations. V I . I M P E R F E C T C H A N N E L S TA T E I N F O R M A T I O N W e next in vestigate the e r godic system capac ity (9 ) for the cas e of imperfect cha nnel state information (CSI). In pa rticular , we c onsider training-bas ed estimation of the c hannel matrix H an d the noise variance σ 2 v and ass ess how the amount of training inﬂu ences the performance o f the various demodu lators. October 29, 2018 DRAFT 21 T raining-based Channe l Estimation. T o es timate the channe l, the transmitter se nds N p > M T training vectors 8 which a re arranged into a n M T × N p training matrix X p . W e as sume that X p has full rank and has Frobenius no rm [56] k X p k 2 F = N p E s such that the power per cha nnel use for training is the sa me as for the data. Assuming that the chann el stays constan t for the duration of one block (which contains training a nd ac tual data), the M R × N p receiv e matrix Y p induced by the training is giv en by Y p = H X p + V . (23) Here, V is an M R × N p i.i.d. Ga ussian no ise matrix. Using (23) , the least-squa res (ML) ch annel estimate is computed as [57] ˆ H = Y p X H p ( X p X H p ) − 1 . (24) This es timate is unbias ed a nd its mean square error equa ls E  k ˆ H − H k 2 F  = M R σ 2 v tr  ( X p X H p ) − 1  ≥ M R M 2 T N p 1 ρ , where the lower boun d is attained with orthogo nal training seq uences , i.e., X p X H p = N p E s M T I (we reca ll that ρ = E s /σ 2 v denotes the SNR). The noise variance is then e stimated a s the mean power of Y p in the ( N p − M T ) -dimensional orthog onal co mplement of the rang e space of X H p , i.e., ˆ σ 2 v = 1 M R ( N p − M T ) k Y p − ˆ HX p k 2 F . (25) The n oise variance estimate is un biased an d its MSE is independ ent of the transmit power: E  | ˆ σ 2 v − σ 2 v | 2  = σ 4 v M R ( N p − M T ) . Capacity Resu lts. W e show re sults for the ergodic s ystem capac ity of mismatched 9 max-log, hard ML, and soft MMSE demodulation whe re the true c hannel matrix a nd no ise variance are replac ed by ˆ H in (24) and ˆ σ 2 v in (25), res pectiv ely . Througho ut, a 4 × 4 MIMO system with 4 -QAM and Gray labe ling is considered ( R 0 = 8 ). Res ults for other de modulators with imperfect CSI are provided in [31]. Fig. 10(a) sh ows the ma ximum ach iev able rate s versus SNR for a ﬁxed orthogon al training sequen ce of length N p = 5 (the worst ca se with minimum a mount of training). It is seen tha t imperfect CSI causes a signiﬁc ant performance d egradation of all three d emodulators, e.g. , a t 4 bpc u the SNR losse s a re 3 . 9 dB (max-log), 3 . 2 dB (hard ML), and 4 dB (soft MMSE). In this worst c ase setup (minimum training length), 8 While N p = M T is sufﬁcient to estimate H , extra training is required for estimation of σ 2 v . 9 One could also modify these demodu lators in order to take into account the fact that the CSI is imperfect as e.g. in [58]; ho we ve r , this is beyond the scope of this paper . October 29, 2018 DRAFT 22 −10 −5 0 5 10 15 20 0 1 2 3 4 5 6 7 8 SNR [dB] Max. Achievable Rate [bpcu] perfect imperfect max−log hard−ML soft−MMSE P S f r a g r e p l a c e m e n t s T r a i n i n g L e n g t h (a) 5 8 11 14 17 20 −3 −2 −1 0 1 2 3 4 5 Required SNR [dB] 2 bpcu max−log hard−ML soft−MMSE P S f r a g r e p l a c e m e n t s T raining Length (b) 5 8 11 14 17 20 6 7 8 9 10 11 12 13 6 bpcu Required SNR [dB] max−log hard−ML soft−MMSE P S f r a g r e p l a c e m e n t s T raining Length (c) Fig. 10. Imp act of imperfect CSI on baseline demodulators: (a) capacity versus S NR for N p = 5 ; (b), (c) required SNR versus training length N p for a t arget rate of (b) C = 2 bpcu and (c) C = 6 bpcu ( 4 × 4 MIMO, 4 -QAM, Gray labeling). the performance advantage o f soft MMSE over ha rd ML a t lo w rates is slightly less prono unced; the intersection o f hard ML and soft MMSE performance shifts from 5 . 8 bpcu (at a n SNR of abo ut 7 . 7 dB) for perfect CS I to 5 bpcu (at 9 . 4 dB) for imperfect CSI. Howe ver , the gap be tween s oft MMSE and max - log is slightly lar ger at low rate s, e.g., 0 . 7 dB at 2 b pcu. T he performance los ses for all demodulators tend to be smaller at high rates, which ma y be p artly attrib uted to the fact that the CS I be comes more accurate with increa sing SNR. In ge neral it can be observed tha t the performance loss of hard ML is the smallest while soft MMSE and max-log performance deteriorates stronge r; u nlike max-log a nd soft MMSE, h ard ML does n ot us e the noise variance an d henc e is more robust to estimation errors in σ 2 v . T o in vestigate the impact of the amou nt of training, Fig. 10(b) a nd (c) depict the minimum SNRs required by the indi vidual demod ulators to achieve target rates of 2 bpcu a nd 6 bpc u, res pectiv ely , versus the training length N p . It is see n that for all demod ulators, the requ ired SNR dec reases rapidly with increasing amount of training. Y et, even for N p = 20 there is a signiﬁcant gap of 1 to 2 d B to pe rfect CSI pe rformance (indicated by h orizontal gray lines with c orresponding line style). He re, soft MMSE consisten tly performs better than max-log and hard ML at 2 bpcu. In contrast, ha rd ML outperforms soft MMSE a t 6 bpcu , espe cially for very s mall training durations. The results shown in Fig. 10 co rrespond to a worst cas e sc enario whe re bo th, c hannel and noise variance, are imperfectly known. Further capacity results, speciﬁc ally for the ca se o f imperfect chann el knowledge but perfect no ise variance and for other demo dulators disc ussed in S ection V, are provided in [31] . These results gen erally show that impe rfect receiv er CSI degr ades the performance throughout for all in vestigated demod ulation schemes. An interes ting observations is that—in the MIMO s etup cons idered— the LSD with list s ize |L| ≥ 8 co nsistently outperforms max-log for training du ration N p = 5 [31] . For lar ger training durations, LSD with |L| = 8 performs slightly poorer than but still very c lose to max -log. October 29, 2018 DRAFT 23 −10 −5 0 5 10 15 20 25 10 −3 10 −2 10 −1 10 0 SNR [dB] Outage Probability 2 bpcu 6 bpcu BICM max−log hard−ML soft−MMSE hard−MMSE soft−ZF hard−ZF (a) −10 −5 0 5 10 15 20 25 30 35 0 1 2 3 4 5 6 7 8 Max. ε −Achievable Rate SNR [dB] BICM max−log hard−ML soft−MMSE hard−MMSE soft−ZF hard−ZF (b) Fig. 11. Demodulator performance in quasi-static fading: (a) outage probability versus SNR for R = 2 bpcu and R = 6 bpcu, and (b) ǫ -capacity versus S NR for ǫ = 10 − 2 ( 4 × 4 MIMO, 4 -QAM, Gray labeling). V I I . Q UA S I - S TA T I C F A D I N G In this section we provide a de modulator performance comparison for q uasi-static fading MIMO channe ls based on the ou tage p robability P out ( R ) in (10) and the ǫ -capa city C ǫ in (11). The setup considered ( 4 × 4 MIMO with Gra y-labeled 4 -QAM) is the same as before apart from the spatially i.i.d. Rayleigh fading chan nel which now is assume d to be q uasi-static. Th e outage proba bility P out ( R ) was measured using 10 5 blocks (affected by indepen dent fading realiza tions), eac h co nsisting of 10 5 symbol vectors. T o kee p the presentation c oncise, we restrict to the baseline de modulators from Section IV. Fig. 11(a) sh ows the outag e probability versus SNR ρ for target rates of R = 2 bpcu and R = 6 bpcu . For R = 2 bpc u, so ft MAP demodulation (labeled ‘BICM’ for consistency with previous sections ) and soft MMSE demodulation exactly co incide and outperform max-log de modulation by about 0 . 5 dB. In this low-r ate regime, max-log pe rforms about 2 . 5 dB better than hard ML. While ma x-log, h ard ML, and soft MMSE d emodulation all a chieve full di versity (cf. the slope of the correspon ding outag e p robability curves), s oft and hard ZF o nly hav e diversity orde r one, r esulting in a huge perfor mance loss (almost 1 9 dB and 20 . 5 d B a t P out ( R ) = 10 − 2 , res pectiv ely). At R = 6 bpcu the situation is quite diff erent: here, max-log coincides with soft MAP and ha rd ML loos es only 1 . 4 dB (again, those three de modulators achieve full div ersity). Hard and s oft MMSE de teriorate at this rate an d loose all d i versity . At P out ( R ) = 10 − 2 , the SNR loss of soft MMSE and s oft ZF relativ e to max-log eq uals ab out 4 . 4 dB and 19 dB, respectively . The degrada tion of soft MMSE with increa sing rate is also visible in Fig. 11(b), wh ich shows ǫ -ca pacity versus SNR for a n outage probability of ǫ = 10 − 2 . The ǫ -capa city qualitativ ely behaves s imilar as the ergodic c apacity (cf. Fig. 2 (a)): at low rates soft MMSE outperforms hard ML (by up to 2 . 8 dB for rates less than 4 . 7 bpcu) while at h igh rates it is the opposite way . Furthermore, for low rates soft MMSE October 29, 2018 DRAFT 24 −10 −5 0 5 10 15 0 1 2 3 4 5 6 7 8 SNR [dB] Max. Achievable Rate [bpcu] max−log,4x4,4−QAM hard−ML,4x4,4−QAM soft−MMSE,4x4,4−QAM hard−ML,2x4,16−QAM soft−MMSE,2x4,16−QAM 2 2.5 3 3.5 4 4.5 5 3.5 4 4.5 Zoom (a) 0 2 4 6 8 10 1 2 3 4 5 6 7 8 SNR [dB] Max. Achievable Rate [bpcu] max−log,4x4,4−QAM soft−MMSE,4x4,4−QAM max−log,4x4,16−QAM soft−MMSE,4x4,16−QAM 6 6.5 7 7.5 8 8.5 9 5.5 6 6.5 Zoom (b) Fig. 12. System capacity of baseline demodulators f or (a) a 4 × 4 MIMO system using 4 -QAM and a 2 × 4 MIMO system using 16 -QAM, and (b) a 4 × 4 MIMO system using 4 -QAM and 16 -QAM (Gray labeling in al l cases). essen tially coincides with soft MAP whe reas at high rates it a pproache s s oft ZF performance. W e note that a similar rate-depend ent p erformance o f MMSE demodu lation has been obs erved in [59], [60]. There it was shown that with co ding acros s the anten nas, the div ersity orde r of MMSE e qualization equals M T M R at lo w rates a nd M R − M T + 1 at high rates; in contrast, ZF equa lization always ac hiev es a div ersity of only M R − M T + 1 . These results, which are interpreted in detail in [60, Sec tion IV], match well our observations that the MMSE demodulator loo ses di versity for rates larger tha n 5 bpc u (see Fig. 11 and [31]). A comparison of Fig. 11(b) a nd Fig. 2(a) sugge sts that there is a conne ction betwee n the diversity of a demodulator in the q uasi-static scenario a nd its SNR loss relati ve to optimum demodulation in the ergodic sc enario. For a ll rates (SNRs ), the ma x-log and hard ML demodulator both a chieve con stant (full) d i versity in the q uasi-static regime and maintain a roughly co nstant gap to soft MAP in the er godic scena rio. In contrast, with MMSE demod ulation the diversity order in the qua si-static cas e and the SNR gap to soft MAP in the e r godic s cenario both deteriorate with increasing rate/SNR. V I I I . K E Y O B S E RV A T I O N S A N D D E S I G N G U I D E L I N E S Based on the previous resu lts, we summarize key observations and provide sy stem des ign guidelines. Soft MMSE demodulation approache s BICM ca pacity an d o utperforms max-log a t lo w rates, both in the er godic an d the quasi-static regime an d for vari ous s ystem c onﬁgurations (see also [31]). Moreover , soft MMSE is very attracti ve since it has the lo west c omplexity among all soft de modulators that we discuss ed. Therefore, soft MMSE demodu lation is arguably the demodu lator of cho ice when designing MIMO-BICM sys tems with outer co des of low to medium rate. Since soft ZF performs con sistently poorer tha n soft MMSE at the same c omputational cos t, there ap pears to be no rea son to p refer soft October 29, 2018 DRAFT 25 ZF in practical implementations. The case for soft MMSE is particularly strong for asymme tric MIMO systems (i.e., M T < M R ), where it performs close to BICM capa city for all rates. F ig. 12(a) compa res a 4 × 4 MIMO system us ing 4 -QAM (system I) and a 2 × 4 system using 16 -QAM (system II), both using Gray labe ling and achieving R 0 = 8 . Whereas at low rates s oft MMSE demodu lation performs better with s ystem I than with sy stem II, it is the o ther way around for high rates. For example, at 7 bpcu system II requires 1 . 1 dB les s SNR than system I, in spite of using fe wer a cti ve trans mit anten nas. This observation is of interest when designing MIMO-BICM sy stems with adaptive mod ulation and coding . Speciﬁca lly , with soft MMSE demodulation sys tem I is preferable below 6 bpcu, whereas above 6 bpcu it is advantageous to dea cti vate two transmit antenna s an d s witch to the 16 -QAM constellation (system II). W e note that with max-log and hard ML d emodulation, system II performs consistently worse than system I. The on ly regime whe re soft MMSE su f fers from a no ticeable performance loss is s ymmetric systems at high rates (both, in the e r godic and q uasi-static sc enario). In the h igh-rate regime, hard a nd soft SDR are the only low- complexity demod ulation sche mes that are able to ac hiev e hard ML and max-log pe rformance, res pectiv ely . Thes e obs ervations apply also in the case of imperfect CSI. This sugges ts that hard and soft SDR demodulation are preferable over ha rd ML and max-log d emodulation. W ith perfect CS I, also LSD, bit-ﬂipping d emodulation, and soft ℓ ∞ -norm de modulation co me rea sonably close to max-log. The LSD has the ad ditional advantage of being a ble to trade of f performance for complexity reduction. Furthermore, note that for s ystem I hard SDR de modulation (which app roaches hard ML performance ) outperforms mos t su boptimal soft demodulators for rates lar ger tha n 6 bpc u. The above discussion s uggests tha t in order to a chieve a giv en target information rate , it ma y be preferable to a dapt the number of antennas an d the symbo l constellation in a sys tem with a low- rate code a nd a low-complexity MMSE de modulator instead of us ing a high-rate co de and a compu tationally expensive non-linear demodulator . Such a design a pproach has recently been adv ocated also in [60 ]. While RF complexity may be a limiting factor with respect to anten na numb er , increa sing the constellation size is inexpe nsiv e. Fig. 12(b) comp ares soft MMSE and max -log for a 4 × 4 MIMO s ystem w ith Gray-labe led 4 -QAM ( R 0 = 8 ) a nd 16 -QAM ( R 0 = 16 ). Below 3.5 b pcu, soft MMSE de modulation with 4-QAM is optimal; at higher ra tes, switching to 16 -QAM allo ws the MMSE de modulator to perform within a bout 0 . 7 dB o f max-log with 4-QAM while increasing the soft-MMSE complexity only litt le. In cas e of imperfect rece i ver CSI, the performanc e of a ll demodu lators deteriorates signiﬁca ntly (see also [31]), i.e., a ll c apacity curves a re s hifted to h igher SN Rs (de pending on the amoun t of training). Demodulators tha t take the noise v ariance into acco unt req uire somewhat more training. An exception is the LSD which can ou tperform max-log in c ase of poor channel and noise variance estimates [31]. October 29, 2018 DRAFT 26 W e conclud e that at low rates linear so ft demodulation is generally p referable due to its very low computational c omplexity . At h igh rates n on-linear d emodulators pe rform be tter , even whe n they d eli ver hard rather tha n s oft outputs . If comp lexity is no t an issu e, soft SDR demodulation is a dvantageous s ince it approac hes max-log pe rformance and is largely s uperior to all other demo dulators over a wide range of sys tem pa rameters and SNRs. A n otable exception is the low-complexity SoftIC demod ulator which for so me sys tem conﬁ gurations has the pote ntial to o utperform s oft SDR (and max-log) at lo w rates. I X . C O N C L U S I O N W e p rovided a comprehen si ve p erformance as sessme nt a nd compa rison o f soft and ha rd demodulators for non-iterativ e MIMO-BICM sy stems. Our comparison is base d on the information-theoretic n otion of system capac ity , whic h c an b e interpreted as the maximum achievable rate of the equ i v alent “mo dula- tion” cha nnel that co mprises modulator , physica l channel, and demodulator . As a performance measure, system capac ity ha s the main advantage of being indep endent of a ny ou ter co de. E xtensive simulation results sh ow that a universal demodu lator p erformance ranking is not p ossible and that the demodulator performance ca n depen d strongly on the rate (or equivalently the SNR) at which a system ope rates. In addition to ergodic capacity results, we in vestigated the no n-er godic fading sc enario in terms of ou tage probability a nd ǫ -capa city and a nalyzed the robustness of certain demodulators und er imperfect c hannel state information. Our obs ervations provide new insights into the design of MIMO-BICM sys tems (i.e., choice of demodulator , number o f antennas, a nd sy mbol c onstellation). Moreover , our approach s heds light on issue s that have n ot been apparent in the previously prev a iling BER pe rformance comparisons for spec iﬁc ou ter c odes. For exa mple, a key observation is that with low-rate outer codes so ft MMSE is preferable over other demodulators sinc e it has low co mplexity but clos e-to-optimal pe rformance. A P P E N D I X A M E A S U R I N G M U T UA L I N F O R M A T I O N Evaluating the mutual information in (9) in volv es the cond itional LLR distributi ons f ( ˜ Λ l | c l ) . W e approximate these p dfs by histograms obtained via Monte Carlo simulations. T o a chieve a small bias an d variance of the mutual information estimate, the nu mber and the size o f the histog ram b ins as we ll as the sample size need to be carefully balan ced [48]. Instead of LLRs, we us e the bit probab ilities φ l = 1 1 + e − ˜ Λ l ∈ [0 , 1] . Since the L LRs ˜ Λ l and φ l are in o ne-to-one corresponde nce (cf. (2)), the mutual information of the equiv a lent modulation channel equals that of the cha nnel characterized by the cond itional pdf f ( φ l | c l ) ; the latter h as the advantage of being ea sier to a pproximate by a histog ram with uniform bins. By performing October 29, 2018 DRAFT 27 Monte Carlo simulations in which N cod e bits, the noise, and the chan nel are rand omly generated, we obtain a histogram with K bins which is charac terized by the u niform b ins  k − 1 K , k K  , k = 1 , . . . , K , an d the assoc iated cond itional rela ti ve freque ncies Ξ b l,k (i.e., the normalized number o f probabilities φ l lying in the k th bin co nditioned o n c l = b ). The mutual information in (9) is then a pproximated as C ≈ ˆ C = R 0 − R 0 X l =1 1 X b =0 K X k =1 1 2 Ξ b l,k log 2 P 1 b ′ =0 Ξ b ′ l,k Ξ b l,k . (26) The a ccuracy of this app roximation dep ends i) on the numb er K of histog ram bins (this determines the discretization error) an d ii) on the number N of samples (code b its) used to estimate the histogram (this determines the bias and variance of Ξ b l,k and hence o f ˆ C ). Spe ciﬁcally , the bias a nd the variance of ˆ C can be boun ded as (se e [48]) 0 ≤ E { ˆ C } − C Q ≤ l og 2  1 + K − 1 N  , E n  ˆ C − E { ˆ C }  2 o ≤ (log 2 N ) 2 N . (27) Here, C Q is the mutual information of the discretized c hannel, i.e., equ al to (26) b ut with Ξ b l,k replaced by P  φ l ∈  k − 1 K , k K    c l = b  . He nce, the bia s in (27) quantiﬁes the sys tematic error resulting from the empirical estimation of the histograms. W e co nclude that a large number K o f bins is advantageous in order to keep the d iscretization error small; in view of (27 ), this nec essitates a signiﬁcan tly larger number N o f samples ( N ≫ K ) in order to achieve a small estimation b ias. Large N simultaneo usly ensu res a small e stimation variance. The price p aid for accurate capacity estimates is c omputational complexity . T o de sign K and N , we ﬁrst evaluated the BICM ca pacity in (7 ) by direct numerical integration using the known pdfs in (5) ; then we e stimated the s ame ca pacity via Monte Carlo simulations a s des cribed above using the optimum soft MAP demo dulator a nd increas ingly larger K and N un til the result was close e nough to the cap acity o btained by direct ev aluation. Speciﬁca lly , with K = 256 an d N = 10 5 the estimation error was o n the o rder of 10 − 4 over a large SNR range. These numb ers were then u sed to estimate the mutual information for all o ther de modulators. A C K N O W L E D G M E N T The a uthors than k Gottfried Lechne r for kindly providing his LDPC dec oder implementation. R E F E R E N C E S [1] P . Fertl, J. Jald ´ en, and G. Matz, “Capacity-based performance comparison of MIMO-BICM demodulators, ” i n Pr oc. IEEE SP A WC-2008 , Recife, Brazil, July 2008, pp. 166–170. [2] G. Caire, G. T aricco, and E. Biglieri, “Bit-i nterleav ed coded modulation, ” IEEE Tr ans. Inf. Theory , vol. 44, no. 5, pp. 927–94 5, May 1998. [3] A. Guil l ´ en i F ` abrega s, A. Martinez, and G. Caire, “Bit-Interleaved Coded Modulation, ” F oundations and T ren ds in Communications and Information Theory , vol. 5, no. 1–2, pp. 1–153, 2008. [4] U. W achsmann, R. F . H. Fischer , and J. B. Huber , “Multilevel codes: Theoretical concepts and practical design rules, ” IEEE T ran s. Inf. Theory , vol. 45, no. 5, pp. 1361–1391, July 1999. October 29, 2018 DRAFT 28 [5] E. Biglieri, G. T aricco, and E. V iterbo, “Bit-interleaved time-space co des for fading channels, ” in Pro c. Conf . on Information Sciences and Systems , Princeton, NJ, Mar . 2000, pp. W A 4.1–4.6. [6] A. S tefano v and T . M. Duman, “T urbo-coded modulation for systems with transmit and receiv e antenna div ersity ov er block fading channels: System model, decoding approaches , and practical considerations, ” IEEE J. Sel. Are as Comm. , vol. 19, no. 5, pp. 958–968 , May 2001. [7] S. H. M ¨ uller-W einfurtner, “Coding approaches f or multiple antenna t ransmission in fast fading and OFDM, ” IEEE Tr ans. Signal Pro cessing , vol. 50, no. 10, pp. 2442–2450 , Oct. 2002. [8] J. Jald ´ en and B. Ott ersten, “Parallel implementation of a soft output sphere decoder , ” in Proc. 39th Asilomar Conf. Signals, Systems, Computers , Paciﬁc Grove , CA, Oct./Nov . 2005, pp. 581– 585. [9] ——, “On the complexity of sphere decoding in digital communications, ” IE EE T rans. Signal Processin g , vol. 53, no. 4, pp. 1474–1484, Apr . 2005. [10] C. Studer , A. Burg, and H. B ¨ olcskei, “Soft- output sphere decoding: Algorithms and VLSI implementation, ” IEEE J. Sel. Ar eas Comm. , vol. 26, no. 2, pp. 290–300, Feb. 2008. [11] A. Burg, M. Borgman n, M. W enk, M. Zellweg er , W . Fichtner, and H. B ¨ olcsk ei, “VLS I implementation of MIMO detection using sphere decoding al gorithm, ” IEEE J. Solid-State Circuits , vol. 40, no. 7, pp. 1566–1577, July 2005. [12] D. Seethaler and H. B ¨ olcsk ei, “Performance and complexity analysis of inﬁnity-norm sphere-decoding, ” IEEE T ra ns. Inf. Theory , vol. 56, no. 3, pp. 1085–110 5, Mar . 2010. [13] W . K. Ma, T . N. Davidson, K . M. W ong, Z. Q. Luo, and P . C. Ching, “Quasi-maximum-likelihood multiuser detection using semideﬁnite relaxation with application to synchrono us C DMA, ” IEEE T rans. Signal Pr ocessing , vol. 50, no. 4, pp. 912–92 2, Apr . 2002. [14] B. Steingrimsson, Z.-Q. Luo, and K. M. W ong, “Soft quasi-maximum-lik elihood detection for multiple-antenna wir eless channels, ” IEEE T rans. Signal Processing , vol. 51, no. 11, pp. 2710–2719, Nov . 2003. [15] B. M. Hochwald and S. ten Brink, “ Achieving near-capacity on a multiple-antenna channel, ” IEEE T ran s. I nf. Theory , vol. 51, no. 3, pp. 389–399 , Mar . 2003. [16] H. Y ao and G. W . W ornell, “Lattice-reduction-aided detectors for MIMO communication systems, ” in Proc. IE EE GLOBECOM-2002 , vol. 1, T aipei, T aiwan, Nov . 2002, pp. 424–428. [17] D. W ¨ ubben, R. B ¨ ohnk e, V . K ¨ uhn, and K. Kammeyer , “MMSE-based lattice-reduction f or near-ML detection of MIMO systems, ” in P r oc. I TG W orkshop on Smart Antennas 2004 , Munich, Germany , Mar . 2004, pp. 106–11 3. [18] C. W indpassing er , L. H.-J. Lampe, and R. F . H. Fischer , “From lattice-reduction-aided detection tow ards maximum- likelihood detection in MIMO systems, ” in Proc . W ireless, Optical Commun. Conf. , Banff, AB , Canada, July 2003. [19] P . Silvola, K. Hooli, and M. Juntii, “Suboptimal soft-output MAP detector with lattice reduction, ” IE EE Signal Processing Letters , vol. 13, no. 6, pp. 321–324, June 2006. [20] V . Ponnampalam, D. McNamara, A. Lillie, and M. Sandell, “On generating soft outputs for latt ice-reduction-aided MIMO detection, ” in Proc. IEE E IC C-07 , Glasgo w , UK, June 2007, pp. 4144–4149. [21] R. W ang and G. B. Giannakis, “ Approaching MIMO channel capacity with soft detection based on hard sphere decoding, ” IEEE T ran s. Comm. , vol. 54, no. 4, pp. 587–590, Apr . 2006. [22] M. Butler and I. Co llings, “ A zero-forcing approx imate log-likelihood recei ver for MIMO bit-interleav ed cod ed modulation, ” IEEE Commun. L etters , vol. 8, no. 2, pp. 105–107, Feb . 2004. [23] M. R. McKay and I. B. Colli ngs, “Capacity and performance of MIMO-BICM with zero-forcing receivers, ” IEEE T ran s. Comm. , vol. 53, no. 1, pp. 74–83, Jan. 2005. [24] I. B. Collings, M. R. G. Butler, and M. R. McKay , “Lo w complexity recei ver design for MIMO bit-interleav ed coded modulation, ” in I EEE ISST A-2004 , Sydney , Australia, Aug.-Sep. 2004, pp. 12–16. [25] D. Seethaler , G. Matz, and F . Hlawatsch, “ An efﬁcient MMSE-based demodulator for MIMO bit-interl eav ed coded modulation, ” in P r oc. IEE E GL OBECOM-2004 , vol. 4, Dallas, T exas, Dec. 2004, pp. 2455–2459 . [26] W .-J. Choi, K.-W . Cheong, and J. Ciofﬁ, “Iterative soft interference cancellation for multiple antenna systems, ” in P r oc. IEEE WCNC-2000 , vol. 1, Chicago, IL, Sep. 2000, pp. 304–309. [27] A. Paulraj, R. U. Nabar , and D. Gore, Introd uction to Space-T ime W i r eless Communications . Cambridge (UK): Cambridge Univ . P ress, 2003. [28] P . W . W olniansky , G. J. Foschini, G. D. Golden, and R. A. V alenzuela, “V -BL AST: An architecture for realizing very high data rates over the rich-scattering wireless channel, ” in Proc . UR SI Int. Symp. on Signals, Systems and El ectr onics , Pisa, Italy , Sep. 1998, pp. 295–300. [29] B. Hassibi, “ A fast square-root implementation for B LAST, ” in Pro c. 34th Asilomar Conf. Signals, Systems, Computers , Paciﬁc Grov e, CA, Nov ./Dec. 2000, pp. 1255–1259 . [30] R. B ¨ ohnk e, D. W ¨ ubben, V . K ¨ uh n, and K. D. Kammeyer , “Reduced complexity MMSE detection for BLAST architectures, ” in Pr oc. IEEE GLOBE COM-2003 , vol. 4, S an Francisco, CA, Dec. 2003, pp. 2258–2262. [31] P . Fertl, J. Jald ´ en, and G. Matz, “P erformance assessment of MIMO-BICM demodulators based on system capacity: October 29, 2018 DRAFT 29 Further results, ” Vien na Univ ersity of T echno logy , Austria, T echnical Report #09-1, Oct. 2009. [Online]. A vailab le: http://publik.tuwien.ac.at/ﬁles/PubDat 174303 .pdf [32] E. T elatar, “Capacity of multi-antenna Gaussian channels, ” Eur opean T ran s. T elecomm. , vol. 10, no. 6, pp. 585–59 6, Nov . 1999. [33] T . M. Cover and J. A. Thomas, Elements of Information Theory . Ne w Y ork: W iley , 1991. [34] L. H.-J. Lampe, R. Schober , and R . F . H. F ischer , “Multilev el coding for mutliple-antenna transmission, ” IEEE Tr ans. W ir eless Comm. , vol. 3, no. 1, pp. 203–208 , Jan. 2004. [35] Z. Hong and B. L. Hughes, “Robust space-time trellis codes based on bit-interleav ed coded modulation, ” in Pr oc. CISS-01 , vol. 2, Mar . 2001, pp. 665–670. [36] M. van Dijk, A. J. E. M. Janssen, and A. G. C. Ko ppelaar , “Correcting systematic mismatches i n computed log-likelihood ratios, ” Eur op. T rans. T elecomm. , vol. 14, no. 3, pp. 227–244 , July 2003. [37] D. Tse and P . V iswanath, Fundamentals of W ir eless Communication . Boston (MA): Cambridge University Press, 2005. [38] A. Lapidoth and P . Naryan, “Reliable communication under channel uncertainty , ” IE EE T rans. Inf. Theory , vol. 44, no. 6, pp. 2148–2177, Oct. 1998. [39] N. Merhav , G. Kaplan, A. Lapidoth, and S. Shamai, “On information rates for mismatched decoders, ” IEEE T rans. Inf. Theory , vol. 40, no. 6, pp. 1953–196 7, Nov . 1994. [40] A. Martinez, A. Guill ´ en i F ` abregas, G. Caire, and F . Willems, “Bit-interleaved coded modulation revisited: A mi smatched decoding perspectiv e, ” in Pr oc. ISIT -2008 , T oronta, Canada, July 2008, pp. 2337–2341. [41] J. Jald ´ en, P . Fertl, and G. Matz, “On the generalized mutual information of BICM systems with approximate demodulation, ” in Pr oc. IEEE Information T heory W orkshop , Cairo, Egypt, Jan. 2010, pp. 1–5. [42] A. Burg, M. W enk, and W . F ichtner , “VLSI implementation of pipelined sphere decoding with early termination, ” in P r oc. EUSIPCO-2006 , Florence, It aly , Sept. 2006. [43] S. Schwandter , P . Fertl, C. Novak , and G. Matz, “Log-likelihood rati o clipping in MIMO-BICM systems: Information geometric analysis and impact on system capacity , ” in Pr oc. IEEE ICASSP-2009 , T aipei, T aiwan, Apr . 2009, pp. 2433– 2436. [44] C. Nov ak, P . Fert l, and G. Matz, “Quantization for soft-output demodulators in bit-interleaved coded modulation systems, ” in Pr oc. IEEE ISIT-2009 , S eoul, K orea, Jun./Jul. 2009, pp. 1070–1074 . [45] C. Studer and H. B ¨ olcskei, “Soft-input soft-output single tree-search sphere decoding, ” IEEE T rans. Inf. Theory , vol. 56, no. 10, pp. 4827–484 2, Oct. 2010. [46] T . T . Nguyen and L. Lampe, “Bit-interleav ed coded modulation with mismatched decoding metrics, ” IE EE T rans. Communications , to appear . [47] J. M. Ciofﬁ, G. P . Dude vo ir , M. V . Eyuboglu, and G. D. Forney , “MMSE decision-feedba ck equalizers and coding – Part I: Equalization results, ” IE EE T rans. Commun. , vol. 43, no. 10, pp. 2582–2594, Oct. 1995. [48] L. Paninski, “Estimation of entropy and mutual information, ” Neural Comput. , vol. 15, no. 6, pp. 1191–12 53, June 2003. [49] T . J. R ichardson and R. L . Urbanke, “The capacity of l o w-density parity check codes under message-passing decoding, ” IEEE T ran s. Inf. Theory , vol. 47, no. 2, pp. 599–618, F eb . 2001. [50] C. Michalke, E. Zimmermann, and G. F ettweis, “Linear MIMO receiv ers vs. tree search detection: A performance comparison overv iew , ” in Pr oc. IEEE PIMRC-06 , Helsinki, Finland, Sept. 2006, pp. 1–7. [51] M. Damen, H. El Gamal, and G. Caire, “On maximum-likelihood detection and the search for the closest lattice point, ” IEEE T ran s. Inf. Theory , vol. 49, no. 10, pp. 2389–2402 , Oct. 2003. [52] A. K. Lenstra, H. W . Lenstra, Jr ., and L. Lov ´ asz, “Factoring polynomials with rational coefﬁcients, ” Math. Ann. , vol. 261, no. 4, pp. 515–534, Dec. 1982. [53] J. Jald ´ en, D. Seethaler , and G. Matz, “W orst- and avera ge-case complexity of LL L lattice reduction in MIMO wireless systems, ” in P r oc. I EEE ICASSP-2008 , Las V egas, N V , Apr . 2008, pp. 2685–2688. [54] D. W ¨ ubben and D. Seethaler , “On the performance of lattice reduction schemes for MIMO data detection, ” in Proc . Asilomar Conf. Signals, Systems, Computers , Paciﬁc Grove, C A, USA, Nov . 2007, pp. 1534–1538. [55] P . H. T an and L. K. Rasmussen, “The application of semideﬁnite programmin g for detection in CDMA, ” IE EE J. Sel. Ar eas Comm. , vol. 19, no. 8, pp. 1442–1449 , Aug. 2001. [56] R. A. Horn and C. R . Johnson, Matrix A nalysis . Cam bridge (UK): Cambridge Univ . Press, 1999. [57] M. Biguesh and A. B. Gershman, “Training-base d MIMO channel estimation: A study of estimator tradeoffs and optimal training signals, ” IEE E Tr ans. Signal Pr ocessing , vol. 54, no. 3, pp. 884–893, Mar . 2006. [58] G. T aricco and E. Biglieri, “Space-time decoding with imperfect channel estimation, ” IEEE Tr ans. W i r eless Comm. , vol. 4, no. 4, pp. 1874–1888, July 2005. [59] A. Hedayat and A. Nosratinia, “Outage and diversity of linear receiv ers in ﬂat-fading MIMO channels, ” IEEE T rans. Signal Pr ocessing , vol. 55, no. 12, pp. 5868–5 873, Dec. 2007. [60] K. R. Kumar , G. Caire, and A. L. Moustakas, “ Asymptotic performance of linear recei vers in MIMO fading channels, ” IEEE T ran s. Inf. Theory , vol. 55, no. 10, pp. 4398–4418 , Oct. 2009. October 29, 2018 DRAFT

Performance Assessment of MIMO-BICM Demodulators based on System Capacity

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment