Layer-Aware Forward Error Correction for Mobile Broadcast of Layered Media
Low-Complexity Forward Error Correction and Modulation for ...
-
Upload
khangminh22 -
Category
Documents
-
view
0 -
download
0
Transcript of Low-Complexity Forward Error Correction and Modulation for ...
Low-Complexity Forward Error Correction andModulation for Optical Communication
by
Masoud Barakatain
A thesis submitted in conformity with the requirementsfor the degree of Doctor of Philosophy
Graduate Department of Electrical and Computer EngineeringUniversity of Toronto
© Copyright 2021 by Masoud Barakatain
Abstract
Low-Complexity Forward Error Correction and Modulation for Optical Communication
Masoud Barakatain
Doctor of Philosophy
Graduate Department of Electrical and Computer Engineering
University of Toronto
2021
A novel low-complexity architecture for forward error correction (FEC) in optical com-
munication is proposed. The architecture consists of an inner soft-decision low-density
parity check (LDPC) code concatenated with an outer hard-decision staircase or zipper
code. The inner code is tasked with reducing the bit error probability below the level
that allows the outer code to deliver on the stringent output bit error rate required in
optical communication. A hardware-friendly quasi-cyclic construction is adopted for the
inner codes.
The concatenated code is optimized by minimizing the estimated data-flow at the
decoder. A method is developed to obtain complexity-optimized inner-code ensembles.
A key feature emerging from this optimization is that it pays to leave some inner codeword
bits completely uncoded, thereby greatly reducing the decoding complexity. The trade-off
between performance and complexity of the designed codes is characterized by a Pareto
frontier. In binary modulation, up to 71% reduction in complexity is achieved compared
to previously existing designs.
Higher-order modulation via multilevel coding (MLC) is compared with bit-interleaved
coded modulation (BICM) from a performance-versus-complexity standpoint. In both
approaches, complexity-optimized error-reducing LDPC inner codes are designed for con-
catenation with an outer hard-decision code, for various modulation orders. Code designs
for MLC are shown to provide significant advantages relative to designs for BICM over the
entire performance-complexity tradeoff space, for a range of modulation orders. Codes
ii
designed for MLC can operate with 78% less complexity, or provide up to 1.2 dB coding
gain compared to designs for BICM.
A multi-rate and channel-adaptive inner-code architecture is also proposed. A tool is
developed to optimize low-complexity rate- and channel-configurable concatenated FEC
schemes via an MLC architecture. Compared to previously existing FEC schemes, up to
63% reduction in decoding complexity, or up to 0.6 dB coding gain is obtained.
Code designs for MLC in combination with four-dimensional signal constellations are
also considered. The design method is generalized to obtain complexity-optimized non-
binary LDPC codes to concatenate with outer zipper codes. Gains of up to 1 dB over
the conventional schemes are reported.
The possibility of using a novel class of nonlinear codes in FEC design is also inves-
tigated.
iii
To my wonderful family, Amir, Ziba, and Maryam,
and to the love of my life, Zhino.
In memory of Arash and all the precious lives lost
in the downing of flight PS752.
iv
Acknowledgements
I would like to express my sincere gratitude to my supervisor, Prof. Frank R. Kschischang
for his support and guidance throughout my studies. Frank is an amazing scientist with
an in-depth knowledge in various fields of research. Without his brilliant insights and
ideas, his encouraging words, and his patience with me, this work would have never been
possible. He is an excellent teacher, a great mentor, and a wonderful human being. He
has been and will always be a source of inspiration in my life and in my scientific work.
It has been an honour to be his student.
I would like to acknowledge the following colleagues for their contribution to this
work. I thank Georg Bocherer and Diego Lentner with whom Frank and I collaborated
in a fruitful project, some results of which are presented in Chapter 4 of this thesis.
I appreciate Alvin Sukmadji’s work on zipper codes and his insights in designing the
concatenation schemes presented in this work. I thank Felix Frey and Sebastian Stern
for the collaborative work that resulted in Chapter 6 of this thesis. I also thank Yury
Polyanskiy and Hajir Roozbehani for the discussions on non-linear codes that helped us
in development of the work presented in Chapter 7 of this thesis.
I would also like to thank Prof. Laurent Schmalen and Dr. Vahid Aref who were
my supervisors when I was an intern at Nokia Bell Labs in Stuttgart, Germany. Their
knowledge and insight helped me a lot in broadening my understanding of this field.
I would like to thank the following friends and colleagues. I am grateful of my seniors,
Christian, Lei, Chunpo, Chris and Siddarth, for welcoming me to this group and helping
me settle and get started with my research. I thank Amir, Reza, Foad, and Kaveh, my
colleagues during most of my studies, for their friendship and the interesting scientific
discussions. I also thank Susanna, Bo, Qun, Saber, and Mohannad, my newer colleagues,
for keeping the office a warm and welcoming place to work. Thank you all for creating a
very positive and collaborative work environment over the years.
I have been blessed in life with having many wonderful friends. I would like to thank
Atena and Saman, two of my oldest friends, who were my support system when I came to
Canada. To them and to all my great friends, including Soheil, Amirreza, Alborz, Rozhin,
Peter, Sajjad, Amin, and Shiva: you should know that your support and encouragement
was worth more than I can express on paper.
Finally, I would like to thank my amazing family. I am grateful of my parents, Amir
and Ziba, for providing me with the best possible care and education every step of the
way. I thank My sister, Maryam, for her kindness and support. And I wholeheartedly
thank the love of my life, my best friend, Zhino, for her patience, encouragement, and
unconditional love; I could not have asked for a better partner on this journey.
v
Contents
1 Introduction 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Assumptions and Figures of Merit . . . . . . . . . . . . . . . . . . . . . . 4
1.2.1 Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.2 Figures of Merit . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2 Background 7
2.1 Staircase Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2 Zipper Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.3 LDPC Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3.1 EXIT Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3.2 Quasi-Cyclic LDPC Codes . . . . . . . . . . . . . . . . . . . . . . 12
2.4 Code Concatenation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.5 Coded Modulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.5.1 Multi-Level Coding . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.5.2 Bit-Interleaved Coded Modulation . . . . . . . . . . . . . . . . . . 16
3 Low-Complexity Concatenated LDPC-Staircase Codes 17
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.2 The Inner-Code Structure . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.2.1 Code Description . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.2.2 Ensemble Parameterization . . . . . . . . . . . . . . . . . . . . . 19
3.2.3 Complexity Measure . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.3 Complexity-optimized Design . . . . . . . . . . . . . . . . . . . . . . . . 21
3.3.1 EXIT chart analysis . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.3.2 Code Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.3.3 Practical Considerations . . . . . . . . . . . . . . . . . . . . . . . 26
3.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
vi
3.4.1 Pareto Frontier . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.4.2 Two Design Examples . . . . . . . . . . . . . . . . . . . . . . . . 29
3.4.3 Comparison to Other Works . . . . . . . . . . . . . . . . . . . . . 31
3.4.4 Quasi-Cyclic-Structured Inner Codes . . . . . . . . . . . . . . . . 32
3.4.5 Concatenated LDPC-Zipper Structure . . . . . . . . . . . . . . . 34
3.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4 Low-Complexity Concatenated FEC for Higher-Order Modulation 37
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4.2 Concatenated Code Description . . . . . . . . . . . . . . . . . . . . . . . 38
4.3 MLC Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.3.1 Coded-Modulation Description . . . . . . . . . . . . . . . . . . . 39
4.3.2 Inner-Code Description . . . . . . . . . . . . . . . . . . . . . . . . 41
4.3.3 Ensemble Optimization . . . . . . . . . . . . . . . . . . . . . . . . 41
4.4 BICM Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.4.1 Coded-Modulation Description . . . . . . . . . . . . . . . . . . . 44
4.4.2 Inner-Code Description . . . . . . . . . . . . . . . . . . . . . . . . 45
4.4.3 Ensemble Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.4.4 Ensemble Optimization . . . . . . . . . . . . . . . . . . . . . . . . 48
4.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.5.1 Design for 28% OH . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.5.2 Design for 25% OH . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.5.3 Design Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5 Low-Complexity Rate- and Channel-Configurable Concatenated Codes 62
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
5.2 Concatenated Code Description . . . . . . . . . . . . . . . . . . . . . . . 64
5.3 Inner-Code Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
5.4 Ensemble Optimization and Code Construction . . . . . . . . . . . . . . 67
5.4.1 Reference Complexities . . . . . . . . . . . . . . . . . . . . . . . . 67
5.4.2 Configurable Inner-Code Optimization . . . . . . . . . . . . . . . 68
5.4.3 Code Optimization Via Differential Evolution . . . . . . . . . . . 70
5.4.4 Code Construction . . . . . . . . . . . . . . . . . . . . . . . . . . 73
5.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
5.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
vii
6 Complexity-Optimized Non-Binary Coded Modulation for Four-Dimensional
Constellations 78
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
6.2 Four-Dimensional Signal Constellations . . . . . . . . . . . . . . . . . . . 79
6.3 Four-Dimensional Set-Partitioning . . . . . . . . . . . . . . . . . . . . . . 81
6.4 Concatenated Non-Binary FEC Architecture . . . . . . . . . . . . . . . . 82
6.5 Ensemble Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
6.5.1 Empirical Density Evolution . . . . . . . . . . . . . . . . . . . . . 83
6.5.2 BER Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
6.5.3 Differential Evolution . . . . . . . . . . . . . . . . . . . . . . . . . 85
6.6 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
6.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
7 Low-Density Nonlinear-Check Codes 90
7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
7.2 ERC Limit and Nonlinear Codes . . . . . . . . . . . . . . . . . . . . . . 91
7.3 LDNC Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
7.3.1 Code Description . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
7.3.2 Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
7.3.3 Message-Passing Decoding . . . . . . . . . . . . . . . . . . . . . . 95
7.3.4 Efficient Message Computation . . . . . . . . . . . . . . . . . . . 97
7.4 Error-Reducing Performance Results . . . . . . . . . . . . . . . . . . . . 98
7.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
8 Conclusion and Topics of Future Research 101
Bibliography 104
viii
List of Tables
3.1 Quantifying Finite Interleaving Loss . . . . . . . . . . . . . . . . . . . . . 27
4.1 An example of degree distributions of various types, for m = 3. . . . . . . 46
4.2 Statistics of the simulation results shown in Fig. 4.5.3 (14 inner iterations) 60
4.3 Statistics of the simulation results shown in Fig. 4.5.3 (12 inner iterations) 60
6.1 Bit-level capacities of the 4D constellations at their respective operating
points. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
7.1 List of all distinct check functions for dc = 7. . . . . . . . . . . . . . . . . 95
ix
List of Figures
2.1 Staircase code structure. Information bits fill the white part of the blocks
and the parity bits fill the rest. The block B0 is initialized with all-zeros. 8
2.2 A typical zipper-code framework (left) and its representation of a staircase
code (right). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.3 The two common approaches to coded modulation. . . . . . . . . . . . . 15
3.1 Tanner graph of an LDPC inner code, consisting of some degree-zero vari-
able nodes (uncoded components) and a coded component. The rectangle
labeled by Π represents an edge permutation. The VN and CN degree
distributions are to be designed. . . . . . . . . . . . . . . . . . . . . . . . 19
3.2 The elementary EXIT functions used for designing a rate-8/9 code en-
semble in Section 3.4.2, Example 1. The EXIT function of the resulting
optimized ensemble is compared with that of the (3, 27)-regular LDPC
ensemble. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.3 The (Es/N0, ηin) Pareto frontiers of the inner code in the proposed design,
compared with the benchmark design of [1], at 15%, 20%, and 25% OHs. 28
3.4 Simulated inner-code BERs on bits passed to the outer code, sampled from
the complexity-optimized ensembles, for designs at 20% OH. The mid-
point on each BER curve (highlighted by an ‘o’) is the code operational
point, i.e, the SNR for which the inner code is designed to achieve Pout ≤ psc. 29
3.5 The BER on information nodes of different degrees in the ensemble of
Example 1 and the BER on bits passed to the outer code, denoted by Pout.
The degree distribution on the information nodes is 0.1665 + 0.0223x +
0.3919x3 + 0.4193x4. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
x
3.6 NCG and η comparisons of the proposed concatenated design and other
soft-decision FEC schemes, at 20% OH. Decoders using a flooding (resp.,
layered) decoding schedule are denoted with Fl (resp. La). For the pro-
posed codes (denoted as “prop.”), the inner decoding algorithm (MS or
SP) is specified. Block length 30000 is considered for the designs with QC-
structured inner codes. The following abbreviations are used in describing
the referenced codes. BCH: Bose—Ray-Chaudhuri—Hocquenghem, UEP:
Unequal Error Protection, RS: Reed-Solomon, CC: Convolutional Code,
SpC: Spatially Coupled. . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.7 NCG and η comparisons of the QC constructions of the designed concate-
nated FEC, at 20% OH, under layered (La) and flooding (Fl) schedules. . 33
3.8 The (Es/N0, ηin) Pareto frontiers of the designed concatenated LDCP-
zipper FEC, at 20% OH, compared with the LDCP-staircase design and
the benchmark design of [1]. . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.9 NCG and η comparisons of the proposed concatenated design and other
soft-decision FEC schemes, at 20% OH. The concatenated design with an
outer zipper code, the NCG-η Pareto frontier of which is the top left curve,
outperforms other designs by a wide margin. . . . . . . . . . . . . . . . . 35
4.1 The encoder and the decoder in the MLC scheme. Here, m = log2M
denotes the number of bits per PAM symbol. . . . . . . . . . . . . . . . . 39
4.2 The encoder and the decoder in the BICM scheme. Here, m = log2M
denotes the number of bits per PAM symbol. . . . . . . . . . . . . . . . . 40
4.3 Inner-code ensemble considered for the BICM scheme . . . . . . . . . . . 45
4.4 Performance-complexity comparisons of optimized codes for MLC and
BICM using 64-QAM, compared with the design in [2]. The number of
decoding iterations required by each designed code is indicated. At the
overall 28% OH of these schemes, CSL = 15.0 dB. . . . . . . . . . . . . . 52
4.5 Simulated decoder outputs of inner codes for designs at 28% OH with 64-
QAM. The mid-point on each BER curve (highlighted by an ‘o’) is the
code operational point, i.e, the SNR for which the inner code is designed
to achieve Pout ≤ P tout. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.6 Performance-complexity comparisons of the obtained optimized codes for
MLC and the BICM of various orders at 25% overall OH, compared with
the designs in [3]. The number of decoding iterations required by each
designed code is indicated. . . . . . . . . . . . . . . . . . . . . . . . . . . 54
xi
4.7 Achievable information rate for 16- and 64-QAM modulations compared
to the unconstrained Shannon capacity. The operational point of the de-
signed concatenated code is also shown and compared to that of [4]. . . . 56
4.8 The interleaving and placement of bits into the real buffer of the outer
decoder per chunk, for the FEC parameters discussed in Sec 4.5.3. . . . . 58
4.9 BER simulations for the designed concatenate LDPC-zipper FEC scheme. 59
5.1 The encoder and the decoder in the configurable FEC scheme. Here,
m = log2M denotes the number of bits per PAM symbol. . . . . . . . . . 64
5.2 Designed configurable FEC schemes, denoted by the connected marks.
Each mark is an operating point and its complexity score is indicated on
its label. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
5.3 Performance-versus-complexity comparison between the designed config-
urable FEC schemes and those of [5]. The FEC rate, in bits per symbol,
of each operating point is indicated on its label. . . . . . . . . . . . . . . 76
6.1 Constellation-constrained capacities in bit/symbol versus the SNR. The
inset shows the 2D projection of the corresponding signal constellations. . 80
6.2 Illustration of first (left) and second (right) partitioning steps of the D4-
based constellations in one polarization. . . . . . . . . . . . . . . . . . . 81
6.3 The proposed concatenated FEC architecture for DP transmission over
the AWGN channel. The alphabet field sizes are denoted below their
corresponding stages. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
6.4 The BER on bits passed to the outer code. The constellation capacities
are indicated by the vertical lines. The horizontal line denotes the outer-
code threshold. Here, the solid curves denote the non-binary designs and
the dotted and dashed curves denote their binary counterparts: TS-BICM
indicates the performance two-stage BICM-based scheme of [6] and 1D-
MLC indicates performance of the scheme of Sec. 4.3. . . . . . . . . . . . 88
7.1 The block diagram of a scheme that achieves the ERC Limit. . . . . . . . 91
7.2 Factor graph representation of an LDNC ensemble. Information- and
check-node degrees are denoted by dv and dc, respectively. Here, a degree-
dc CN is connected to dc information nodes. The rectangle labelled Π
represents an edge permutation. . . . . . . . . . . . . . . . . . . . . . . . 93
7.3 Encoding operation an a nonlinear check node. . . . . . . . . . . . . . . . 94
xii
7.4 A typical CN of degree dc. Node y is set to denote the function the CN
performs on the information nodes. . . . . . . . . . . . . . . . . . . . . . 95
7.5 Binary computation tree for obtaining q(x) when dc = 7. The messages
are passed from the bottom up. . . . . . . . . . . . . . . . . . . . . . . . 97
7.6 BER curves in regular ensemble with dv = 4 and dc = 7 with various
check functions, plotted versus number of decoding iterations. The codes
are simulated at 0.5 dB above their (error-free) constrained Shannon limit. 98
7.7 BER curves of regular dv = 4, dc = 7 LDNC ensembles with three check
functions, plotted in a wide range of SNRs. All decoders perform 4 de-
coding iterations. The error-free constrained Shannon limit is at 1.92 dB
SNR. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
xiii
List of acronyms
2D two-dimensional
4D four-dimensional
AIR achievable information rate
APP a posteriori probability
AWGN additive white Gaussian noise
BCH Bose-Chaudhuri-Hocquenghem
BER bit error rate
BICM bit-interleaved coded modulation
BRGC binary reflected Gray code
CN check node
CSL constrained Shannon limit
DP-QAM dual-polarization quadrature amplitude modulation
DP dual-polarization
DPS degree partition and sort
ERC error-reducing code
EXIT extrinsic information transfer
FEC forward error correction
HD hard-decision
LDGM low-density generator-matrix
LDMC low-density majority-check
LDNC low-Density Nonlinear-Check Codes
LDPC low-density parity-check
xiv
LLR log-likelihood ratio
LSB least significant bit
MET multi-edge type
MLC multi-level coding
MS min-sum
MSB most significant bit
NCG net coding gain
OH overhead
OTN optical transport network
PAM pulse amplitude modulations
QAM quadrature amplitude modulation
QC quasi-cyclic
RS Reed-Solomon
SD soft-decision
SNR signal-to-noise ratio
SP sum-product
SQP sequential quadratic programming
VN variable node
xv
Chapter 1
Introduction
1.1 Motivation
This thesis is about the design of low-complexity forward error correction (FEC) ar-
chitectures for applications with high throughput, as needed, for example, in optical
communication systems. We develop methods to obtain FEC and modulation schemes
in which the primary focus is on minimizing the complexity of decoding information bits.
We also aim at understanding the performance-complexity trade-offs of the FEC schemes
with various modulation formats.
Optical communication systems mandate a bit error rate (BER) of less than 10−15
on bits delivered to the customer. To meet this stringent requirement, and with the ever
increasing demand for throughput, channel coding has become an essential component
of the optical transport networks (OTNs), and the study of efficient and low-complexity
FEC schemes for optical communication is an active area of research; see [7–11] and
references therein.
Early FEC scheme proposals for OTNs (ITU-T G.975.1 [12] for example) used Reed-
Solomon (RS) codes and Bose-Chaudhuri-Hocquenghem (BCH) codes, both algebraic
codes, as the FEC components. These codes achieve very good performance and their
algebraic structure, and their syndrome based decoding, enable the decoding procedure
to be performed in a single pass with a low complexity. More recently, algebraic-based
product-like codes, such as staircase codes [13] and zipper codes [14], have been adopted
in OTNs. These codes are low-complexity hard-decision (HD) FEC solutions that can
operate at a gap of only ∼0.5 dB from their information-theoretic limits.
As the demand for throughput in OTNs increases, researchers increasingly specify
the use of soft-decision (SD) codes, i.e., codes that can make use of probabilistic symbol
reliabilities. The difference between an SD and an HD code lies in the number of decoder
1
Chapter 1. Introduction 2
inputs per received symbol, required for decoding. For example, for a binary input
channel, an HD decoder requires a single-bit quantization of the channel output, while
an SD decoder requires a softer quantization of the channel output, as a measure of the
reliability of the received symbol. Hence, HD codes are fundamentally weaker than SD
codes. Examples of modern SD codes are turbo codes, low-density parity-check (LDPC)
codes, and spatially-coupled codes [15]. At a similar overhead (OH) and signal-to-noise
ratio (SNR), SD codes can achieve coding gains of ∼1–2 dB, or more, relative to the HD
codes used in earlier OTN proposals [16].
The excellent performance of SD codes comes, however, at the expense of a signifi-
cantly increased decoding complexity. A comparison of the implementations of soft- and
hard-decision decoders shows that SD decoders typically consume an order of magni-
tude more power than HD decoders [17–20] operating at comparable throughputs. As
estimated in [21, 22], with a pure SD FEC approach, the decoder component would
be responsible for about 16%–35% of the total power consumption in coherent optical
transmission systems, higher power consumption than any other component of the net-
work. In short-reach optical networks, where throughputs as high as 1 Tb/s are being
considered [23], the FEC power consumption is an even bigger concern [24]. If the en-
ergy consumption per decoded bit does not decrease for future OTN designs, the FEC
component becomes increasingly energy hungry and difficult to cool.
In this work, we study various aspects of code design for optical communication,
with a particular focus on obtaining architectures that attain low decoding complexity,
the specific measure of which is defined in Sec. 1.2.2. We consider an FEC architecture
consisting of an SD LDPC code concatenated with an HD staircase or zipper code. We
aim to take advantage of the best of both worlds: the superior performance of SD FEC
schemes and the low-complexity decoding of the HD FEC schemes.
In our designs, the inner LDPC code is tasked with matching the channel and the
rest of the FEC components to the outer code, reducing the BER on bits delivered to it
below its threshold. That, in turn, enables the outer code to take the BER further down,
below 10−15, as required by OTNs. For example, a typical pre-FEC BER is ∼10−2. We
then task the inner code with reducing the BER to ∼10−3, after which the outer code
takes over and brings the BER to below 10−15. With this approach, the bulk of the error-
correction is carried by the outer HD code with very low decoding complexity. The overall
FEC complexity is then dominated by that of the SD decoder. While not considered in
this work, coarsely-quantized LDPC decoding algorithms [25–28] or soft-aided decoding
algorithms for staircase or zipper codes [11,29–31] can be adopted to further reduce the
complexity or improve the performance of the proposed scheme, respectively.
Chapter 1. Introduction 3
In Chapter 3 we study the design of low-complexity LDPC code architectures. A key
feature that emerges from our design is that it pays to have some of the channel symbols
bypass the inner code, thereby greatly reducing a significant portion of the decoding
complexity. We then develop an optimization routine and obtain low-complexity LDPC
codes for various system specifications that, under the considered measures, significantly
outperform all previously existing SD FEC schemes and do so with up to 71% reduction in
decoding complexity. In this work, a hardware-friendly quasi-cyclic (QC) construction is
adopted for the inner codes, which can realize an energy-efficient decoder implementation,
and even further complexity reductions via a layered message-passing decoder schedule.
The explosive growth in demand to achieve increased transmission rates in optical
communication (currently at a compound annual rate of 48% [23]), has given impetus
to the study of FEC schemes in combination with higher order modulation to increase
the spectral-efficiency; see [8, 32, 33] and references therein. In Chapter 4 we study the
design of low-complexity concatenated codes both in a bit-interleaved coded modulation
(BICM) scheme and a multi-level coding (MLC) scheme. We obtain code designs that
handily outperform the previously existing schemes, with up to 60% reduction in decoding
complexity. More importantly, by a clever choice of FEC architecture, we obtain code
designs via MLC scheme that provide significant advantages relative to designs via BICM
scheme over the entire performance-complexity tradeoff space. We also provide examples
in which the designed FEC schemes are describe in great detail.
In Chapter 5 we develop tools to design low-complexity rate- and channel-configurable
FEC schemes. We propose a concatenated coded modulation scheme that can operate
at multiple transmission overheads (OH) and channel qualities, and with various mod-
ulation orders via an MLC scheme. In this design, the transmission rate is configurable
by signalling with various modulation formats and by the configurable inner-code rate,
and operation at various channel qualities is realized by the configurable inner-decoding
complexity. Such flexibility is mandated in a variety of applications including designing
multi-vendor interoperable modules and software defined optical networks. The obtained
configurable codes achieve up to 60% reduction in complexity compared to previously
existing designs.
Non-binary LDPC codes have also been considered for FEC in optical communication
[34] because they can outperform their binary counterparts of short to medium length.
However, conventional non-binary LDPC codes have higher decoding complexity too. In
Chapter 6 we adapt our design tools to obtain complexity-optimized concatenated non-
binary LDPC-zipper codes. In particular, we consider dense signal constellations based
on four-dimensional (4D) lattices and show that with the obtained non-binary FEC
Chapter 1. Introduction 4
schemes, based on an MLC architecture, we achieve up to 1 dB gain over conventional
FEC schemes, yet with reasonable complexity.
In Chapter 7 we first derive the theoretical limit for the error-reducing inner codes we
will have designed in our schemes, and then explore the possibility of using a novel class of
nonlinear codes to be used in our schemes instead. While the obtained codes are not well-
suited for optical communication, they are nevertheless very interesting mathematically,
and may be useful in code designs for other applications.
We also cover various concepts in Chapter 2 that are not new to this work, but are
beneficial for the reader to review before we present the novelties of this work. Finally,
in Chapter 8 we provide the concluding remarks and provide pointers to possible future
research based on this work.
1.2 Assumptions and Figures of Merit
1.2.1 Assumptions
Throughout this work we design codes assuming a memoryless, additive white Gaussian
noise (AWGN) channel. Although the optical channel is in fact nonlinear, the vari-
ous signal processing units that perform filtering, chromatic-dispersion and nonlinearity
compensation, and various other tasks prior to decoding, typically leave the FEC with a
residual AWGN channel [7, 35,36].
While the actual performance of the codes we obtain might be different over an optical
channel, the ordering of their performance is very unlikely to change. In other words, if
code A outperforms code B on an AWGN channel, code A is also very likely to outperform
code B on a typical (equalized, nonlinearity-compensated) optical link. By assuming an
AWGN channel, we can see the potential in the proposed FEC schemes, which makes
it worthwhile to implement them for an optical channel. This is a common practice in
evaluating design proposals for optical communication [3, 8, 37].
Throughout this thesis, we consider signalling with a uniform distribution over the
constellation points. We assume unit energy per signal dimension at the receiver. Further,
we assume an AWGN channel with σ2 denoting the the noise variance per dimension. In
this setting, the SNR, in decibels, is denoted by Es/N0 , −10 log10 σ2.
Chapter 1. Introduction 5
1.2.2 Figures of Merit
Performance
A fundamental figure of merit of an FEC scheme is determined by the SNR at which it
operates, relative to a reference SNR. When the reference SNR is that of the theoretical
limit, the performance metric is the gap to the constrained Shannon limit (CSL). When
the reference SNR is that of an uncoded scheme, the performance metric is the net coding
gain (NCG). For example, consider an FEC scheme that operates at rate R (bit/symbol),
uses the 22m-ary signal constellation Ω, and requires SNRc to achieve a target BER of
10−15. Now let CΩ(SNR) denote the mutual information achieved by a uniform input
distribution over the constellation Ω as a function of SNR. The gap to the CSL is given
by
gap to CSL (dB) = 10 log10
(SNRc
C−1Ω (R)
), (1.1)
where C−1Ω is the inverse function of CΩ. The NCG provided by this FEC scheme is given
by
NCG (dB) = 10 log10
(R · SNRu
2m · SNRc
), (1.2)
where SNRu is the SNR required in an uncoded transmission, using Ω, to achieve 10−15
BER.
Complexity
Power consumption and heat dissipation at the decoder have become increasingly limiting
factors in FEC design for optical communication. It has been shown that with a given
code and decoding algorithm, the power consumption at the decoder chip scales roughly
linearly with the system throughput [21, 22]. Therefore, it is customary to consider the
energy consumed per decoded information bit as a measure of FEC energy efficiency.
Realistic efficiency measurements for FEC schemes, however, are very hard to formulate,
since they will be highly implementation-technology-dependent, architecture-dependent,
etc. and the only reliable way of estimating them is to design and simulate an application-
specific integrated circuit for the FEC scheme [38]. Therefore, often a measure of decoding
complexity is considered as a proxy.
In this work, in order to quantify, and eventually minimize, the FEC complexity,
we consider the decoder data-flow and count the number of messages required to pass
among the various nodes for successful decoding. As the measure of FEC complexity,
we normalize the number of messages passed in decoding by the number of decoded
Chapter 1. Introduction 6
information bits and denote it by η. A similar complexity measure has been used in a
number of prior works including [1, 39].
We note that this measure is only a proxy for the actual decoder complexity (which,
as stated above, might be measured in energy efficiency terms, or area, memory require-
ments, I/O requirements, or in combinations of these factors). While it is true that η is
only a proxy, we believe that it gives insight into code design, since the actual complexity
is very likely to scale with it. For example, the vast majority of power dissipated in a
decoder hardware is the dynamic power resulting from the signal toggling corresponding
to messages passed between the nodes. Also, in most codes and decoding algorithms, the
number of arithmetic operations required in decoding is linearly related to the number of
messages passed in decoding. Therefore, if the FEC scheme were to be implemented using
compute-in-memory technology [40] (where data can be processed in memory, reducing
the number of times it is moved), η would still be a relevant complexity measure.
Interestingly, as will be shown in Chapters 3–6, minimizing η has desirable hardware
implications such as the use of degree-one variable nodes, or even leaving some bits
uncoded (degree zero) at the inner code, the presence of which have obvious complexity
benefits by any measure.
Chapter 2
Background
2.1 Staircase Codes
A Staircase code [13] is a binary code that consists of (possibly infinitely many) m×mblocks B0,B1,B2,B3, . . ., as shown in Fig. 2.1. Associated with a staircase code is a
binary constituent code, C(nc, nc − rc), where nc is the code length and rc is the number
of parity bits per constituent codewords. The inner-code rate is Rc = 1− rc/nc.At the staircase encoder, we initially let B0 be an all-zero block. For block Bi,
where i ≥ 1, we first fill the white part of the block (see Fig. 2.1) with information bits.
We then use the constituent encoder to fill the rest of the block such that the rows of[Bi−1 BT
i
]are codewords in C. Note that we must have nc = 2m. Per block, we then
have m(m − rc) information bits and mrc parity bits. The rate of the unterminated
staircase code is obtained as
Rsc =m− rcm
= 2Rc − 1.
In this structure, two information bits belong to no more than one constituent codeword.
Staircase codes can be decoded by iterative window decoding. The window size W is
defined as the number blocks in the decoder at each time. For i ∈ i2, i3, . . . , iW, at each
iteration, we use the constituent decoder to decode the rows of[Bi−1 BT
i
]. Decoding
continues until a maximum number of iterations is reached. The staircase decoder then
outputs block Bi1 , the oldest block of bits in the decoding window, and brings a newly
received block in it.
The parameters we pick to construct the staircase code and its decoder determine
its threshold, psc. The threshold is the maximum cross-over probability of a binary
symmetric channel for which the decoder can achieve a target output BER, set at 10−15
7
Chapter 2. Background 8
m
rc
nc
B0
B2
B4
BT1
BT3
BT5
Figure 2.1: Staircase code structure. Information bits fill the white part of the blocksand the parity bits fill the rest. The block B0 is initialized with all-zeros.
in optical communication. See, e.g., [1, Table I] and [41, Tables I, II] for various staircase
code constructions and their thresholds.
Staircase codes have excellent error-correcting performance and can operate within
a gap of only 0.56 dB from the binary symmetric channel Shannon limit. Their al-
gebraic syndrome-based decoding means these codes also have extremely low decoding
complexity. Hence, staircase codes are very attractive choices in FEC design in optical
communication. In fact, in the recent 400ZR standard [4] a staircase code, concatenated
with a Hamming code, is used in the FEC scheme. At the expense of higher decod-
ing complexity, the performance of staircase codes, and product codes in general, can
be improved by modifying the decoding algorithm [42] and aiding the decoder by soft
information from the channel [11,29–31].
As shown in [43, Figure 1.1], however, the staircase decoder memory size grows ex-
plosively at higher rates. In FEC designs where concatenation with codes of very high
rate is desirable (see, e.g., Chapter 3), therefore, a staircase code may not be a practical
choice.
2.2 Zipper Codes
Zipper codes [14] are a newly proposed framework for describing spatially-coupled product-
like codes such as staircase codes and braided block codes. Similar to staircase codes, a
constituent code C(nc, nc − rc) is associated with a zipper code. The zipper code buffer
is divided into a pair of virtual and real buffers, shown in Fig. 2.2 as the left and right
halves of the codes, respectively. The virtual buffer contains copies of the bits in the real
Chapter 2. Background 9
B0 BT1
B1 BT2
B2 BT3
rc
nc
µ
W × µ
Figure 2.2: A typical zipper-code framework (left) and its representation of a staircasecode (right).
buffer, possibly in a different arrangement.
At the encoder, we first fill the virtual buffer by a permutation of previously encoded
bits. We then fill the white part of the real buffer (see Fig. 2.2) by information bits.
Finally, we use the constituent encoder to fill the rest of the real block such that the rows
of the buffer are codewords in C.
Zipper codes also can be decoded by iterative window decoding. The decoding window
consists of W chunks, where a chunk is defined as a collection of µ rows. At each iteration,
we decode each chunk by using the constituent decoder to decode the rows. If we correct
(or much less often, miscorrect) any bits in the chunk, we update their copies accordingly.
Decoding continues until a maximum number of iterations is reached. The zipper decoder
then outputs the oldest chunk of bits in the decoding window, and brings a newly received
chunk of bits into it.
Similar to staircase codes, the parameters we pick to construct the zipper code and
its decoder determine its threshold, below which the decoder can achieve a target output
BER, set at 10−15 in optical communication.
Note that with zipper codes we have more flexibility in choosing the chunk size and
decoding-window size compared to staircase codes. Hence, we can keep the decoder
memory size in check when engineering practical high-rate codes. In fact, high-rate and
practical zipper codes are reported in [14, Table I] that can operate within a gap of only
0.49 dB from the binary symmetric channel Shannon limit. We believe zipper codes are
the natural improvement over staircase codes to be used in the next standards.
Chapter 2. Background 10
2.3 LDPC Codes
Invented by Gallager [44], an (N,K) LDPC code is defined as the null space of a sparse
(N −K)×N parity-check matrix H . Here, N and K denote the code length and code
dimension, respectively, and sparse means that the number of non-zero elements in H is
much smaller than the number of zero elements.
An LDPC code also has a Tanner graph representation [45] as shown in, e.g., Fig. 3.1.
Here, the N columns of H are represented by variable nodes (VNs) and the M = N −Krows are represented by check nodes (CNs). Where there is a non-zero element in the H
matrix, the corresponding VN and CN are connected by an edge in the Tanner graph.
The Tanner graph of an LDPC code can be described by its VN and CN degree
distributions. Let Dv denote the maximum VN degree. The node perspective variable-
degree distribution is defined as L(x) =∑Dv
i=0 Lixi, where Li is the fraction of degree-
i VNs. The edge-perspective degree distribution is defined as λ(x) , L′(x)/L′(1) =∑Dvi=1 λix
i−1, where L′(x) = dL(x)/dx.
For the ease of decoder implementation, it is often assumed that the CN degree
distribution is concentrated on one or two consecutive degrees, dc and dc + 1, with dc
denoting the average CN degree. The node perspective, check-degree distribution is
defined as R(x) =∑dc+1
d=dcRdx
d, where Rd is the fraction of degree-d CNs. The edge-
perspective degree distribution is defined as ρ(x) , R′(x)/R′(1), and can also be obtained
as
ρ(x) =dc(dc + 1− dc)
dcxdc−1 +
dc − dc(dc + 1− dc)dc
xdc .
The VN node perspective and edge perspective distribution parameters are related
by
Li = dc(1−Rin)λi/i.
This identity can be obtained by first setting Li = Ni/N , where Ni is the number degree-i
VNs, then by using the relation 1−R = M/N where R is the LDPC code rate, afterwards
by using Mdc = E where E is the number of edges in the Tanner graph, and finally by
using Ni = Eλi/i.
An LDPC code can be decoded by iterative message-passing decoding on its factor
graph. A factor graph [46] is a type of graphical model well-suited for describing codes
and iterative decoding algorithms via the sum-product (SP) algorithm. In SP algorithm
the messages passed on the edges of the graph are “beliefs” about the symbols associated
with the VNs. These beliefs are typically represented by extrinsic posterior probability
vectors. Hence, message-passing and decoding via SP algorithm is sometimes referred to
Chapter 2. Background 11
as belief propagation. With the SP decoding algorithm, LDPC codes can approach the
Shannon limit [47]. Other, sub-optimal, message passing schemes also exist, including
max-product [45], min-sum [48], and offset min-sum [49] schemes, that may offer practical
advantages compared to the SP message-passing, athough typically at a performance loss.
The excellent performance of LDPC codes, however, comes with several practical
challenges, some of which are listed below.
• In their general form, LDPC codes have a complex encoding that scales quadrati-
cally with the code length, both in terms of processing and memory requirements.
There are, however, structured LDPC codes that allow for a low-complexity en-
coding. Examples of such constructions include repeat-accumulate codes [50] and
low-density generator matrix (LDGM) codes [51].
• The exchange of soft information for every VN and CN in message-passing decoding
of LDPC codes leads to very high power consumption at the decoder. Obtaining
updates to those messages also is computationally intensive. Several ideas have been
explored to address the problem of high power consumption in LDPC decoders,
including chip-voltage reduction, switching activity reduction, use of simpler and
quantized, but sub-optimal, message-passing algorithms, and early termination [26–
28,52–55]. Also, introducing structure to LDPC codes reduce their implementation
complexities and help realize energy-efficient decoders [56,57], as will be discussed
in Sec. 2.3.2. Nevertheless, LDPC decoders typically still consume about an order
of magnitude more power than hard-decision decoders [17–20].
• Due to their random structure, LDPC codes that are to operate close to the Shan-
non limit typically suffer from an error floor [58]. Therefore, in applications such
as optical communications where a very low BER is required, a clean-up (usually
algebraic) outer code has to be concatenated with an LDPC code to mitigate the
remaining errors.
2.3.1 EXIT Functions
The standard analysis and design tool for LDPC codes is the density evolution algorithm.
Density evolution, proposed in [47], can accurately track the convergence behaviour of
very long LDPC codes under various decoding schemes. For example, density evolution
can be used to design and track the pdf of LLR messages at each decoding iteration or
to test whether an LDPC code can operate at a given channel condition.
Chapter 2. Background 12
An extrinsic information transfer (EXIT) function analysis can also be used to track
the decoding behaviour of an LDPC code [59]. While not as accurate as density evolution,
EXIT function analysis has much lower computational complexity. It can provide a fast
and accurate enough tool for designing LDPC codes.
In [60], an accurate one-dimensional EXIT function analysis was proposed and used
for LDPC code design. The key idea is to assume a symmetric Gaussian distribution for
the messages that come from the VNs, but not from the CNs. In the log-likelihood ratio
(LLR) domain, the SP update rule requires the VNs to send the sum of the extrinsic
messages that they receive from the CNs, plus their channel message. The Gaussian
distribution of these messages then can be explained by the central limit theorem.
The EXIT function then tracks a measure of progression throughout decoding itera-
tions. The measure can be the mean of the Gaussian LLRs, the probability of error in
messages, or the average mutual information between the value of VNs and the extrin-
sic LLR messages. Each of these measures proves more accurate than others in certain
settings. The error-probability EXIT function, for example, denotes what would be the
probability of error in messages coming from the VNs after one decoding iteration as a
function of the message error probability at the current iteration.
Similarly, in [60] the authors define elementary EXIT functions that track a certain
measure of decoding progression for a VN of particular degree. Moreover, they show
that the EXIT function of the decoder can be closely approximated as a linear function
(corresponding to the VN degree distribution) of the elementary EXIT functions. As will
be shown in Chapter 3, this approximation is key to LDPC code design.
In [60, Fig. 3] the authors provide a visualization of how the EXIT function tracks the
progress of decoding in various iterations. Furthermore, in [61] the authors give a formula
that, given the EXIT function, estimates the number of decoding iterations required to
take the message BER down to a target BER. As shown in Chapter 3, this formula can
be used to design LDPC codes with minimized decoding complexity.
2.3.2 Quasi-Cyclic LDPC Codes
In their general form, LDPC codes have random composition and imposing any struc-
ture on them negatively affects their asymptotic performance [62]. However, a random
composition is not suitable for hardware implementation. Meanwhile, a code structure
can be used to reduce the wiring interconnect complexity and routing in the hardware,
thus reducing the power dissipation on the decoder chip. Structured codes also allow for
hardware reusability.
Chapter 2. Background 13
Most practical LDPC codes have a quasi-cyclic (QC) structure [63], characterized as
follows. For some positive integer q, let n = N/q and m = M/q. Also let P (s), for
s ∈ 0, 1, . . . q − 1, denote the circular shift of a q × q identity matrix by s columns to
the right. For example, P (0) is the identity matrix and
P (1) =
0 1 0 . . . 0
0 0 1 . . . 0...
......
...
0 0 0 . . . 1
1 0 0 . . . 0
.
We also remark that P (s) = (P (1))s, for s ∈ 1, 2, . . . q − 1. The M × N parity-check
matrix of a QC-structured LDPC code is of the form
H =
P (s11) P (s12) . . . P (s1n)
P (s21) P (s22) . . . P (s2n)...
......
P (sm1) P (sm2) . . . P (smn)
,
where for i ∈ 1, 2, . . .m and j ∈ 1, 2, . . . n, we have si,j ∈ 0, 1, . . . q − 1,∞, and we
define P (∞) as the all-zero matrix.
The QC structure has several advantages, some of which are listed below.
• The QC structure is well known to be hardware-friendly, leading to energy-efficient
implementations [64].
• With a QC structure, LDPC codes of moderate lengths can be obtained that have a
large girth. The girth of a graph is the length of its shortest cycle. A Tanner graph
that has a large girth more closely resembles a tree and therefore belief propagation
on such a graph is a better approximation for maximum a posteriori decoding.
• QC-structured LDPC codes can be encoded in linear time with shift registers [65].
• The QC structure enables a layered message-passing decoding schedule [66]. In a
layered decoding schedule, the CNs are divided into layers. In each iteration, the
decoder sequentially update messages corresponding to each layer, always using
the latest available extrinsic information, which in turn results in a faster decoding
convergence. As will be shown in Chapter 3, layered decoding of QC-structured
LDPC codes can reduce the decoding complexity by up to 50%.
Chapter 2. Background 14
2.4 Code Concatenation
The concept of code concatenation was introduced by Forney [67] and extensively studied
in [68]. In Forney’s scheme, an inner block code of short length is concatenated to an
outer algebraic code. The inner code is decoded using maximum-likelihood decoding
which, from the outer-code perspective, reduces the channel to a burst channel. The
outer (usually RS) decoder is then tasked with cleaning the burst errors.
Code concatenation is frequently used to improve performance of FEC schemes. In
recent OTN proposals, often an inner, iteratively decoded, SD code concatenated to
an outer algebraic code is used [4]. In such schemes, as is the case in this work, the
inner code reduces the channel seen by the outer code, possibly through an iterleaver,
to a binary symmetric channel. The outer decoder is then tasked with mitigating the
remaining errors made by the inner code and bring the BER below 10−15, as required by
OTNs.
The code concatenation principle has been extended to generalized concatenated
codes, first in [69]. In this generalization, the inner code is taken as a sequence of
nested codes and multiple outer codes are used to provide unequal protection over the
inner code symbols, resulting in great flexibility in code design. Reed-Muller codes with
decoding as described in [70] is an example of a generalized concatenated code.
2.5 Coded Modulation
Digital modulation can be represented by a signal constellation and its labelling, i.e., a
bijective mapping of the (possibly coded) bit patterns to the constellation points. Coded
modulation, introduced by J.L. Massey [71], is the joint design of coding and modulation
in FEC schemes. Coded modulation can optimize for the performance, complexity, ro-
bustness, etc. of the FEC scheme, or for any combination of these metrics. We describe
the two common approaches to coded modulation below.
2.5.1 Multi-Level Coding
A constellation labelling produces various bit-levels, corresponding to the signal-point
address-bits. The idea of MLC is to protect (possibly groups of) bit levels by individual
codes, as shown in Fig. 2.3(a). At the receiver multi-stage decoding is carried out, during
which the decoding starts with the lowest bit level and continues to higher levels while
taking into account the the decisions of the previous levels.
Chapter 2. Background 15
C1
C2
C3
C
(a) MLC (b) BICM
Figure 2.3: The two common approaches to coded modulation.
For example, the coded modulation scheme proposed by Ungerboeck [72] and Imai
and Hirakawa [73], aims to improve performance by increasing the minimum Euclidean
distance, instead of Hamming distance, among the symbols that represent codewords.
Ungerboeck’s labelling (also known as set-partitioning or natural labelling) maximizes
the minimum intra-subset Euclidean distance when assigning address-bits. See Fig. 6.2
for an example of set-partitioning labelling. Note that the minimum squared distance
among the adjacent sub-constellation points doubles at each step. This effectively means
a 3 dB gain in SNR of the corresponding bit-channel. A trellis coded modulation [74,75],
for example, divides the bit-levels into two groups: least significant ones (first assigned
bits) are protected by a convolutional code and the rest (if any) remain uncoded.
A generalization of trellis coded modulation is lattice coding in which algebraically-
structured constellations (usually in higher dimensions) are used for signalling [76]. More-
over, constellation labelling via bit assignment can be generalized to non-binary parti-
tioning of the constellation. An example of coded modulation using such partitioning is
the coset codes [77,78].
The MLC scheme is optimal from an information-theoretic point of view [73, 79], as
it is capable of approaching channel capacity with multi-stage decoding by appropriate
choice of the codes for the different bit levels. Although used in digital subscriber line
applications [80], MLC schemes have often been avoided in practice because of the poten-
tially high complexity induced by using separate bit-level codes and the negative impact
that multi-stage decoding has on latency and error propagation. Nevertheless, it has
been shown in [2,3,81,82] and also in Chapter 4, that with a clever choice of the bit-level
codes these issues can be largely resolved and MLC schemes can be designed that have
decoding complexity or performance advantages over BICM.
Chapter 2. Background 16
2.5.2 Bit-Interleaved Coded Modulation
The BICM scheme [83, 84] uses one channel code. The encoded bits are passed through
an interleaver and mapped to the (usually Gray-labeled) constellation. At the decoder,
parallel independent decoding is used where, independent of their level, bits are decoded
all in parallel.
The BICM scheme relaxes the constraints between constellation size, labelling, and
the choice of code. It is known that BICM with a Gray constellation labelling can operate
within fractions of a dB from the Shannon limit [84, 85]. Because of its simplicity and
flexibility, BICM is usually considered to be a pragmatic approach to coded modulation
[86].
Another perceived advantage of the BICM scheme is that for a fixed frame length it
allows for the use of codes with longer block lengths, compared to the MLC approach,
thereby potentially unlocking higher coding gains. This advantage, however, diminishes
in applications with higher throughput as in optical communication. For example, at a
throughput of 400 Gb/s over a 16-ary constellation [4], for an additional delay of only
1 µs for the FEC, the bit-channel codes of an MLC scheme can have a block length in
the order of 105, which still allows for very powerful coding.
Chapter 3
Low-Complexity Concatenated
LDPC-Staircase Codes
3.1 Introduction
In this chapter, we build on the work of Zhang and Kschischang [1] on designing low-
complexity concatenated FEC schemes for applications with high throughput. Their
design consists of an inner soft-decision LDGM code concatenated with an outer hard-
decision staircase code. The degree distribution of the inner LDGM code ensemble is
obtained by solving an optimization problem, minimizing the estimated data-flow of the
inner-code decoder, while searching a table of staircase codes to find the optimal inner
and outer code pair. At 20% OH, the codes proposed in [1] can achieve up to 46%
reduction in complexity, when compared with other low-complexity designs.
We adopt the concatenated FEC structure of [1], but we consider a different ensemble
of inner codes. The task of the inner code, similar to that of [1], is to reduce the BER
of the bits transferred to the outer staircase code to below its threshold, which enables
the outer code to take the BER further down, below 10−15, as required by OTNs. We re-
design the inner code to further reduce its data-flow, thereby achieving an FEC solution
with even lower complexity than the codes reported in [1].
Throughout this chapter, we consider signalling using a Gray-labeled quadrature
phase-shift keying constellation, with unit energy per dimension. We assume a mem-
oryless AWGN channel.
A key characteristic that emerges from the re-designed inner-code optimization is
This chapter includes and expands on the work in [87].
17
Chapter 3. Low-Complexity Concatenated LDPC-Staircase Codes 18
that some inner codeword bits remain uncoded! These bits bypass the inner code, and
are protected only by the outer code. We propose a method to analyze and optimize
the inner-code ensemble, and show that the resulting codes can reduce the inner-code
data-flow by up to 71%, when compared to [1]. We show that, when the block length is
sufficiently large, codes generated according to the obtained inner-code ensembles perform
as expected, verifying the design approach.
To realize a pragmatic decoder implementation, we construct QC codes of practical
length, generated according to the obtained inner-code ensembles. We show that the
performance of randomly-generated inner codes of large block-length can be achieved
by QC codes of practical length in the order of 6000 to 15000. A QC-structured inner
code allows for decoder hardware implementations that are very energy efficient [64].
The QC structure also enables a layered message-passing decoding schedule. We show
that, compared with the flooding schedule, layered decoding of the QC-structured codes
reduces the complexity by up to 50%.
The rest of this chapter is organized as follows. In Sec. 3.2 we describe the inner-
code structure, code parameters, and complexity measure. In Sec. 3.3 we describe how
EXIT functions can be used to predict the inner-code performance, and we describe the
inner-code optimization procedure. In Sec. 3.4 we present simulation results and give
a characterization of the trade-off between the required SNR and decoding complexity
for the concatenated code designs. Designs with QC-structured codes are also discussed
in Sec. 3.4, and a comparison with existing soft-decision FEC solutions is presented. In
Sec. 3.5 we provide concluding remarks.
3.2 The Inner-Code Structure
3.2.1 Code Description
We use LDPC codes as inner codes. A significant feature of the inner-code ensemble is
that we allow for both degree-zero and degree-one variable nodes. Degree-zero variable
nodes are uncoded, and thus incur zero inner decoding complexity. Also, as will be
discussed in Sec. 3.2.3, degree-one variable nodes do not add to the data-flow throughout
the decoding procedure, thus they also incur no inner decoding complexity.
A Tanner graph for a member of the inner-code ensemble is sketched in Fig. 3.1. We
denote the inner-code rate by Rin.
In this work we only consider designing ensembles of systematic LDPC codes. In the
encoder of a systematic code, an information set is designated for the message symbols
Chapter 3. Low-Complexity Concatenated LDPC-Staircase Codes 19
Π
dc
dv
Figure 3.1: Tanner graph of an LDPC inner code, consisting of some degree-zero vari-able nodes (uncoded components) and a coded component. The rectangle labeled by Πrepresents an edge permutation. The VN and CN degree distributions are to be designed.
such that once the message symbols are realized, the codeword is uniquely obtained.
Formally, an information set of an (N,K) code is a set of K positions, the projection of
the code onto which results in a code of dimension K, i.e., the (K,K) code.
Note that the LDGM inner code of [1] is an instance of the ensemble defined above.
However, in an LDGM code, CNs are associated randomly with variable nodes, inducing
a Poisson distribution on variable-node degrees. In this work, the variable-node degree
distribution is carefully optimized to achieve small decoding complexity.
3.2.2 Ensemble Parameterization
The inner code ensemble is described by its VN and CN degree distributions. We denote
the maximum VN degree by Dv. We consider a CN degree distribution that is concen-
trated on one or two consecutive degrees, dc and dc + 1, with dc denoting the average CN
degree.
Let N denote the number of VNs in a particular Tanner graph drawn from the
ensemble, and let Ni be the number of degree-i VNs. We designate a particular subset
of the VNs to be the information set, while the remaining VNs form the parity set. We
let K denote the number of information nodes and let Ki be the number of information
nodes of degree i. The code rate therefore is Rin = K/N .
We denote the VN perspective, degree distribution by L(x) =∑Dv
i=0 Lixi, where
Li = Ni/N is the fraction of VNs that have degree i. The portion of uncoded bits
therefore is given by L0. We define the edge-perspective VN degree distribution as
λ(x) , L′(x)/L′(1) =∑Dv
i=1 λixi−1, where L′(x) = dL(x)/dx. The inner-code rate is
related to the edge-perspective VN degree distribution by
Dv∑i=1
λii
=1− L0
dc(1−Rin), (3.1)
Chapter 3. Low-Complexity Concatenated LDPC-Staircase Codes 20
and for i ∈ 1, . . . , Dv, the edge-perspective and node-perspective VN degree distribu-
tion parameters are related by
Li = dc(1−Rin)λi/i. (3.2)
Let Ui = Ki/N be the share of degree-i information nodes among all VNs. Since all
degree-zero VNs must be among the information nodes, we have U0 = L0. Also Ui ≤ Li
for i ∈ 1, . . . , Dv, andDv∑i=0
Ui = Rin.
Let Λ = (λ1, λ2, . . . , λDv) and let U = (U0, U1, . . . , UDv). We refer to the pair (Λ,U)
as the design parameters. The design parameters will be used in the inner-code optimiza-
tion program.
For reasons described in Sec. 3.2.3 and Sec. 3.3.1, degree-one VNs receive special
treatment in our design. We define ν to be the average number of degree-one VNs
connected to each check node. In terms of the code parameters, ν can be expressed as
ν = dcλ1. (3.3)
3.2.3 Complexity Measure
We use the complexity measure described in Sec. 1.2.2, to quantify, and eventually mini-
mize, the required data-flow at the decoder. The concatenated code decoder complexity
is defined as
η =ηin
Rsc
+ P, (3.4)
where ηin is the inner code complexity score, Rsc is the outer staircase code rate, and
P is the number of post-processing operations per information bit at the outer-code
decoder. The η score is a normalized measure of the number of messages passed in
iterative decoding of the inner code. In this thesis, we have set P = 0, since the decoding
complexity, per bit, of the staircase code is typically two to three orders of magnitude
smaller than that of the inner code. This can be estimated as follows for the rate 15/16
staircase code with a (1408,1364) constituent code. Typically, each constituent code
is “visited” by the iterative decoder about four times during the decoding (where the
decoding, i.e., processing of a syndrome, is performed using a small table-lookup-based
circuit). Since each information bit is protected by two constituent codes, the average
number of bits recovered per decoding attempt is 170.5, giving a complexity of P ≈
Chapter 3. Low-Complexity Concatenated LDPC-Staircase Codes 21
0.006 decoding attempts per decoded bit, which is negligible compared to the complexity
incurred by the inner code, obtained next.
Let E denote the number of edges in the ensemble that are not connected to a degree-
one VN. The complexity score of the inner-code, ηin, can be computed as
ηin =EI
K=
(1−Rin)(dc − ν)I
Rin
, (3.5)
where I is the maximum number of decoding iterations allowed for the inner-code decoder.
Note that, similar to [1], the complexity score in (3.5) does not account for messages of
degree-one VNs, as they remain constant throughout the decoding procedure.
3.3 Complexity-optimized Design
3.3.1 EXIT chart analysis
We analyze the the inner code using a version of EXIT functions [60, 61]. Under the
assumption that the all-zero codeword is transmitted, we define the error-probability
EXIT function fΛ, that takes pin, the probability of error in messages coming from the
VNs, as input, and outputs pout, the probability of error in messages coming from the
VNs, after one round of SP message-passing, i.e.,
pout = fΛ(pin). (3.6)
Using the law of total probability, we can write pout as
pout =Dv∑i=1
λipouti , (3.7)
where pouti is the probability of error in messages coming from a degree-i VN. From (3.6)
and (3.7) we get
pout = fΛ(pin) =Dv∑i=1
λifi,Λ(pin), (3.8)
where functions fi,Λ are called elementary EXIT functions. Function fi,Λ takes pin as
an argument, and produces pouti , the probability of error in messages coming from the
degree-i VNs, after one round of SP message-passing. As shown in [60], in practice the
elementary EXIT charts’ dependence on Λ can be neglected. Therefore, (3.8) can be
Chapter 3. Low-Complexity Concatenated LDPC-Staircase Codes 22
written as
pout = f(pin) =Dv∑i=1
λifi(pin). (3.9)
In [60] a method is proposed, that, for an LDPC code ensemble without degree-zero
and degree-one VNs, approximates the elementary EXIT charts using Monte-Carlo sim-
ulation. Assuming that the messages coming from the VNs have a symmetric Gaussian
distribution with mean m = (2 erfc−1(pin))2 and variance σ2 = 2m, an empirical distribu-
tion for CN messages is generated by performing the CN computation on samples of VN
messages. A degree−i VN then adds its channel message and i− 1 independent samples
of CN messages, to generate a sample of fi(pin). It is shown that the elementary EXIT
charts generated by interpolating the average of a large number of fi(pin) samples closely
replicate the actual elementary EXIT charts.
In our design, however, we must take into account the presence of degree-one VNs
in obtaining the elementary EXIT charts with the method of [60], as the messages from
such nodes significantly affect the CN operation. To this end, we generate the elementary
EXIT charts for a pre-set value of ν, the average number of degree-one VNs connected to
each CN, as defined in (3.3). In the Monte-Carlo simulation described above, we modify
the CN operation to account for the fact that each CN is connected to, on average, ν
degree-one VNs, and therefore receives only their channel observation.
In particular, given ν, let θ ∈ [0, 1) satisfy θbνc + (1 − θ)dνe = ν. We then assume
that a fraction θ of the CNs are connected to bνc degree-one VNs and the remainder
are connected to dνe degree-one VNs. Therefore, in obtaining the samples of degree-dc
CN messages, a fraction θe of CN message computations are performed assuming bνcmessages from degree-one VNs and when performing the CN computation and and the
remainder are performed assuming dνe messages from degree-one VNs, where
θe = θdc − bνcdc − ν
. (3.10)
Note that the SNR, dc, and ν are the only parameters needed to compute the elemen-
tary EXIT charts. Since they do not depend on inner-code design parameters, elementary
EXIT charts can be pre-computed. Therefore, when SNR, dc, and ν are given, the prob-
lem of inner-code design reduces to the problem of appropriately shaping an EXIT chart
out of its elementary EXIT charts.
In Fig. 3.2 we plot the elementary EXIT functions used in Section 3.4.2, Example 1,
for designing a rate-8/9 code ensemble. Here we have SNR = 5.85 dB, dc = 24, and
ν = 1.18. We also plot the EXIT function of the (3, 27)-regular LDPC ensemble (ν = 0).
Chapter 3. Low-Complexity Concatenated LDPC-Staircase Codes 23
0 0.5 1 1.5 2 2.5
·10−2
0
0.5
1
1.5
2
2.5·10−2
f2
f3
f4
f5
f6
pin
pout=
f(p
in)
pout = pin
(3, 27)-RegularDesigned
Figure 3.2: The elementary EXIT functions used for designing a rate-8/9 code ensemblein Section 3.4.2, Example 1. The EXIT function of the resulting optimized ensemble iscompared with that of the (3, 27)-regular LDPC ensemble.
We can observe the effect of allowing degree-zero and degree-one VNs in the ensemble
by comparing f3 to the EXIT function of the regular ensemble: for a given code rate,
having such VNs in the ensemble allows for having CNs of lower degree which in turn
can provide the VNs with stronger and more reliable messages at large values of pin.
3.3.2 Code Optimization
Similar to [1], we view the problem of designing the concatenated FEC scheme as a multi-
objective optimization with the objectives (Es/N0, ηin). In both parameters, smaller is
better, i.e., we wish to minimize the SNR needed to achieve the target error rate and
we wish to minimize the estimated complexity needed to do so. Given a concatenated
code rate, Rcat, we characterize the trade-off between the objectives by finding their
Pareto frontier. For any SNR, we find a pair (if it exists), consisting of an outer staircase
code and an inner-code ensemble, with minimum complexity, that together, bring the
BER below 10−15. The Pareto frontier then provides the various choices of FEC schemes
available to be used in the OTN.
Our proposed concatenated code optimization procedure is as follows. When the
concatenated FEC rate, Rcat, is specified, we loop over a table of staircase codes such as [1,
Table 1]. Recall that each staircase code specifies Rsc and psc, the rate and threshold of
the outer code, respectively. For each staircase code, we perform the inner-code ensemble
complexity optimization.
Chapter 3. Low-Complexity Concatenated LDPC-Staircase Codes 24
It is shown in [61], that, given the EXIT function, the number of iterations, I, required
by the inner code to take the VN message error probability from p0, the channel BER,
down to pt, a target message error probability, can be closely approximated as
I ≈∫ p0
pt
dp
p log(
pf(p)
) . (3.11)
Based on the EXIT function analysis described in Sec. 3.3.1, for i ≥ 2, the probability
of error at a degree-i VN is equal to the probability of error at a message coming from a
degree-(i+ 1) VN, i.e., fi+1(pt). However, the probability of error at a degree-one VN is
not equal to f2(pt) because to obtain f2(pt) we add a channel message to a check message.
In computing that check message, there may be a contribution from a degree-1 VN.
Such a message significantly affects the CN message as it remains constant throughout
the decoding procedure. Therefore, we must obtain CN messages specifically targeted
at degree-one VNs and use them, along with the channel observations, to obtain the
probability of error at degree-one VNs. We denote the probability of error at a degree-
one VN by f1(pt). Note that f1(pt) can also be pre-computed and stored, given a fixed
SNR, dc, and ν.
We let Pout denote the BER on bits passed to the outer decoder. Since only the
information bits of the inner code are passed to the outer code, Pout can be obtained as
Pout =1
Rin
(U0p0 + U1f1(pt) +
Dv∑i=2
Uifi+1(pt)
). (3.12)
From (3.5) and (3.11), the complexity-optimized inner-code ensemble is obtained by
searching over a discrete set of values for dc, ν, and pt, and, for each choice, solving the
following optimization problem:
minimize(Λ,U)
ηin =(1−Rin)(dc − ν)
Rin
∫ p0
pt
dp
p log(
pf(p)
) , (3.13)
subject toDv∑i=1
λii≥ 1− L0
dc(1−Rin), (3.14)
Dv∑i=1
λi = 1, λ1dc = ν, (3.15)
0 ≤ λi ∀i ∈ 1, . . . , Dv, (3.16)
Chapter 3. Low-Complexity Concatenated LDPC-Staircase Codes 25
Dv∑i=0
Ui = Rin, U0 = L0, (3.17)
0 ≤ Ui ≤ Li ∀i ∈ 1, . . . , Dv, (3.18)
f(p) < p ∀p ∈ [pt, p0], (3.19)
Pout ≤ psc. (3.20)
In this optimization problem formulation, constraint (3.14) ensures that the obtained
complexity-optimized code has the desired rate (see (3.1)). Constraints (3.15)–(3.16)
ensure the validity of the obtained ensemble. Constraints (3.17)–(3.18) ensure the validity
of the designated information set. Constraint (3.19) ensures that the obtained EXIT-
curve remains open throughout the decoding procedure, i.e., for all p ∈ [pt, p0]. Finally,
(3.20) ensures that the BER on bits passed to the outer code is at or slightly below its
threshold. Unsurprisingly, it turns out that the highest degree VNs are always chosen
by the optimization routine as information nodes. We call constraints (3.14)–(3.19) the
validity constraints and refer to them in the next chapters.
Note that, in terms of the optimization parameters, constraints (3.14)–(3.20) are
linear (see (3.2), (3.8), and (3.12) for how (3.18), (3.19), and (3.20) are related to the
design parameters, respectively). Also, as shown in [61], under mild conditions, I, as
approximated in (3.11), is a convex function of Λ. Therefore, given an SNR, we can
compute the elementary EXIT functions and once the values of dc, ν, and pt are picked,
the problem of designing complexity-optimized inner-code becomes convex, and can be
solved by the method described in Sec. 3.3.3.
Once the search over the library of staircase codes and the values of dc, ν, and pt
is complete, the ensemble with lowest complexity, according to (3.5), is chosen as the
inner-code ensemble. The obtained ensemble and the corresponding staircase code that
achieves the minimum overall complexity then give the optimized concatenated code.
The inner-code optimization procedure describe here, in effect, synthesizes an open
EXIT function out of the elementary EXIT functions to obtain a valid ensemble that
achieves the target BER with minimum complexity. In Fig. 3.2 we plot the EXIT function
of the rate-8/9 optimized ensemble we obtain in Section 3.4.2, Example 1, and compare
it with that of the (3, 27)-regular LDPC ensemble. While the regular ensemble has a
fixed point above the target BER, the optimization procedure described here obtains a
valid ensemble with an open EXIT function.
We remark that, while not considered in this work, the inner-code optimization can
be reformulated to obtain, for a given complexity, the inner-code ensemble and the cor-
responding outer code with the maximum overall rate, thereby maximizing the system
Chapter 3. Low-Complexity Concatenated LDPC-Staircase Codes 26
throughput. Similarly, a Pareto frontier between the SNR and the concatenated code
rate can be established.
3.3.3 Practical Considerations
Discretization
In practice, the integral in (3.11) is estimated by a sum over a quantized version of the
[pt, p0] interval. Let Q be the number of quantization points. Define ∆ , (p0 − pt)/Qand let
qi = pt + i∆, i ∈ 0, 1, . . . , Q− 1.
We define a discrete approximation of the integral in (3.11) as
IQ =
Q−1∑i=0
∆
qi ln( qif(qi)
),
which we use in the objective function in (3.13), instead of the integral. The constraint
f(qi) < qi i ∈ 0, 1, . . . , Q then ensures the openness of the EXIT-curve throughout
the decoding procedure.
Similarly, intervals [dminc , dmax
c ] [0, νmax] and [0, pmaxt ], are quantized with Qdc , Qν and
Qpt points when searching over values of dc, ν, and pt, respectively, at the inner-code
ensemble optimization. Here, the pair (dminc , dmax
c ) and νmax and pmaxt determine the
intervals we search for the values of dc, ν, and pt, respectively, in our search for the
optimal inner-code ensemble. The values of Q, Qdc , Qν and Qpt allow the designer to
trade-off between accuracy and computational complexity of the design process.
Optimization Algorithm
Even when dc, ν, and pt are fixed, the objective function is non-linear and is not easily
differentiable. To solve the optimization problem, we use the sequential quadratic pro-
gramming (SQP) method [88]. This method is an iterative procedure, at each iteration
of which a quadratic model of the objective function is optimized (subject to the con-
straints), and the solution is used to construct a new quadratic model of the objective
function.
An issue with using the SQP algorithm is that it needs to be initialized with a fea-
sible point. In our design procedure, we first substitute the objective function of the
Chapter 3. Low-Complexity Concatenated LDPC-Staircase Codes 27
Table 3.1: Quantifying Finite Interleaving LossPacking Ratio φ 2 4 8 ≥ 16
Performance Loss (dB) 0.02 0.01 0.007 ≈ 0
optimization by a quadratic function, such as
Dv∑i=1
λ2i +
Dv∑i=0
U2i .
A feasible set of values to initialize the design parameters is then found by solving the
optimization problem using any standard quadratic programming method.
Interleaving Between Inner and Outer Code
The outer staircase code threshold psc is computed assuming that the outer code sees a
binary symmetric channel, i.e., a channel with independent and identically distributed bit
errors occurring with probability psc. The inner decoder, however, produces correlated
errors. To mitigate the error correlation, we use a diagonal interleaver as in [1]. We
suppose that each staircase block is of size Φ2, and we choose the inner code dimension
K to divide Φ2. We define the packing ratio, φ, as the number of inner codewords
associated with a staircase block, i.e., φ = Φ2/K.
Table 3.1 shows the performance loss, relative to ideal interleaving, obtained for
different packing ratios via simulation, assuming an outer staircase code of rate 15/16
with Φ = 704 and using an inner code sampled from an optimized ensemble. Here, the
loss is measured as the extra SNR needed at the receiver to achieve 10−5 BER, relative
to the ideal interleaving threshold. The ideal interleaving threshold was estimated by
interleaving inner codewords over multiple staircase blocks. At packing ratios exceeding
8, the performance degradation becomes negligible, justifying the use of the simple binary
symmetric channel BER analysis of staircase codes. A more detailed discussion of the
interleaving between inner and outer codes is provided with an example in Sec. 4.5.3.
3.4 Results
3.4.1 Pareto Frontier
We searched staircase codes of [1, Table 1] for the optimal outer code. We refer the
reader to [41] to see how these codes are obtained. The reader should note that there
is a slight difference between two of the entries in the earlier table [41, Table 1] (which
Chapter 3. Low-Complexity Concatenated LDPC-Staircase Codes 28
5 5.2 5.4 5.6 5.8 6 6.2 6.4 6.6 6.8 7 7.20
20
40
60
80
25% OH20% OH
15% OH
Es/N0 dB
η i
BenchmarkProposed
Figure 3.3: The (Es/N0, ηin) Pareto frontiers of the inner code in the proposed design,compared with the benchmark design of [1], at 15%, 20%, and 25% OHs.
included t = 5-error-correcting constituent codes) and the later table [1, Table 1] (which
includes only results corresponding to the more practical t = 4 constituent codes).
We used the following parameters in designing inner-code ensembles: Dv = 12, νmax =
4, pmaxt = p0/2, and Q = 400. The pair (dmin
c , dmaxc ) are chosen according to the inner-code
rate while ensuring a large enough interval to search for the optimal dc. We used the SP
algorithm in generating the elementary EXIT charts, and 106 samples were produced at
each pass of the Monte-Carlo simulation.
Fig. 3.3 shows the (Es/N0, ηin) Pareto frontier for the designed inner-codes, at 15%,
20%, and 25% OHs. The Pareto frontiers are also compared with those of [1]. Similar
to [1], all our concatenated code designs picked the highest-rate staircase code available,
with Rsc = 15/16 and psc = 5.02 × 10−3. As can be seen from Fig. 3.3, the proposed
design outperforms the design in [1]. The obtained inner codes achieve the performance
of the inner codes of [1], with up to 71%, 50%, and 19% reduction in complexity, at 15%,
20%, and 25% OHs, respectively. Also, compared to [1], the designed concatenated codes
operate at up to 0.23 dB, 0.14 dB, and 0.06 dB closer to the CSL, at 15%, 20%, and 25%
OHs, respectively.
To study the performance of the designed inner codes at an overall OH of 20%,
we sampled parity-check matrices for codes of length up to 100,000 from each of the
complexity-optimized inner-code ensembles. Since the code-lengths we consider here are
very large, with high probability we obtain a full-rank sub-matrix corresponding to the
designated information set of each parity-check matrix. We simulated the transmission
Chapter 3. Low-Complexity Concatenated LDPC-Staircase Codes 29
5.4 5.6 5.8 6 6.2 6.4 6.610−3
10−2
Es/N0 dB
Pout
uncodedpsc
Figure 3.4: Simulated inner-code BERs on bits passed to the outer code, sampled fromthe complexity-optimized ensembles, for designs at 20% OH. The mid-point on each BERcurve (highlighted by an ‘o’) is the code operational point, i.e, the SNR for which theinner code is designed to achieve Pout ≤ psc.
of codewords over an AWGN channel. Codewords were decoded using the SP algorithm
with floating-point message-passing, and the code performance was obtained by averaging
the codeword BERs. Note that we only care about the BER of the information nodes of
an inner codeword.
In Fig. 3.4, obtained BERs are plotted versus SNR. The psc line shows the outer
staircase code threshold. The mid-point SNR on each curve (highlighted by on ‘o’) is the
code operational point, i.e., the SNR for which the code is designed. Note that BERs
of all the sampled codes hit at, or below, the outer-code threshold, at their operational
point, verifying our design approach.
3.4.2 Two Design Examples
Here we present two interesting examples of the complexity-optimized concatenated code
designs at 20% OH. In both of these examples, the outer code picked was the Rsc = 15/16
and psc = 5.02× 10−3 staircase code.
Example 1 : An FEC scheme operating at 1.27 dB from the CSL. The optimization
procedure yields the following ensemble for the inner code:
L(x) = 0.1480 + 0.1309x+ 0.3484x3 + 0.3727x4,
R(x) = x24.
Chapter 3. Low-Complexity Concatenated LDPC-Staircase Codes 30
0 1 2 3 4 5 6 7 8 910−4
10−3
10−2
10−1
degree-4
degree-3
degree-1
degree-0
Pout
Iteration #
BER
Figure 3.5: The BER on information nodes of different degrees in the ensemble of Ex-ample 1 and the BER on bits passed to the outer code, denoted by Pout. The degreedistribution on the information nodes is 0.1665 + 0.0223x+ 0.3919x3 + 0.4193x4.
This code requires a maximum of 9 iterations to bring the BER below the outer-code
threshold, which gives an inner-code complexity score of 25.67.
In Fig. 3.5 we plot the BER on information nodes of this ensemble over the decoding
iterations. We also plot Pout, the BER on bits passed to the outer code, obtained using
(3.12). In this example the information nodes include some of the degree-one VNs as
well. As can be seen from Fig. 3.5, the BER on VNs of higher degree decrease rapidly
with decoding iterations and therefore the BER of the uncoded bits dominates Pout at
the end. Similarly, in Fig. 3.4, for codes designed for lower SNRs and at low BERs, Pout
is dominated by the BER on the uncoded bits.
Example 2 : An FEC scheme operating at 1 dB from the CSL. The optimization
procedure yields the following ensemble for the inner code:
L(x) = 0.1480 + 0.1111x+ 0.4539x3 + 0.0911x4 + 0.0973x6 + 0.0985x7,
R(x) = x28.
This code requires a maximum of 18 iterations to bring the BER below the outer-code
threshold, which gives an inner-code complexity score of 60.24.
Chapter 3. Low-Complexity Concatenated LDPC-Staircase Codes 31
0 20 40 60 80 100 120 14010.8
11
11.2
11.4
11.6
LDPC
LDPC
LDPC
UEP-BCH-LDPC
BCH-RS-LDPC
LDPC
BCH-LDPCQC
η
NCG
(dB)
SP, QC, prop. (La)MS, QC, prop. (La)
SP, prop. (Fl)SP, QC, prop. (Fl)MS, prop. (Fl)
MS, QC, prop. (Fl)[92] CC (La)[95] QC (La)[92] CC (Fl)[91] (Fl)
[90] QC (Fl)[93] QC (Fl)[94] SpC (Fl)
Figure 3.6: NCG and η comparisons of the proposed concatenated design and other soft-decision FEC schemes, at 20% OH. Decoders using a flooding (resp., layered) decodingschedule are denoted with Fl (resp. La). For the proposed codes (denoted as “prop.”),the inner decoding algorithm (MS or SP) is specified. Block length 30000 is consideredfor the designs with QC-structured inner codes. The following abbreviations are usedin describing the referenced codes. BCH: Bose—Ray-Chaudhuri—Hocquenghem, UEP:Unequal Error Protection, RS: Reed-Solomon, CC: Convolutional Code, SpC: SpatiallyCoupled.
3.4.3 Comparison to Other Works
To compare our work with the existing designs, in Fig. 3.6, we have plotted the NCG,
obtained from (1.2), versus complexity, at 20% OH, for our designed codes, and also
for several other existing FEC solutions. Since the referenced code designs are based on
min-sum (MS) or offset-MS decoding, we also simulated the obtained inner codes using
the offset-MS algorithm with unconditional correction [89].
Compared to code designs decoded under a flooding schedule, the obtained MS-based
codes achieve, at similar complexities, a 0.77 dB gain over the code in [90], a 0.57 dB
gain over the code in [91], and a 0.42 dB gain over the code in [92]. The designed codes
achieve the NCGs of codes in [92] and [93] with more than a 56% reduction in complexity,
and the excellent NCG of the code in [94] with 46% reduction in complexity.
Compared to code designs where the inner code is decoded under a layered schedule,
the obtained MS-based codes achieve the NCGs of codes in [95] with more than 57%
reduction in complexity, and achieve the NCGs of codes in [92] with 15% to 41% reduction
in complexity.
While some designs in [92], decoded under a layered schedule, come close to the pro-
Chapter 3. Low-Complexity Concatenated LDPC-Staircase Codes 32
posed MS-based codes, the proposed SP-based codes, decoded under flooding schedule,
strictly dominate the existing designs. The SP-based codes achieve the NCGs of the
existing designs with at least 62% and 24% reduction in complexity compared to code
designs decoded under a flooding schedule and layered schedule, respectively. The SP-
based codes achieve at least 0.45 dB and 0.11 dB greater NCG over the existing designs,
at nearly the same η, compared to code designs decoded under flooding schedule and
layered schedule, respectively.
The latency of the proposed concatenated code can be obtained by adding the laten-
cies of the inner and the outer codes. The latency is dominated by the staircase decoder.
For example, at 200 Gb/s, for a staircase block containing 4.65 × 105 information bits
and a staircase decoding window size W = 6, the decoding latency of the proposed con-
catenated code (including the inner code) is ≈ 2.8 × 106 bit periods, or 14 µs, which is
an acceptable latency in many OTN applications.
3.4.4 Quasi-Cyclic-Structured Inner Codes
The inner codes considered so far have been randomly structured and have large block
lengths. Decoder architectures for such codes are often plagued with routing and message-
permutation complexities. In order to obtain a more pragmatic implementation of the
proposed FEC scheme, we adopt a quasi-cyclic (QC) structure for the inner codes. The
QC structure is well known to be hardware-friendly and leading to energy-efficient im-
plementations; see [64] and references therein.
To construct a QC inner code given an ensemble, we first sample a base matrix in
keeping with the ensemble. Should the sampled base matrix not have a full-rank sub-
matrix in the designated parity positions we discard it and sample another one. Once a
valid base matrix is obtained, we lift it to obtain a QC parity-check matrix of large girth
for the inner code.
We constructed girth-8 inner-codes of length 30000±1%, based on the obtained inner-
code ensembles, for the concatenated code at 20% OH. As can be seen from Fig. 3.6,
the concatenated FEC with QC-structured inner-codes performs as well as with ran-
domly structured inner-codes, with only a small loss in performance when operating at
a high NCG. Note that, however, we do not make any claim of optimality for the code
constructions with QC-structured inner-codes, as the optimization procedure used as-
sumes a random structure for the inner code. See [96] for a scaling law predicting the
finite-length performance loss of LDPC codes.
The structure of the QC codes also allows for layered decoding of the constructed inner
Chapter 3. Low-Complexity Concatenated LDPC-Staircase Codes 33
0 10 20 30 40 50 60 7010.8
11
11.2
11.4
11.6
η
NCG
(dB)
nc = 30000± 1% (La)nc = 15000± 1% (La)nc = 10000± 2% (La)nc = 6000± 3% (La)nc = 30000± 1% (Fl)nc = 15000± 1% (Fl)nc = 10000± 2% (Fl)nc = 6000± 3% (Fl)
Figure 3.7: NCG and η comparisons of the QC constructions of the designed concatenatedFEC, at 20% OH, under layered (La) and flooding (Fl) schedules.
codes. As can be seen from Fig. 3.6, the concatenated scheme with inner-code length
30000±1%, decoded under layering schedule, performs at up to 50% lower complexity
compared to the scheme with the inner code decoded under flooding schedule. Compared
to the existing code designs decoded under a layered schedule, the designed codes, with
QC inner-codes decoded under layering schedule, achieve a similar NCG with at least
40% reduction in complexity.
While a length 30000 LDPC code can be considered practical for OTN applications
[92], we have also constructed QC-structured inner-codes of shorter lengths (6000± 3%,
10000±2%, and 15000±1%) and possibly lower girths, based on the obtained inner-code
ensembles, at 20% OH. Note that, according to (3.4) and (3.5), using a short inner code
does not change the complexity score of the overall code; however, having a short inner
code leads to a more practical implementation, as it greatly reduces wiring and routing
complexities. A comparison between the concatenated FEC schemes with inner codes of
various lengths is provided in Fig. 3.7.
As can be seen from Fig. 3.7, when shorter inner codes are used, the loss in NCG is
not significant, although the loss becomes bigger, as the NCG increases or as the inner-
code length becomes shorter. Nevertheless, schemes with inner code of length 6000±3%,
decoded under a layered schedule, operate at up to 50% less complexity, compared to
schemes with an inner code of length 30000± 1%, decoded under a flooding schedule.
Chapter 3. Low-Complexity Concatenated LDPC-Staircase Codes 34
5.4 5.6 5.8 6 6.2 6.40
20
40
60
80
100
Es/N0 dB
η i
BenchmarkLDPC-Staircase (15/16)LDPC-Zipper (0.98)
Figure 3.8: The (Es/N0, ηin) Pareto frontiers of the designed concatenated LDCP-zipperFEC, at 20% OH, compared with the LDCP-staircase design and the benchmark designof [1].
3.4.5 Concatenated LDPC-Zipper Structure
As mentioned in the beginning of this section, in all code designs the optimization pro-
cedure picked the highest-rate staircase code available to us. This suggests that using
an outer staircase code with higher rate is likely to yield concatenated code designs with
even lower complexity. However, it is not trivial to design and implement staircase codes
with a very high rate, because the staircase block size becomes very large as the code
rate increases.
Here, instead of a staircase code, we consider a zipper code as the outer code in
our design. Zipper codes are a framework proposed in [14] for describing spatially-
coupled product-like codes such as staircase codes and braided block codes. In particular,
from [14, Table 1], we pick the highest rate code that is the rate-0.98 zipper with threshold
1.1× 10−3.
Fig. 3.8 shows the (Es/N0, ηin) Pareto frontier of the inner codes designed for concate-
nation with the outer zipper code described above, and a 20% overall OH. The Pareto
frontier is also compared with that of designs with an outer staircase code and also the
benchmark Pareto frontier of [1]. The obtained inner codes achieve the performance of
the codes in [1] and codes with an outer staircase code with up to 71% and 54% reduction
in complexity, respectively. Also, at a similar complexity, the obtained codes can operate
at up to 0.41 dB and 0.32 dB closer to the CSL compared to the codes in [1] and codes
with an outer staircase code, respectively.
Chapter 3. Low-Complexity Concatenated LDPC-Staircase Codes 35
0 20 40 60 80 100 120 140 160 180 200 220 240
10.8
11
11.2
11.4
11.6
11.8
12
LDPC
LDPC
LDPC
UEP-BCH-LDPC
BCH-RS-LDPC
LDPC
BCH-LDPC
LDPC
η
NCG
(dB)
LDPC-Zipper0.98 (Fl)LDPC-Staircase (Fl)
[92] CC (La)[95] QC (La)[92] CC (Fl)[91] (Fl)
[90] QC (Fl)[93] QC (Fl)[94] SpC (Fl)[10] SpC (Fl)
Figure 3.9: NCG and η comparisons of the proposed concatenated design and other soft-decision FEC schemes, at 20% OH. The concatenated design with an outer zipper code,the NCG-η Pareto frontier of which is the top left curve, outperforms other designs by awide margin.
In Fig. 3.9, we plot the NCG-η Pareto frontier of the obtained concatenated designs
with the rate-0.98 outer zipper code and compare it to the existing designs at 20% OH.
As can be seen from Fig. 3.9, the designed codes outperform the previously existing
designs by a wide margin. In particular, the designed codes can achieve the excellent
performance of [10] with 74% reduction in complexity.
3.5 Conclusion
In this chapter we have proposed a concatenated code design that improves significantly
upon the results of [1]. The complexity-optimized error-reducing inner code, concatenated
with an outer staircase code, forms a low-complexity FEC scheme suitable for high bit-
rate optical communication. An interesting feature that emerges from the inner-code
optimization is that a fraction of symbols are better left uncoded, and only protected by
the outer code. We showed that, compared to [1], with this modified design, the inner-
code complexity can be reduced by up to 71%. We showed that the concatenated code
designs have lower complexity than, to the best of our knowledge, any other existing SD
FEC scheme.
To realize a pragmatic and energy-efficient implementation for the proposed FEC
scheme, we constructed QC inner codes, based on the obtained ensembles. We showed
that QC-structured inner codes with practical lengths can achieve the performance of
Chapter 3. Low-Complexity Concatenated LDPC-Staircase Codes 36
the randomly constructed inner codes. We simulated layered decoding of the QC inner
codes and showed that with layered decoding the complexity score of the FEC scheme
can be reduced by up to 50%.
Chapter 4
Low-Complexity Concatenated FEC
for Higher-Order Modulation
4.1 Introduction
In this chapter, we consider the design of low-complexity FEC particularly in combination
with higher-order modulation. In designing such an FEC scheme, it is often unclear
whether it is better to design a coded modulation scheme via an MLC approach, or a
BICM approach. As described in Section 2.5, the MLC scheme, on the one hand, is
optimal from an information-theoretic point of view but has often been avoided because
of the potentially high implementation complexities, and the BICM scheme, on the other
hand, is usually considered to be a pragmatic approach to coded modulation.
Here we consider coded modulation design instances for various modulation orders
that are of practical relevance to optical communication. We design inner MLC and
BICM concatenated with an outer hard-decision code, for application in optical transport
networks and we compare them from a performance-complexity standpoint. We consider
signalling with rectangular quadrature amplitude modulation with 16, 64, and 256 points
(16-QAM, 64-QAM, and 256-QAM, respectively) and design concatenated codes of 28%
and 25% OHs. We use similar code-design approaches as in Chapter 3 and in [99], to
obtain complexity-optimized MLC schemes and complexity-optimized BICM schemes,
so that we may make—via their respective Pareto frontiers—a fair comparison between
them.
Simulation results of practical code designs, reported in Section 4.5, show that, for
This chapter includes and expands on the work in [97] and [98].
37
Chapter 4. Low-Comp. Concat. FEC for Higher-Order Modulation 38
all considered modulation orders, MLC provides significant advantages relative to BICM
over the entire performance-complexity tradeoff space. For example, at 28% overall OH
the 64-QAM MLC design can operate with 60% less complexity, or provide up to 0.4 dB
coding gain, when compared with the BICM design. It also compares favorably with the
MLC scheme reported in [2]. Similar advantages are provided by MLC at 25% OH for a
range of modulation formats: the MLC design provides an NCG of up to 12.8 dB with
16-QAM (1.0 dB from the CSL), an NCG of up to 13.6 dB with 64-QAM (1.2 dB from
the CSL), and an NCG of up to 14 dB with 256-QAM (1.65 dB from the CSL), all with
reasonable decoding complexity.
The rest of this chapter is organized as follows. In Sec. 4.2 we describe the concate-
nated coded modulation structures and the setup for a fair comparison between the MLC
and BICM schemes. In Sec. 4.3 and 4.4 we describe the MLC and BICM schemes and
their inner code parameterization and design, respectively. In Sec. 4.5 we present simu-
lation results for the MLC and BICM schemes that we have designed, characterize their
trade-offs, and compare them to the existing designs. In Sec. 4.6 we provide concluding
remarks.
4.2 Concatenated Code Description
We adopt a similar concatenated FEC structure to that in Chapter 3. We consider an
inner SD LDPC code concatenated with a high-rate outer HD code. The outer code is
concatenated with the inner code through an interleaver, π (see the encoder and decoder
of Fig. 4.1 and Fig. 4.2). The purpose of the interleaver is to reduce correlation among
bit errors passed to the outer code.
The task of the inner code in the concatenated code design is to reduce the BER of
the bits transferred to the outer code to below its threshold, which enables the outer code
to take the BER further down, below 10−15, as required by optical transport networks.
We construct concatenated codes of 28% and 25% OHs. For the 28% OH design, we
use the staircase code of rate Rout = 239/255, proposed in [13], as the outer code. This
outer code has a nominal threshold of 4.8 × 10−3; however, for the inner code we set a
lower BER target of P tout = 3 × 10−3, as this provides a practical margin that will also
enable a reduced interleaver size between inner and outer codes.
For the 25% OH design, we use a zipper code as the outer code [14]. We use the
diagonal zipper code of rate Rout = 0.96, proposed in [14], as the outer code. This outer
code has a nominal threshold of 2.32 × 10−3; however, for similar practical reasons, we
set a lower BER target of P tout = 2× 10−3 for the inner code.
Chapter 4. Low-Comp. Concat. FEC for Higher-Order Modulation 39
Outer HDEncoder
πInner SDEncoder
ΩMLC−M
Mapper
m− 1
Rin,MLC 1
(a) Encoder
ΩMLC−M
Demapper Inner SDDecoder
HD-MSBs
π−1 Outer HDDecoder
m− 1 m− 1
11−Rin,MLC
Rin,MLC
(b) Decoder
Figure 4.1: The encoder and the decoder in the MLC scheme. Here, m = log2M denotesthe number of bits per PAM symbol.
We study this concatenated FEC scheme in conjunction with modulation schemes
of various orders. In particular, we consider rectangular M2-QAM with M ∈ 4, 8, 16.Each of these modulation schemes can be thought of as the Cartesian product of separate
pulse amplitude modulations (PAMs) of M points, one in-phase and one in-quadrature.
It follows that the number of QAM symbols per frame is half that of the number of PAM
symbols. Throughout this chapter we let m = log2M denote the number of bits mapped
to each PAM symbol.
We aim at establishing a trade-off, in a Pareto sense, between performance and com-
plexity, for the considered coded modulation schemes. We use the performance and com-
plexity measure described in Sec. 1.2.2 in code design. As shown in Sec. 3.2.3, the overall
decoder complexity is dominated by that of the inner code and therefore is obtained as
η = ηin/Rout, where ηin is the inner-code complexity.
We have devised a coded modulation architecture to ensure a fair comparison between
the MLC and BICM schemes. Unlike most conventional MLC schemes, only the LSB
is inner-coded here; thus both schemes employ just a single binary code. Moreover, to
achieve the same latency, we choose the inner-code block length of the MLC scheme to
be shorter, by a factor of m, than that of the BICM scheme.
For the purposes of code design, we model the optical channel as an AWGN channel.
4.3 MLC Scheme
4.3.1 Coded-Modulation Description
We label the in-phase and the quadrature M -PAMs forming the M2-QAM constellations
as follows. The least significant (right-most) bit (LSB) alternates between adjacent sym-
Chapter 4. Low-Comp. Concat. FEC for Higher-Order Modulation 40
Outer HDEncoder
π Inner SDEncoder
ΩBICM−M
MappermRin,BICM m
(a) Encoder
ΩBICM−M
DemapperInner SDDecoder
π−1 Outer HDDecoder
m mRin,BICM
(b) Decoder
Figure 4.2: The encoder and the decoder in the BICM scheme. Here, m = log2M denotesthe number of bits per PAM symbol.
bols. For symbols with LSB = 0 we use a binary reflected Gray code (BRGC) [100] to
label the most significant bits (MSBs). We use the same MSB labelling for symbols with
LSB = 1. For example, we use the following labelling for the in-phase 8-PAM of the 64-
QAM constellation: ΩMLC-8 = (000, 001, 010, 011, 110, 111, 100, 101). With this labelling,
we construct a channel with minimal reliability for the LSB, thereby maximizing the
reliability of the MSBs. Moreover, the BER on the MSBs, demapped given the LSB, are
most similar with the Gray labelling.
We consider MLC schemes in which only the LSB is encoded by the inner code, and
the MSBs are protected only by the outer code (see Fig. 4.1(a)). At the receiver, a
hard-decision on the MSBs, taking into account the hard-decision on the inner-decoder
output bits and the channel information, is passed through the de-interleaver to the outer
decoder (see Fig. 4.1(b)). We assume that the inner decoder passes only a hard-decision
on its bits to the outer decoder and the MSB demapper.
In Fig. 4.1 we also indicate the number of bits passed to (from) the constellation
(de)mapper, per PAM symbol. Note that while all inner-decoded bits are used for demap-
ping the MSBs, inner-decoded information bits are passed to the outer decoder as well.
With the chosen labelling, the LSB channel is output-symmetric (see [15, Defn. 4.8]).
It is known that for a binary-input, memoryless, and output-symmetric channel, the log-
likelihood ratio densities coming from variable nodes of an LDPC code remain symmetric
during decoding [47, Thm. 3]. Therefore, the EXIT function analysis of [60], which we
modified for code design in Chapter 3, remains valid. This symmetry also simplifies inner-
code design and simulation, as the all-zero codeword can be assumed to be transmitted
on the LSB channel without loss of generality.
Chapter 4. Low-Comp. Concat. FEC for Higher-Order Modulation 41
4.3.2 Inner-Code Description
We adopt the design procedure of Chapter 3 to obtain complexity-optimized inner codes.
The inner code ensemble, members of which have a Tanner graph as shown in Fig. 3.1,
is described by its VN and CN degree distributions.
We consider a CN degree distribution that is concentrated on two consecutive degrees,
dc and dc + 1, with dc denoting the average CN degree. We let Rj(x) =∑dcj+1
d=dcjRd,jx
d
denote the node perspective check-degree distribution.
We use the same parameters as in Sec. 3.2.2 to describe the VN degree distribution
of the systematic inner-code ensemble where we designate a particular subset of the
VNs to be the information set and the remaining VNs form the parity set. The inner-
code rate is denoted by Rin,MLC. We therefore have∑Dv
i=0 Ui = Rin,MLC, and also Li =
dc(1−Rin,MLC)λi/i, for i ∈ 1, . . . , Dv.We let Λ = (λ1, λ2, . . . , λDv) and let U = (U0, U1, . . . , UDv). We refer to the pair
(Λ,U) as the design parameters. The design parameters will be used in the inner-code
optimization program.
Let E denote the number of edges in the ensemble that are not connected to a degree-
one VN. Also, let I denote the maximum number of decoding iterations allowed in the
inner decoder. The complexity score of the inner code in the MLC scheme, ηin,MLC, is
then computed as
ηin,MLC =EI
N(m− 1 +Rin,MLC), (4.1)
and measures the number of messages that are passed at the inner decoder per informa-
tion bit transferred to the outer code. Note that in (4.1) we have accounted for the fact
that there are m − 1 uncoded bit-levels per in-phase and in-quadrature M-PAMs that
incur zero inner decoding complexity. It is easy to see that E/N = (1−Rin,MLC)(dc− ν),
where ν is the average number of degree-one VNs connected to each CN. Therefore,
ηin,MLC can be obtained as
ηin,MLC =(1−Rin,MLC)(dc − ν)I
m− 1 +Rin,MLC
. (4.2)
4.3.3 Ensemble Optimization
EXIT Chart Analysis
Since in our architecture the messages passed at the decoder are symmetric, the EXIT
function tracking the decoding procedure is uni-parametric. We use a similar uni-
parametric EXIT function, f(p), as in Sec. 3.3.1, with argument p, the error probability
Chapter 4. Low-Comp. Concat. FEC for Higher-Order Modulation 42
on messages coming from the VNs, in analyzing the inner-code ensemble.
Similar to Sec. 3.3.2, given the fixed SNR, dc, and ν, we can pre-compute and store
the elementary EXIT functions fi(p), for i ∈ 2, 3, . . . , Dv, and f1(p) and use them in
the BER analysis and the ensemble optimization. Given the elementary EXIT functions,
the number of iterations, I, required by the inner code to take the VN message error
probability from p0, the channel BER, down to pt, a target message error probability,
can be approximated as in (3.11).
BER Analysis
We let Pinfo and Pparity denote the information-set BER, and the parity-set BER, respec-
tively, after the target message error probability is achieved. In terms of the ensemble
parameters, Pinfo and Pparity can be obtained as
Pinfo =1
Rin,MLC
(U0p0 + U1f1(pt) +
Dv∑i=2
Uifi+1(pt)
),
Pparity =1
1−Rin,MLC
((L1 − U1)f1(pt) +
Dv∑i=2
(Li − Ui)fi+1(pt)
).
As shown in Fig. 4.1(b), the demapped MSBs along with the information bits of the
inner code are passed to the outer decoder. Let Pout denote the BER on bits passed to
the outer decoder. From the law of total probability, Pout can be obtained as
Pout =m− 1
m− 1 +Rin,MLC
PMSB +Rin,MLC
m− 1 +Rin,MLC
Pinfo, (4.3)
where PMSB is the average BER in demapping the MSBs. Note that a hard decision on
all LSBs (both information bits and parity bits of the inner code) is used to demap the
MSBs. Therefore, PMSB can be obtained as
PMSB = Rin,MLCPMSBinfo + (1−Rin,MLC)PMSB
parity, (4.4)
where PMSBinfo and PMSB
parity denote the average BER in demapping the MSBs using informa-
tion bits and parity bits of the inner code, respectively. Now let PMSBc be the average
BER in demapping the MSBs when the LSB decision is correct, and let PMSBe be the
average BER in demapping the MSBs when the LSB decision is in error. The values
of PMSBc and PMSB
e only depend on the SNR and the constellation labelling and can be
Chapter 4. Low-Comp. Concat. FEC for Higher-Order Modulation 43
obtained empirically by Mote-Carlo simulation. We can then obtain PMSBinfo and PMSB
parity as
PMSBinfo = (1− Pinfo)PMSB
c + PinfoPMSBe
PMSBparity = (1− Pparity)PMSB
c + PparityPMSBe .
(4.5)
From (4.3)–(4.5), Pout can be obtained by the affine relation
Pout = aPinfo + bPparity + c, (4.6)
where
a =(m− 1)Rin,MLC
m− 1 +Rin,MLC
(1
m− 1+ PMSB
e − PMSBc
),
b =(m− 1)(1−Rin,MLC)
m− 1 +Rin,MLC
(PMSB
e − PMSBc
),
c =m− 1
m− 1 +Rin,MLC
PMSBc
are independent of the inner-code design parameters.
Optimization Routine
The complexity-optimized inner-code ensemble is obtained by searching over a discrete
set of values for dc, ν, and pt, and, for each choice, solving the following optimization
problem:
minimize(Λ,U)
(1−Rin,MLC)(dc − ν)
m− 1 +Rin,MLC
∫ p0
pt
dp
p log(
pf(p)
) , (4.7)
subject toDv∑i=1
λii≥ 1− L0
dc(1−Rin,MLC), (4.8)
Dv∑i=1
λi = 1, λ1dc = ν, (4.9)
0 ≤ λi ∀i ∈ 1, . . . , Dv, (4.10)
Dv∑i=0
Ui = Rin,MLC, U0 = L0, (4.11)
0 ≤ Ui ≤ Li ∀i ∈ 1, . . . , Dv, (4.12)
f(p) < p ∀p ∈ [pt, p0], (4.13)
aPinfo + bPparity + c ≤ P tout. (4.14)
Chapter 4. Low-Comp. Concat. FEC for Higher-Order Modulation 44
In this optimization problem formulation, constraints (4.8)–(4.13), similar to constraints
(3.14)–(3.19), are the validity constraints. Constraint (4.14) then ensures that the BER
on bits passed to the staircase decoder is at or below the set target. Unsurprisingly, it
turns out that in all the MLC designs reported in this chapter, the highest degree VNs
are always chosen by the optimization routine as information nodes.
As decribed in Sec. 3.3.2, in terms of the optimization parameters, every constraint
in the optimization program is linear (see (4.6) for how (4.14) is related to the design
parameters). Therefore, one can solve this optimization program using the tools and
techniques described in Sec. 3.3.3.
4.4 BICM Scheme
4.4.1 Coded-Modulation Description
We label the M -PAM constellation of an M2-QAM constellation using a BRGC on m
bits. For example, we use the following labelling for the in-phase 8-PAM of the 64-QAM
constellation: ΩBICM-8 = (000, 001, 011, 010, 110, 111, 101, 100).
Unlike the MLC case, all m bit-levels are encoded jointly by the inner code in our
BICM scheme as shown in Fig. 4.2(a). At the receiver (see Fig. 4.2(b)), we use bitwise
demapping and perform SD bit-metric decoding of the inner LDPC code on all m bit-
levels. The inner decoder output is then passed to the outer decoder, which performs
HD decoding.
A bitwise ΩBICM demapper yields m BICM bit-levels of differing reliabilities. Fol-
lowing the approach of [99], we explicitly incorporate the different bit reliabilities into
the code design and consider a multi-edge type (MET) ensemble [101] as displayed in
Fig. 4.3. The m bit-levels are mapped to the m VN types.
The inner code rate is related to that of the MLC via
m− 1 +Rin,MLC = mRin,BICM.
For instance, for Rin,MLC = 1/2 and m = 3, the corresponding BICM rate is Rin,BICM =
5/6 for the same spectral efficiency.
Unlike that of the MLC scheme, the inner-code design procedure for the BICM
scheme, described next, does not rely upon the channel being output-symmetric. In-
deed, this approach can also be adapted to the design of MLC schemes. While we have
not pursued this direction in this chapter, we will consider a similar code design approach
in the next chapter.
Chapter 4. Low-Comp. Concat. FEC for Higher-Order Modulation 45
Π Π Π
VN1 VN2 VNm
dv1 dv2 dvm
dc1
Figure 4.3: Inner-code ensemble considered for the BICM scheme
4.4.2 Inner-Code Description
Consider the MET ensemble in Fig. 4.3. Unlike conventional protographs [102], where
each VN-type represents one specific VN degree, we associate with a type-j VN, where
j ∈ 1, . . .m, a VN-perspective degree distribution Lj(x) =∑Dv
i=0 Li,jxi, where Dv is the
maximum VN degree. The average type-j VN degree therefore is dvj =∑Dv
i=0 iLi,j. We
define the edge-perspective degree distribution of the type-j VN as λj(x) =∑Dv
i=1 λi,jxi−1.
At any CN, let the type-j degree denote the number of its edges that come from
type-j VNs. We consider a MET ensemble in which the type-j CN degree distribution is
concentrated on two consecutive degrees, dcj and dcj + 1. We denote by Γj(x) the type-j
CN degree distribution. The type-j average CN degree, dcj , is then obtained as
dcj =dvj
m(1−Rin,BICM).
Furthermore, similar to the MLC scheme, we consider an overall CN distribution that is
concentrated on two consecutive degrees, dc and dc + 1, with dc denoting the average CN
degree. We therefore have dc =∑m
j=1 dcj .
The average number of degree-one VNs connected to each CN in the MET ensemble
is
ν =
∑mj=1 L1,j
m(1−Rin,BICM).
Similar to (4.2), the complexity score of the inner code in the BICM scheme, ηin,BICM, is
obtained by
ηin,BICM =(1−Rin,BICM)(dc − ν)I
Rin,BICM
. (4.15)
By assigning a degree distribution to each of the m VN-types, we obtain an ensemble
where the VN-types see different reliabilities by the assigned bit-levels. Also at each VN,
after decoding, the VNs of different degrees see different resulting reliabilities. Let Nj
Chapter 4. Low-Comp. Concat. FEC for Higher-Order Modulation 46
Table 4.1: An example of degree distributions of various types, for m = 3.
j Lj(x) dvj Γj(x) dcj
1 23x+ 1
3x2 4
329x+ 7
9x2 16
9
2 x4 4 23x5 + 1
3x6 16
3
3 x5 5 13x6 + 2
3x7 20
3
denote the number of non-zero coefficients of the VN degree distribution of the type-j VN
Lj(x). Overall, after decoding, we then have N =∑m
j=1Nj VNs of different reliabilities.
4.4.3 Ensemble Sampling
Note that for the MET ensemble we require not only to have, for all j ∈ 1, . . . ,m, a
concentrated type-j CN degree distribution, but to have a concentrated overall CN degree
distribution. We use an algorithm we call degree partition and sort (DPS) to obtain such
a CN configuration. Before we describe DPS we need the following subroutines that
operate on integer matrices. Here, by “row-weight” we mean the sum of the elements in
a given row of the matrix.
• Sort: The function sort≤(P, l) operates on a matrix P of r rows and a column
vector l of r corresponding integers. It returns a matrix P, in which the rows of P
have been sorted, top-down, by row-weight in ascending order, and it produces a
vector l preserving the original correspondence. The function sort≥(P, l) is defined
similarly, but it returns a P in which the rows of P have been sorted, top-down,
by row-weight in descending order.
• Expand: The function P = expand(P, l) operates on an s × t matrix P and an
s × 1 vector l of positive integers summing to sl. It returns an sl × t matrix P in
which the i-th row of P is repeated with multiplicity l(k), where l(k) denotes the
k-th element of l.
• Collapse: The function coll(P) operates on an sl × t matrix P. It returns an
s × t matrix P and an s × 1 multiplicity vector l of positive integers, where P =
expand(P, l), and s is as small as possible.
Given the degree distributions of all types, DPS works as follows. Let CΓ be the
smallest positive integer such that, for j ∈ 1, . . . ,m, CΓΓj(x) is a polynomial with
Chapter 4. Low-Comp. Concat. FEC for Higher-Order Modulation 47
integer coefficients. Let Pj be a column vector containing the CN degrees of type-j,
and let lj be the multiplicity vector containing the corresponding coefficients of CΓΓj(x).
Initially, we let P = P1 and l = l1. For j = 2, . . . ,m, we then update P and l iteratively
as
(P, l) = coll(expand(sort≥(P, l))
∣∣expand(sort≤(Pj, lj))),
where ‘|’ denotes the concatenation operator. Note that a Γj(x) with irrational coefficients
can be approximated arbitrarily closely by a polynomial with rational coefficients. In
practice, therefore, DPS works for any set of degree distributions.
It is possible to show that the resulting matrix P and vector l then describe the CN-
side of the MET ensemble. The rows of P represent the CNs in the ensemble and the
columns represent VN types they connect to. Matrix P will have at most m + 1 rows
which denote various CN kinds in the ensemble. The (i, j)-th element of P corresponds to
the multiplicity of type-j edges at i-th CN kind. Vector l corresponds to the multiplicity
of the CN kinds in the ensemble.
As an example, we apply DPS to an ensemble for which the VN-type degree distri-
butions are give in Table 4.1. Here, we have CΓ = 9. We initialize the DPS algorithm
with
P =
[1
2
], l =
[2
7
],
corresponding to the first VN type. After one round of DPS, we get the following
P =
2 5
2 6
1 6
, l =
6
1
2
,and after the last round of DPS, we get
P =
2 6 6
2 5 6
2 5 7
1 6 7
, l =
1
2
4
2
.
In the resulting matrix P, 1) the elements of the j-th column, representing the type-
j CN degrees, are concentrated on integers dcj and dcj + 1, and 2) the row-weights,
representing the overall CN degrees, are concentrated on integers dc and dc + 1. These
properties hold true in general.
Chapter 4. Low-Comp. Concat. FEC for Higher-Order Modulation 48
Let P and l be the outputs of DPS and let sl =∑m+1
k=1 l(k). Let nv and nc be positive
integers such that
1. nv is divisible by m,
2. for j ∈ 1, . . . ,m, nvmLj(x) is a polynomial with integer coefficients,
3. nc is divisible by sl, and
4. ncnv
= 1−Rin,BICM.
We sample from the MET ensemble by creating a bipartite graph with nvmLi,j degree-i
VNs of j-th type, for i ∈ 1, . . . , Dv and j ∈ 1, . . . ,m, and ncsll(k) CNs of k-th kind,
for k ∈ 1, . . . ,m+ 1. We then randomly place the edges in the graph according to the
multiplicity of edge-types at each CN. We do not allow parallel edges in the graph.
Note that similar to the MLC scheme inner-code where the CN degrees concentrate
on two consecutive degrees, the BICM inner-code CN degrees also concentrate locally
on two consecutive degrees for each VN-type. Thus, the BICM inner-code ensemble can
be interpreted as the BICM counterpart of the MLC inner-code ensemble with multiple
VN-types.
4.4.4 Ensemble Optimization
EXIT Function Analysis
The iterative decoding threshold of conventional protograph ensembles can be efficiently
computed using the protograph-based EXIT analysis [103]. Here, we carefully consider
the BICM inner code MET ensemble and the irregular VN and CN degree distributions
of each type and provide an analysis for it based on EXIT functions.
Let the function Υ(σ) be defined as
Υ(σ) = 1−∫ ∞−∞
1√2πσ2
e−(z−σ2/2)2
2σ2 log2
(1 + e−z
)dz. (4.16)
Similar to [99, Sec. IV-B2], we let SNRj denote the equivalent binary-input AWGN
surrogate channel-SNR for the j-th bit channel. The corresponding channel log-likelihood
ratio has a distribution with variance σ2j = 4SNRj. The message from a type-j VN in
the `-th iteration, for j ∈ 1, . . . ,m, is
I`vj→c =Dv∑i=1
λi,jI`vj→c(i), (4.17)
Chapter 4. Low-Comp. Concat. FEC for Higher-Order Modulation 49
where I`vj→c(i) is given by
I`vj→c(i) = Υ
(√(i− 1)Υ−1
(I`−1c→vj
)2
+ σ2j
). (4.18)
Initially, we let I0vj→c = Υ(σj). We use the approximation of [104, Eqs. (9),(10)] in
computing Υ(σ).
Note that (4.17) and (4.18) are identical to the equations for irregular LDPC code
ensembles given in [105, Chap. 3, Eq. (17)] and [105, Chap. 3, Eq. (19)], respectively.
The message from a CN to a type-j VN in `-th iteration is
I`c→vj =m+1∑i=1
ρi,jI`c→vj(i), (4.19)
where I`c→vj(i) is given by
I`c→vj(i) = 1−Υ
(( ∑j′=1:mj′ 6=j
[Pij′Υ
−1(
1− I`vj′→c)2
+ (Pij − 1)Υ−1(
1− I`vj→c)2 ]) 1
2
).
(4.20)
Here, Pij is the (i, j)-th element of the matrix P, and where ρi,j denotes the portion of
type-j edges connected to the i-th CN kind.
Note that (4.19) is identical to the equation for irregular LDPC ensemble analysis
given in [105, Chap. 3, Eq. (16)] and (4.20) is identical to the equations for PEXIT
analysis given in [99, Eq. (17)], [103, Sec. III.C].
BER Analysis
The a posteriori probability (APP) mutual information at a type-j VN after ` decoding
iterations, IAPP,`j , can be computed as
IAPP,`j =
Dv∑i=1
Li,jIAPP,`j (i),
where
IAPP,`j (i) = Υ
(√iΥ−1
(I`−1c→vj
)2
+ σ2j
). (4.21)
Chapter 4. Low-Comp. Concat. FEC for Higher-Order Modulation 50
Note that the right hand sides of (4.21) is almost identical to the right-hand side of
(4.18), except that we have a factor of i instead of (i − 1), as (4.18) computes extrinsic
information.
Using (4.21), the BER at degree-i, type-j VNs after ` decoding iterations, ε`i,j, there-
fore is
ε`i,j =1
2erfc
(σi,j
2√
2
),
where σi,j = Υ−1(IAPP,`j (i)
)and erfc(x) is the standard complimentary error function.
The contribution of degree-i, type-j VNs to the overall BER therefore is 1mLi,jε
`i,j.
Note, however, that unlike the MLC scheme, the error on inner parity bits has no
impact on the BER on bits passed to the outer decoder. We therefore assign the VNs of
highest reliability as the information bits. In particular, we let ε = (ε1, . . . , εN) be a vector
whose elements are ε`i,j values, sorted in ascending order, and we let α = (α1, . . . , αN)
be their corresponding contribution factors (the 1mLi,j values). Here, N denotes the
number of different reliabilities we get after decoding, as stated in Sec. 4.4.2. Let κ be
the maximum index such that∑κ
i=1 αi < Rin,BICM. The BER on bits passed to the outer
decoder, Pout, is therefore obtained as
Pout =1
Rin,BICM
(κ∑i=1
αiεi +
(Rin,BICM −
κ∑i=1
αi
)εκ+1
). (4.22)
Differential Evolution
We jointly optimize the VN-type degree distributions using a differential evolution algo-
rithm [106]. In particular, we follow the differential evolution procedure of [105, Ch. 3,
Sec. 3.3] for the inner-code ensemble optimization.
Given the inner code rate, Rin,BICM, and a fixed target complexity constraint, ηtin,BICM,
the differential evolution algorithm searches for a set ofm degree distributions, Lj(x)mj=1,
that minimizes a score function defined in the next paragraph.
For the given set of VN degree distributions Lj(x)mj=1, we first obtain the maximum
number of decoding iterations allowed, I, that satisfies the data-flow constraint ηin,BICM ≤ηtin,BICM from (4.15). A non-integral number of iterations is obtained by time-sharing
between decoding withbIc and with dIe number of iterations. The score of Lj(x)mj=1 is
then defined as the minimum channel SNR at which Pout, as defined in (4.22), is below
the target BER, P tout.
The differential evolution search is performed on an N -dimensional vector x contain-
ing the stacked coefficients of Lj(x)mj=1 . Let g denote the number of generations the
Chapter 4. Low-Comp. Concat. FEC for Higher-Order Modulation 51
differential evolution is carried for and let S denote the population size at each genera-
tion. Also, let β > 0 be an amplification factor and let 0 ≤ ξ ≤ 1 denote a cross-over
probability. On the population of each generation, the differential evolution carries the
following three steps, where for s ∈ 1, . . . , S:
1. Generate a mutation vs = xi1 + β(xi2 − xi3), where i1, i2, i3 are chosen uniformly
at random, without replacement, from the set 1, . . . , S\s.
2. Generate a competitor vector us whose i-th component, i ∈ 1, . . . , N, is found
as
us,i =
vs,n with probability ξ,
xs,n otherwise.
3. Vector us then replaces vector xs in the next generation if and only if it has a
better (i.e., lower) score.
The algorithm is initialized with random vectors x1, . . . ,xS that satisfy the code-rate
constraint. After carrying differential evolution on g generations, the algorithm then
outputs the vector, x∗, with best score at the last generation. The stacked vector x∗
then determines the optimal inner-code ensemble, for which we can sample a base-matrix
as described in Sec. 4.4.3.
4.5 Results
We design concatenated coded modulation schemes at an overall OH of 28% and 25%,
for various choices of modulation orders using MLC and BICM. We characterize the
performance-complexity trade-off in MLC and BICM by obtaining the Pareto frontier
between the SNR at which the coded modulation operates, and the decoding complexity.
In particular, at any given SNR, we obtain complexity-optimized concatenated MLC and
BICM inner-code ensembles, according to measures (4.2) and (4.15), using the methods
described in Sec. 4.3 and Sec. 4.4, respectively.
We consider a QC structure for the inner codes. We sample base codes of small
length from the obtained ensembles and then lift them to obtain QC inner codes of girth
at least 8. Note that should a sampled base matrix not have a full-rank sub-matrix
in the designated parity positions we discard it and sample another base matrix. As
mentioned in Sec. 3.4.4, the QC structure is well-known to be hardware-friendly, giving
rise to energy-efficient implementations [64].
Chapter 4. Low-Comp. Concat. FEC for Higher-Order Modulation 52
16.4 16.6 16.8 17 17.2 17.40
10
20
30
40
60%
0.4 dB
11
9
7
6
5
17
15
109
86
5
Es/N0 dB
η
BICMMLC
MLC [2]
Figure 4.4: Performance-complexity comparisons of optimized codes for MLC and BICMusing 64-QAM, compared with the design in [2]. The number of decoding iterationsrequired by each designed code is indicated. At the overall 28% OH of these schemes,CSL = 15.0 dB.
As suggested in [99, Sec. IV-E], for optimization of the inner code in the BICM scheme
we chose the initial population size S = 100 and g = 1000 for the differential evolution
algorithm. We also used β = 0.6 and ξ = 0.6.
Our codes induce a uniform distribution over the transmitted QAM symbols. In all
the results presented, we use the SP algorithm for inner-decoding, with floating-point
message-passing. For each scheme we measure the gap (in dB) to the CSL and the NCG
using (1.1) and (1.2).
4.5.1 Design for 28% OH
For coded modulation designs with 28% OH, we consider a 64-QAM and we use the
staircase code of rate-239/255 as the outer code. We consider a PAM frame length of
8000 (which amounts to 4000 QAM symbols). This requires the use of a rate-(1/2),
length-8000, inner code on the LSB for the MLC scheme, and the use of a rate-(5/6),
length-24000, inner code for the BICM scheme.
As can be seen in Fig. 4.4, compared to the BICM scheme, the MLC scheme provides
a superior performance-complexity trade-off. Compared to the BICM scheme at a similar
decoding complexity, the MLC scheme provides SNR gains of up to 0.4 dB. Also, at a
similar operating point, up to 60% reduction in decoding complexity can be achieved by
the MLC scheme. We believe that this advantage is obtained because the MLC scheme,
despite its shorter block length, leaves all but one bit levels uncoded, unlike the BICM
decoder which processes all bit levels.
Chapter 4. Low-Comp. Concat. FEC for Higher-Order Modulation 53
16.1 16.3 16.5 16.7 16.9 17.1 17.3 17.510−3
10−2
Es/N0 dB
Pout
P tout
(a) MLC inner codes
16.2 16.4 16.6 16.8 17 17.2 17.4 17.610−3
10−2
Es/N0 dB
Pout
P tout
(b) BICM inner codes
Figure 4.5: Simulated decoder outputs of inner codes for designs at 28% OH with 64-QAM. The mid-point on each BER curve (highlighted by an ‘o’) is the code operationalpoint, i.e, the SNR for which the inner code is designed to achieve Pout ≤ P t
out.
The proposed MLC scheme can operate within a 1.4 dB gap to the CSL, achieving an
NCG of up to 13.6 dB with a complexity score below 24. We also see that the obtained
MLC-based codes attain 0.24 dB better coding gain than the code of [2] at a similar
decoding complexity, and a 30% reduction in decoding complexity at a similar NCG.
In Fig. 4.5, we plot the average BER passed to the outer code versus SNR, for the
obtained codes in the MLC and BICM schemes. The P tout line shows the target we set
for the inner codes. The mid-point SNR on each curve (highlighted by an ‘o’) is the code
operational point, i.e., the SNR for which the inner code is designed. Note that all BERs
of the sampled codes hit very close to the target at their operational point, verifying our
design approach.
Chapter 4. Low-Comp. Concat. FEC for Higher-Order Modulation 54
11 11.2 11.4 11.6 11.8 12 12.2 12.4 12.6 12.8100
101
102
3
4
67
20
128
65
4
Es/N0 dB
η
[3] BICMBICM[3] MLCMLC
(a) 16-QAM, CSL=10.15 dB.
16.4 16.8 17.2 17.6 18 18.4 18.8100
101
102
2
3
6716
1297 7 6 5
Es/N0 dB
η
[3] BICMBICM[3] MLCMLC
(b) 64-QAM, CSL=15.4 dB.
22 22.2 22.4 22.6 22.8 23 23.2 23.4 23.6 23.8100
101
102
3
5510
86
5 5 5 4 4 4 4 3 3
Es/N0 dB
η
[3] BICMBICM[3] MLCMLC
(c) 256-QAM, CSL=20.45 dB.
Figure 4.6: Performance-complexity comparisons of the obtained optimized codes forMLC and the BICM of various orders at 25% overall OH, compared with the designsin [3]. The number of decoding iterations required by each designed code is indicated.
Chapter 4. Low-Comp. Concat. FEC for Higher-Order Modulation 55
4.5.2 Design for 25% OH
For coded modulation designs at 25% OH, we consider three modulation orders, namely
16-QAM, 64-QAM and 256-QAM with PAM frame lengths of 12000, 8000, 6000 (which
amounts to 6000, 4000, and 3000 QAM symbols), respectively. These choices for the
PAM frame lengths would result in a constant bit throughput in all modulation schemes.
For the MLC scheme, we use a rate-(2/3) length-12000 inner code for 16-QAM, a
rate-(1/2) length-8000 inner code for 64-QAM, and a rate-(1/3) length-6000 inner code
for 256-QAM, on the LSB channel. For the BICM scheme, we use a rate-(5/6) and length
24000 for 16-QAM, 64-QAM and 256-QAM schemes.
In Fig. 4.6, we plot the Pareto frontier obtained by MLC and BICM scheme at various
modulation orders. Compared to the BICM scheme at a similar decoding complexity,
the MLC scheme provides SNR gains of up to 0.4 dB, 0.8 dB, and 1.2 dB for 16-QAM,
64-QAM and 256-QAM, respectively. At a similar operating point, up to 43%, 64%, and
78% reduction in decoding complexity can be achieved by the MLC scheme for 16-QAM,
64-QAM and 256-QAM, respectively. Also in that order, the MLC schemes can operate
within 1 dB, 1.2 dB, and 1.65 dB gap to the CSL, achieving NCGs of up to 12.8 dB,
13.6 dB dB, and 14 dB with complexity score of just under 40, 22, and 12.
In Fig. 4.6, we also see that the obtained MLC and BICM codes provide a supe-
rior performance-complexity trade-off compared to the MLC and BICM codes of [3],
respectively. At a similar NCG, for the MLC-based schemes of 16-QAM, 64-QAM and
256-QAM, a 23%, 54%, and 55% reduction in decoding complexity was achieved by our
codes, compared to the codes of [3], respectively. Also at a similar NCG, for the BICM-
based schemes of 16-QAM, 64-QAM and 256-QAM, our codes achieve a 73%, 80%, and
81% reduction in decoding complexity, respectively, compared to the codes of [3].
4.5.3 Design Example1
Here in detail, we describe a design example for the MLC scheme. We explain the
parameters we pick for the FEC scheme and carefully design the interleaver between the
inner and outer code. To validate the system operation, we then implement the encoder
and decoder of both the inner and the outer code and provide BER measurements down
to 10−7.
1The FEC design and simulation presented in this section is a joint work with Alvin Sukmadji.
Chapter 4. Low-Comp. Concat. FEC for Higher-Order Modulation 56
4 6 8 10 12 14 160
1
2
3
4
5
6
1.5dB
2dB1dB
0.8dB
Es/N0 dB
AIR
(bit/sym
bol)
Shannon Cap.
64-QAM
16-QAM
Designed Code
400ZR [4]
Figure 4.7: Achievable information rate for 16- and 64-QAM modulations compared tothe unconstrained Shannon capacity. The operational point of the designed concatenatedcode is also shown and compared to that of [4].
FEC Parameters
In Fig. 4.7 we plot the achievable information rate (AIR) of signalling with 16-QAM
alongside that of the 64-QAM and the Shannon capacity. We pick signalling with 16-
QAM and target an overall OH of 25% (3.2 bit/symbol– a high spectral efficiency). At
this OH, the loss due to signalling with 16-QAM compared to the unconstrained Shannon
limit is a mere 1 dB, and the loss compared to signalling 64-QAM is insignificant. By
contrast, at the rate-3.485 bit/symbol of the recent 400ZR implementation agreement [4],
the loss due to signalling with 16-QAM compared to the Shannon limit is around 1.5 dB.
For the outer zipper code, we picked a length-3960 (Galois field extension degree 12),
3 error-correcting constituent code, which gives an outer rate of
Rout = 1− 3 · 12
3960/2.
We define a chunk to have 1210 constituent codes at the zipper decoder with a total
of 6 chunks (around 14.4 Mbits) per decoding window. This outer code has threshold
1.07 × 10−3. Setting the inner-code rate to Rin,MLC = 17/27 then results in 25% overall
OH. We target operation at 0.8 dB gap to the CSL. The inner-code optimization routine
of Sec. 4.3.3 then gives the degree distributions
L(x) = (10x+ 8x5 + 9x6)/27,
R(x) = (6x10 + 4x11)/10,
Chapter 4. Low-Comp. Concat. FEC for Higher-Order Modulation 57
with 14 decoding iterations. See Fig. 4.7 for the operational point of the obtained code.
It also turns out that the optimization routine gives the same degree distributions, but
with 12 decoding iteration, for operation at 0.85 dB gap to the CSL. We sampled a base
code of length 27 from this ensemble and lifted it by a factor of 45× 55 = 2475 to obtain
a girth-10 code.
Interleaver Design
Note that the inner code only guarantees that the average BER on bits passed to the outer
code is at or below a target. However, VNs of different degrees have different reliabilities
and similarly, the bits carried on the uncoded level that are demapped conditioned on
those VNs also have different reliabilities. The task of the interleaver therefore is not
only to reduce, to the extent possible, the correlation among bits at each constituent
code, but also to ensure that each constituent code observes, on average, the same BER.
To design such interleaver, we first classify the bits passed to outer code. Note that
the base inner frame produces 44 bits; 17 information bits from the coded level and 27
bits from the uncoded level. We group their corresponding lifted bits in 44 different
classes. Further, we divide bits of each class into cards of 45 bits. Therefore, with each
inner frame, a total of (27 + 17) · 2475 = 108, 900 bits are passed to the outer code–
17 ·2475 bits from the coded level and 27 ·2475 from the uncoded level– among which we
have 44 classes of 2475 bits each. The bits of each class are then divided into 55 cards of
45 bits each, for a total of 2420 = 1210 · 2 cards per inner frame.
We have 22 inner frames per chunk. In Fig. 4.5.3 we depict interleaving and placement
of the inner frames into the real buffer of the zipper decoder chunk. Here, we first stack
the cards of each inner frame vertically on top of each other in two decks (note that
we have 1210 constituent codes per chunk at the outer decoder and therefore each inner
frame will have 2 cards per constituent code). Then, we vertically shift the two decks of
each inner frame as follows: The first inner-frame decks are shifted by 0 cards (no shift),
the second inner-frame decks are shifted by 55 cards, the third inner-frame decks are
shifted by 2 · 55 cards, and so on, and finally, the 22nd inner-frame decks are shifted by
21 · 55 cards. The shifted decks then form the zipper decoder chunk. With this labelling,
we ensure that, 1) each constituent code has an equal number of bits from the 22 inner
frames, providing maximum possible mitigation of correlation among its bits, and 2) each
constituent code has an equal number of bits from the 44 classes of inner bits, providing
it with the expected BER.
Chapter 4. Low-Comp. Concat. FEC for Higher-Order Modulation 58
C(0)0 C
(0)1
· · · C(0)1209 C
(0)1210 C
(0)1211
· · · C(0)2419
C(1)0 C
(1)1
· · · C(1)1209 C
(1)1210 C
(1)1211
· · · C(1)2419
C(21)0 C
(21)1
· · · C(21)1209 C
(21)1210 C
(21)1211
· · · C(21)2419
......
......
......
22
108900 24750
45Inner Frames in a chunk.
C(0)0
C(0)1
...
C(0)54
C(0)55
...
C(0)1209
C(0)1210
C(0)1211
...
C(0)1264
C(0)1265
...
C(0)2419
C(1)1155
C(1)1156
...
C(1)1209
...
C(1)0
C(1)1154
C(1)2365
C(1)2366
...
C(1)2419
C(1)1210
...
C(1)2364
C(21)55
C(21)56
...
C(21)109
C(21)110
...
C(21)54
C(21)1265
...
C(21)1266
C(21)1319
C(21)1320
...
C(21)1264
. . .
. . .
. . .
. . .
. . .
45
1980
1210
Real buffer of the chunk.
Figure 4.8: The interleaving and placement of bits into the real buffer of the outer decoderper chunk, for the FEC parameters discussed in Sec 4.5.3.
Chapter 4. Low-Comp. Concat. FEC for Higher-Order Modulation 59
10.940 10.960 10.980 11.000 11.020
10−6
10−5
10−4
10−3
SNR (dB)
BER
14 inner iterations12 inner iterations
Figure 4.9: BER simulations for the designed concatenate LDPC-zipper FEC scheme.
BER measurements
In Fig. 4.5.3 we plot the BER simulation results for the obtained concatenated LDPC-
zipper FEC scheme. Here we have used a diagonal interleaver between the real and
virtual buffer at the outer code. We plot the results for both 14 and 12 inner decoding
iterations.
For each SNR, we ran a few trials of the simulation. A trial is considered to be
complete when the decoder records a total of 50 bursts, where a burst is loosely defined
as a sequence of received erroneous chunks of length between 3 and 40, inclusive. The
tips of each error bar denote the maximum and minimum BER values we obtained in
these trials. See Tables 4.2 and 4.3 for more details on the statistics of the simulation
results.
4.6 Conclusion
In this chapter, we have compared performance-complexity tradeoffs achievable by MLC
and BICM in a concatenated coded modulation system at 28% and 25% OHs, using
various QAM modulation schemes. For both systems we have used state-of-the-art op-
timization strategies to obtain a complexity-optimized error-reducing LDPC inner code,
Chapter 4. Low-Comp. Concat. FEC for Higher-Order Modulation 60
Table 4.2: Statistics of the simulation results shown in Fig. 4.5.3 (14 inner iterations)SNR (dB) # of trials average stdev max min
10.948 5 4.630× 10−4 2.543× 10−5 4.898× 10−4 4.304× 10−4
10.949 5 3.041× 10−4 2.988× 10−5 3.258× 10−4 2.539× 10−4
10.950 5 2.081× 10−4 4.365× 10−5 2.798× 10−4 1.708× 10−4
10.951 5 1.410× 10−4 1.574× 10−5 1.652× 10−4 1.232× 10−4
10.952 5 6.952× 10−5 7.969× 10−6 7.890× 10−5 6.262× 10−5
10.953 5 3.614× 10−5 7.306× 10−6 4.527× 10−5 2.482× 10−5
10.954 5 2.152× 10−5 5.490× 10−6 2.630× 10−5 1.483× 10−5
10.955 5 9.584× 10−6 2.096× 10−6 1.189× 10−5 6.857× 10−6
10.956 5 5.061× 10−6 2.988× 10−7 5.294× 10−6 4.567× 10−6
10.957 5 3.192× 10−6 5.699× 10−7 4.068× 10−6 2.510× 10−6
10.958 5 1.393× 10−6 3.863× 10−7 1.885× 10−6 9.007× 10−7
10.959 5 7.239× 10−7 2.923× 10−7 1.077× 10−6 4.722× 10−7
Table 4.3: Statistics of the simulation results shown in Fig. 4.5.3 (12 inner iterations)SNR (dB) # of trials average stdev max min
10.996 4 8.116× 10−4 4.010× 10−6 8.172× 10−4 8.086× 10−4
10.997 4 7.836× 10−4 5.076× 10−6 7.898× 10−4 7.774× 10−4
10.998 4 7.511× 10−4 3.834× 10−6 7.568× 10−4 7.487× 10−4
10.999 4 7.048× 10−4 1.154× 10−5 7.172× 10−4 6.915× 10−4
11.000 8 6.747× 10−4 1.234× 10−5 6.906× 10−4 6.518× 10−4
11.001 8 6.419× 10−4 1.991× 10−5 6.846× 10−4 6.243× 10−4
11.002 8 5.539× 10−4 1.875× 10−5 5.799× 10−4 5.248× 10−4
11.003 8 4.793× 10−4 2.075× 10−5 5.062× 10−4 4.464× 10−4
11.004 8 3.624× 10−4 3.549× 10−5 4.189× 10−4 3.168× 10−4
11.005 8 2.420× 10−4 2.040× 10−5 2.671× 10−4 2.133× 10−4
11.006 8 1.212× 10−4 1.043× 10−5 1.406× 10−4 1.074× 10−4
11.007 8 7.028× 10−5 1.627× 10−5 9.050× 10−5 4.617× 10−5
11.008 8 2.595× 10−5 4.584× 10−6 3.340× 10−5 2.073× 10−5
11.009 8 1.232× 10−5 2.030× 10−6 1.536× 10−5 1.068× 10−5
11.010 8 5.672× 10−6 9.819× 10−7 7.430× 10−6 4.155× 10−6
11.011 8 2.412× 10−6 3.972× 10−7 2.876× 10−6 1.691× 10−6
11.012 8 8.857× 10−7 1.396× 10−7 1.107× 10−6 7.002× 10−7
11.013 8 4.257× 10−7 8.815× 10−8 5.359× 10−7 2.991× 10−7
Chapter 4. Low-Comp. Concat. FEC for Higher-Order Modulation 61
to concatenate with an outer hard-decision code. We characterize the trade-off between
performance and decoding complexity by a Pareto frontier. Our results show that the
MLC schemes, despite operating with a shorter block length than BICM, dominate the
BICM schemes from a performance-complexity standpoint. Our complexity-optimized
MLC and BICM schemes also provide a superior performance-complexity trade-off rel-
ative to existing proposals [2, 3], achieving net coding gains of up to 14 dB, yet with
manageable complexity.
We emphasize that our choice to inner-encode only one bit-level in the MLC scheme
is driven by complexity considerations. The fact that the remaining bit-levels can be left
uncoded is permitted because of the presence of an outer code, and then only in certain
settings. Nevertheless, for modulations with up to 256 points, we have shown that the
MLC scheme provides excellent performance-complexity tradeoffs with this architecture.
Furthermore, in Chapter 6 we obtain MLC schemes in which we protect more than one
bit-level by a complexity-optimized non-binary LDPC code, achieving even better FEC
performance.
Chapter 5
Low-Complexity Rate- and
Channel-Configurable Concatenated
Codes
5.1 Introduction
Conventionally, efficient and low-complexity FEC schemes have been designed for a spe-
cific system throughput and channel quality. The rapid adoption of OTNs that can
operate with various modulation formats at a variety of transmission rates and channel
qualities requires, however, that researchers rethink this convention and design config-
urable FEC schemes that can be deployed in multiple modes of operation. In this chapter,
we propose a design approach for low-complexity FEC schemes that can be configured
to operate at multiple transmission rates and channel qualities.
Code designs configurable to channel variations have been studied previously [108,
109]. In FEC design for optical communication, researchers have considered scalable
designs that trade coding gain for low-complexity operation. In the widely used FEC so-
lutions for OTNs—e.g., product-like codes, LDPC codes, and turbo codes—this trade-off
can often be realized by scaling the number of decoding iterations [110, 111]. An exper-
imental implementation of such a scalable FEC scheme with 20.5% OH was presented
in [112].
Rate-adaptive FEC schemes for optical communication have also been studied previ-
ously [113, 114]. Variable-rate FEC design for optical communication has been realized
This chapter includes and expands on the work in [107].
62
Chapter 5. Low-Comp. Rate- and Channel-Config. Concat. Codes 63
by various approaches, including shortening [115–117], puncturing [118], and selective
use of code concatenation [91]. More recently, a concatenated polar-staircase code struc-
ture was proposed in [5], providing rate adaptability with near-continuous granularity by
varying the size of the polar-code frozen set. Rate-adaptive coded modulation schemes
have also been considered for the optical channel [119–121]. In combination with shaping,
rate-adaptability has been realized by an adjustable distribution matcher that performs
probabilistic constellation shaping [122, 123]. Experimental validation of rate-adaptive
FEC schemes for optical communication has been widely reported, e.g., in [123,124].
In this chapter, we propose a design approach for attaining low-complexity, multi-
rate, and channel-adaptive FEC schemes that can provide excellent coded-modulation
performance and are of practical relevance to optical communication. In the designed
concatenated FEC schemes, the transmission rate is configurable by signalling with vari-
ous modulation formats and by shortening or lengthening the inner code, and functioning
at various channel qualities is realized by scaling the inner-decoding operation. We re-
formulate the design tools reported in Chapter 4 with the aim of obtaining a configurable
FEC scheme with near-optimal decoding complexity at its various operating points.
We design a number of configurable FEC schemes flexible to operate at various trans-
mission rates and with various modulation formats compatible with the recent propos-
als [4,5]. Compared to the configurable FEC scheme of [5], the designs reported here can
provide up to 63% reduction in complexity while delivering a similar performance and
provide up to 0.6 dB coding gain when operating at a similar decoding complexity.
The FEC design approach advocated in this chapter can also be used to address
the need for multi-vendor interoperable modules in current and future standards [4].
Moreover, FEC flexibility towards modulation format, data rate, and the delivered coding
gain is a key feature that is necessary in future coherent optical networks [22, 125]. The
approach presented in this chapter can also be applied to FEC design for time-domain
hybrid modulation formats.
The rest of this chapter is organized as follows. In Sec. 5.2 we describe the concate-
nated code structure, the modulation formats we work with, and the MLC architecture
we incorporate in our design. In Sec. 5.3 we describe the inner-code, its parameterization,
and its configurable design. In Sec. 5.4 the inner-code optimization and construction are
explained. In Sec. 5.5 we present simulation results for the various FEC schemes that we
have designed using these optimization tools, characterize their trade-offs, and compare
them to the existing state of the art. In Sec. 5.6 we provide concluding remarks.
Chapter 5. Low-Comp. Rate- and Channel-Config. Concat. Codes 64
Outer HDEncoder
π LDPCEncoder
Ω2m
Mapper
m− 1
Rin 1
Rinm
(a) Encoder
Ω2m
Demapper LDPCDecoder
HD-MSBs
π−1 Outer HDDecoder
m− 1
1 1−Rin
Rin
m Iter. No.
m− 1
(b) Decoder
Figure 5.1: The encoder and the decoder in the configurable FEC scheme. Here, m =log2M denotes the number of bits per PAM symbol.
5.2 Concatenated Code Description
We adopt a similar concatenated FEC structure to that in Chapter 4, Sec. 4.3. We
consider an inner, SD, rate- and channel-configurable LDPC code concatenated with a
high-rate outer HD zipper code. We use the zipper codes of [14, Table 1] as outer codes.
The inner code is concatenated with the outer code through an interleaver, π (see the
encoder and decoder of Fig. 5.1).
Similar to the designs in previous chapters, the task of the inner LDPC code is to
reduce the BER of the bits transferred to the outer code to below its threshold. When
setting a target BER on bits passed to the outer code, P tout, we leave a margin (of 7
to 10%) compared to the outer-code thresholds reported in [14], to enable a reduced
interleaver size between inner and outer codes and a practical realization of our designs
(see Section 5.5).
The concatenated FEC scheme works in conjunction with QAM schemes of various
orders. For concreteness, we consider uniform rectangular M2-QAM, with M ∈ 2, 4, 8,in our FEC schemes. It follows that the number of QAM symbols per frame is half that
of the number of PAM symbols. Throughout this chapter we let m = log2M denote
the number of bits mapped to each PAM symbol. We note, however, that the proposed
configurable FEC scheme can be designed to incorporate essentially any modulation
format.
When a binary modulation (m = 1) is considered, a concatenated FEC scheme similar
to that of [87] is assumed. When the concatenated FEC scheme is to work with a higher-
order modulation (m > 1), we assume that a multi-level coding and multi-stage decoding
structure similar to Sec. 4.3 is deployed. In this architecture, as shown in Fig. 5.1, only
Chapter 5. Low-Comp. Rate- and Channel-Config. Concat. Codes 65
the LSB is encoded by the inner code, and the MSBs are protected only by the outer
code. A similar constellation labelling to that of Sec. 4.3.1 is considered. At the receiver,
a hard-decision on the MSBs, taking into account the hard-decision on the inner-decoder
output bits and the channel information, is passed through the de-interleaver to the outer
decoder (see Fig. 5.1(b)). We assume that the inner decoder passes only hard-decided
bits to the outer decoder and the MSB demapper.
Throughout this chapter, we consider only unshaped (i.e., uniformly-distributed) sig-
nalling schemes with square QAM constellations. Probabilistic- and geometric constella-
tion shaping can provide a power advantage over uniform signalling, and can also provide
rate-configurability by adjusting the entropy of the shaping distribution [123]. It is well
known that the optimal shaping parameters, in both the probabilistic- and geometric-
shaping variants, depend on launch power, target transmission rate, and constellation
size [126]. On the other hand, implementation of shaping schemes adds to the encod-
ing and decoding complexity. We have not attempted in this work to characterize the
tradeoffs between redundancy, reliability and complexity when shaping schemes are in-
corporated; instead we prefer to think of shaping as an independent operation aimed at
narrowing the performance gap to the unconstrained Shannon limit, with its own at-
tendant tradeoffs between performance and complexity. If desired though, as we briefly
sketch in Section 5.4.2, shaping can be incorporated by certain adjustments to our design
procedure.
For the purposes of code design, we model the channel as an additive white Gaussian
noise channel. We optimize the configurable FEC scheme to operate with minimal devi-
ation relative to its reference complexities, i.e., the complexity of the codes individually
designed for its operating points. For the complexity scores, we use the measure described
in Sec. 3.2.3, that is obtained as η = ηin/Rout, where ηin is the inner-code complexity
score, and Rout is the outer-code rate.
5.3 Inner-Code Description
We consider a FEC scheme with J operating points. Throughout this chapter, we use
subscript j to denote parameters specific to the j-th operating point, with j ∈ 0, . . . J−1. The j-th operating point specifies the pair (Rin,j, SNRj), the inner-code rate and
the SNR at the operating point, respectively. Without loss of generality we assume
Rin,j ≤ Rin,j+1, for j ∈ 0, . . . J − 2.Similar to code designs in Chapters 3 and 4, we design ensembles of systematic LDPC
codes where we designate a particular subset of the VNs to be the information set,
Chapter 5. Low-Comp. Rate- and Channel-Config. Concat. Codes 66
while the remaining VNs form the parity set. We let Nj denote the number of VNs
in a particular Tanner graph of the j-th operating point drawn from the ensemble and
let Ni,j be the number of degree-i VNs in the graph. We divide the VNs into two
groups: information nodes and parity nodes, representing information bits and parity
bits, respectively. We let Kj denote the number of information nodes and let Ki,j be
the number of degree-i information nodes in the graph. The code rate, therefore, is
Rin,j = Kj/Nj.
We denote the VN perspective degree distribution for the j-th operating point by
Lj(x) =∑Dv
i=0 Li,jxi, where Li,j = Ni,j/Nj is the fraction of VNs that have degree i,
and Dv denotes the maximum VN degree allowed in the ensemble. Note that we permit
uncoded bits in the ensemble, i.e., we allow L0,j ≥ 0. We define the edge-perspective VN
degree distribution as λj(x) , L′j(x)/L′j(1) =∑Dv
i=1 λi,jxi−1, where L′j(x) = dLj(x)/dx.
We consider a CN degree distribution that is concentrated on two consecutive degrees,
namely, dcj and dcj + 1, with dcj denoting the average CN degree. It is easy to see that,
for i ∈ 1, . . . , Dv, Li,j = dcj(1− Rin,j)λi,j/i. We let Rj(x) =∑dcj+1
d=dcjRd,jx
d denote the
node perspective check-degree distribution, where Rd,j is the fraction of CNs of degree
d, with ρj(x) , R′j(x)/R′j(1) denoting the edge-perspective CN degree distribution.
Let Ui,j = Ki,j/Nj be the share of degree-i information nodes among all VNs. Since all
degree-zero VNs must be among the information nodes, we have U0,j = L0,j. Furthermore,
Ui,j ≤ Li,j for i ∈ 1, . . . , Dv, and∑Dv
i=0 Ui,j = Rin,j.
For j ∈ 0, . . . J − 2, we design the configurable inner code such that where Rin,j <
Rin,j+1, the inner code associated with the j-th operating point is shortened from that
of the (j + 1)-th operating point. Since the number of CNs remains the same when
shortening an LDPC code, we necessarily have Nj − Kj = Nj+1 − Kj+1, and therefore
Nj(1 − Rin,j) = Nj+1(1 − Rin,j+1). However Ki,j ≤ Ki,j+1 (due to the shortening); thus
dividing Ki,j and Ki,j+1 by Nj(1−Rin,j) and Nj+1(1−Rin,j+1), respectively, we get
Ui,j1−Rin,j
≤ Ui,j+1
1−Rin,j+1
.
Similarly, since shortening preserves the degree-i parity nodes, we must have Ni,j−Ki,j =
Ni,j+1 −Ki,j+1 and therefore
Li,j − Ui,j1−Rin,j
=Li,j+1 − Ui,j+1
1−Rin,j+1
.
We allow the possibility that Rin,j = Rin,j+1, i.e., Li,j = Li,j+1 and Ui,j = Ui,j+1, meaning
the code structure stays the same, but with possibility of having a different number of
Chapter 5. Low-Comp. Rate- and Channel-Config. Concat. Codes 67
decoding iterations.
We let Λj = (λ1,j, λ2,j, . . . , λDv ,j) and Uj = (U0,j, U1,j, . . . , UDv ,j). Further, we let
Λ = (Λ1,Λ2, . . . ,ΛJ) and let U = (U1, U2, . . . , UJ). We refer to the pair (Λ,U) as
the design parameters. The design parameters are used in the complexity-optimization
program.
Let Ej denote the number of edges in a particular inner-code Tanner graph of the
j-th operating point that are not connected to a degree-1 VN. Also, let Ij denote the
maximum number of inner decoding iterations performed at the j-th operating point. The
complexity score of the inner code at the j-th operating point, ηin,j, is then computed as
ηin,j =EjIj
Nj(mj − 1 +Rin,j), (5.1)
and measures the number of messages that are passed at the inner decoder per informa-
tion bit transferred to the outer code at the j-th operating point. Here, mj − 1 denotes
the number of bits per in-phase and in-quadrature PAM symbols that bypass the inner
code, per PAM symbol. Note that those bits incur zero inner decoding complexity as
accounted for in (5.1). It is easy to see that Ej/Nj = (1 − Rin,j)(dcj − νj), where νj
is the average number of degree-one VNs connected to each CN in the Tanner graph.
Therefore, ηin,j can be obtained as
ηin,j =(1−Rin,j)(dcj − νj)Ij
mj − 1 +Rin,j
. (5.2)
5.4 Ensemble Optimization and Code Construction
5.4.1 Reference Complexities
The coded modulation scheme at any operating point is structured such that the inner
code observes an output-symmetric channel (see [15, Def. 4.8]). Therefore, the BER and
the EXIT function analysis used for code design in Sec 4.3.3 remains applicable.
Let the uni-parametric EXIT function corresponding to the j-th operating point be
denoted by fj(p). Similar to Sec. 3.3.1, fj(p) can be expressed in terms of elementary
EXIT functions, fi,j(p), as fj(p) =∑Dv
i=0 λifi,j(p), where i ∈ 1, . . . , Dv. The function
fi,j(p) outputs the probability of error in messages emitted from the degree-i VNs after
one round of message-passing. Note that computation of fi,j(p) depends only on the SNR
and modulation format of the j-th operating point. Given the the j-th operating-point
SNR, dcj , and νj, we can pre-compute and store the fi,j(p) values and use them in the
Chapter 5. Low-Comp. Rate- and Channel-Config. Concat. Codes 68
ensemble optimization.
Given the elementary EXIT functions, the number of iterations, Ij, required by the
inner code to take the VN message error probability from p0,j, the channel BER, down
to pt,j, a target message error probability, can be approximated as [61]
Ij ∼=∫ p0,j
pt,j
dp
p log(
pfj(p)
) .Therefore, from (5.2), the inner-code complexity at the j-th operating point, ηin,j, can
be approximated as
ηin,j∼=
(1−Rin,j)(dcj − νj)mj − 1 +Rin,j
∫ p0,j
pt,j
dp
p log(
pfj(p)
) .Let Pinfo,j and Pparity,j denote the information-set BER, and the parity-set BER,
respectively, after the target message error probability, pt,j, is achieved. As explained
in Sec. 4.3.3, Pinfo and Pparity, can be computed from pt,j and the elementary EXIT
functions. Let Pout denote the BER on bits passed to the outer decoder. In terms of the
ensemble parameters, Pout,j can be obtained as
Pout,j = ajPinfo,j + bjPparity,j + cj, (5.3)
where aj, bj, cj are independent of the inner-code design parameters and can be pre-
computed as described in equation (4.3.3).
The complexity-optimized inner-code specifically designed for the j-th operating point
is obtained as described in Sec. 4.3.3 and by searching over a discrete set of values for
dc,j, νj, and pt,j. We refer to the minimum achievable inner-code complexity score as the
j-th reference complexity and denote it by η∗in,j.
5.4.2 Configurable Inner-Code Optimization
We aim at designing a configurable FEC scheme that maintains a close-to-optimal com-
plexity score at its operating points. While there are many possible ways to give a
precise meaning to “close-to-optimal,” in this chapter we use the relative deviation with
respect to the reference complexity as the cost associated with a configurable scheme
at an operating point. Once all the η∗in,j’s are obtained, the optimized configurable
inner-code ensemble is obtained as follows. We search over a discrete set of values for
dc,0 × dc,1 × . . . dc,J−1, ν0 × ν1 × . . . × νJ−1, and pt,0 × pt,1 × . . . × pt,J−1, and, for each
Chapter 5. Low-Comp. Rate- and Channel-Config. Concat. Codes 69
choice we solve the following optimization problem:
minimize(Λ,U)
γ = max
(ηin,0
η∗in,0,ηin,1
η∗in,1, . . . ,
ηin,J−1
η∗in,J−1
), (5.4)
subject toUi,j
1−Rin,j
≤ Ui,j+1
1−Rin,j+1
, (5.5)
Li,j − Ui,j1−Rin,j
=Li,j+1 − Ui,j+1
1−Rin,j+1
, (5.6)
Dv∑i=1
λi,ji≥ 1− L0,j
dc,j(1−Rin,j), (5.7)
Dv∑i=1
λi,j = 1, λ1,j dc,j = νj, (5.8)
0 ≤ λi,j ∀i ∈ 1, . . . , Dv, (5.9)
Dv∑i=0
Ui,j = Rin,j, U0,j = L0,j, (5.10)
0 ≤ Ui,j ≤ Li,j ∀i ∈ 1, . . . , Dv, (5.11)
fj(p) < p ∀p ∈ [pt,j, p0,j], (5.12)
ajPinfo,j + bjPparity,j + cj ≤ P tout, (5.13)
where constraints (5.5)–(5.6) should hold true for all j ∈ 0, . . . , J − 2 and constraints
(5.7)–(5.13) should hold true for all j ∈ 0, . . . , J−1. We call γ the maximum complexity
deviation ratio of a given ensemble set. The objective is to find the configurable inner-
code ensemble set which minimizes γ; the resulting γ is called γ∗.
Note that constraints (5.5)–(5.6) ensure that the obtained codes have a compatible
structure. Constraints (5.7)–(5.12), similar to constraints (3.14)–(3.19), are the validity
constraints. Constraint (5.13) then ensures that the BER on bits passed to the outer
decoder is at or below the set target at each operating point.
The code optimization problem can be solved using the methods described in Sec. 4.3.3;
however, as the number of operating points we optimize for increases, it becomes infeasi-
ble to search over a sufficiently dense subset of the dc,j’s, νj’s, and pt,j discrete parameter
spaces. A practical differential-evolution-based method for obtaining the optimal config-
urable inner-code ensemble is described in the next section.
While not considered in this work, the optimization problem can be modified to incor-
porate probabilistic amplitude shaping, for m > 1, in a reverse concatenated architecture
as in [127]. A first modification would be to use a constellation labelling in which the
Chapter 5. Low-Comp. Rate- and Channel-Config. Concat. Codes 70
LSB alternates between adjacent symbols, and the MSBs, given the LSB, are Gray-
labelled while signal magnitude is indicated by the MSBs. For example, the labelling
Ω8 = (110, 011, 100, 001, 000, 101, 010, 111) could be used for the in-phase 8-PAM of a
64-QAM constellation. A distribution matcher can then perform probabilistic amplitude
shaping by adjusting the distribution of the MSBs. Of course the additional redundancy
and complexity introduced by the distribution matcher must be accounted for in the
overall OH and complexity score. Note that with this labelling, the LSB channel remains
output-symmetric and the EXIT function analysis used for code design remains valid.
5.4.3 Code Optimization Via Differential Evolution
We use a similar method as in Sec. 4.4.4 to characterize and eventually optimize the
configurable FEC scheme. Compared to the method described in Sec. 5.4, despite be-
ing less rigorous, this method obtains very similar optimized ensembles with a lower
computational complexity.
For notational simplicity, here we only limit the method to codes in which νj = 1 for
j ∈ 1, . . . , J. In fact, in most of the optimal codes we found, it turns out that νj = 1.
Note, however, that the method presented here can easily be extended to the general
case.
EXIT Function Analysis
Let the function Υ(σ) be defined as in equation (4.16). For the j-th operating point,
we let SNRj denote the equivalent binary-input additive white Gaussian noise surrogate
channel-SNR [99, Sec. IV-B2]. The corresponding channel log-likelihood ratio has a
symmetric Gaussian distribution with variance σ2j = 4SNRj.
As suggested in [128], in the `-th iteration, the message from a degree-i VN , I`vj→cj(i)
and the message from a degree-d CN, I`cj→vj(i), for i ∈ 2, . . . , Dv and d ∈ dcj , dcj + 1are obtained as
I`vj→cj(i) = Υ
(√(i− 1)Υ−1
(I`−1cj→vj
)2
+ σ2j
),
I`cj→vj(d) = 1−Υ
(((d− 2)Υ−1
(1− I`vj→cj
)2
+ Υ−1 (1−Υ (σj))2
) 12
),
Chapter 5. Low-Comp. Rate- and Channel-Config. Concat. Codes 71
where
I`vj→cj =Dv∑i=2
λi,j1− λ1,j
I`vj→cj(i),
I`cj→vj =
dcj+1∑d=dcj
ρd,j −Rd,jλ1,j
1− λ1,j
I`cj→vj(d).
Here, we have excluded the edges that connect to degree-one VNs. Initially, we let
I0vj→cj = Υ(σj) and I0
cj→vj = 0. Furthermore, we use the approximations in [104, Eqs.
(9),(10)] in computing Υ(σ) and Υ−1(σ).
After ` iterations, the APP mutual information at the degree-i VNs, IAPP,`j , for i ∈
2, . . . , Dv, is obtained as
IAPP,`j (i) = Υ
(√iΥ−1
(I`cj→vj
)2
+ σ2j
). (5.14)
For i = 1, first we obtain the message from a degree-d CN to the degree-1 VNs, for
d ∈ dcj , dcj + 1, as
I`,1cj→vj(d) = 1−Υ
(√(d− 1)Υ−1
(1− I`vj→cj
)2).
Then, the APP mutual information at the degree-1 VNs is obtained from (5.14), but
with substituting I`cj→vj with I`cj→vj , obtained as
I`cj→vj =
dcj+1∑d=dcj
ρd,jI`,1c→vj(d).
The BER on the degree-i VNs, ε`i,j, is then obtained as
ε`i,j =1
2erfc
(σi,j
2√
2
),
where σi,j = Υ−1(IAPP,`j (i)
)is obtained using (5.14), and erfc(x) is the standard comple-
mentary error function. The BER on the inner-code information bits, Pinfo,j, and parity
Chapter 5. Low-Comp. Rate- and Channel-Config. Concat. Codes 72
bits Pparity,j, are therefore obtained as
Pinfo,j =Dv∑i=0
Ui,jεi,j, (5.15)
Pparity,j = ε1,j, (5.16)
where ε0,j is the bit-level channel BER. Note that for νj = 1 we must have U1,j = 0.
Differential Evolution
We obtain the optimized configurable inner-code degree distributions following a similar
differential evolution algorithm as in [105, Ch. 3, Sec. 3.3]. Given the J operating points
and their reference inner-code complexities, i.e., the values for (Rin,j, SNRj) and η∗in,j,
the differential evolution algorithm searches for the J degree distributions that minimize
the maximum complexity deviation ratio from the reference complexities as denoted by
(5.4).
Before we describe the differential evolution operation, we define the helper function
config(C) that generates UC , a collection of valid distributions for the information bits of
a configurable code, based on its input, matrix C. Let C(:, j) denote the j-th column of
matrix C. The j-th column of UC , denoted by UC(:, j), is obtained iteratively, starting
from j = 0, as the solution to the following quadratic program:
minimizeUC(:,j)
‖UC(:, j)−C(:, j)‖2,
subject toDv∑i=0
UC(i, j) = Rin,j,
UC(1, j) = 0,
UC(i, j)
1−Rin,j
≥ UC(i, j − 1)
1−Rin,j−1
,
where we initialize UC(:,−1) to be the all-zero vector and Rin,−1 = 0.
For a given collection of J information-node degree distributions, UjJ−1j=0 , and a de-
viation γ, we first obtain the number of decoding iterations allowed for each operating
point, Ij, according to (5.2) and solving for ηin,j = γη∗in,j. A non-integral number of iter-
ations is obtained by time-sharing between decoding with bIjc and with dIje iterations.
The score of UjJ−1j=0 is then defined as the minimum γ at which Pout, as defined in (5.3),
and calculated using (5.15)–(5.16), is below the target BER, P tout, at all operating points.
The differential evolution search is performed on a size J×(Dv+1) matrix A contain-
Chapter 5. Low-Comp. Rate- and Channel-Config. Concat. Codes 73
ing the UjJ−1j=0 column vectors. Let g denote the number of generations the differential
evolution is carried for and let S denote the population size at each generation. Also,
let β > 0 be an amplification factor and let 0 ≤ ξ ≤ 1 denote a cross-over probability.
On the population of each generation, the differential evolution carries out the following
three steps, for s ∈ 1, . . . , S:
1. Generate a mutation Bs = Ai1 +β(Ai2−Ai3), where i1, i2, i3 are chosen uniformly
at random, without replacement, from the set 1, . . . , S\s.
2. Generate a competitor matrix Cs where the element in row-i and column-j, i ∈0, . . . , Dv and j ∈ 0, . . . , J − 1, is found as
Cs(i, j) =
Bs(i, j) with probability ξ,
As(i, j) otherwise.
We then let UCs = config(Cs).
3. Matrix UCs then replaces matrix As in the next generation if and only if it has a
better (i.e., lower) score.
The differential evolution search is initialized with matrices config(X1), . . . , config(XS),
where matrices X1, . . . ,XS are generated at random. After carrying differential evolu-
tion on g generations, the algorithm then outputs the matrix with the best score at the
last generation, which determines the ensembles of the optimal configurable inner-code.
5.4.4 Code Construction
We consider a QC structure for the inner code. It is well known that the QC structure
enables a hardware-friendly and energy-efficient decoder implementation [64] as required
in OTNs. Note that in solving (5.4), we obtain J optimized inner-code ensembles, ordered
in ascending rates, that describe a configurable FEC scheme.
To construct the QC inner code, we first construct a base graph in keeping with the
obtained ensembles. We let Nbj and Eb
j denote the number of VNs and edges of the
j-th base graph, respectively, and let Mb denote the number of CNs of the base graph.
We start by sampling a Tanner graph with Nb1 VNs and Mb CNs, corresponding to the
ensemble of shortest (lowest) length (rate). Then, when Rin,j < Rin,j+1, the Tanner graph
corresponding to the (j+1)-th operating point is obtained by: (a) adding Nbj+1−Nb
j VNs
to the existing graph; (b) adding Ebj+1−Eb
j edges to the new VNs in accordance with the
VN degree distribution of the (j + 1)-th inner-code ensemble; and (c) adding Ebj+1 −Eb
j
Chapter 5. Low-Comp. Rate- and Channel-Config. Concat. Codes 74
sockets to the existing CNs of the graph and connecting them randomly to the new edges
in accordance with the CN degree distribution of the (j + 1)-th inner-code ensemble.
Should the sampled base matrix not have a full-rank sub-matrix in the designated parity
positions we discard it and start over.
Once the base graph is obtained, we lift its corresponding matrix to obtain a QC
parity-check matrix of large girth for the inner code. Note that the obtained code can be
configured to work at multiple operating points: its rate can be configured by activating or
deactivating certain parts of the graph, and its operational complexity can be configured
by varying the number of decoding iterations (see the encoder and decoder of Fig. 5.1).
5.5 Results
In this section we apply the tools described above to design configurable codes that
can operate at various operational rates and complexities and with various modulation
formats. We characterize the performance of the designed concatenated FEC schemes
by their SNR gap, in dB, to the CSL, obtained from (1.1), at each operating point. We
obtain the complexity score of the operating points according to (5.2). For the designed
configurable FEC schemes, we also report γ∗, the optimal complexity deviation ratio
from the reference complexities.
We use the rate-0.97 and the rate-0.98 zipper codes from [14, Table 1] as the outer
code and we set the conservative BER target of 1.7 × 10−3 and 1 × 10−3 for them, re-
spectively. We obtain the configurable inner-code ensembles using the method described
in Appendix A. In keeping with the suggestion of [99, Sec. IV-E], we chose the following
values for the parameters of the differential evolution: S = 150, g = 100, β = 0.6, and
ξ = 0.6 for the differential evolution algorithm. We then sample base codes of small
length, according to the obtained ensembles, and lift them to obtain QC inner codes
of girth at least 8. In all the results presented, we assume floating-point sum-product
message passing at the inner decoder.
In Fig. 5.2 we plot the performance of the configurable codes versus the FEC rates.
Here, each mark denotes an operating point with a complexity score indicated by its
label; connected marks denote the operating points of a single configurable FEC scheme.
We have also reported γ∗ for each configurable FEC scheme. Two of these configurable
FEC schemes are described in detail in the examples below.
Example 1 : an FEC scheme that uses a rate-0.97 outer zipper code and is configurable
to operate with 25% OH and 64-QAM at 1.18 dB gap to the CSL, with 25% OH and
16-QAM at 1 dB gap to the CSL, and with 20% OH and 4-QAM at 0.9 dB gap to the
Chapter 5. Low-Comp. Rate- and Channel-Config. Concat. Codes 75
0.9 1 1.1 1.2 1.3
2
3
4
5
γ∗ = 1.14
γ∗ = 1.13
γ∗ = 1.05
γ∗ = 1.1214.3 10.4
15.19.739.7410.17
23.017.7
14.221.0 14.724.4
51.5
Gap to the CSL (dB)
FEC
Rate(bit/symbol)
64-QAM16-QAM4-QAM
Figure 5.2: Designed configurable FEC schemes, denoted by the connected marks. Eachmark is an operating point and its complexity score is indicated on its label.
CSL. We sampled base inner-codes of lengths 38, 57, and 142, respectively, from the
following optimized ensembles:
L0(x) = (20x+ 5x6 + 9x7 + 4x8)/38,
R0(x) = (15x8 + 5x9)/20,
L1(x) = (20x+ 18x4 + x5 + 5x6 + 9x7 + 4x8)/57,
R1(x) = (18x12 + 2x13)/20,
L2(x) = (20x+ 67x3 + 21x4 + 5x5 + 11x6
+ 12x7 + 6x8)/142,
R2(x) = (12x27 + 8x28)/20.
Note that the code with VN degree distribution L0(x) is shortened from the code with
VN degree distribution L1(x), that itself is shortened from the code with VN degree
distribution L2(x). We then lifted the obtained base code by a factor of 493 to get a
girth-8 QC code. At the designated operating points, the inner codes require 10, 11,
and 12 iterations, respectively, to bring the BER on the bits passed to the outer code to
below 1.7×10−3, which gives complexity scores of 15.1, 24.4, and 51.1, respectively. This
configurable FEC scheme has γ∗ = 1.14. It is worth noting from this example that the
optimized ensembles do not always shorten the lowest-degree variable nodes, as is often
the practice followed in conventional designs.
Example 2 : an FEC scheme with 15% OH that uses a rate-0.98 outer zipper code
Chapter 5. Low-Comp. Rate- and Channel-Config. Concat. Codes 76
0.8 1 1.2 1.4 1.6 1.8100
101
102
γ∗ = 1.05
γ∗ = 1.09
3.49
3.49 3.493.4
3.4
5.21 5.35
5.21
5.35
5.215.35
5.17
Gap to the CSL (dB)
η16-QAM64-QAMDesigned
[5]
Figure 5.3: Performance-versus-complexity comparison between the designed config-urable FEC schemes and those of [5]. The FEC rate, in bits per symbol, of each operatingpoint is indicated on its label.
and 16-QAM signalling, compatible with the 400ZR implementation agreement [4]. We
obtained an FEC scheme with configurable operational complexity to operate at various
gaps to the CSL. We sampled an inner code of length 95 from the following optimized
ensemble:
L(x) = (21x+ 5x3 + 67x4 + 2x5)/95,
R(x) = (x15 + 20x16)/21.
Note that since the modulation format and the transmission rate are the same for all
operating points, the inner code structure also remains the same across the operating
points. We then lifted the obtained base code by a factor of 210 to get a girth-8 QC
code. With 8, 10, and 13 inner-decoding iterations, the FEC scheme operates at 1.1 dB,
1 dB, and 0.9 dB gap to the CSL, which gives complexity scores of 14.2, 17.7, and 23,
respectively. The inner code brings the BER on the bits passed to the outer code to
below 1× 10−3. This configurable FEC scheme has γ∗ = 1.05.
In Fig. 5.2 we also present a low-complexity configurable FEC scheme that uses 64-
QAM signalling and can operate with 20%, 15%, and 12.5% OH, all at 1.25 dB gap to
the CSL, with γ∗ = 1.12. Another scheme presented in Fig. 5.2 can use 64- and 16-QAM
constellations and can operate at 1 dB or 1.2 dB gap to the CSL, all with 20% OH, and
with γ∗ = 1.13.
Chapter 5. Low-Comp. Rate- and Channel-Config. Concat. Codes 77
When the concatenated FEC scheme is applied to a modulation format with m bit-
levels, m−1 levels are protected only by the outer code, thus incurring zero inner-decoding
complexity. Therefore, as reported in Fig. 5.2, FEC schemes that work with a bigger m
operate at a much lower complexity. Since at the same symbol rate the information
throughput is larger with a bigger m, keeping the complexity score, per information bit,
in check is especially beneficial in realizing very large system throughputs.
In Fig. 5.3 we compare our design approach to the most recent results in the literature,
i.e., those of [5]. Note that in this figure, for better visualization, we plot FEC complexity-
score versus performance, and indicate the FEC rates on the labels. As can be seen in
Fig. 5.3, a performance similar to that of the configurable codes of [5] can be obtained
by our designed code at 63% less complexity. At a similar complexity, our designed code
can provide up to 0.6 dB coding gain over the codes of [5].
5.6 Conclusion
In this chapter, we have proposed a design approach for configurable and complexity-
optimized concatenated FEC scheme capable of operating at various transmission rates
and channel conditions. We use an inner error-reducing LDPC code and an outer zipper
code. We use a low-complexity MLC scheme in which we only inner-encode one bit-level.
We minimize the estimated inner-decoding data-flow while realizing (a) rate-adaptivity
by varying the modulation order and also by varying the inner-code rate by shortening or
lengthening it, and (b) channel-adaptivity by varying the number of decoding iterations
performed by the inner code. We design a number of configurable FEC schemes according
to the most recent industry specifications and show that the designed codes have a
superior performance-complexity trade-off relative to existing proposals.
We took the complexity of the codes particularly designed for an operating point as
the reference and, in designing the configurable codes, we minimized the relative deviation
from the reference complexities. Alternatively, it might benefit the system throughput
and its implementation in hardware to minimize the complexity score of the scheme with
largest spectral efficiency, while also equalizing the relative complexity deviation, however
defined, at other operating points. While we acknowledge that there may be other viable
objectives and formulations to consider in the configurable code design, we leave their
investigation and their implementation implications as a topic of future study.
Chapter 6
Complexity-Optimized Non-Binary
Coded Modulation for
Four-Dimensional Constellations1
6.1 Introduction
In coherent optical communication, we may use both the polarization and the quadratures
of the electromagnetic fields for data transmission. Therefore, it is sensible to consider
all 4 degrees of freedom for signal constellation. In fact, for dual-polarization (DP)
optical communication with 4D modulation, an improved power efficiency was reported
in [32,129,130].
Conventionally, two independent QAM constellations are used for DP optical commu-
nication. However, denser arrangement of points can be realized in a 4D constellation.
In particular, signalling with constellations drawn from the checkerboard lattice, D4,
has shown to provide a packing gain over the conventional constellations [6, 131, 132].
Moreover, spherically-bounded D4-based constellations [133], where the constellation
comprises points within a 4D hypersphere, instead of a hypercube, readily unlock an
additional shaping gain.
Recently, a coded modulation scheme was proposed in [6, 132] that makes effective
use of the D4-based constellations, delivering a coding gain over the conventional DP
quadrature amplitude modulation (DP-QAM) schemes. There, a concatenated MLC
architecture is considered with two inner SD codes, each protecting a number of bit
1 This chapter is a joint work with Sebastian Stern and Felix Frey. It includes and expands on thework in [134].
78
Chapter 6. Comp.-Optimized Non-Binary Coded Modulation for 4D Constel.79
levels, and an outer staircase code. However, the two distinct SD codes used in the
FEC scheme of [6] (a) are generic LDPC codes that incur high decoding complexity, (b)
have to be successively decoded, incurring additional structural complexity, (c) are not
effective in delivering the shaping gain of the spherically-bounded 4D constellation.
In this chapter, we adopt the 4D signalling constellations of [6] and propose a concate-
nated coded modulation architecture carefully designed to deliver the inherent packing
and shaping gains of the D4-based constellations. We use a set partitioning approach to
construct a few bit-levels with minimal reliabilities, thereby maximizing the reliabilities
of the other bit-levels. We then use a low-complexity MLC scheme [79] in which we
inner-encode the less reliable bit-levels and protect the more reliable ones only by the
outer code. We modify the inner-code design approach of Sec. 4.3 to obtain non-binary
SD LDPC codes with minimized decoding data-flow. Similar to Sec. 4.3, the inner code
is tasked to reduce the BER of bits passed to the outer HD zipper code [14] to below
its threshold, which enables it to take the BER further down, below 10−15 as required in
optical communication.
We target a system designed for transmission rate of 6.97 bits/symbol, compatible
with the 400ZR implementation agreement [4], and assess the proposed scheme over the
additive AWGN channel. Compared to the conventional BICM scheme and to the scheme
of Sec. 4.3, a gain of 0.75 dB and 0.62 dB, respectively, is reported for the proposed FEC
architecture. Moreover, the proposed scheme can realize an additional shaping gain of
0.25 dB using the spherically-bounded 4D constellations.
The rest of this chapter is organized as follows. In Sec. 6.2 we describe the 4D
constellations considered in this chapter and show their capacity curves next to that the
conventional QAM constellations. In Sec. 6.3 we describe the set-partitioning procedure
that we use in labelling the 4D constellation. In Sec. 6.4 we describe the coded modulation
structure. In Sec. 6.5 we describe the inner-code ensemble optimization procedure. In
Sec. 6.6 we present simulation results for the obtained scheme and compare them to the
existing designs. In Sec. 6.7 we provide concluding remarks.
6.2 Four-Dimensional Signal Constellations
The DP-QAM constellation induces a subset of the Lipschitz integers [131]. We denote
this constellation by LM , whereM is the constellation cardinality. As described in [6,132],
with 4D signalling, the density of the constellation points packed in a hypersphere can be
doubled— compared to the conventional DP-QAM constellations— without a decrease
in minimum distance. The induced algebraic structure is isomorphic to the D4 lattice
Chapter 6. Comp.-Optimized Non-Binary Coded Modulation for 4D Constel.80
8 9 10 11 12 13 14 15 165
6
7
8
9
ShannonLimit
C = 6.971
Es/N0 (dB)
Cbit/symbol
L256 (16QAM)
H512 (D4, cube)
W512 (D4, sphere)
L1024 (32QAM)
L4096 (64QAM)
8 9 10 11 12 13 14 15 165
6
7
8
9
ShannonLimit
C = 6.971
Es/N0 (dB)
Cbit/symbol
L256 (16QAM)
H512 (D4, cube)
W512 (D4, sphere)
L1024 (32QAM)
L4096 (64QAM)
0.570.24
0.58
Figure 6.1: Constellation-constrained capacities in bit/symbol versus the SNR. The insetshows the 2D projection of the corresponding signal constellations.
and is known as Hurwitz integers [131]. We denote the corresponding constellation by
HM .
We remark that the D4 lattice is isomorphic to the lattice obtained by applying a
single-parity-check code to the Z4 integer lattice. In this sense, the FEC scheme proposed
in this chapter can be considered as a triple concatenated code, deploying the Z4 signal
constellation with a single parity-check code closest to the channel.
In Fig. 6.1, we plot the constellation-constrained capacities, in bits per symbol, versus
the SNR for the 4D 512-ary Hurwitz constellation (H512) and its DP-16QAM (L256)
counterpart. The achievable packing gain for the Hurwitz constellation over the DP-
QAM is evident in this figure. Moreover, as proposed by G. Welti [133], the constellation
capacity can be further increased by selecting a subset Hurwitz integers that are within
a 4D hypersphere. The capacity of the resulting Welti constellation, denoted by W512,
is also shown in Fig. 6.1. Note that the additional achievable shaping gain of the Welti
constellation incurs no extra architectural complexity.
Also shown in Fig. 6.1 are the DP-32QAM and DP-64QAM constellation capacities.
For the target transmission rate, there is virtually no difference between the constel-
lation constrained limit of DP-64QAM and that of the H512. The H512 constellation
however entails a simpler system implementation since it has fewer signalling points.
Similarly for the shaped ones, while there is virtually no difference between the constel-
lation constrained limit of DP-32QAM and that of the W512 at the target transmission
rate, signalling with W512 is preferable because W512 has smaller cardinality.
Chapter 6. Comp.-Optimized Non-Binary Coded Modulation for 4D Constel.81
−2 −1 0 1 2
−2
−1
0
1
2
d2min=1
Re
Im
−2 −1 0 1 2
−2
−1
0
1
2
d2min=2
Re
Figure 6.2: Illustration of first (left) and second (right) partitioning steps of the D4-basedconstellations in one polarization.
We show a two-dimensional (2D) projection of the 4D signal constellations in Fig. 6.1
[6]. We assume the constellations are normalized to have d2min = 1, where dmin is the
squared minimum distance among the constellation points. We have shown the 2D
projections of the L256, H512, and W512 constellation points by blue crosses, red dots,
and green circles, respectively, in the Fig. 6.1 inset.
6.3 Four-Dimensional Set-Partitioning
The D4 lattice can be partitioned according to the following chain [77]:
D4 → Z4 →√
2D4 →√
2Z4 → 2D4 → . . . .
This chain describes the lattice isomorphisms obtained by the set-partitioning of the D4
lattice. Note that the Hurwitz and Welti constellations are subsets of the D4 lattice.
Starting from a normalized D4-based constellation, we illustrate the induced cor-
responding partitioning chain in Fig. 6.2. The first partitioning step decomposes the
constellation into two 256-ary subsets, each a subset of the Z4 lattice. This is shown in
Fig. 6.2 to the left, with triangles and squares as constellation points. Note that we still
have d2min = 1 after the first step. The second step decomposes the sub-constellations
into two 128-ary subsets, each a subset of the√
2D4 lattice. This is shown in Fig. 6.2
to the right, with hollow and solid triangles as sub-constellation points. Note, however,
that we get d2min = 2 after the second step.
It is easy to see that with every other step, the squared minimum distance among the
resulting constellation points doubles.
Chapter 6. Comp.-Optimized Non-Binary Coded Modulation for 4D Constel.82
6.4 Concatenated Non-Binary FEC Architecture
The proposed concatenated FEC architecture is shown in Fig. 6.3. Here, the outer
encoder (ENCout) bits are parallelized into two streams: one going straight to the 4D
constellation mapper M, and the other going to the inner encoder (ENCin) first.
In one channel use, the constellation mapper M, takes in bc inner encoded bits and
bu bypass bits to generate a symbol of the 4D constellation. The inner encoded bits are
assigned to the less reliable bit channels and the bypass bits are assigned to the more
reliable bit channels. We model the linear regime of the optical fiber by the transmission
over a DP AWGN channel.
At the receiver, soft demapping is performed on the first bc levels, the result of which
is passed to the inner decoder (DECin). An HD on the inner-decoded symbols then is
used, along with an HD on the channel observations, to demap the other bu bit levels.
The inner-code information bits and the demapped bypass bits are then passed to the
outer HD decoder (DECout) via a parallel-to-serial convertor.
We use a non-binary LDPC code as the inner code. Theoretically, with a non-binary
MLC approach the 4D constellation capacity is achievable. This, however, typically
comes at the expense of high decoding complexity of the non-binary codes [135, 136].
In this chapter, we modify the approach in Sec. 4.3 to obtain non-binary LDPC codes,
designed to minimize the decoding data flow. Furthermore, as in Sec. 4.3, we only inner-
encode a few bit levels, hence keeping the inner-code alphabet size small. While a number
of sub-optimal decoding algorithms aimed at lowering the decoding complexities of the
non-binary LDPC codes have been proposed [136–138], in this work we only consider the
conventional message-passing algorithm at the inner LDPC decoder [135,136].
We target an FEC scheme compatible with the 400ZR implementation agreement [4].
There, a DP-16QAM is deployed with an inner rate-119/128 code, encoding all bit-
levels, concatenated to an outer rate-239/255 code, for a combined transmission rate of
6.97 bit/symbol.
We consider signalling with H512 and W512 constellations. Wth these 512-ary D4-
based constellations, we use a non-binary inner code of rate Rin = 2.187/4, encoding
bc = 4 bit-levels, concatenated to an outer zipper code of rate Rout = 0.97, providing the
same overall target rate of 6.97 bit/symbol.
For comparison, we also consider a similar architecture to be used with the L256
constellation (DP-16QAM). Here, an inner code of rate Rin = 2.187/3 encodes bc = 3 bit
levels and the other bu = 5 bit levels are protected only by the outer code. We use the
same outer code rate with this scheme as before.
Chapter 6. Comp.-Optimized Non-Binary Coded Modulation for 4D Constel.83
Fbc+bu2
ENCout M
ENCin
F2
HD
DECin
Para
llel/Seria
l
sub set DECout
R4
Channel
AWGN
F2
n
Seria
l/Para
llel
Figure 6.3: The proposed concatenated FEC architecture for DP transmission over theAWGN channel. The alphabet field sizes are denoted below their corresponding stages.
As shown in Fig. 6.1, at the target rate of 6.97 bit/symbol, a packing gain and an
additional shaping gain of 0.74 dB and 0.24 dB is achievable by the H512 and W512
constellations, respectively, over L256 constellation.
6.5 Ensemble Optimization
6.5.1 Empirical Density Evolution
We use a similar parameterization for the inner code as in Sec. 3.2.2. Here, we consider
ensembles with ν = 1 that have no uncoded symbols. In such ensemble, all parity
nodes are of degree-1 and all CNs are connected exactly to one degree-1 VN. The edge-
perspective VN degree distribution that includes only edges connected to the information
nodes is then obtained as
λinfo(x) = λ(x)/λ1 − x.
Similarly, the edge-perspective CN degree distribution that includes only edges connected
to the information nodes is obtained as
ρinfo(x) = R′info(x)/R′info(1),
where Rinfo(x) = R(x)/x.
Instead of an EXIT function analysis, here we use an empirical density evolution
approach in analyzing the ensemble. A similar approach has been used in [139,140]. With
this approach, we aim at understanding the belief propagation decoding of the non-binary
LDPC codes under the flooding schedule. Note that the non-binary LDPC inner code
operates over a Galois field of size 2bc . We use both F2bc and the set α0, α1, . . . , α2bc−1
Chapter 6. Comp.-Optimized Non-Binary Coded Modulation for 4D Constel.84
to denote the code alphabet.
In describing the density evolution, we represent LLR messages as 2bc-dimensional
vectors. We consider passing of messages on n edges at each iteration of the empirical
density evolution. Here, n allows for trade-off between accuracy and computational
complexity of the density evolution. Note that we only consider messages to or from the
information nodes. We assign a value α ∈ α0, α1, . . . , α2bc−1 to each edge and obtain
its channel sample accordingly. The message on such edge, Lα(m), is obtained as
Lα(m) =
(lnP (m = α0)
P (m = α), ln
P (m = α1)
P (m = α), . . . , ln
P (m = α2bc−1)
P (m = α)
). (6.1)
We generate the n CN messages at each iteration according to ρinfo(x). A degree-dc
CN first samples dc − 2 messages at random from those coming from the VNs. Each
of those messages are then permuted by a random non-zero element of the field F2bc .
This permutation imitates the function of a non-zero entry in the parity-check matrix.
See [141, Sec. 2.2] for how elements of a message are permuted by an element of the field.
We also modify the assigned value of the message according to the multiplication in the
field arithmetic. The CN also samples, and permutes by a random non-zero element of
the field, a channel observation as the message coming from its (degree-1) parity node.
The CN then transforms the sampled messages into the Fourier domain and take
the inverse Fourier transform of the dot-product of the resulting vectors to obtain the
outgoing message vector. See [141, Sec. 2.2] for message computation details. The assignd
value of the outgoing message is also computed according to the summation rule in the
field arithmetic. Note that without loss of generality, we may assume that the outgoing
message is not permuted by a field element. Initially, we set the messages from CNs to
all-zero vectors, with assigned values chosen uniformly at random from the alphabet.
We generate n VN messages at each iteration according to λinfo(x). A degree-dv VN
first picks the assigned value to the outgoing message at random and samples a channel
observation accordingly. Then, it samples dv − 1 messages at random among those CN
messages with the same assigned value. It adds up the samples messages and the channel
sample to generate the outgoing message vector.
We remark that with keeping track of the assigned values of the messages we also
prevent any numerical issues in the empirical density evolution. In particular, in (6.1) we
compute the LLR values by normalizing the probabilities by that of the assigned value.
This probability is expected to be sufficiently large so as to avert any numerical issues.
In fact, in the process of obtaining the ensembles presented in the next section, we did
not observe any numerical instability. This process is explained next.
Chapter 6. Comp.-Optimized Non-Binary Coded Modulation for 4D Constel.85
6.5.2 BER Analysis
After the last iteration, we generate the a posteriori LLR vectors at the information
nodes according to degree-distribution on the information nodes. These vectors are
generated similar to VN message generation, but here, a degree-dv VN picks dv messages
at random, among those CN messages with the same true value, to obtain its a posteriori
vector. The maximum element of the vector then denotes the VN decoded value. For a
posteriori vector of parity nodes, the CN messages passed to the degree-1 VNs have to be
considered. To this end, a degree-dc CN has to pick dc−1 messages from the information
nodes to perform the CN operation. A parity node then uses a sample of these “special”
CN messages along with a channel observation sample to obtain its a posteriori vector.
Let Pinfo(αi, αj) be the probability that an information node true value is αi, while
its decoded value is αj, i, j ∈ 0, 1, . . . , 2bc − 1 . Similarly, let Pparity(αi, αj) be the
probability that a parity node true value is αi , while its decoded value is αj. Further,
we let PMSBinfo (αi, αj) and PMSB
parity(αi, αj) be the BER in demapping the MSB using an
inner encoded information node and parity node that has true value αi and decoded
value αj, respectively. We empirically estimate the values of Pinfo(αi, αj), Pparity(αi, αj),
PMSBinfo (αi, αj) and PMSB
parity(αi, αj) using the true values and maximum a posteriori decoding
of the VNs as described above.
The average BER in demapping the MSBs, denoted by PMSB is obtained as
PMSB = Rin
∑i,j
Pinfo(αi, αj)PMSBinfo (αi, αj) + (1−Rin)
∑i,j
Pparity(αi, αj)PMSBparity(αi, αj).
The average BER on inner decoded bits that are passed to the outer code is obtained as
Pinfo =1
bc
∑i,j
Pinfo(αi, αj)B(αi, αj),
where B(αi, αj) is the Hamming distance between the bc LSBs of the labels assigned to
αi and αj by the mappingM. Finally, Pout, the BER on bits passed to the outer decoder
is obtained as
Pout =bu
bu +RinbcPMSB +
Rinbcbu +Rinbc
Pinfo. (6.2)
6.5.3 Differential Evolution
We use a method based on differential evolution, similar to those in Sec. 4.4.4 and
Sec. 5.4.3, to optimize the inner-code ensemble. Here, we fix a maximum number of
iterations and aim at finding an inner-code ensemble that can operate at lowest chan-
Chapter 6. Comp.-Optimized Non-Binary Coded Modulation for 4D Constel.86
nel SNR. Note that we only search among ensembles with ν = 1 that have no uncoded
symbols.
Similar to Sec. 5.4.3, we define the vector Ux to be the valid information node degree
distribution obtained based on vector x. Vector Ux is obtained by the solution of the
following quadratic program:
minimizeUx
‖Ux − x‖2,
subject toDv∑i=0
UC(i) = Rin,j,
Ux(1) = 0,
Ux(i) ≥ 0 ∀i ∈ 1, . . . , Dv.
given an information-node degree distribution vector U , the score of U is defined as
the minimum SNR at which Pout, as defined in (6.2), is below the target BER, P tout.
The differential evolution search is performed on a size-Dv vector x. Let the differential-
evolution parameters g, S, β and ξ be defined as in Sec. 5.4.3. On the population
of each generation, the differential evolution carries out the following three steps, for
s ∈ 1, . . . , S:
1. Generate a mutation ys = xi1 + β(xi2 − xi3), where i1, i2, i3 are chosen uniformly
at random, without replacement, from the set 1, . . . , S\s.
2. Generate a competitor vector ys whose i-th element, i ∈ 1, . . . , Dv is found as
ys(i) =
ys(i) with probability ξ,
xs(i) otherwise.
3. Vector Uys then replaces xs in the next generation if and only if it has a better
(i.e., lower) score.
The differential evolution search is initialized with vectors Ux1 , . . . ,UxS , where vectors
x1, . . . ,xS are generated at random. After carrying differential evolution on g genera-
tions, the algorithm then outputs the matrix with the best score at the last generation,
which determines the ensembles of the optimal configurable inner-code.
Chapter 6. Comp.-Optimized Non-Binary Coded Modulation for 4D Constel.87
Table 6.1: Bit-level capacities of the 4D constellations at their respective operating points.Level Labelling d2min L256 H512 W512
8
PseudoGray 4
0.999 0.998 0.9987 0.999 0.997 0.9976 0.999 0.995 0.9965 0.998 0.994 0.9944 0.998 0.992 0.9923
Set Par-titioning
20.904 0.817 0.830
2 0.857 0.746 0.7621
10.433 0.279 0.359
0 − 0.369 0.258
Rin (levels 4–8 uncoded) 2.187/bc
6.6 Results
As mentioned in Sec. 6.4, we consider a rate-0.97 zipper code as the outer code. We set
the conservative BER threshold of 1.7 × 10−3 for the outer code to enable a practical
realization of our designs. We consider a frame length of 6000 4D symbols in the various
code design described below. This ensures the same number of information bits and noise
samples per frame in the FEC schemes.
We adopt a QC structure for the inner codes. We sample a binary base matrix
according to the obtained ensembles. We then lift the base matrix to obtain binary
parity-check matrix of girth at least 8. Then, we substitute the non-zero values by
random non-zero elements of the field and obtain the parity-check matrix for the inner
code. We assume floating-point sum-product message passing. All non-binary codes
reported here are optimized for and simulated with 10 inner-decoding iterations.
We obtain the FEC schemes with complexity-optimized non-binary inner codes for the
H512, W512, and L256 constellations. The FEC architecture we use with these constella-
tions is detailed in Sec. 6.4. For these constellations, we report the corresponding bit-level
capacities at their eventual operating points in Table 6.1. We use the set-partitioning
chain described in Sec. 6.3 to label the first few bit channels. We see, for example, that
after 4 steps, the remaining bit-levels of the H512 constellation are highly reliable. We
use a pseudo-Gray labelling [142] on these remaining bit-levels. We use a similar method
in labelling the W512 and L256 constellations.
We compare our designs to the other binary concatenated FEC schemes. We obtain
an MLC-based scheme with L256 constellation, with the labelling and the optimized inner
LDPC code as described in Sec. 4.3. We also consider the BICM-based scheme with L256
constellation and generic LDPC inner code as described in [6], and the two-stage BICM-
based scheme with H512 constellation and two generic inner LDPC codes also described
Chapter 6. Comp.-Optimized Non-Binary Coded Modulation for 4D Constel.88
10.5 11 11.5 12 12.5 13 13.510−4
10−3
10−2
10−1
100
0.24 0.58
0.25 0.62Target BER
Es/N0 (dB)
BER
H512, Designed
W512, Designed
L256, Designed
H512, TS-BICM [6]
L256, 1D-MLC
L256, BICM [6]
Figure 6.4: The BER on bits passed to the outer code. The constellation capacitiesare indicated by the vertical lines. The horizontal line denotes the outer-code threshold.Here, the solid curves denote the non-binary designs and the dotted and dashed curvesdenote their binary counterparts: TS-BICM indicates the performance two-stage BICM-based scheme of [6] and 1D-MLC indicates performance of the scheme of Sec. 4.3.
in [6].
In Fig. 6.4 we plot the BER on bits passed to the outer code versus the SNR for the
obtained FEC schemes. With the L256 constellation, the designed non-binary scheme
performs as well as the binary scheme designed as described in Sec. 4.3, both having
a gain of about 0.15 dB over the BICM scheme of [6]. With the H512 constellation,
the designed non-binary scheme performs better than the two-stage BICM-based scheme
scheme of [6] and provides a 0.62 dB gain over the designed non-binary scheme for the L256
constellation. With the W512 constellation, the designed scheme achieves an additional
gain of about 0.25 dB over the scheme with H512, and provides A total gain of around
1 dB over conventional BICM scheme of [6].
6.7 Conclusion
In this chapter, we have proposed a concatenated FEC scheme, consisting of an inner
complexity-optimized non-binary LDPC code and an outer zipper code, specifically de-
signed to take advantage of the 4D signalling in optical communication. We consider
signalling with D4-based constellations, i.e., Hurwitz and Welti constellations, that can
pack a higher density of points per unit of volume of space. We consider an MLC ar-
chitecture in the FEC design in which the non-binary inner code protects only a few bit
Chapter 6. Comp.-Optimized Non-Binary Coded Modulation for 4D Constel.89
levels while the a outer code cleans the residual errors. We obtain concatenated codes
that can deliver on the packing and shaping gains of the 4D constellation, achieving a
total gain of 1 dB compared to the conventional designs.
The decoding complexity of non-binary LDPC codes grows with their alphabet size.
While in the codes we designed we have kept the inner-code alphabet size small, the
obtained non-binary code incur a higher decoding complexity than their binary coun-
terparts. A complete performance-complexity tradeoff assessment of using non-binary
inner LDPC code in an MLC architecture, including the possible adoption of suboptimal
decoding methods, is a work in progress.
In this chapter, we utilized the four degrees of freedom provided by modulating the
signal on both the polarization and the quadratures of the electromagnetic field. We
have shown that with the effective use of a non-binary codes and a dense 4D signal
constellation we can deliver a substantial gain over the conventional designs. While the D4
lattice (which we derived the constellations from) is the densest lattice in four dimensions,
there are denser lattices in higher dimensions [131] we may exploit by modulating the
signal in time as well. For example, by considering two 4D channel uses we may deploy
constellations based on the E8 lattice, the densest lattice in eight dimensions. Exploring
possible advantages of using constellations based on lattices in higher dimensions is a
topic of future work.
Chapter 7
Low-Density Nonlinear-Check Codes
7.1 Introduction
In previous chapters, one constant in our FEC designs has been the use on an error-
reducing code (ERC). An ERC is formally defined as a code that converts a word received
at the output of a noisy channel to a different word that is the output of a less noisy
channel. While an ERC does not guarantee the correction of any complete error pattern
in a received word, it provides a probabilistic guarantee of the correction of a fraction of
the bits in error contained in the received word [143, Def. 1]. The fractional reduction
of BER naturally leads to code constructions consisting of the concatenation of an ERC
with an outer clean-up code, such as those we have designed in the previous chapters.
Recently, Roozbehani and Polyanskiy have introduced low-density majority-check
(LDMC) codes [144, 145]. These codes are nonlinear sparse-graph codes that are struc-
turally similar to LDGM codes but in them a CN, instead of checking the parity on the
VNs, indicates the value attained by the majority of the VNs connected to it. Over an
erasure channel, the authors report a graceful degradation of BER on decoded informa-
tion bits for LDMC codes, as the channel quality degrades. The authors also propose the
use of some majority-check nodes in the ensemble to improve the performance of LDGM
codes.
In this chapter, we introduce low-density nonlinear check (LDNC) codes, a class
of binary sparse-graph codes that are structurally similar to LDGM codes but with a
generalized nonlinear operation at the CNs. We consider all possible CN functions that
are 1) symmetric with respect to their input, and 2) produce an entropy-1 check bit
when the input to the CN function (i.e., the information bits) have entropy 1. Note
that LDMC codes are a member of the class of LDNC codes. We derive a universal
sum-product message passing update rule for the CN functions and obtain an efficient
90
Chapter 7. Low-Density Nonlinear-Check Codes 91
SourceEncoder
D(p)
SourceChannelEncoder
C(σ)
+ChannelDecoder
SourceDecoder
AWGNNoise
Sink
Figure 7.1: The block diagram of a scheme that achieves the ERC Limit.
algorithm to compute those messages. We study LDNC codes over the AWGN channel
and show that LDNC codes can be very effective ERCs.
The rest of this chapter is organized as follows. In Sec. 7.2 we derive the equivalent
of a Shannon limit for an ERC. In Sec. 7.3 we set out the nonlinear CN operations that
we consider for LDNC codes, derive a universal update rule for them, and provide a
computationally efficient method for obtaining the CN outputs. In Sec. 7.4 we provide
examples of the error-reducing capabilities of LDNC codes. In Sec. 7.5 we provide further
discussions and concluding remarks.
7.2 ERC Limit and Nonlinear Codes
The information-theoretic limit associated with ERCs have been obtained in a number
of prior works including [146–148]. Here, we derive this well-known theoretical limit in
order to motivate, through its achievability scheme, the use of LDNC codes. We consider
data transmission over an AWGN channel and derive the maximum possible rate of
transmission when we target a certain BER at the receiver.
Theorem. Consider data transmission over an AWGN channel with σ denoting the noise
standard deviation and where the signal average power is normalized to 1. Let p denote
the maximum BER that is tolerated in the reconstructed bits at the receiver. Then, the
maximum rate of transmission, R∗, is given by
R∗ =C(σ)
D(p), (7.1)
where C(σ) denotes the channel capacity for error-free transmission, and D(p) denotes
the minimum rate required in compressing a binary source with a maximum expected
Hamming distortion p.
Proof.
(a) Achievability: Consider transmission of k information bits, where k is a large number,
encoded by two codes, as shown in Fig. 7.1. Let the source code be a rate-D(p) code
Chapter 7. Low-Density Nonlinear-Check Codes 92
that achieves the rate-distortion limit in compression of a binary source with a maximum
expected Hamming distortion p. The source encoder then outputs kD(p) bits. Let the
channel code be a rate-C(σ) code that achieves the Shannon limit [149]. The channel
encoder then outputs kD(p)/C(σ) bits. At the receiver, the channel decoder produces
kD(p) error-free bits. The source decoder then produces k bits with maximum expected
Hamming distortion p. With this scheme, we have a rate-C(σ)/D(p) transmission scheme
that achieves the desired maximum distortion. Note that with Hamming distortion
measure we have D(p) = 1−Hb(p), where Hb(p) is the binary entropy function.
(b) Converse: Consider a rate-R binary ERC that provides a BER ≤ p at its decoder
output. Let Cp be a capacity-achieving code over the binary symmetric channel with
cross-over probability p. The code Cp then has rate 1−Hb(p) = D(p). Now consider pre-
coding the source with Cp. At the receiver, the Cp decoder then recovers the source bits
error free. The concatenation of Cp and the ERC then gives a code of rate D(p)R that
provides error-free transmission. By Shannon’s theorem we must have D(p)R ≤ C(σ),
and therefore R ≤ C(σ)/D(p).
Corollary. The maximum noise standard deviation, σ, that a rate-R binary ERC can
operate in, when targeting a BER ≤ p at its decoder output, is given by
σ∗ = C−1((1−Hb(p))R).
Proof. From (7.1) we must have D(p)R ≤ C(σ). With a Hamming distortion metric
we have D(p) = 1 −Hb(p). Since C(σ) is a monotonically decreasing function, we have
σ ≤ C−1((1−Hb(p))R).
Note that while the binary ERC limit is derived for the AWGN channel, a similar
limit can be derived for any channel for which the capacity is known.
The achievability scheme showed in Fig. 7.1 uses a lossy source code that operates at
the rate-distortion limit, concatenated with a channel code that operates at the Shannon
limit. It is argued in [150] that while linear channel encoders can achieve the Shannon
limit in discrete channels with additive noise, linear lossy source encoders cannot ap-
proach the rate-distortion limit. In fact, nonlinear encoders with sparse graph structures
have been proposed, e.g, in [151–154], that can approach the rate-distortion limit. Such
codes can have linear or nonlinear codebooks. Next, we study codes that are structurally
similar to LDGM codes but with generalized nonlinear encoding operation at their CNs.
Chapter 7. Low-Density Nonlinear-Check Codes 93
Π
dc
dv
k
m
Figure 7.2: Factor graph representation of an LDNC ensemble. Information- and check-node degrees are denoted by dv and dc, respectively. Here, a degree-dc CN is connectedto dc information nodes. The rectangle labelled Π represents an edge permutation.
7.3 LDNC Codes
7.3.1 Code Description
An LDNC code can be represented by a factor graph consisting of VNs, CNs, and edges.
Fig. 7.2 shows the factor graph of an LDNC code, in which circles and squares represent
VNs and (possibly nonlinear) CNs, respectively. The VNs are partitioned into k infor-
mation nodes (bottom) and m check bits (top). In an LDNC Tanner graph, a CN is
always connected to a of degree 1 check bit, as shown in Fig. 1, so the number of CNs is
equal to m. Throughout this chapter, we do not count the check bits when denoting the
CN degrees.
Given k, m, and the degree of every node, an LDNC ensemble is uniquely determined
by the edge connections between CNs and information nodes. Let dv and dc denote the
average information-node and CN degree, respectively. Note that in a valid ensemble we
have m/k = dv/dc. The rate of the LDNC ensemble is given by
R =k
k +m=
dcdc + dv
. (7.2)
7.3.2 Encoding
Consider a CN of degree-dc. As shown in Fig. 7.3, since the check bit of the CN is of
degree-1, given the information bits, the check bit can be computed by the CN function
easily. The check function of a degree-dc CN can be any function with domain 0, 1dcand codomain 0, 1. We call a check function g suitable if it satisfies the following two
Chapter 7. Low-Density Nonlinear-Check Codes 94
g
. . .
Figure 7.3: Encoding operation an a nonlinear check node.
conditions:
1. Function g should be symmetric with respect to its input, meaning
g(x1, x2, . . . , xdc) = g(π(x1, x2, . . . , xdc)),
for any random permutation π. It is easy to see that any symmetric check function
factors through the input weight, meaning
g(x1, x2, . . . , xdc) = g(wt(x)),
where wt(x) denotes the Hamming weight of vector x = (x1, x2, . . . , xdc).
2. The check bit has unit entropy, meaning that g(wt(x)) is equally likely to be 0 or
1, when the input bits are picked uniformly at random in 0, 1. Note that we send
all information bits and check bits over the channel and this condition ensures that
we make the best use of the channel use when we send the check bit.
Note that, for any dc > 1, the parity-check function is a linear suitable function. When
dc is odd, and also for some even values of dc,1 there are nonlinear suitable functions that
we may consider as the CN function. The problem of finding suitable check functions is
similar to that of bisecting binomial coefficients studied in [155, 156]. In [155, Table 1],
the authors list the number of possible bisections for 1 ≤ dc ≤ 51.
In Table 7.1, we list all the distinct check functions that exist for dc = 7. Note
that in a check function, by flipping all the output bits we get operationally the same
check function as the channel is assumed to be symmetric. In Table 7.1, without loss of
generality, we have assumed that the first input bit is always 0. Here, g1(wt(x)) is in fact
the parity-check function, and g8(wt(x)) is the majority-check function.
1Some small and even dc values for which there exists nonlinear check functions are 8, 14, 20, 24, 26.
Chapter 7. Low-Density Nonlinear-Check Codes 95
wt(x) 0 1 2 3 4 5 6 7
g1 0 1 0 1 0 1 0 1g2 0 0 1 0 1 0 1 1g3 0 1 1 0 1 0 0 1g4 0 0 0 1 0 1 1 1g5 0 1 0 0 1 1 0 1g6 0 0 1 1 0 0 1 1g7 0 1 1 1 0 0 0 1g8 0 0 0 0 1 1 1 1
Table 7.1: List of all distinct check functions for dc = 7.
7.3.3 Message-Passing Decoding
We use a factor graph model to describe decoding of an LDNC code. Fig. 7.4 shows a
factor graph fragment with a check node of degree dc. In a valid configuration, the top
node, denoted by y, is required to satisfy y = g(wt(x1, . . . , xdc)), where y, x1, . . . , xdc ∈0, 1.
We derive here the sum-product update rule for the message to be sent to xdc , given the
messages µ1, . . . , µdc−1 and µy. A message is a function from 0, 1 to R, or, equivalently,
a pair µi = (µi(0), µi(1)) of real numbers. The messages, µi(0) and µi(1), are assumed
to be equal to the probability of xi being 0 and 1, respectively. Accordingly µi(0) ≥ 0,
µi(1) ≥ 0, and µi(0) + µi(1) = 1. The same is true for y and µy. We aim to compute the
message νdc sent from the CN to node xdc ; messages to be sent to the other nodes are
defined in similar fashion.
The general sum-product update rule [46] for the message sent from a factor node
associated with a function f(x, x1, . . . , xm) to a variable node x, given received messages
µi(xi), i = 1, . . . ,m, is proportional to∑
x1,...,xm
∏mi=1 µi(xi)f(x, x1, . . . , xm), meaning
g
x1
x2
xdc−1
...
y
µ1
µ2
µdc−1
µy
xdcνdc
Figure 7.4: A typical CN of degree dc. Node y is set to denote the function the CNperforms on the information nodes.
Chapter 7. Low-Density Nonlinear-Check Codes 96
that for some constant ζ,
ν(x) = ζ∑
x1,...,xm
m∏i=1
µi(xi)f(x, x1, . . . , xm).
Associated with a CN of degree dc + 1 is the indicator function
f(x1, . . . , xdc , y) = Ig(wt(x1,...,xdc ))=y =
1 if y = g(wt(x1, . . . , xdc)),
0 otherwise.
The message νdc can be determined from νdc(0)− νdc(1). For νdc(0) we have
νdc(0) = ζdc∑
y,x1,...,xdc−1
f(x1, . . . , xdc−1, 0, y)µy(y)dc−1∏j=1
µj(xj)
= ζdcµy(0)∑
x1,...,xdc−1
f(x1, . . . , xdc−1, 0, 0)dc−1∏j=1
µj(xj)
+ ζdcµy(1)∑
x1,...,xdc−1
f(x1, . . . , xdc−1, 0, 1)dc−1∏j=1
µj(xj)
= ζdcµy(0)∑
x1,...,xdc−1:g(wt(x1,...,xdc−1,0))=0
dc−1∏j=1
µj(xj) + ζdcµy(1)∑
x1,...,xdc−1:g(wt(x1,...,xdc−1,0))=1
dc−1∏j=1
µj(xj).
(7.3)
Similarly,
νdc(1) = ζdcµy(0)∑
x1,...,xdc−1:g(wt(x1,...,xdc−1,1))=0
dc−1∏j=1
µj(xj) + ζdcµy(1)∑
x1,...,xdc−1:g(wt(x1,...,xdc−1,1))=1
dc−1∏j=1
µj(xj).
(7.4)
Note that ζdc can be computed from
1 = νdc(0) + νdc(1).
Note that since the update rule is symmetric among the xi’s, it can be derived similarly
for i ∈ 1, 2, . . . dc − 1.
Chapter 7. Low-Density Nonlinear-Check Codes 97
7.3.4 Efficient Message Computation
Let pi(x) = µi(0) + µi(1)x, for i ∈ 1, 2, . . . dc, and let
q(x) =dc∏i=1
pi(x).
Now let
qi(x) =q(x)
pi(x)= qi,0 + qi,1x+ . . .+ qi,dc−1x
dc−1.
Then, νi(0) and νi(0) can be written from (7.3) and (7.4) as
νi(0) = ζiµy(0)∑g(j)=0
qi,j + ζiµy(1)∑g(j)=1
qi,j,
νi(1) = ζiµy(0)∑
g(j+1)=0
qi,j + ζiµy(1)∑
g(j+1)=1
qi,j.(7.5)
The key takeaway here is that by obtaining the coefficients of q(x) we are able to compute
the qi(x)’s and consequently the νi(x) messages.
p 1→
←p2
p 1p 2→
p 3→
←p4
←p3 p
4
p1p2p3p
4→
p 5→
←p6
p 5p 6→ ←
p7
←p5p
6p7
p1p2p3p4p5p6p7 →
Figure 7.5: Binary computation tree for obtaining q(x) when dc = 7. The messages arepassed from the bottom up.
Chapter 7. Low-Density Nonlinear-Check Codes 98
0 1 2 3 4 5 6 7 8 9 10 11 1210−2
10−1
g1
g2
g3
g4
g5, g6g7
g8
Iteration #
BER
Figure 7.6: BER curves in regular ensemble with dv = 4 and dc = 7 with various checkfunctions, plotted versus number of decoding iterations. The codes are simulated at0.5 dB above their (error-free) constrained Shannon limit.
We obtain q(x) using a binary computation tree, as illustrated for dc = 7 in Fig. 7.5.
The algorithm runs as a message-passing algorithm and the messages are polynomials.
The messages are passed upwards starting from the leaf nodes, the ith of which sends
pi(x). Interior nodes in the tree perform the polynomial multiplication of the incoming
messages, passing the result further up. Let m = dlog2 dce be the depth of the tree.
At depth L in the tree, each node must multiply two polynomials of degree at most
2m−L−1. Using fast Fourier transform techniques, this computation can be accomplished
with O((m − L − 1)2m−L−1) operations. Since there are at most 2L nodes at depth L,
the total number of operations needed at depth L is at most O(2L(m−L− 1)2m−L−1) =
O((m − L − 1)2m−1). Summing over all levels L (from 0 to m − 1) where computation
is performed results in O(2mm2) = O(dc log2 dc) operations.
Each qi(x), in the worst case, can be obtained from q(x) by a polynomial division
with O(dc) operations. The update messages can therefore be obtained in parallel, each
with O(dc) operations, using (7.5).
7.4 Error-Reducing Performance Results
We study LDNC codes by studying the behaviour of their information BER curve. For
illustration purposes, we consider a regular code, i.e., codes where every information
node is of the same degree and every CN also is of the same degree. In particular, we
consider a regular LDNC code with information degree dv = 4 and CN degree dc = 7. We
consider signalling using the binary alphabet +1,−1 over an AWGN channel. Note
Chapter 7. Low-Density Nonlinear-Check Codes 99
that we may not assume transmission of an all-zero codeword.
0 0.5 1 1.5 2 2.5 310−2
10−1
Es/N0 dB
BER
g1g5g8
Figure 7.7: BER curves of regular dv = 4, dc = 7 LDNC ensembles with three checkfunctions, plotted in a wide range of SNRs. All decoders perform 4 decoding iterations.The error-free constrained Shannon limit is at 1.92 dB SNR.
In Fig. 7.6 we consider all suitable check functions for dc = 7 described in Table 7.1,
and plot the BER in various decoding iterations. Here, we have obtained QC codes of
girth-10 and length ∼100, 000 and simulated the codes at 0.5 dB gap to the (error-free)
constrained Shannon limit. We have observed the following behaviours in the LDNC
ensembles:
• The nonlinear check functions yield a relatively high fixed-point. A fixed-point is an
information-bit error-rate at which the iterative decoder stops and cannot reduce
the error-rate anymore.
• The nonlinear check functions exhibits most of their gains in the first few iterations,
and the following iterations provide little improvement.
• Two of the nonlinear check functions yield the same BER curve (g5 and g6). This
means there are more symmetries that can be factored out of the number distinct
suitable functions available for a particular CN degree.
• There are points, specially with fewer decoding iterations, where the nonlinear
check functions perform better than the linear (parity check) functions.
To understand the performance of the LDNC codes, we picked three check functions,
and simulated a regular dv = 4, dc = 7 ensemble in a wide SNR range. We fixed
decoding iterations to 4 (low complexity). In Fig. 7.7 we plot the BER curves of these
LDNC ensembles versus the SNR. Note that the constrained Shannon limit is at 1.92 dB
Chapter 7. Low-Density Nonlinear-Check Codes 100
SNR. As can be seen from Fig. 7.7, in low SNRs, the ensembles with nonlinear check
functions (g5 and g8) perform better than that with the parity check function, i.e., g1.
Another notable observation is that the ensemble with check function g5 performs better
than that with g8, the majority function proposed in [144,145], in a wide range of SNRs.
Due to the high fixed-point of ensembles with nonlinear check functions, they would
not be good candidates to consider in code design for optical communication. For other
applications, however, it is possible to optimize these codes using a differential evolution
method as described in previous chapters. We remark that in studying ensembles with
nonlinear check functions, we observe the EXIT functions are not uni-parametric and
therefore a method based on a Monte Carlo simulation, similar to Sec. 6.5, should be
deployed to optimize these codes.
7.5 Conclusion
In this chapter, we studied the performance of LDNC codes, a class of error-reducing
codes, for the binary-input AWGN channel. We derived the rules for the sum-product
message-passing decoder of LDNC codes and obtained an efficient algorithm for message
computation. We analyzed the error-reduction behaviour of LDNC decoders for regular
LDNC codes for various nonlinear check functions. We observed that in certain regimes,
codes with nonlinear check functions can perform better that codes with parity check
functions.
While we observed an interesting error-reduction behaviour in LDNC ensembles, it
remains to be seen how a forward error-correction scheme consisting of an LDNC inner
code and an outer clean-up code compares with existing designs. It would be of interest
to compare the performance-complexity trade-off curve of such scheme with schemes of
Chapter 3.
Chapter 8
Conclusion and Topics of Future
Research
In this work we developed tools and methods to obtain low-complexity concatenated
FEC schemes for applications with high throughput, as needed, for example, in optical
communication. We characterized and compared the performance-complexity tradeoffs
in various FEC schemes and modulation formats.
We proposed a decoder architecture consisting of an inner, error-reducing, LDPC
code, concatenated to an outer staircase or zipper code. We showed that with this
scheme, we may rely on the outer, algebraic code for the bulk of the error-correction, and
task the inner SD code only with reducing the BER to below the outer code threshold.
The outer code then can bring the BER to below 10−15, as required in OTNs, with very
low complexity.
Accordingly, we developed methods to optimize the FEC scheme by minimizing the
estimated data-flow at the inner code, for various choices of the outer code. An interesting
feature that emerges from the inner-code optimization is that a fraction of symbols are
better left uncoded, and only protected by the outer code. We considered a QC structure
for the inner codes in our design to realize a pragmatic and energy-efficient hardware
implementation.
We extended the code design method to FEC schemes with higher order modula-
tion. We obtained complexity-optimized MLC schemes and complexity-optimized BICM
schemes and made a fair comparison between them via their respective Pareto frontiers.
We showed that by a clever design, the MLC scheme can provide significant advantages
relative to the BICM scheme over the entire performance-complexity tradeoff space.
For binary modulation, the obtained FEC schemes provided up to 71% reduction
in complexity or up to 0.4 dB gain compared to the existing designs. For higher order
101
Chapter 8. Conclusion and Topics of Future Research 102
modulation, via the designed MLC scheme, the obtained codes provided up to 60%
reduction in complexity or up to 0.7 dB gain compared to existing designs.
We also designed a multi-rate and channel-adaptive LDPC code architecture. We
then developed a tool to optimize a low-complexity rate- and channel-configurable FEC
scheme via an MLC approach. We reported up to 63% reduction in decoding complexity,
or up to 0.6 dB gain compared to existing flexible FEC schemes.
To achieve even further performance improvements, we adapted our tools to design
complexity-optimized non-binary LDPC codes to concatenate with outer zipper codes in
an FEC scheme via the MLC approach. We considered 4D signal constellations that are
denser than their 2D counterparts and obtained clever labellings for them. We obtained
gains of up to 1 dB over the conventional schemes.
Based on this work, there are various worthwhile topics that are left for future study.
We discuss some of these below.
• In Chapter 3 we showed that with a layered schedule in decoding the LDPC in-
ner codes, we can significantly reduces the decoding complexity. Note that those
codes were not designed to be decoded with a layered schedule. Therefore, we can-
not make any claim of optimality for those codes when decoded under a layered
schedule. We may, however, be able to obtain inner-code ensembles, designed to be
decoded with a layered schedule, using, for example, a differential evolution method
similar to that in Sec. 6.5.3.
• In the MLC structure we considered in Chapters 4–6, the demodulator demaps the
MSBs given the inner decoded LSBs. Only hard-decision feedback is provided, re-
sulting in a low-complexity implementation. The BICM schemes we considered, on
the other hand, are deprived of such feedback. If greater complexity were permis-
sible, it is known that soft-decision feedback from the decoder to the demodulator
can significantly improve the performance of a BICM scheme [157]. Code design in
presence of such feedback is a topic for future study.
• We also remark that in the MLC structure we considered in Chapters 4–6 we
have not considered the effects of a possible mismatch between the actual channel
parameters and those for which the codes are designed. It is known that MLC
schemes are generally more susceptible than BICM schemes to such mismatches.
We leave the investigation of channel parameter mismatch on MLC and BICM code
performance as a topic for future study.
• We pointed out in Chapter 6 that the D4 lattice we obtained the signal constellation
Chapter 8. Conclusion and Topics of Future Research 103
from, is in fact isomorphic to the lattice obtained by applying a single-parity-check
code to the Z4 integer lattice. The next logical lattice we can draw the signal
constellation from, the E8 lattice, is isomorphic to the lattice obtained by applying
the length-8 extended Hamming code to the Z8 integer lattice. Exploring possible
advantages of signal constellations obtained in this way is another topic of future
work.
• We observed that with a clever constellation labelling, we can obtain some highly
reliable bit-levels (see, e.g., Table 6.1). Such bit-levels may then bypass the inner
SD code and be protected only by the outer HD code with much lower decoding
complexity, without sacrificing too much performance. On the flip side, in multi-
dimensional signal constellations and with clever labelling, we may also be able to
obtain some bit-levels with very low reliability. Such bit-levels may then be left
completely uncoded (frozen) without sacrificing too much performance. Designing
such signal constellations and labellings is also another topic of future work.
• In Chapter 7 we only considered nonlinear check operations that generate an
entropy-1 check bit to be sent over the symmetric channel. This constraints can
be relaxed in order to explore a larger space of check functions and their error-
reducing performance. As suggested by the external examiner, Prof. Alexan-
dre Graell i Amat, by controlling the check-bit entropy, LDNC code could also
be considered for probabilistic shaping of the signal constellation. Investigating
these ideas and other possible applications of LDNC codes is a topic of future
work.
Bibliography
[1] L. M. Zhang and F. R. Kschischang, “Low-complexity soft-decision concatenated
LDGM-staircase FEC for high bit-rate fiber-optic communication,” J. Lightw.
Technol., vol. 35, no. 18, pp. 3991–3999, Sep. 2017.
[2] A. Bisplinghoff, S. Langenbach, and T. Kupfer, “Low-power, phase-slip tolerant,
multilevel coding for M-QAM,” J. Lightw. Technol., vol. 35, no. 4, pp. 1006–1014,
Feb. 2017.
[3] Y. Koganei, T. Oyama, K. Sugitani, H. Nakashima, and T. Hoshida, “Multilevel
coding with spatially coupled repeat-accumulate codes for high-order QAM optical
transmission,” J. Lightw. Technol., vol. 37, no. 2, pp. 486–492, Jan. 2019.
[4] Optical Internetworking Forum, “Implementation agreement 400ZR,” OIF-400ZR-
01.0, 2020.
[5] T. Mehmood, M. P. Yankov, S. Iqbal, and S. Forchhammer, “Flexible
multilevel coding with concatenated polar-staircase codes for M-QAM,”
IEEE Trans. Commun., 2020, early access version. [Online]. Available:
https://doi.org/10.1109/TCOMM.2020.3038185
[6] F. Frey, S. Stern, J. K. Fischer, and R. F. H. Fischer, “Two-stage coded modulation
for Hurwitz constellations in fiber-optical communications,” J. Lightw. Technol.,
vol. 38, no. 12, pp. 3135–3146, Jun. 2020.
[7] I. B. Djordjevic, “On advanced FEC and coded modulation for ultra-high-speed
optical transmission,” IEEE Commun. Surveys Tuts., vol. 18, no. 3, pp. 1920–1951,
Jul. 2016.
[8] A. Alvarado, G. Liga, T. Fehenberger, and L. Schmalen, “On the design of coded
modulation for fiber optical communications,” in Proc. Signal Proc. Photonic Com-
mun. (SPPCom), New Orleans, USA, Jul. 2017, pp. SpM4F–2.
104
Bibliography 105
[9] P. Larsson-Edefors, C. Fougstedt, and K. Cushon, “Implementation challenges for
energy-efficient error correction in optical communication systems,” in Proc. Adv.
Photon., Zurich, Switzerland, Jul. 2018, pp. SpTh4F–2.
[10] A. Graell i Amat, G. Liva, and F. Steiner, “Coding for optical communications–
Can we approach the Shannon limit with low complexity?” in Proc. Europ. Conf.
Optic. Commun., Dublin, Ireland, Sep. 2019, pp. (Tu.1.B.5)1–4.
[11] A. Sheikh, A. G. i Amat, and A. Alvarado, “Novel high-throughput decoding
algorithms for product and staircase codes based on error-and-erasure decoding,”
Aug. 2020. [Online]. Available: http://arxiv.org/abs/2008.02181v1
[12] International Telecommunication Union, Telecommunication Standardization Sec-
tor, “Forward error correction for high bit-rate DWDM submarine systems,” ITU-T
Rec. G.975.1, Feb. 2004.
[13] B. P. Smith, A. Farhood, A. Hunt, F. R. Kschischang, and J. Lodge, “Staircase
codes: FEC for 100 Gb/s OTN,” J. Lightw. Technol., vol. 30, no. 1, pp. 110–117,
Jan. 2012.
[14] A. Y. Sukmadji, U. Martınez-Penas, and F. R. Kschischang, “Zipper codes:
Spatially-coupled product-like codes with iterative algebraic decoding,” in Proc.
Canadian Workshop Info. Theory, Hamilton, Canada, Jun. 2019, pp. 1–6.
[15] T. J. Richardson and R. L. Urbanke, Modern Coding Theory. Cambridge, U.K.:
Cambridge U. Press, 2008.
[16] G. Tzimpragos, C. Kachris, I. Djordjevic, M. Cvijetic, D. Soudris, and I. Tomkos,
“A survey on FEC codes for 100G and beyond optical networks,” IEEE Commun.
Surveys Tuts., vol. 18, no. 1, pp. 209–221, First Quarter 2016.
[17] M. Weiner, M. Blagojevic, S. Skotnikov, A. Burg, P. Flatresse, and B. Nikolic, “A
scalable 1.5-to-6Gb/s 6.2-to-38.1mW LDPC decoder for 60 GHz wireless networks
in 28nm UTBB FDSOI,” in IEEE Int. Solid-State Circuits Conf., Feb. 2014, pp.
464–465.
[18] T.-C. Ou, Z. Zhang, and M. Papaefthymiou, “An 821MHz 7.9Gb/s
7.3pJ/b/iteration charge-recovery LDPC decoder,” in IEEE Int. Solid-State Cir-
cuits Conf., Feb. 2014, pp. 462–463.
Bibliography 106
[19] Y. Lee, H. Yoo, J. Jung, J. Jo, and I.-C. Park, “A 2.74-pJ/bit, 17.7-Gb/s itera-
tive concatenated-BCH decoder in 65-nm CMOS for NAND flash memory,” IEEE
Trans. Syst. Sci. Cybern., vol. 48, no. 10, pp. 2531–2540, Oct. 2013.
[20] H. Yoo, Y. Lee, and I.-C. Park, “7.3 Gb/s universal BCH encoder and decoder for
SSD controllers,” in Proc. Asia South Pacific Design Autom. Conf., Jan. 2014, pp.
37–38.
[21] B. S. G. Pillai, B. Sedighi, K. Guan, N. P. Anthapadmanabhan, W. Shieh, K. J.
Hinton, and R. S. Tucker, “End-to-end energy modeling and analysis of long-haul
coherent transmission systems,” J. Lightw. Technol., vol. 32, no. 18, pp. 3093–3111,
Jun. 2014.
[22] D. A. Morero, M. A. Castrillon, A. Aguirre, M. R. Hueda, and O. E. Agazzi, “Design
tradeoffs and challenges in practical coherent optical transceiver implementations,”
J. Lightw. Technol., vol. 34, no. 1, pp. 121–136, Jan. 2016.
[23] 800G Pluggable Multi-source Agreement, “Enabling the next generation of cloud
& AI using 800Gb/s optical modules,” White Paper, Mar. 2020.
[24] E. Maniloff, S. Gareau, and M. Moyer, “400G and beyond: Coherent evolution
to high-capacity inter data center links,” in Proc. Optical Fiber Commun. Conf.
(OFC), San Diego, USA, Mar. 2019, p. M3H.4.
[25] G. Lechner, T. Pedersen, and G. Kramer, “Analysis and design of binary message
passing decoders,” IEEE Trans. Commun., vol. 60, no. 3, pp. 601–607, Dec. 2011.
[26] F. Angarita, J. Valls, V. Almenar, and V. Torres, “Reduced-complexity min-sum
algorithm for decoding LDPC codes with low error-floor,” IEEE Trans. Circuits
Syst. I, vol. 61, no. 7, pp. 2150–2158, Feb. 2014.
[27] K. Cushon, P. Larsson-Edefors, and P. Andrekson, “Low-power 400-Gbps soft-
decision LDPC FEC for optical transport networks,” J. Lightw. Technol., vol. 34,
no. 18, pp. 4304–4311, Aug. 2016.
[28] F. Steiner, E. B. Yacoub, B. Matuz, G. Liva, and A. G. i Amat, “One and two
bit message passing for SC-LDPC codes with higher-order modulation,” J. Lightw.
Technol., vol. 37, no. 23, pp. 5914–5925, Sep. 2019.
[29] Y. Lei, A. Alvarado, B. Chen, X. Deng, Z. Cao, J. Li, and K. Xu, “Decoding
staircase codes with marked bits,” in Proc. 10th Int. Symp. Turbo Codes Iterative
Inf. Process. (ISTC), Hong Kong, Dec. 2018, pp. 1–5.
Bibliography 107
[30] A. Sheikh, A. G. i Amat, and G. Liva, “Binary message passing decoding of
product-like codes,” IEEE Trans. Commun., vol. 67, no. 12, pp. 8167–8178, Sep.
2019.
[31] Y. Lei, B. Chen, G. Liga, X. Deng, Z. Cao, J. Li, K. Xu, and A. Alvarado,
“Improved decoding of staircase codes: The soft-aided bit-marking (SABM) al-
gorithm,” IEEE Trans. Commun., vol. 67, no. 12, pp. 8220–8232, Oct. 2019.
[32] D. S. Millar, T. Koike-Akino, S. O. Arik, K. Kojima, K. Parsons, T. Yoshida, and
T. Sugihara, “High-dimensional modulation for coherent optical communications
systems,” Opt. Express, vol. 22, no. 7, pp. 8798–8812, Apr. 2014.
[33] D. S. Millar, T. Fehenberger, T. Koike-Akino, K. Kojima, and K. Parsons, “Coded
modulation for next-generation optical communications,” in Proc. Optical Fiber
Commun. Conf. (OFC), San Diego, USA, Mar. 2018, p. Tu3C.3.
[34] I. B. Djordjevic and B. Vasic, “Nonbinary LDPC codes for optical communication
systems,” IEEE Photon. Technol. Lett., vol. 17, no. 10, pp. 2224–2226, Sep. 2005.
[35] A. Leven and L. Schmalen, “Status and recent advances on forward error correction
technologies for lightwave systems,” J. Lightw. Technol., vol. 32, no. 16, pp. 2735–
2750, 2014.
[36] R.-J. Essiambre, G. Kramer, P. J. Winzer, G. J. Foschini, and B. Goebel, “Capacity
limits of optical fiber networks,” J. Lightw. Technol., vol. 28, no. 4, pp. 662–701,
2010.
[37] Y. Cai, W. Wang, W. Qian, J. Xing, K. Tao, J. Yin, S. Zhang, M. Lei, E. Sun,
H.-C. Chien, Q. Liao, K. Yang, and H. Chen, “FPGA investigation on error-flare
performance of a concatenated staircase and Hamming FEC code for 400G inter-
data center interconnect,” J. Lightw. Technol., vol. 37, no. 1, pp. 188–195, Jan.
2019.
[38] L. Lundberg, “Power consumption and joint signal processing in fiber-optical
communication,” Ph.D. dissertation, Dept. of Microtechnology & Nanoscience,
Chalmers University of Technology, 2019.
[39] T. Koike-Akino, D. S. Millar, K. Kojima, K. Parsons, Y. Miyata, K. Sugihara, and
W. Matsumoto, “Iteration-aware LDPC code design for low-power optical commu-
nications,” J. Lightw. Technol., vol. 34, no. 2, pp. 573–581, 2015.
Bibliography 108
[40] N. Verma, H. Jia, H. Valavi, Y. Tang, M. Ozatay, L.-Y. Chen, B. Zhang, and
P. Deaville, “In-memory computing: Advances and prospects,” IEEE Solid-State
Circuits Mag., vol. 11, no. 3, pp. 43–55, Aug. 2019.
[41] L. M. Zhang and F. R. Kschischang, “Staircase codes with 6% to 33% overhead,”
J. Lightw. Technol., vol. 32, no. 10, pp. 1999–2002, May 2014.
[42] C. Hager and H. D. Pfister, “Approaching miscorrection-free performance of prod-
uct codes with anchor decoding,” IEEE Trans. Commun., vol. 66, no. 7, pp. 2797–
2808, Mar. 2018.
[43] A. Y. Sukmadji, “Zipper codes: High-rate spatially-coupled codes with algebraic
component codes,” Master’s thesis, Dept. of Electrical & Computer Engineering,
University of Toronto, 2020.
[44] R. G. Gallager, “Low-density parity-check codes,” IEEE Trans. Inf. Theory, vol. 8,
no. 1, pp. 21–28, Jan. 1962.
[45] R. Tanner, “A recursive approach to low complexity codes,” IEEE Trans. Inf.
Theory, vol. 27, no. 5, pp. 533–547, Sep. 1981.
[46] F. R. Kschischang, B. J. Frey, and H.-A. Loeliger, “Factor graphs and the sum-
product algorithm,” IEEE Trans. Inf. Theory, vol. 47, no. 2, pp. 498–519, Feb.
2001.
[47] T. J. Richardson, M. A. Shokrollahi, and R. L. Urbanke, “Design of capacity-
approaching irregular low-density parity-check codes,” IEEE Trans. Inf. Theory,
vol. 47, no. 2, pp. 619–637, Feb. 2001.
[48] N. Wiberg, “Codes and decoding on general graphs,” Ph.D. dissertation, Dept. of
Electrical Engineering, Linkoping University, 1996.
[49] J. Chen and M. P. Fossorier, “Density evolution for two improved BP-based decod-
ing algorithms of LDPC codes,” IEEE Commun. Lett., vol. 6, no. 5, pp. 208–210,
Aug. 2002.
[50] D. Divsalar, H. Jin, and R. J. McEliece, “Coding theorems for ”Turbo-Like” codes,”
in Proc. 36th Allerton Conf. on Commun., Control, and Comput., vol. 36, Allerton,
USA, Sep. 1998, pp. 201–210.
Bibliography 109
[51] J. Garcia-Frias and W. Zhong, “Approaching shannon performance by iterative
decoding of linear codes with low-density generator matrix,” IEEE Commun. Lett.,
vol. 7, no. 6, pp. 266–268, Jun. 2003.
[52] A. Darabiha, A. C. Carusone, and F. R. Kschischang, “Power reduction techniques
for LDPC decoders,” IEEE J. Solid-State Circuits, vol. 43, no. 8, pp. 1835–1845,
Jul. 2008.
[53] E. Amador, R. Knopp, V. Rezard, and R. Pacalet, “Dynamic power management
on LDPC decoders,” in Proc. IEEE Comput. Society Annu. Symp. VLSI, Lixouri,
Greece, Jul. 2010, pp. 416–421.
[54] T. Mohsenin, D. N. Truong, and B. M. Baas, “A low-complexity message-passing
algorithm for reduced routing congestion in LDPC decoders,” IEEE Trans. Circuits
Syst. I, vol. 57, no. 5, pp. 1048–1061, May 2010.
[55] X. Zhang and P. H. Siegel, “Quantized iterative message passing decoders with low
error floor for LDPC codes,” IEEE Trans. Commun., vol. 62, no. 1, pp. 1–14, Dec.
2013.
[56] Z. Wang, Z. Cui, and J. Sha, “VLSI design for low-density parity-check code de-
coding,” IEEE Circuits Syst. Mag., vol. 11, no. 1, pp. 52–69, Feb. 2011.
[57] M. Milicevic, “Low-density parity-check decoder architectures for integrated cir-
cuits and quantum cryptography,” Ph.D. dissertation, Dept. of Electrical & Com-
puter Engineering, University of Toronto, 2017.
[58] T. Richardson, “Error floors of LDPC codes,” in Proc. 41st Allerton Conf. on
Commun., Control, and Comput., vol. 41, no. 3, Allerton, USA, Oct. 2003, pp.
1426–1435.
[59] S. ten Brink, “Convergence behavior of iteratively decoded parallel concatenated
codes,” IEEE Trans. Commun., vol. 49, no. 10, pp. 1727–1737, Oct. 2001.
[60] M. Ardakani and F. R. Kschischang, “A more accurate one-dimensional analysis
and design of irregular LDPC codes,” IEEE Trans. Commun., vol. 52, no. 12, pp.
2106–2114, Dec. 2004.
[61] B. P. Smith, M. Ardakani, W. Yu, and F. R. Kschischang, “Design of irregu-
lar LDPC codes with optimized performance-complexity tradeoff,” IEEE Trans.
Commun., vol. 58, no. 2, pp. 489–499, Feb. 2010.
Bibliography 110
[62] L. Schmalen, S. ten Brink, G. Lechner, and A. Leven, “On threshold prediction
of low-density parity-check codes with structure,” in Proc. 46th Annu. Conf. on
Inform. Sci. and Syst. (CISS), Princeton , USA, Mar. 2012, pp. 1–5.
[63] M. P. Fossorier, “Quasicyclic low-density parity-check codes from circulant per-
mutation matrices,” IEEE Trans. Inf. Theory, vol. 50, no. 8, pp. 1788–1793, Jul.
2004.
[64] M. Milicevic and P. G. Gulak, “A multi-Gb/s frame-interleaved LDPC decoder
with path-unrolled message passing in 28-nm CMOS,” IEEE Trans. Very Large
Scale Integr. (VLSI) Syst., vol. 26, no. 10, pp. 1908–1921, Jun. 2018.
[65] Z. Li, L. Chen, L. Zeng, S. Lin, and W. H. Fong, “Efficient encoding of quasi-cyclic
low-density parity-check codes,” IEEE Trans. Commun., vol. 54, no. 1, pp. 71–81,
Jan. 2006.
[66] D. E. Hocevar, “A reduced complexity decoder architecture via layered decoding of
LDPC codes,” in Proc. IEEE Workshop Signal Processing and Systems (SIPS.04),
Austin , USA, Oct. 2004, pp. 107–112.
[67] G. D. Forney, “Concatenated codes,” Ph.D. dissertation, Research Laboratory of
Electronics, Massachusetts Institute of Technology, 1965.
[68] I. Dumer, “Low-density parity-check code constructions,” in Handbook of Coding
Theory, V. S. Pless and W. C. Huffman, Ed. Amsterdam, The Netherlands:
Elsevier Science, 1998, ch. 23, pp. 1911–1988.
[69] E. L. Blokh and V. V. Zyablov, “Coding of generalized concatenated codes,” Probl.
Pered. Inform., vol. 10, no. 3, pp. 45–50, 1974.
[70] M. Bossert, Channel coding for telecommunications. Hoboken, NJ: John Wiley &
Sons, 1999.
[71] J. L. Massey, “Coding and modulation in digital communications,” in International
Zurich Seminar on Digital Communications, Zurich, Switzerland, Mar. 1974.
[72] G. Ungerboeck, “Channel coding with multilevel/phase signals,” IEEE Trans. Inf.
Theory, vol. 28, no. 1, pp. 55–67, Jan. 1982.
[73] H. Imai and S. Hirakawa, “A new multilevel coding method using error-correcting
codes,” IEEE Trans. Inf. Theory, vol. 23, no. 3, pp. 371–377, May 1977.
Bibliography 111
[74] G. Ungerboeck, “Trellis-coded modulation with redundant signal sets Part I: In-
troduction,” IEEE Commun. Mag., vol. 25, no. 2, pp. 5–11, Feb. 1987.
[75] ——, “Trellis-coded modulation with redundant signal sets Part II: State of the
art,” IEEE Commun. Mag., vol. 25, no. 2, pp. 12–21, Feb. 1987.
[76] R. Zamir, Lattice Coding for Signals and Networks: A Structured Coding Approach
to Quantization, Modulation, and Multiuser Information Theory. Cambridge,
England: Cambridge University Press, 2014.
[77] G. D. Forney Jr, “Coset codes. I. introduction and geometrical classification,” IEEE
Trans. Inf. Theory, vol. 34, no. 5, pp. 1123–1151, Sep. 1988.
[78] ——, “Coset codes. II. binary lattices and related codes,” IEEE Trans. Inf. Theory,
vol. 34, no. 5, pp. 1152–1187, Sep. 1988.
[79] U. Wachsmann, R. F. Fischer, and J. B. Huber, “Multilevel codes: Theoretical
concepts and practical design rules,” IEEE Trans. Inf. Theory, vol. 45, no. 5, pp.
1361–1391, Jul. 1999.
[80] Asymmetric Digital Subscriber Line Transceivers 2 (ADSL2), Int. Telecommun.
Union (ITU) Std. Recommendation G.992.3, 2009.
[81] L. Beygi, E. Agrell, and M. Karlsson, “On the dimensionality of multilevel coded
modulation in the high SNR regime,” IEEE Commun. Lett., vol. 14, no. 11, pp.
1056–1058, 2010.
[82] F. Frey, S. Stern, R. Emmerich, C. Schubert, J. K. Fischer, and R. F. H. Fischer,
“Coded modulation using a 512-ary Hurwitz-integer constellation,” in Proc. 45th
Europ. Conf. Opt. Commun., Dublin, Ireland, Sep. 2019, pp. (W2D2)1–4.
[83] E. Zehavi, “8-PSK trellis codes for a Rayleigh channel,” IEEE Trans. Commun.,
vol. 40, no. 5, pp. 873–884, 1992.
[84] G. Caire, G. Taricco, and E. Biglieri, “Bit-interleaved coded modulation,” IEEE
Trans. Inf. Theory, vol. 44, no. 3, pp. 927–946, May 1998.
[85] L. Szczecinski and A. Alvarado, Bit-interleaved Coded Modulation: Fundamentals,
Analysis and Design. Hoboken, NJ: John Wiley & Sons, 2015.
Bibliography 112
[86] B. P. Smith and F. R. Kschischang, “A pragmatic coded modulation scheme for
high-spectral-efficiency fiber-optic communications,” J. Lightw. Technol., vol. 30,
no. 13, pp. 2047–2053, Jul. 2012.
[87] M. Barakatain and F. R. Kschischang, “Low-complexity concatenated LDPC-
staircase codes,” J. Lightw. Technol., vol. 36, no. 12, pp. 2443–2449, Jun. 2018,
(correction: vol. 37, no. 3, p. 1070, 2019).
[88] M. J. D. Powell, “A fast algorithm for nonlinearly constrained optimization calcu-
lations,” in Numerical Analysis. Springer, 1978, pp. 144–157.
[89] J. Zhao, F. Zarkeshvari, and A. H. Banihashemi, “On implementation of min-
sum algorithm and its modifications for decoding low-density parity-check (LDPC)
codes,” IEEE Trans. Commun., vol. 53, no. 4, pp. 549–554, Apr. 2005.
[90] K. Onohara, T. Sugihara, Y. Konishi, Y. Miyata, T. Inoue, S. Kametani, K. Sug-
ihara, K. Kubo, H. Yoshida, and T. Mizuochi, “Soft-decision-based forward error
correction for 100 Gb/s transport systems,” IEEE J. Sel. Topics Quantum Elec-
tron., vol. 16, no. 5, pp. 1258–1267, Sep. 2010.
[91] Y. Miyata, K. Kubo, K. Sugihara, T. Ichikawa, W. Matsumoto, H. Yoshida, and
T. Mizuochi, “Performance improvement of a triple-concatenated FEC by a UEP-
BCH product code for 100 Gb/s optical transport networks,” in Proc. OptoElec-
tronics Commun. Conf., Wuhan, China, May 2013, pp. (ThR2–2)1–3.
[92] D. Chang, F. Yu, Z. Xiao, N. Stojanovic, F. N. Hauske, Y. Cai, C. Xie, L. Li,
X. Xu, and Q. Xiong, “LDPC convolutional codes using layered decoding algorithm
for high speed coherent optical transmission,” in Proc. IEEE/OSA Optical Fiber
Commun. Conf., 2012, pp. (OW1H.4)1–3.
[93] D. Morero, M. Castrillon, F. Ramos, T. Goette, O. Agazzi, and M. Hueda, “Non-
concatenated FEC codes for ultra-high speed optical transport networks,” in Proc.
IEEE Global Telecommun. Conf., Dec. 2011, pp. 1–5.
[94] K. Sugihara, K. Ishii, K. Dohi, K. Kubo, T. Sugihara, and W. Matsumoto, “Scal-
able SD-FEC for efficient next-generation optical networks,” in Proc. Eur. Conf.
Exhibit. Opt. Commun., 2016, pp. 568–570.
[95] D. Chang, F. Yu, Z. Xiao, Y. Li, N. Stojanovic, C. Xie, X. Shi, X. Xu, and Q. Xiong,
“FPGA verification of a single QC-LDPC code for 100 Gb/s optical systems without
Bibliography 113
error floor down to BER of 10−15,” in Proc. IEEE/OSA Optical Fiber Commun.
Conf., 2011, pp. (OTuN2)1–3.
[96] A. Amraoui, A. Montanari, T. Richardson, and R. Urbanke, “Finite-length scaling
for iteratively decoded LDPC ensembles,” IEEE Trans. Inf. Theory, vol. 55, no. 2,
pp. 473–498, Feb. 2009.
[97] M. Barakatain, D. Lentner, G. Bocherer, and F. R. Kschischang, “Performance-
complexity tradeoffs of concatenated FEC for 64-QAM MLC and BICM,” in Proc.
45th Europ. Conf. Optic. Commun. (ECOC), Dublin, Ireland, Sep. 2019, pp.
(Tu.1.B.4)1–4.
[98] ——, “Performance-complexity tradeoffs of concatenated FEC for higher-order
modulation,” J. Lightw. Technol., vol. 38, no. 11, pp. 2944–2953, Jun. 2020.
[99] F. Steiner, G. Bocherer, and G. Liva, “Protograph-based LDPC code design for
shaped bit-metric decoding,” J. Sel. Areas Commun., vol. 34, no. 2, pp. 397–407,
Feb. 2016.
[100] F. Gray, “Pulse code communication,” U.S. Patent 2 632 058, Mar. 17, 1953.
[101] T. Richardson and R. Urbanke, “Multi-edge type LDPC codes,” in Information,
Coding and Mathematics: Proceedings of Workshop Honoring Prof. Bob McEliece
on his 60th Birthday, M. Blaum, P. G. Farrell, and H. C. A. van Tilbog, Eds. New
York, NY: Springer, Apr. 2002, pp. 24–25.
[102] J. Thorpe, “Low-density parity-check (LDPC) codes constructed from pro-
tographs,” IPN Progr. Rep., vol. 42, no. 154, pp. 42–154, Aug. 2003.
[103] G. Liva and M. Chiani, “Protograph LDPC codes design based on EXIT analysis,”
in Proc. IEEE Global Telecommun. Conf. (GLOBECOM), Washington, DC, Nov.
2007, pp. 3250–3254.
[104] F. Brannstrom, L. K. Rasmussen, and A. J. Grant, “Convergence analysis and
optimal scheduling for multiple concatenated codes,” IEEE Trans. Inf. Theory,
vol. 51, no. 9, pp. 3354–3364, Sep. 2005.
[105] E. Paolini and M. Flanagan, “Low-density parity-check code constructions,” in
Channel Coding: Theory, Algorithms, and Applications, D. Declercq, M. Fossorier,
and E. Biglieri, Eds. Cambridge, MA: Academic Press, 2014, pp. 141–209.
Bibliography 114
[106] R. Storn and K. Price, “Differential evolution–a simple and efficient heuristic for
global optimization over continuous spaces,” J. Global Optim., vol. 11, no. 4, pp.
341–359, Dec. 1997.
[107] M. Barakatain and F. R. Kschischang, “Low-complexity rate- and channel-
configurable concatenated codes,” J. Lightw. Technol., 2020, early access version.
[Online]. Available: https://doi.org/10.1109/JLT.2020.3046473
[108] J. Shi and R. D. Wesel, “A study on universal codes with finite block lengths,”
IEEE Trans. Inf. Theory, vol. 53, no. 9, pp. 3066–3074, Sep. 2007.
[109] H. Esfahanizadeh, A. Hareedy, R. Wu, R. Galbraith, and L. Dolecek, “Spatially-
coupled codes for channels with SNR variation,” IEEE Trans. Magn., vol. 54,
no. 11, pp. 1–5, Nov. 2018.
[110] D. Chang, F. Yu, Z. Xiao, N. Stojanovic, F. N. Hauske, Y. Cai, C. Xie, L. Li, X. Xu,
and Q. Xiong, “LDPC convolutional codes using layered decoding algorithm for
high speed coherent optical transmission,” in Proc. Optical Fiber Commun. Conf.
(OFC), Los Angeles, USA, Mar. 2012, p. OW1H.4.
[111] J. D. Andersen, K. J. Larsen, C. Bering, S. Forchhammer, F. Da Ros, K. Dalgaard,
and S. Iqbal, “A configurable FPGA FEC unit for Tb/s optical communication,”
in Proc. IEEE Int. Conf. Commun. (ICC), Paris, France, May 2017, pp. 1–6.
[112] K. Ishii, K. Dohi, K. Kubo, K. Sugihara, Y. Miyata, and T. Sugihara, “A study on
power-scaling of triple-concatenated FEC for optical transport networks,” in Proc.
Europ. Conf. Optical Commun., Valencia, Spain, Sep. 2015, pp. (Tu.3.4.2)1–3.
[113] D. A. A. Mello, A. N. Barreto, T. C. de Lima, T. F. Portela, L. Beygi, and J. M.
Kahn, “Optical networking with variable-code-rate transceivers,” J. Lightw. Tech-
nol., vol. 32, no. 2, pp. 257–266, Jan. 2013.
[114] G. Bosco, “Advanced modulation techniques for flexible optical transceivers: The
rate/reach tradeoff,” J. Lightw. Technol., vol. 37, no. 1, pp. 36–49, Jan. 2018.
[115] L. Schmalen, L. M. Zhang, and U. Gebhard, “Distributed rate-adaptive staircase
codes for connectionless optical metro networks,” in Proc. Optical Fiber Commun.
Conf. (OFC), Los Angeles, USA, Mar. 2017, pp. W1J–2.
[116] T. Koike-Akino, D. S. Millar, K. Parsons, and K. Kojima, “Rate-adaptive LDPC
convolutional coding with joint layered scheduling and shortening design,” in Proc.
Optical Fiber Commun. Conf. (OFC), San Francisco, USA, Mar. 2018, p. Tu3C.1.
Bibliography 115
[117] V. Jain, C. Fougstedt, and P. Larsson-Edefors, “Variable-rate FEC decoder VLSI
architecture for 400G rate-adaptive optical communication,” in Proc. IEEE Int.
Conf. Electron., Circuits, Systems (ICECS), Genova, Italy, Nov. 2019, pp. 45–48.
[118] G.-H. Gho and J. M. Kahn, “Rate-adaptive modulation and coding for optical
fiber transmission systems,” J. Lightw. Technol., vol. 30, no. 12, pp. 1818–1828,
Jun. 2012.
[119] M. Arabaci, I. B. Djordjevic, L. Xu, and T. Wang, “Nonbinary LDPC-coded mod-
ulation for rate-adaptive optical fiber communication without bandwidth expan-
sion,” IEEE Photon. Technol. Lett., vol. 24, no. 16, pp. 1402–1404, Jun. 2012.
[120] L. Beygi, E. Agrell, J. M. Kahn, and M. Karlsson, “Rate-adaptive coded mod-
ulation for fiber-optic communications,” J. Lightw. Technol., vol. 32, no. 2, pp.
333–343, Jan. 2013.
[121] B. Chen, Y. Lei, D. Lavery, C. Okonkwo, and A. Alvarado, “Rate-adaptive coded
modulation with geometrically-shaped constellations,” in Proc. Asia Commun.
Photon. Conf. (ACP), Hangzhou, China, Oct. 2018, pp. 1–3.
[122] D. S. Millar, T. Fehenberger, T. Koike-Akino, K. Kojima, and K. Parsons, “Distri-
bution matching for high spectral efficiency optical communication with multiset
partitions,” J. Lightw. Technol., vol. 37, no. 2, pp. 517–523, Jan. 2019.
[123] F. Buchali, F. Steiner, G. Bocherer, L. Schmalen, P. Schulte, and W. Idler, “Rate
adaptation and reach increase by probabilistically shaped 64-QAM: An experimen-
tal demonstration,” J. Lightw. Technol., vol. 34, no. 7, pp. 1599–1609, Apr. 2016.
[124] D. A. Morero, M. A. Castrillon, T. A. Goette, M. S. Schnidrig, F. A. Ramos,
M. C. Asinari, D. E. Crivelli, and M. R. Hueda, “Experimental demonstration of
a variable-rate LDPC code with adaptive low-power decoding for next-generation
optical networks,” in Proc. IEEE Photon. Conf., Wakoloa, USA, Oct. 2016, pp.
307–308.
[125] A. S. Thyagaturu, A. Mercian, M. P. McGarry, M. Reisslein, and W. Kellerer, “Soft-
ware defined optical networks (SDONs): A comprehensive survey,” IEEE Commun.
Surveys Tuts., vol. 18, no. 4, pp. 2738–2786, Oct. 2016.
[126] C. Pan and F. R. Kschischang, “Probabilistic 16-QAM shaping in WDM systems,”
J. Lightw. Technol., vol. 34, no. 18, pp. 4285–4292, Sep. 2016.
Bibliography 116
[127] G. Bocherer, P. Schulte, and F. Steiner, “Probabilistic shaping and forward error
correction for fiber-optic communication systems,” J. Lightw. Technol., vol. 37,
no. 2, pp. 230–244, Jan. 2019.
[128] S. ten Brink, G. Kramer, and A. Ashikhmin, “Design of low-density parity-check
codes for modulation and detection,” IEEE Trans. Commun., vol. 52, no. 4, pp.
670–678, Apr. 2004.
[129] E. Agrell and M. Karlsson, “Power-efficient modulation formats in coherent trans-
mission systems,” J. Lightw. Technol., vol. 27, no. 22, pp. 5115–5126, Nov. 2009.
[130] M. Karlsson and E. Agrell, Multidimensional Optimized Optical Modulation For-
mats. Hoboken, USA: John Wiley & Sons, 2016, ch. 2, pp. 13–64.
[131] J. H. Conway and N. J. A. Sloane, Sphere Packings, Lattices and Groups, 3rd ed.
New York, USA: Springer-Verlag, 1999.
[132] S. Stern, F. Frey, J. K. Fischer, and R. F. H. Fischer, “Two-stage dimension-wise
coded modulation for four-dimensional Hurwitz-integer constellations,” in Proc.
12th Int. ITG Conf. on Systems, Commun. and Coding (SCC), Rostock, Germany,
Feb. 2019, pp. 197–202.
[133] G. Welti and Jhong Lee, “Digital transmission with coherent four-dimensional mod-
ulation,” IEEE Trans. Inf. Theory, vol. 20, no. 4, pp. 497–502, Jul. 1974.
[134] S. Stern, M. Barakatain, F. Frey, J. Pfeiffer, J. K. Fischer, and R. F. Fischer,
“Coded modulation for four-dimensional signal constellations with concatenated
non-binary forward error correction,” in Proc. 46th Europ. Conf. Optic. Commun.
(ECOC), Brussels, Belgium, Dec. 2020, pp. (We1F.4)1–4.
[135] H. Song and J. R. Cruz, “Reduced-complexity decoding of Q-ary LDPC codes for
magnetic recording,” IEEE Trans. Magn., vol. 39, no. 2, pp. 1081–1087, Mar. 2003.
[136] D. Declercq and M. Fossorier, “Decoding algorithms for nonbinary LDPC codes
over GF(q),” IEEE Trans. Commun., vol. 55, no. 4, pp. 633–643, Apr. 2007.
[137] A. Voicila, D. Declercq, F. Verdier, M. Fossorier, and P. Urard, “Low-complexity
decoding for non-binary LDPC codes in high order fields,” IEEE Trans. Commun.,
vol. 58, no. 5, pp. 1365–1375, May 2010.
Bibliography 117
[138] V. B. Wijekoon, E. Viterbo, and Y. Hong, “A low complexity decoding algorithm
for NB-LDPC codes over quadratic extension fields,” in Proc. IEEE Int. Symp. Inf.
Theory (ISIT), Los Angeles, USA, Jun. 2020, p. C.4.1.
[139] M. C. Davey and D. J. MacKay, “Monte Carlo simulations of infinite low den-
sity parity check codes over GF(q),” in Int. Workshop Optim. Codes Rel. Topics,
Bulgaria, Balkans, Jun. 1998, pp. 9–15.
[140] M. Gorgoglione, V. Savin, and D. Declercq, “Optimized puncturing distributions
for irregular non-binary LDPC codes,” in Int. Symp. Inf. Theory Appl. (ISITA),
Taichung, Taiwan, Oct. 2010, pp. 400–405.
[141] M. Beermann, E. Monzo, L. Schmalen, and P. Vary, “GPU accelerated belief prop-
agation decoding of non-binary LDPC codes with parallel and sequential schedul-
ing,” J. Signal Process. Syst., vol. 78, no. 1, pp. 21–34, Jan. 2015.
[142] K. Zeger and A. Gersho, “Pseudo-Gray coding,” IEEE Trans. Commun., vol. 38,
no. 12, pp. 2147–2158, Dec. 1990.
[143] D. A. Spielman, “Linear-time encodable and decodable error-correcting codes,”
IEEE Trans. Inf. Theory, vol. 42, no. 6, pp. 1723–1731, Jun. 1996.
[144] H. Roozbehani and Y. Polyanskiy, “Triangulation codes: a family of non-linear
codes with graceful degradation,” in 2018 Conf. Inform. Sciences and Syst. (CISS),
Princeton, USA, Mar. 2018, pp. 1–6.
[145] ——, “Low density majority codes and the problem of graceful degradation,”
Nov. 2019. [Online]. Available: http://arxiv.org/abs/1911.12263v1
[146] R. G. Gallager, Information theory and reliable communication. Hoboken, NJ:
John Wiley & Sons, 1968.
[147] T. M. Cover, Elements of information theory. Hoboken, NJ: John Wiley & Sons,
1999.
[148] R. McEliece and R. J. Mac Eliece, The theory of information and coding. Cam-
bridge, U.K.: Cambridge university press, 2002.
[149] C. E. Shannon, “A mathematical theory of communication,” Bell Syst. Tech. J.,
vol. 27, no. 3, pp. 379–423, Jul. 1948.
Bibliography 118
[150] J. L. Massey, “Joint source and channel coding,” in Commun. Syst. Random Process
Theory, J. K. Skwirzynski, Ed. Alphenaan den Rijn, The Netherlands: Sijthoff
and Noordhoff, 1978, pp. 279–293.
[151] A. Gupta and S. Verdu, “Nonlinear sparse-graph codes for lossy compression,”
IEEE Trans. Inf. Theory, vol. 55, no. 5, pp. 1961–1975, Apr. 2009.
[152] Z. Sun, M. Shao, J. Chen, K. M. Wong, and X. Wu, “Achieving the rate-distortion
bound with low-density generator matrix codes,” IEEE Trans. Commun., vol. 58,
no. 6, pp. 1643–1653, Jun. 2010.
[153] R. Venkataramanan, T. Sarkar, and S. Tatikonda, “Lossy compression via sparse
linear regression: Computationally efficient encoding and decoding,” IEEE Trans.
Inf. Theory, vol. 60, no. 6, pp. 3265–3278, Apr. 2014.
[154] V. Aref, N. Macris, and M. Vuffray, “Approaching the rate-distortion limit with
spatial coupling, belief propagation, and decimation,” IEEE Trans. Inf. Theory,
vol. 61, no. 7, pp. 3954–3979, May 2015.
[155] E. J. Ionascu, T. Martinsen, and P. Stanica, “Bisecting binomial coefficients,”
Discrete Applied Mathematics, vol. 227, pp. 70–83, Aug. 2017.
[156] E. J. Ionascu, “A variation on bisecting the binomial coefficients,” Mar. 2018.
[Online]. Available: http://arxiv.org/abs/1712.01243
[157] H. Ma, X. Leung, W. K.and Yan, K. Law, and M. Fossorier, “Delayed bit in-
terleaved coded modulation,” in Proc. 9th Int. Symp. Turbo Codes Iterative Inf.
Process. (ISTC), Brest, France, Aug. 2016, pp. 86–90.