Error Control Coding - CERN Indico
-
Upload
khangminh22 -
Category
Documents
-
view
4 -
download
0
Transcript of Error Control Coding - CERN Indico
1
1
Coding
with emphasis on
Error Control Coding
A. Marchioro / PH-ESE-ME
2
“The fundamental problem of communication is that of reproducing at one point either exactly or approximately a message selected at
another point”
Claude Shannon
A Mathematical Theory of Communication,
The Bell System Technical Journal, vol. 27, pp.379-423, July 1948
2
3
What type of coding?
Four main reasons for using coding:
Adapt to electrical or accidental characteristics
of the channel
Reduce the amount of data to be
transmitted ï Source Coding
Control errors ï Error Coding
Conceal and hide data ï Cryptography
4
Motivation: Adapt
Data may need to be coded because of: Electrical reasons
• Clock Recovery
• Line balancing • Transmit statistically same number of 0’s and 1’s
• Aid gain control in AGC amplifiers at receiver’s end
• Facilitate synchronization
• Spectrum limitation and modification
• Line adaptation: • PAM
Logical reasons • Simplification of frame recognition
• E.g. add an easily recognizable start/stop condition
• Protocol reasons • Distinguish data from control information
3
5
Some digital waveforms
NRZ-L
NRZ-M
NRZ-S
Unipolar-RZ
Bipolar-RZ
Bi- -L
Bi- -M
Bi- -S
Delay Mod
Di-code NRZ
Di-code RZ
1 0 1 1 0 0 0 1 1 0 1
6
Motivation: Error Control
“Two weekends in a row I came in and found that all my stuff had been
dumped and nothing was done. I was really aroused and annoyed
and I wanted those answers and two weekends had been lost. And
so I said, ‘Damn it, if the machine can detect an error, why can’t it
locate the position of the error and correct it?’”
from an interview with R. Hamming,
February 3-4, 1977, quoted in T. Thompson, p.17
“The purpose of this memorandum is to give some practical codes
which may detect and correct all errors of a given probability of
occurrence, and which detect errors of even a rarer occurrence”. from R. Hamming,
‘Self-Correcting Codes – Case 20878,
Memorandum 1130-RWH-MFW,
Bell Telephone Laboratories, July 27, 1947
4
7
Motivation: Error Control
Hardware is not perfect Errors are introduced by:
• Signal reduction • Transmission from satellite
• Trans-oceanic optical fibers
• Noise and interference • Radio communication
• People • Reading a Credit Card number
• Imperfect parts • Richard Hamming worked first to avoid catastrophic errors in early relay-based
computers
• Radioactive particles causing perturbation in electronic circuits
• Wear-out and aging • e.g. fingerprint and scratches on CD and DVDs
Basic idea: Add redundant information as to make up for lost or garbled information
8
Applications
5
9
Applications Voyager 2 Flight System Performance (Dec 2008)
Down transmission speed ~ 7Kbit/sec
Up transmission speed ~50 bits/sec
PROPELLANT/POWER CONSUMABLES STATUS AS OF THIS REPORT
Spacecraft Consumption One Week[*] (gm) 6.29
Propellant Remaining (Kg) 28.63
Output (Watts) 281.5
RANGE, VELOCITY AND ROUND TRIP LIGHT TIME AS OF Nov 7,2008
Distance from the Sun (Km) 13,077,000,000
Distance from the Earth (Km) 13,132,000,000
Total Distance Traveled (Km) 19,409,000,000
Velocity Relative to Sun (Km/sec) 15.524
Velocity Relative to Earth (Km/sec) 41.592
[*] Power Source is Radioisotope Thermoelectric Generator
http://voyager.jpl.nasa.gov/mission/weekly-reports/
10
Applications
6
ADSL needs FEC
11
12
Parity
In B = {0,1}, start with a message word: S = {s1s2s3s4s5s6s7}
Compute a “Parity” character s8 defined as:
c8 = s1 s2 s3 s4 s5 s6 s7
where is the exclusive-OR (or the sum mod 2).
Parity check can detect all single errors (but can not give the position)
Parity check can not detect double errors
Used:
- often in computer memories
- in serial terminals data transmission
7
13
Two-Dimensional Parity
Pa
rity
X
ParityY
2 Errors
1 0 1 1 1 0 0 0
0 1 0 0 0 1 1 1
1 1 0 0 0 0 1 1
0 1 0 1 1 0 1 0
1 0 0 1 0 1 1 0
0 0 0 1 0 1 0 0
1 1 0 0 1 0 0 1
0 0 1 0 1 1 0
1 0 1 1 1 0 0 0
0 1 0 0 0 1 1 1
1 1 0 0 0 0 1 1
0 1 0 1 0 0 1 1
1 0 0 1 1 1 1 1
0 0 0 1 0 1 0 0
1 1 0 0 1 0 0 1
0 0 1 0 1 1 0
14
Two-Dimensional Parity
8
15
Repetition Code
Take each symbol si in S and repeat it n times.
This is an (n, 1) code.
For example the word {s1s2s3} becomes the codeword
{s1s1s1s2s2s2s3s3s3}
Efficiency (= rate) of the code is: 1/n
The minimum distance (see later) is n and the number of errors t that can
be corrected is:
t = (n – 1)
16
Repetition anywhere?
Redundancy in mainframes:
http://www.research.ibm.com/journal/rd/435/spainhower.html
http://www.fujitsu.com/downloads/PRMQST/documents/whitepaper/PRIMEQUEST-High-Availability.pdf
In RAID disk arrays:
RAID1 (Mirroring)
Space electronics:
“A Comparison of Fault-Tolerant State Machine Architectures for Space-Borne Electronics”, IEEE Transactions on Reliability, Vol. 45. No. 1. 1996 March
9
17
Triple Module Redundancy
Triple redundancy Three copies of same user logic + state_register
Voting logic decides 2 out of three (majority)
Used in: High reliability electronics
Mainframes
Problems: 300% area and power
corrects only 1 error
can get very wrong with two errors
FSM1
FSM2
FSM3
Voting logic
Input Output
CLK
18
ISBN (1)
ISBN: d1d2d3d4d5d6d7d8d9d10
T = (10d1 + 9d2 + 8d3 + 7d4 + 6d5 + 5d6 + 4d7 + 3d8 + 2d9) mod 11
d10 = 11 – T
Example: the check digit in: ISBN 038550420-9 is computed as:
11 – [(10*0 + 9*3 + 8*8 + 7*5 + 6*5 + 5*0 + 4*4 + 3*2 + 2*0) mod 11] =
11 – 2 = 9 q.e.d.
10
19
ISBN (2)
Statement:
All single errors are detected by the ISBN check digit.
Proof:
An error will not be detected if T is changed by a multiple of 11 as the check digit is computed mod 11.
Assume that the ith digit is wrong, i.e. it was di and was changed into d’i with 0 d’i 9.
The difference of the two digits calculated in the check-sum is:
i · di – i · d’i
= i · (di – d’i)
and for the error to be undetected it must be that
gcd(11, i · (di – d’i) ) = 11
which by an elementary theorem in number theory requires that gcd(11, i) = 11 or that gcd(11, (di – d’i) ) = 11. But neither of these statements can be true, as i < 11 and -9 (di – d’i) 9.
20
ISBN (3)
Statement:
All exchanges of two consecutive digits can be detected by the ISBN code.
Proof:
Assume that two consecutive digits ni and ni+1 are exchanged in an ISBN
number. In order for the error to be undetected, the T sum must be changed
by a multiple of 11. But if ni is the digit in position i in the sum, the two
terms in the sums are:
i · ni + (i+1) · ni+1
i · ni+1 + (i+1) · ni
and the difference is simply:
ni+1-ni
which for 0 ni 9 is always comprised between -9 and +9, i.e. it is never a
multiple of 11, q.e.d.
11
ISBN-13 (4)
The new 13 digit ISBN-13 control code is constructed as:
T = (d1 + 3*d2 + d3 + 3*d4 + d5 + 3*d6 + d7 + 3*d8 + d9 + 3*d10 + d11 + 3*d12 ) mod 10
d13 = 10 – T
Now several digits exchange can NOT be recognized, but more books can be cataloged…
21
Families of Error Control Methods
Block Codes: codeword built only on current message-word
Non-block codes: codeword depends on current message word and
of some past words, ex:
Convolutional
Examples of codes:
Hamming
Bose-Chauduri-Hocqueghem (BCH)
Golay
Reed-Solomon (RS)
Reed-Müller
Low Density Parity Check Codes
Turbo Codes
… 22
12
Theory and practice
Coding and decoding schemes are, especially for
algebraic codes, perfectly well described by mathematical
methods providing the transformations:
The real detection and correction of errors instead require
much more ad-hoc, heuristic methods that allow to
recover the original message from a corrupted received
message
23
24
Hamming (intuitive version)
s1 s2
s3
s4
c5
c6 c7
Definition:
cj = computed to give even parity in the circle
source parity
Notice:
the 16 code words in Hamming(7,4) differ from each other by at least 3 bits.
13
25
Hamming (2)
r1 r2
r3
r4
r5
r6 r7
During transmission the message word
s1s2s3s4c5c6c7
is (potentially) modified by an error
in (the unknown)
position j and is received as:
r1r2r3r4r5r6r7
for example, for j=2:
r1r2r3r4r5r6r7 = s1s2s3s4c5c6c7 0100000
26
Hamming (3)
1 1*
0
0
1
0 1
Example:
for an original word: 1000101
assume that e=0100000 occurred,
resulting in r=1100101
Circles with odd (=wrong) parity
are now marked
Decoding and correcting trick:
can we find a single bit (assuming that
there was just one error) that lies inside
all the marked circles and outside
of the unmarked one?
0
14
27
Hamming Codes (1)
Another simple construction of Hamming Code:
Given the four data bits (a0,a1,a2,a3), construct
three parity bits as follows:
p0 = a0 + a1 + a2
p1 = a1 + a2 + a3
p2 = a0 + a1 + a3
(here “+” is modulo 2 addition) and send the
codeword: (a0, a1, a2, a3, p0, p1, p2).
The valid codewords are therefore given in the table
on the right:
Notice that we use 27 code-words to represent 24
possible message-words
0 0 0 0 0 0 0
0 0 0 1 0 1 1
0 0 1 0 1 1 0
0 0 1 1 1 0 1
0 1 0 0 1 1 1
0 1 0 1 1 0 0
0 1 1 0 0 0 1
0 1 1 1 0 1 0
1 0 0 0 1 0 1
1 0 0 1 1 1 0
1 0 1 0 0 1 1
1 0 1 1 0 0 0
1 1 0 0 0 1 0
1 1 0 1 0 0 1
1 1 1 0 1 0 0
1 1 1 1 1 1 1
28
Hamming Codes (2)
The decoder receives: (a’0, a’1, a’2, a’3, p’0, p’1, p’2) and computes:
s0 = p’0 + a’0 + a’1 + a’2
s1 = p’1 + a’1 + a’2 + a’3
s2 = p’2 + a’0 + a’1 + a’3
called the “syndromes”. If there has been no error, these are all zero, if there has been one
error, one or more of them may be non-zero. The syndromes depend only on the error
pattern, as in the table below:
0 0 0 0 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 1
0 1 0 0 0 0 0 0 1 0
0 1 1 0 0 0 1 0 0 0
1 0 0 0 0 0 0 1 0 0
1 0 1 1 0 0 0 0 0 0
1 1 0 0 0 1 0 0 0 0
1 1 1 0 1 0 0 0 0 0
Syndrome Error
15
29
Hamming Codes (3)
a0
a1
a2
a3
a0
a1
a2
a3
p0
p1
p2
Hardware for encoder
30
Hamming Codes (4)
a0
a1
a2
a3
a’0
a’1
a’2
a’3
p’0
p’1
p’2
Hardware for decoder
Correction Logic
+
+
+
+
16
31
Hamming Codes (5)
A compact description of the encoding operation and of the syndrome
computations may be given by using matrix notation such as:
32
Hamming in use
17
33
Block coding as linear maps
Error Coding always assumes an increase in the amount of
information to be sent or stored, as the ‘extra’ information has to take
into account some means to recover what gets corrupted or lost
during transmission or storage (retrieval) or data.
Block coding performs this on finite blocks of data, without
reference to previous blocks, and with all redundant information
contained in the block itself.
Unlike block coding, convolutional coding performs encoding based
on the current set of data to be coded and on the history of previous
blocks, i.e., a given data set is mapped on a number of different data
sets, depending on the content of the previously coded sets.
34
Coding as a map FkØ Fn
Fk
Fn
18
35
Error Detection in Fn
Degradation due to
Transmission or storage
(retrieval)
Recoverable
Undetected Error
Confused, unrecoverable
36
Linear Codes: Structure
Definition: Linear Block Code
A linear code C is a subspace of dimension k of the vector space Bn
for simple binary vectors, or in general of GF(q)n.
In other words, C is a non-empty set of n-tuples over GF(q)n, called
codewords, respecting the structure of a vector space, that is the
addition of two codewords is always a codeword, and the
multiplication of a codeword by a scalar element of the field is also
always a codeword. From the definition, it follows that the zero word
must always be part of a linear codeword, as
" c œ C , (c ) + (-c) = 0 œ C
19
37
Cyclic Codes: Basic definitions
Let c = (c0, c1,…, cn-2, cn-1) œ GF(q)n.
The vector:
c’= (cn-1, c0, c1, …, cn-2)
is called a right cyclic shift of c.
Definition: Cyclic code
An (n, k) block code C is said to be a cyclic code if it is linear and if for
every c its right shift c’ is also a codeword.
Polynomials can be considered as ‘support’ for codewords:
One can conveniently represent a codeword c = (c0, c1, …, cn-2, cn-1) as follows:
c(x) = c0 + c1x + c2x2 + cn-1x
n-1
38
Cyclic Codes: Simple Example
The code:
c0 : 0000000 c1 : 1011100
c2 : 0101110 c3 : 1110010
c4 : 0010111 c5 : 1001011
c6 : 0111001 c7 : 1100101
is cyclic, in fact it can be noticed that using shift and linearity, starting with
cg=(1011100):
c0 : 0000000 c1 : cg
c2 : cg>>1 c3 : c2+c4
c4 : cg>>2 c5 : c1+c4
c6 : c2+c4 c7 : c1+c2+c3
20
Reed-Solomon: very
elementary introduction
39
40
RS coding, very simplified version
Let’s imagine we want to transmit just two numbers and that we are ready
to add some redundancy to ‘repair’ faulty data.
One method could be:
Compute the straight line that goes through the two numbers intended as y
values in a plot at fixed x=1 and x=2 coordinates, i.e. the line going though
the points (1, y1) and (2, y2).
Instead of transmitting just the two numbers, now transmit the five y numbers
on the same straight line corresponding to x-coordinates x=1,2, and 3,4,5
Now assume that one of the yi gets corrupted
Clearly the other 4 numbers are sufficient to establish a straight line and the y
value corresponding to the corrupted one can be reconstructed from the other
points
21
41
RS coding, simplified version(2)
If two numbers are corrupted during transmission, then there are still
three on a straight line (well…, in the real worst case there might be two
straight lines, can you find out why?)
Once the straight line is found again among the “good” three y-points,
the exact location of the two corrupted y-points will be reconstructed.
42
RS coding, simplified version(3)
To be capable of correcting three numbers, a parabola is built first :
Let y1, y2, y3 be the three numbers to be transmitted
Let x1 = 1, x2 = 2, x3 = 3 be the corresponding x-coordinate of a parabola
on (x1, y1), (x2, y2) and (x3, y3)
Introduce some new x=points, say at x4 = 4, x5 = 5, x6 = 6 and then
compute the extra y-coordinates on the parabola y4, y5, y6.
Now repeat the method used for the straight line to correct one or two
corrupted y-points.
22
43
RS coding, simplified version(4)
The original points are
(1,3), (2,1), (3,2)
to these we add:
(4,6), (5,13), (6,23)
and we actually transmit:
(3,1,2,6,13,23)
(The polynomial in this
case is 8 - 6.5 x + 1.5 x2 )
44
RS coding, simplified version(5)
So, the trick is:
Let the number of points y={y1, y2, …, yk} to be transmitted be k
Determine a polynomial of degree k – 1 fitting them all at standard ‘x’
coordinates x1, x2, …, xk
Calculate some redundant points on the same polynomial at coordinates
xk+1, xk+2,…,xk+r
When the y data are received see if they all fit on a polynomial of degree
k – 1
• If yes, no error has occurred
• If not
• Try to see if one point is not on the curve and recover it
• If the search for one is not ok, try to see if two are not on the
curve and recover them
• …
23
45
Example
1. Music or Data CD ROM Coding
46
CD recording block diagram
A/D EFM CIRC
Encoder
Control and
Meta info Synchronization
44.1KHz/ch
1.41Mbps
1.88Mbps
1.94Mbps 4.32Mbps
24
47
Cross-Interleaver
48
Encoding block in CD
RS
(28,24)
RS
(32,28) …
2
26
27
Din{24x8} Dout{32x8}
C2 Encoder C1 Encoder
25
49
C1,2 Encoders
Both work in GF(28) and are derived from
(255, 251) codes.
C2 Shortened to (28, 24)
C1 Shortened to (32, 28)
Remember that for RS codes
dmin= n – k + 1
therefore each can correct two errors.
C1,2 Encoders (2)
C1 could correct for two errors, but actually
when 1 error or more is detected, the decoder
signals a failure to the logic and inserts all
zeros (i.e. it replaces errors with erasures, i.e.
zeros). The de-interleaver spreads then the
zeros, and the C2 decoder fixes those errors.
50
26
51
EFM (8 to 14 Modulation)
CD ROM Error Correction Summary
Longest completely
correctable burst
~ 4,000 data bits (2.5 mm track)
Longest interpolatable
burst
~12,300 bits (7.7 mm track)
Undetected Error Rate < 1 every 750 hours @
BER=10-3
Negligible @ BER<10-4
Storage necessary 2048 Byte RAM
52
27
Conclusion
Error Control Coding is a heavily mathematical theory which
has been proved to be fundamental for modern
communication technologies
Error correction can be traded off for transmission power when the
coding scheme can make up for losses generated by small S/N ratios
In the case of the GBT project (see seminar in three weeks),
the objectives were:
To be able to correct burst of errors
• Generated by SEU in the pin-diode
• Generated by hits in the analog or digital circuitry of the transceiver
• To introduce FEC with minimal latency
53
54
Bibliography
Good books on Coding:
R. Blahut, Algebraic Codes for Data Transmission, Cambridge U.P., 2003
O. Pretzel, Error Correcting Codes and Finite Fields, Oxford U.P. 1992
S. Wicker, Error Control Systems, Prentice Hall, 1995
The Mathematics underneath:
J. A. Gallian, Contemporary Abstract Algebra, Houghton Mifflin, 2006
McEliece, Finite Fields for Scientists and Engineers, Kluwer, 1986
28
Extra material
55
56
CRC (1)
CRC codes are a class of binary codes built on cyclic codes, i.e. with codewords
built as multiples of a generator polynomial g(x) with coefficients in GF[2].
CRC codes allow error detection (see below), but no error correction and are used
extensively in network protocols which allow for retransmission.
Let us denote as:
Rg(x)[·]
the operation of computing the remainder of the argument after dividing by g(x).
If m(x) is the original message polynomial, a CRC encoding operation can be
written as:
c(x) = xr m(x) + Rg(x)[xr m(x) ]
notice that deg( c(x) ) = r + deg( m(x) )
29
57
CRC (2)
If an error occurs during transmission, the received word is now:
r(x) = c(x) + e(x)
where e(x) is the polynomial representing the error pattern.
To find out if an error has occurred we now compute the syndrome:
s(x) = Rg(x)[ r(x) ]
that is:
s(x) = Rg(x)[ c(x) + e(x) ] = Rg(x)[ c(x) ] + Rg(x)[ e(x) ]
but by construction of the cyclic code Rg(x)[ c(x) ] = 0 (see Encoding (systematic)) , therefore:
s(x) = Rg(x)[ e(x) ]
Notice that if e(x) is a code polynomial, say e(x) = c1(x), then s(x)=0 and the error
passes undetected.
58
CRC (3)
CRC: an example
g(x) = x16 + x15 + x2 + 1
m = [0,1,1,0,1,1,0,1,0,0,1,0,0,1,1,1] ñ x14 + x13 + x11 + x10 + x8 + x5 + x2 + x + 1
We get: x16 m(x) + Rg(x)[x16 m(x) ] =
x30 + x29 + x27 + x26 + x24 + x21 + x18 + x17 + x16 + x14 + x13 + x11 + x10 + x9 + x7
+ x6 + x4 + x2
or in bit format:
c=[0,1,1,0,1,1,0,1,0,0,1,0,0,1,1,1,0,1,1,0,1,1,1,0,1,1,0,1,0,1,0,0]
30
59
CRC (4)
Table of commonly used CRC generator polynomials
CRC-4 g(x) = x4+x3+x2+x+1
CRC-7 g(x) = x7+x6+x4+1
CRC-8 g(x) = x8+x7+x6+x4+x2+1
CRC-12 g(x) = x12+x11+x3+x2+x+1
CRC-ANSI g(x) = x16+x15+x2+1
CRC-CCITT g(x) = x16+x12+x5+1
CRC-24 g(x) = x24+x23+x14+x12+x8+1
CRC-32b g(x) = x32+x26+x23+x22+x16+x12+x11+x10+x8+x7+x5+
x4+x2+x+1
60
CRC (5)
Statement
All single bit errors can be detected by CRC
Proof:
The generator polynomial must have more than one non-zero term,
otherwise the code would be an (n, n) code (this can be seen by shifting and
linearly combining all codewords), and the minimum distance between
codewords would be 1 (i.e. a meaningless code).
Now, if the codeword polynomial have more than one non-zero terms, then it
can not divide a polynomial with a single xi term evenly, so there is always a
non-zero remainder, i.e. all single bit errors can be detected.
For an error to be undetected, it must itself be a codeword (seen above); but a
valid codeword is generated with a generator codeword that has more than
one xi term, so it can not be a polynomial with a single xi term set.
31
Fundamental questions on codes
1. How is the code described and represented
2. How is encoding accomplished
3. How is decoding accomplished
4. How are these two operations achieved in a computationally
tractable time
5. What is the performance of the code
1. What are the properties: # of codewords, weight of codewords
6. Are there other families of codes that can provide better coding gain
7. How are these codes found and described
8. Are there constraints on the allowable values of n, k and dmin
9. Is there any limit on the amount of coding gain possible
10. For a given available SNR, is there a lower limit on the probability
of error than can be achieved
61
62
RS combined with interleaving UDP packets in TCP/IP protocol do not have guaranteed delivery
RS is used to replace lost packets (“erasures”)
Data stream is framed into blocks of 249 bytes and encoded in RS(255,249) blocks, this has dmin = 7 and can correct 6 erasures
Messages are interleaved in blocks of 255xN
Blocks are send from columns
If a packet is lost, it is replaced by a “0” column
The receiver knows that packet “j” is lost because it is missing in the sequence
The RS code (organized in N rows) can recover up to 6 missing columns
c1,1 c1,2 c1,3 … c1,255
c2,1 c2,2 c2,3 … c2,255
… … … … …
cN,1 cN,2 cN,3 … cN,255
32
63
EFM Coding
Data for EFM coding are considered in bytes. Each byte is translated according to a lookup table into a corresponding 14-bit codeword.
The 14-bit words are chosen such that binary ones are always separated by a minimum of two and a maximum of ten binary zeroes. This is because bits are encoded with NRZI encoding, or modulo-2 integration, so that a binary one is stored on the CD surface as a change from a zero to a one or a one to a zero, while a binary zero is indicated by no change. A sequence 0011 would be changed into 1101 or its inverse 1101 depending on the previous one written. If there are 2 zeros between 2 consecutive ones, then the written sequence will have 3 consecutive zeros (or ones), for example, 010010 will translate into 1100011 (or 0011100). The EFM sequence 100100010010000100 will translate into 111000011100000111 (or its inverse).
Because EFM ensures there are at least 2 zeroes between every 2 ones, it thus ensures that every one and zero is at least three bit clock cycles long. This property is very useful since it reduces the demands on the optical pickup used in the playback mechanism. The ten consecutive-zero maximum ensures worst-case clock recovery in the player.
EFM requires 3 merging bits between adjacent 14-bit codewords to ensure that consecutive codewords can be cascaded without violating the specified minimum and maximum runlength constraint. The 3 merging bits are also used to shape the spectrum of the encoded sequence. Thus, in the final analysis, 17 bits of disc space are needed to encode 8 bits of data.
64
CC (1)
CC: d1d2d3d4 d5d6d7d8 d9d10d11d12 d13d14d15d16
T = (t(d1) + d2 + t(d3) + d4 + t(d5) + d6 + t(d7) + … + d14 + t(d15) ) mod 10
where
t(di) = add digits of (2*di),
i.e.
t(di) = (2 · di) for (2 · di) < 10
= ((2 · di) -10) + 1 for (2 · di) 10
d16 = 10 – T
Example: the check digit in: CC 1234 5678 9012 345 is d16 = 2.
33
65
CC (2)
Statement:
All single digit errors are detected by the CC check digit.
Proof:
A change of digit di into digit d’i can change the sum T by
di – d’i
or by
t(di – d’i).
For the error to be undetected, the check digit has to change by a
multiple of 10.
But, neither of these terms can be a multiple of 10, the first is in fact
-10 < (di – d’i) < 10
and the second is also (try the function t(x) on all x from 0 to 9) in the
same range, so they are not a multiple of 10, q.e.d.
66
CC (3)
Statement:
Almost all exchanges of adjacent digits can be detected by the CC check digit system.
Proof:
An exchange of two adjacent digits di and di+1 can modify the check sum by:
d = ( t(di) + di+1 ) - ( di + t(di+1 ) )
Well, convince yourself just by looking at the table of possible results for d:
… continue
34
67
CC (4)
Proof (cntd):
The values on the diagonal are irrelevant, as the digits are exchanged with
themselves, the only case where d=0 is for 9ñ0 and for 0ñ9. But these are
only 2 cases out of 90, i.e. the efficiency for detecting exchanges of adjacent
CC digits is 2/90 97.8%.
Why 44.1 KHz
Explanation of 44.1 kHz CD sampling rateThe CD sampling rate has to be larger than about 40 kHz to fulfill the
Nyquist criterion that requires sampling at twice the maximum analog frequency, which is about 20 kHz for audio.
The sampling frequency is chosen somewhat higher than the Nyquist rate since practical filters neede to prevent
aliasing have a finite slope. Digital audio tapes (DATs) use a sampling rate of 48 kHz. It has been claimed that thier
sampling rate differs from that of CDs to make digital copying from one to the other more difficult. 48 kHz is, in
principle, a better rate since it is a multiple of the other standard sampling rates, namely 8 and 16 kHz for telephone-
quality audio. Sampling rate conversion is simplified if rates are integer multiples of each other.From John
Watkinson, The Art of Digital Audio, 2nd edition, pg. 104:In the early days of digital audio research, the necessary
bandwidth of about 1 Mbps per audio channel was difficult to store. Disk drives had the bandwidth but not the capacity for
long recording time, so attention turned to video recorders. These were adapted to store audio samples by creating a
pseudo-video waveform which would convey binary as black and white levels. The sampling rate of such a system is
constrained to relate simply to the field rate and field structure of the television standard used, so that an integer number
of samples can be stored on each usable TV line in the field. Such a recording can be made on a monochrome recorder,
and these recording are made in two standards, 525 lines at 60 Hz and 625 lines at 50 Hz. Thus it is possible to find a
frequency which is a common multiple of the two and is also suitable for use as a sampling rate.The allowable sampling
rates in a pseudo-video system can be deduced by multiplying the field rate by the number of active lines in a field
(blanking lines cannot be used) and again by the number of samples in a line. By careful choice of parameters it is
possible to use either 525/60 or 625/50 video with a sampling rate of 44.1KHz.In 60 Hz video, there are 35 blanked lines,
leaving 490 lines per frame or 245 lines per field, so the sampling rate is given by :60 X 245 X 3 = 44.1 KHzIn 50 Hz video,
there are 37 lines of blanking, leaving 588 active lines per frame, or 294 per field, so the same sampling rate is given by50
X 294 X3 = 44.1 Khz.The sampling rate of 44.1 KHz came to be that of the Compact Disc. Even though CD has no video
circuitry, the equipment used to make CD masters is video based and determines the sampling rate.(Reference kindly
provided by Kavitha Parthasarathy.)
from H. Schulzrinne, Professor and Chair in the Dept. of Computer Science; also with the Dept. of Electrical Engineering at Columbia University
68