Arithmetic coding
-
Upload
khangminh22 -
Category
Documents
-
view
2 -
download
0
Transcript of Arithmetic coding
Arithmetic CodingC.M. LiuPerceptual Signal Processing Lab College of Computer ScienceNational Chiao-Tung University
Office: EC538(03)5731877
http://www.csie.nctu.edu.tw/~cmliu/Courses/Compression/
Introduction2
Arithmetic codingLossless data compressionVariable-length entropy coding.Encodes the entire message into a single number, a fraction n where (0.0 ≤ n < 1.0).
As opposed to other entropy encoding techniques that separate the input message into its component symbols and replace each symbol with a code word,
Flashback: Extended Huffman Codes3
Consider the source:A = {a, b, c}, P(a) = 0.8, P(b) = 0.02, P(c) = 0.18
H = 0.816 bits/symbol
Huffman code:a 0
b 11
c 10
l = 1.2 bits/symbol
Redundancy = 0.384 b/sym (47%!)
Q: Could we do better?
Flashback: Extended Huffman Codes (2)
4
IdeaConsider encoding sequences of two letters as opposed to single letters
Letter Probability Code
aa 0.6400 0
ab 0.0160 10101
ac 0.1440 11
ba 0.0160 101000
bb 0.0004 10100101
bc 0.0036 1010011
ca 0.1440 100
cb 0.0036 10100100
cc 0.0324 1011
l = 1.7228/2 = 0.8614
Red. = 0.0045 bits/symbol
Flashback: Extended Huffman Codes (3)
5
The idea can be extended furtherConsider all possible mn sequences of length n (we did 32)
In theory: By considering more sequences we can improve the coding
In reality: The exponential growth of the alphabet makes this impractical
E.g., for length 3 ASCII seq.: 2563 = 224 = 16MNeed to generate codes for all sequences of length mMost sequences would have zero frequency.Decoder would be inefficient for memory, speed, and probability perturbation.
E.g.:A = {a, b, c}, P(a) = 0.95, P(b) = 0.02, P(c) = 0.03 H = 0.335 bits/symbol
l1 = 1.05, l2 = 0.611, …Performance becomes acceptable at length n = 8.But |alphabet| = 38 = 6561.
Arithmetic Coding6
Brief historyShannon mentioned the use of cdfPeter Elias (classmate of Huffman)— recursive algorithm
Not published until 1963Jelinek 1968Modern roots: Pasco/Rissanen 1976
Basic ideaGenerate a unique tag (code) for an entire sequenceWithout generating codes for all the other possible sequences (as Huffman)Tag is a number in [0,1)
Probability Notation Refresher 7
Random variable:A mapping between (sets of) outcomes of an experiment to real numbers.
To replace symbols with numbers we useX(ai) = i, where ai ∈ A (A = {ai}, i = 1..n)
Given a probability model P for the sourceProbability density function (pdf)
Cumulative density function (cdf)
Generating a Tag8
We have a one-to-one mapping:ak [FX(k-1), FX(k)], k = 1..n, FX(0) = 0Any real number in [FX(k-1), FX(k)] can represent ak
Encoding a 2-letter sequence: ak aj
Pick [FX(k-1), FX(k)] for ak
Then split the interval into the same proportions and pick the jth interval:
( ) ( )[ ] ( ) 00,..1,,1 ==− XXX FmiiFiF
( ) ( )( ) ( ) ( ) ( )
( ) ( )⎥⎦⎤
⎢⎣
⎡−−
−−−
−−
11,
111
kFkFjFkF
kFkFjFkF
XX
XX
XX
XX
Divide [0, 1) into m intervals:
Tag Generation Example9
Consider encoding a1a2a3:A = {a1, a2, a3}, P = {0.7, 0.1, 0.2)
Mapping: a1 1, a2 2, a3 3cdf: FX(1) = 0.7, FX(2) = 0.8, FX(3) = 1.0
Mapping to Real Numbers10
A = {a1, a2, …, an}
Fair dice-throwing example: {1, 2, 3, 4, 5, 6}
( ) 6..161
=== kforkXP
( ) ( ) ( ) ( ) ( )iXPiFiXPkXPaT Xi
kiX =+−==+==∑ −
= 211
211
1
( ) ( ) ( ) 25.022112 ==+== XPXPTX
( ) ( ) ( ) 75.05215 4
1==+==∑ =
XPkXPTkX
Lexicographic Order11
Lexicographic (dictionary) order of strings:
where means ‘y precedes x in alphabet ordering’m is the length of the sequence
( ) ( ) ( )ixyyim
X xPyPxTi 2
1:
)( ∑∀+=
p
xy p
Lexicographic Order Example12
Consider two consecutive rolls of the die:Outcomes = {11, 12, …, 16, 21, 22, …, 26, …, 61, 62, …, 66}
( ) ( ) ( ) ( ) 469.072513
21121113 ===+=+== xPxPxPTX
NotesTo generate tag for 13, we did not have to generate any other tagsProblem: to generate a tag for a string of sequence, we need to know
all the probabilities for sequences “less than” the sequence.Target: Try to have use only the probability of individual symbol.
Interval Construction13
ObservationAn interval containing a tag is disjoint from all other tag intervals
IdeaExpress lower/upper bounds for a sequence as a function of bounds for shorter sequences
Recall fair-die exampleConsider the sequence 3 2 2
Let u(m), l(m) be the upper/lower bound of length m. Then,
u(1) = FX(3), l(1) = FX(2)
u(2) = FX(2)(32), l(2) = FX
(2)(31)
( ) )32()31()26(...)21()16(...)11(32)2( =+=+=++=+=++== xPxPxPxPxPxPFX
Interval Construction (2)14
( ) [ ] [ ] )32()31()26(...)21()16(...)11(32)2( =+=+=++=+=++== xPxPxPxPxPxPFX
( ) ( ) ( ) ( ) ( ) 216
1 1216
1 216
1,, xxxwherekxPixPkxPixkxPkixP
iii========== ∑∑∑ ===
( ) )32()31()2()32()31()2()1(32)2( =+=+==+=+=+== xPxPFxPxPxPxPF XX
( ) )2()3()2()1()3()32()31( 1221 XFxPxPxPxPxPxP ===+====+=
)2()3()3( 1 XX FFxP −==
( ) ( ) )2()2()3()2(32)2(XXXXX FFFFF −+=
( ) )2()1()1()1()2(XFlulu −+=
Interval Construction (3)15
( ) ( ) )1()2()3()2(31)2(XXXXX FFFFF −+=
( ) )1()1()1()1()2(XFlull −+=
( ) ( )321,322 )2()3()2()3(XX FlFu ==
( ) ( ) )2()31()32()31(322)3(XXXXX FFFFF −+=
( ) ( ) )1()31()32()31(321)3(XXXXX FFFFF −+=
( ) )2()2()2()2()3(XFlulu −+=
( ) )1()2()2()2()3(XFlull −+=
( ) ( ) )2()2()3()2(32)2(XXXXX FFFFF −+=
( ) )2()1()1()1()2(XFlulu −+=
Generating a Tag16
In general, for any sequence x = x1x2…xn
( )( ) )1(
)()1()1()1()(
)1()1()1()(
−−+=
−+=−−−
−−−
kXkkkk
kXkkkk
xFlull
xFlulu
( )2
)()( nn
XluxT +
=
Tag Generation Example17
Consider random variable X(ai) = iEncode sequence 1 3 2 1, given the following
3,1)(,1)3(,82.0)2(,8.0)1(,0,0)( >====≤= kkFFFFkkF XXXXX
1,0 )0()0( == ul
( ) 772352.02
1321)4()4(
=+
=luTX
( )( ) 8.0)1(
0)0()0()0()0()1(
)0()0()0()1(
=−+=
=−+=
X
X
Flulu
Flull1
( )( ) 8.0)3(
656.0)2()1()1()1()2(
)1()1()1()2(
=−+=
=−+=
X
X
Flulu
Flull13
( )( ) 77408.0)2(
7712.0)1()2()2()2()3(
)2()2()2()3(
=−+=
=−+=
X
X
Flulu
Flull132
( )( ) 773504.0)1(
7712.0)2()3()3()3()4(
)3()3()1()4(
=−+=
=−+=
X
X
Flulu
Flull1321
Decoding a Tag18
AlgorithmInitialize l(0) = 0, u(0) = 1.
1. For each i, i = 1..n• Compute t*=(tag-l(k‐1))/(u(k‐1)-l(k‐1)).
2. Find the xk: FX(xk-1) ≤ t* ≤ FX(xk).3. Update u(n), l(n)
4. If done--exit, otherwise goto 1.
Decoding Example19
3,1)(,1)3(,82.0)2(,8.0)1(,0,0)(
>====≤=
kkFFFFkkF
XXX
XX
( ) 772352.01321 =XTAlgorithmInitialize l(0) = 0, u(0) = 1.
1. Compute t*=(tag-l(k‐1))/(u(k‐1)-l(k‐1)).2. Find the xk: FX(xk‐1) ≤ t* ≤ FX(xk).3. Update u(k), l(k)
4. If done‐‐exit, otherwise goto 1. ( )( ) )1(
)()1()1()1()(
)1()1()1()(
−−+=
−+=−−−
−−−
kXkkkk
kXkkkk
xFlull
xFlulu
( ) ( )
( )( ) 8.0)1(
0)0(
)1(8.00)0(772352.0010772352.0
)0()0()0()1(
)0()0()0()1(
*
*
=−+=
=−+=
=≤≤=
=−−=
X
X
XX
Flulu
Flull
FtFt
( ) ( )
( )( ) 8.0)3(
656.0)2(
)3(182.0)2(96544.008.00772352.0
)1()1()1()2(
)1()1()1()2(
*
*
=−+=
=−+=
=≤≤=
=−−=
X
X
XX
Flulu
Flull
FtFt
( )( ) 77408.0)2(
7712.0)1(
)2(82.08.0)1(
808.0656.08.0
656.0772352.0
)2()2()2()3(
)2()2()2()3(
*
*
=−+=
=−+=
=≤≤=
=−
−=
X
X
XX
Flulu
Flull
FtF
t
)1(8.00)0(
4.07712.077408.07712.0772352.0
*
*
XX FtF
t
=≤≤=
=−−
=
1
13
132
1321
Implementation20
So farWe have encoding/decoding algorithms that workThey assume real numbers (infinite precision)
Eventually l(n) and u(n) will convergeWe want to be able to encode a string incrementally
Observation: as interval narrows, either1. [l(n), u(n)] ⊂ [0, 0.5), or2. [l(n), u(n)] ⊂ [0.5, 1), or3. l(n) ∈ [0, 0.5), u(n) ∈ [0.5, 1).
Our plan: focus on 1. & 2. now, deal with 3. later
Implementation (2)21
PrincipleScale and shift simultaneously x, upper bound, and lower bound will bring the same relative location.
EncoderOnce we reach 1. or 2., we can ignore the other half of [0,1)We can also indicate to decoder which half the tag is confined to:
Send 0/1 bit to indicate lower/upperRescale tag interval to [0, 1):
E1: [0, 0.5) => [0, 1); E1(x) = 2xE2: [0.5, 1) => [0, 1); E2(x) = 2(x-0.5)
Note: we lost the most significant bit during rescalingNot a problem--we already sent it out the door
DecoderFollow the 0/1 bits and rescale accordingly
Stays in sync with encoder
( )( ) )1(
)()1()1()1()(
)1()1()1()(
−−+=
−+=−−−
−−−
kXkkkk
kXkkkk
xFlull
xFlulu
Tag Generation Example w/ Scaling22
Consider random variable X(ai) = iEncode sequence 1 3 2 1, given the following
3,1)(,1)3(,82.0)2(,8.0)1(,0,0)( >====≤= kkFFFFkkF XXXXX
1,0 )0()0( == ul
( )( )
symbolnextgetulul
Flulu
Flull
X
X
⇒⊄
⊄
=−+=
=−+=
)1,5.0[),[)5.0,0[),[
8.0)1(
0)0(
)1()1(
)1()1(
)0()0()0()1(
)0()0()0()1(
Input: 1321
Output:
( )( )
6.0)5.08.0(2312.0)5.0656.0(2
)1,5.0[]8.0,656.0[
8.0)3(
656.0)2(
)2(
)2(
)1()1()1()2(
)1()1()1()2(
=−×=
=−×=
⊂
=−+=
=−+=
ul
Flulu
Flull
X
X
Input: -321
Output: 1
Tag Generation Example w/ Scaling (2)23
Input: --21
Output: 11
( )( ) 54816.0)2(
5424.0)1()2()2()2()3(
)2()2()2()3(
=−+=
=−+=
X
X
Flulu
Flull
6.0,312.0 )2()2( == ul
( )( ) 09632.05.054816.02
0848.05.05424.02)3(
)3(
=−×=
=−×=
ulInput: ---1
Output: 110
19264.009632.021696.00848.02
)3(
)3(
=×=
=×=
ulInput: ---1
Output: 1100
38528.019264.023392.01696.02
)3(
)3(
=×=
=×=
ulInput: ---1
Output: 11000
Tag Generation Example w/ Scaling (3)24
EOT:Any (convenient) number in [l(n),u(n)]
We pick 0.510 = 0.12
77056.038528.026784.03392.02
)3(
)3(
=×=
=×=
ulInput: ---1
Output: 110001
54112.0)5.077056.0(23568.0)5.06784.0(2
)3(
)3(
=−×=
=−×=
ulInput: ---1
Output: 110001
504256.0)1()3568.054112.0(3568.0
3568.0)0()3568.054112.0(3568.0)4(
)4(
=−+=
=−+=
X
X
Fu
FlInput: ---1
Output: 110001
Output: 11000110…
Note: 0.1100011 = 2-1+2-2+2-6+2-7= 0.7734375 ∈ [0.7712,0.77408]
Incremental Tag Decoding25
We want incremental decodingE.g. network transmissions
IssuesHow to start?How to continue?How to end?
Continuation:Once we have unambiguous start, just mimic encoder
Tag Decoding Example26
Assume word length is set to 6Input: 110001100000
0.1100012 = 0.76562510
First bit is 1Output: 1
( ) ( ))3(182.0)2(
9570.008.00765625.0*
*
XX FtFt
=≤≤=
=−−=
( )( ) 8.0)3(
656.0)2()1()1()1()2(
)1()1()1()2(
=−+=
=−+=
X
X
Flulu
Flull
Input: 110001100000
Output: 13
Input:-10001100000 (shift left)
6.0)5.08.0(2312.0)5.0656.0(2
)2(
)2(
=−×=
=−×=
ul
( ) ( ))2(82.08.0)1(
8155.0312.06.0312.0546875.0*
*
XX FtFt
=≤≤=
=−−=Input: -10001100000 (0.546875)
Output: 132
( )( ) 54816.0)2(
5424.0)1()2()2()2()3(
)2()2()2()3(
=−+=
=−+=
X
X
Flulu
Flull
( )( ) 09632.05.054816.02
0848.05.05424.02)3(
)3(
=−×=
=−×=
ul
Input: ---001100000 (shift left)
Input: --0001100000 (shift left)
Tag Decoding Example (2)27
19264.009632.021696.00848.02
)3(
)3(
=×=
=×=
ul
Input: ----01100000 (shift left)
38528.019264.023392.01696.02
)3(
)3(
=×=
=×=
ul
Input: -----1100000 (shift left)
77056.038528.026784.03392.02
)3(
)3(
=×=
=×=
ul
Input: -----100000 (shift left)
54112.0)5.077056.0(23568.0)5.06784.0(2
)3(
)3(
=−×=
=−×=
ul
Input: -----100000
( ) ( ))1(8.00)0(
7769.03568.054112.03568.05.0*
*
XX FtFt
=≤≤=
=−−=
Output: 1321
Q .E. D.
Managing E328
Recall the three cases1. [l(n), u(n)] ⊂ [0, 0.5): E1: [0, 0.5) => [0, 1); E1(x) = 2x2. [l(n), u(n)] ⊂ [0.5, 1): E2: [0. 5, 1) => [0, 1); E2(x) = 2(x-0.5)3. l(n) ∈ [0, 0.5), u(n) ∈ [0.5, 1): E3(x) = ???E3 rescaling
E3: [0.25, 0.75) => [0, 1); E3(x) = 2(x-0.25)Encoding
E1 = 0, E2 = 1, E3 = ???Note that:
E3 … E3 E1 = E1 E2 … E2E3 … E3 E2 = E2 E1 … E1
Rule: keep count of consecutive E3 and issue that number of zeroes/ones after the next encounter of E2/E1.
Integer Implementation29
Assume word length of m. Then
nj = frequency of j in sequence of length TotalCountCumulative Count CC
[ )876KK
876K
timestimes
1110001,0mm
→}times1
0015.0−
=m
K
TCkCCkF
nkCC
X
k
i i
)()(
)(1
=
=∑ =
( )⎣ ⎦( )⎣ ⎦TCxCCluuu
TCxCClull
nnnnn
nnnnn
)(1
)1(1)1()1()1()(
)1()1()1()(
×+−+=
−×+−+=−−−
−−−
Integer Implementation (2)30
MSB(x) = Most Significant Bit of xLSB(x) = Least Significant Bit of xSB(x, i) = ith Significant Bit of x
MSB(x) = SB(x, 1); LSB(x) = SB(x, m)
E3(l, u) = (SB(l, 2) == 1 && SB(u, 2) == 0)
Integer Encoder
31
l=00…0, u=11…1, e3_count=0
repeat
x=get_symbol
l=l+ ⎣(u-l+1)×CC(x-1)/TC⎦ // lower bound update
u=l+ ⎣(u-l+1)×CC(x)/TC⎦-1 // upper bound updatewhile(MSB(u)==MSB(l) OR E3(u,l)) // MSB(u)=MSB(l)=0 E1 rescaling
if(MSB(u)==MSB(l)) // MSB(u)=MSB(l)=1 E2 rescaling
send(MSB(u))
l = (l<<1)+0 // shift left, set LSB to 0
u = (u<<1)+1 // shift left, set LSB to 1
while(e3_count>0)
send(!MSB(u)) // encode accumulated E3 rescalings
e3_count--
endwhile
endif
if(E3(u,l)) // perform E3 rescaling & remember
l = (l<<1)+0
u = (u<<1)+1
complement MSB(u) and MSB(l)
e3_count++
endif
endwhile
until done
Integer Encoding Example32
Sequence 1 3 2 1Count {1, 2, 3} = {40, 1, 9}Total count TC = 50Cumulative count
CC {0, 1, 2, 3} = {0, 40, 41, 50}
Recall that interval endpoint should never overlap
m = ?smallest [l(n),u(n)] = 1/4 of entire range 0..TC;
=> to maintain unique representation we need range of at least 4x50 = 200
=> minimum m = 8 (28 = 256)
l=00…0, u=11…1, e3_count=0
repeat
x=get_symbol
l=l+ ⎣(u-l+1)×CC(x-1)/TC⎦u=l+ ⎣(u-l+1)×CC(x)/TC⎦-1 while(MSB(u)==MSB(l) OR E3(u,l))
if(MSB(u)==MSB(l))
send(MSB(u))
l = (l<<1)+0
u = (u<<1)+1
while(e3_count>0)
send(!MSB(u))
e3_count--
endwhile
endif
if(E3(u,l))
l = (l<<1)+0
u = (u<<1)+1
complement MSB(u) and MSB(l)
e3_count++
endif
endwhile
until done
Integer Encoding Example (2)33
l=00…0, u=11…1, e3_count=0
repeat
x=get_symbol
l=l+ ⎣(u-l+1)×CC(x-1)/TC⎦u=l+ ⎣(u-l+1)×CC(x)/TC⎦-1 while(MSB(u)==MSB(l) OR E3(u,l))
if(MSB(u)==MSB(l))
send(MSB(u))
l = (l<<1)+0
u = (u<<1)+1
while(e3_count>0)
send(!MSB(u))
e3_count--
endwhile
endif
if(E3(u,l))
l = (l<<1)+0
u = (u<<1)+1
complement MSB(u) and MSB(l)
e3_count++
endif
endwhile
until done
2)0(
2)0(
)11111111(255
)00000000(0
==
==
u
l
⎣ ⎦⎣ ⎦
false=≠==−×+=
==×+=
3
2)1(
2)1(
),()()11001011(203150402560
)00000000(05002560
EuMSBlMSBu
lInput: 1321
Output:
Input: -321
Output: 1
⎣ ⎦⎣ ⎦
1)()()11001011(203150502040
)10100111(16750412040
2)2(
2)2(
====−×+=
==×+=
uMSBlMSBu
l
Integer Encoding Example (3)34
l=00…0, u=11…1, e3_count=0
repeat
x=get_symbol
l=l+ ⎣(u-l+1)×CC(x-1)/TC⎦u=l+ ⎣(u-l+1)×CC(x)/TC⎦-1 while(MSB(u)==MSB(l) OR E3(u,l))
if(MSB(u)==MSB(l))
send(MSB(u))
l = (l<<1)+0
u = (u<<1)+1
while(e3_count>0)
send(!MSB(u))
e3_count--
endwhile
endif
if(E3(u,l))
l = (l<<1)+0
u = (u<<1)+1
complement MSB(u) and MSB(l)
e3_count++
endif
endwhile
until done
Input: --21
Output: 110
⎣ ⎦⎣ ⎦
1,1)()()10010100(1481504114828
)10010010(146504014828
2)3(
2)3(
=====−×+=
==×+=
e3_countuMSBlMSBu
l
( ) ( )( ) 175)10000000(11)10010111(
281000000001)01001110(
151)10010111(11)11001011(
78)01001110(01)10101011(
22)2(
2)2(
3
22)2(
22)2(
=+<<=
=+<<=
===+<<=
==+<<=
xor
xor
true
u
l
Eu
l
e3_count = 1
Integer Encoding Example (4)35
l=00…0, u=11…1, e3_count=0
repeat
x=get_symbol
l=l+ ⎣(u-l+1)×CC(x-1)/TC⎦u=l+ ⎣(u-l+1)×CC(x)/TC⎦-1 while(MSB(u)==MSB(l) OR E3(u,l))
if(MSB(u)==MSB(l))
send(MSB(u))
l = (l<<1)+0
u = (u<<1)+1
while(e3_count>0)
send(!MSB(u))
e3_count--
endwhile
endif
if(E3(u,l))
l = (l<<1)+0
u = (u<<1)+1
complement MSB(u) and MSB(l)
e3_count++
endif
endwhile
until done
Input: ---1
Output: 1100
0)()(41)00101001(11)10010100(
36)00100100(1)10010010(
22)3(
22)3(
====+<<=
==<<=
uMSBlMSBu
l
Input: ---1
Output: 11000
0)()(83)01010011(11)00101001(
72)01001000(1)00100100(
22)3(
22)3(
====+<<=
==<<=
uMSBlMSBu
l
Input: ---1
Output: 110001
1)()(167)10100111(11)01010011(
144)10010000(1)01001000(
22)3(
22)3(
====+<<=
==<<=
uMSBlMSBu
l
Integer Encoding Example (5)36
l=00…0, u=11…1, e3_count=0
repeat
x=get_symbol
l=l+ ⎣(u-l+1)×CC(x-1)/TC⎦u=l+ ⎣(u-l+1)×CC(x)/TC⎦-1 while(MSB(u)==MSB(l) OR E3(u,l))
if(MSB(u)==MSB(l))
send(MSB(u))
l = (l<<1)+0
u = (u<<1)+1
while(e3_count>0)
send(!MSB(u))
e3_count--
endwhile
endif
if(E3(u,l))
l = (l<<1)+0
u = (u<<1)+1
complement MSB(u) and MSB(l)
e3_count++
endif
endwhile
until done
Input: ---1
Output: 1100010
0)()(79)01001111(11)10100111(
32)00100000(1)10010000(
22)3(
22)3(
====+<<=
==<<=
uMSBlMSBu
l
Input: ---1
( ) ( )( ) 191)10000000(11)10011111(
01000000001)01000000(
),()(159)10011111(11)01001111(
64)01000000(1)00100000(
22)3(
2)3(
3
22)3(
22)3(
=+<<=
=+<<=
=≠==+<<=
==<<=
xor
xor
true
u
l
EuMSBlMSBu
l
e3_count = 1
Integer Encoding Example (5)37
l=00…0, u=11…1, e3_count=0
repeat
x=get_symbol
l=l+ ⎣(u-l+1)×CC(x-1)/TC⎦u=l+ ⎣(u-l+1)×CC(x)/TC⎦-1 while(MSB(u)==MSB(l) OR E3(u,l))
if(MSB(u)==MSB(l))
send(MSB(u))
l = (l<<1)+0
u = (u<<1)+1
while(e3_count>0)
send(!MSB(u))
e3_count--
endwhile
endif
if(E3(u,l))
l = (l<<1)+0
u = (u<<1)+1
complement MSB(u) and MSB(l)
e3_count++
endif
endwhile
until done
Input: ---1
Output: 1100010
⎣ ⎦⎣ ⎦
false=≠==−×+=
==×+=
3
2)4(
2)4(
),()()10011000(152150401920
)00000000(05001920
EuMSBlMSBu
l
TerminationGenerally, send l(4): (00000000)2However e3_count = 1, sowe send ‘1’ after first ‘0’ from l(4) :
Final output: 1100010010000000
Integer Decoder38
Initialize l, u, t // t = first m bitsrepeatk=0
while( ⎣((u-l+1)×TC-1)/(u-l+1)⎦ ≥ CC(k)
k++x = decode_symbol(k)
l=l+ ⎣(u-l+1)×CC(x-1)/TC⎦u=l+ ⎣(u-l+1)×CC(x)/TC⎦-1 while(MSB(u)==MSB(l) OR E3(u,l)) if(MSB(u)==MSB(l)) // Perform E1/E2 rescaling of l,u,t
l = (l<<1)+0 u = (u<<1)+1 t = (u<<1)+next_bit
endifif(E3(u,l)) // Perform E3 rescaling of l,u,t
l = (l<<1)+0 u = (u<<1)+1t = (u<<1)+next_bitcomplement MSB(u), MSB(l), MSB(t)
endifendwhile
until done
Integer Decoding Example39
Initialize l, u, trepeat
k=0
while( ⎣(u-l+1)×CC(x-1)/TC⎦ ≥ CC(k)
k++x = decode_symbol(k)
l=l+ ⎣(u-l+1)×CC(x-1)/TC⎦u=l+ ⎣(u-l+1)×CC(x)/TC⎦-1 while(MSB(u)==MSB(l) OR E3(u,l))if(MSB(u)==MSB(l))
l = (l<<1)+0 u = (u<<1)+1 t = (u<<1)+next_bit
endifif(E3(u,l))
l = (l<<1)+0 u = (u<<1)+1t = (u<<1)+next_bitcomplement MSB(u), MSB(l), MSB(t)
endifendwhile
until done
Input: 1100010010000000
2
2
2
)11000100(196)11111111(255
)00000000(0
====
==
tul
( )⎣ ⎦11
381256501970*
=⇒=⇒=−×+=
xkt
Output: 1
⎣ ⎦⎣ ⎦
false=≠==−×+=
==×+=
3
2
2
),()()11001011(203150402560
)00000000(05002560
EuMSBlMSBul
( )⎣ ⎦33
481203501970*
=⇒=⇒=−×+=
xkt
Output: 13
Integer Decoding Example (2)40
Initialize l, u, trepeat
k=0
while( ⎣(u-l+1)×CC(x-1)/TC⎦ ≥ CC(k)
k++x = decode_symbol(k)
l=l+ ⎣(u-l+1)×CC(x-1)/TC⎦u=l+ ⎣(u-l+1)×CC(x)/TC⎦-1 while(MSB(u)==MSB(l) OR E3(u,l))if(MSB(u)==MSB(l))
l = (l<<1)+0 u = (u<<1)+1 t = (u<<1)+next_bit
endifif(E3(u,l))
l = (l<<1)+0 u = (u<<1)+1t = (u<<1)+next_bitcomplement MSB(u), MSB(l), MSB(t)
endifendwhile
until done
Input: 1100010010000000
⎣ ⎦⎣ ⎦
)()()11001011(203150502040
)10100111(16750412040)10001001(
2
2
2
uMSBlMSBult
===−×+=
==×+==
Input: 1100010010000000
true=≠=+<<=
=<<==
3
22
22
2
),()()10010111(11)11001011(
)01001110(1)10100111()00010010(
EuMSBlMSBult
Input: 1100010010000000
( )( ) ( )( ) ( )
false
xor
xor
xor
=≠==+<<=
==<<====
3
22
22
22
),()(175)10111001(1000000011)10010111(
28)00011100(100000001)01001110(146)10010010(10000000)00010010(
EuMSBlMSBult
Integer Decoding Example (3)41
Initialize l, u, trepeat
k=0
while( ⎣(u-l+1)×CC(x-1)/TC⎦ ≥ CC(k)
k++x = decode_symbol(k)
l=l+ ⎣(u-l+1)×CC(x-1)/TC⎦u=l+ ⎣(u-l+1)×CC(x)/TC⎦-1 while(MSB(u)==MSB(l) OR E3(u,l))if(MSB(u)==MSB(l))
l = (l<<1)+0 u = (u<<1)+1 t = (u<<1)+next_bit
endifif(E3(u,l))
l = (l<<1)+0 u = (u<<1)+1t = (u<<1)+next_bitcomplement MSB(u), MSB(l), MSB(t)
endifendwhile
until done
( )⎣ ⎦22
40)128175(150)128146*
=⇒=⇒=+−−×+−=
xkt
Output: 132
⎣ ⎦⎣ ⎦
)()()10010100(1481504114828
)10010010(146504014828
2
2
uMSBlMSBul
===−×+=
==×+=
CompletionFive more rounds of bit shifting(do it as an exercise)Eventually a ‘1’ should be decoded
TerminationAfter the advertised number of symbols have been decoded, orEOT is reached
Arithmetic vs. Huffman42
m = sequence lengthAverage code length:
Arithmetic
Extended Huffman (groups of m symbols)
ObservationsBoth show the same asymptotic behavior
Slight edge for HuffmanExtended Huffman requires potentially enormous amounts of storage & code generation mn
Arithmetic does notIt is practical to assume large m for arithmetic but not HuffmanWe can expect arithmetic to do better (except when prob. are powers of 2)
mxHlXH A 2)()( +≤≤
mxHlXH H 1)()( +≤≤
Arithmetic vs. Huffman (2)43
Gains are a function of the size & distribution of the alphabetSmall alphabet (generally) favors HuffmanSkewed distributions favor arithmetic
It is easier to adapt arithmetic to use multiple codes
It is easier to adapt arithmetic to changing input statisticsNo tree to rearrangeWe can separate modeling & coding
Adaptive Arithmetic Coding44
So far:We start with a known probability model
RealityThat information is usually unavailable or too time-consuming to obtainNeed a solution with no initial knowledge
Adaptive solutionStart with all counts set to 1
Or some other known (to the decoder) model
After a symbol is encoded, update countDecoder will be able to stay in syncQuirks
No total count => cannot pick m based on thatGiven a word length of m, we can accommodate max count of 2m-2To prevent count overflow, rescale all counts
This has the added benefit of reducing the importance of older data
Applications: Image Compression45
Image Bits/pixel RatioArithmetic
RatioHuffman
Sena 6.52 1.23 1.16
Sensin 7.12 1.12 1.27
Earth 4.67 1.71 1.67
Omaha 6.84 1.17 1.14
Image Bits/pixel RatioArithmetic
RatioHuffman
Sena 3.89 2.06 2.08
Sensin 4.56 1.75 1.73
Earth 3.92 2.04 2.04
Omaha 6.27 1.28 1.26
Adaptive arithmeticpixel values
Adaptive arithmeticpixel differences
Summary46
Introduced arithmetic codingDirect coding of sequences (not a concatenation of codes)Provably uniquely decodableAsymptotically approaches entropy boundMore efficient for skewed distributions than HuffmanOnly necessary codes are generatedSomewhat more complicated to implementEasy adaptive implementationAllows clean separation of model and coding