Q.2 Arithmetic Code
(i) Introduction
The descriptions and diagrams below illustrate the idea of Arithmetic code:
Let p(0) = p, p(1) = 1 - p,
Length of input |
Set of all input |
Probability Function |
Distribution Function |
Intervals |
1 |
{0, 1} |
|
|
|
2 |
{00, 01, 10, 11} |
|
|
|
3 |
{000, 001, 010, 011, 100, 101, 110, 111} |
|
|
|
4 |
{0000, 0001, 0010, 0011, 0100, 0101, 0110, 0111, 1000, 1001, 1010, 1011, 1100, 1101, 1110, 1111}
|
|
(Just leave it blank, too trouble) |
(leave it blank) |
In this diagram, each line represents corresponding length of input of arithmetic code. This diagram also illustrates the code width of a input.
Examples:
(a) code width of 01 and 10 is p(1-p)
(b) code width of 001, 010, 100 is
Furthermore, the sum of all code width is 1.
This diagram illustrate the distribution function of arithmetic code up to length of input = 3. Red line, purple line and blue line represent length of input equal to 1, 2 and 3 respectively. The indices at x-axis are sets of inputs, while the vertical distance is the code width.
Here comes to the problem, what are the codes?
If is the input, is the arithmetic binary-code, then
Example: Let's say p = 0.4, n = 3
(a) Arithmetic code 0011: in binary, .
As input , so, the arithmetic code of 010 is 0011.
(b) For input , we find out that ,
So, the arithmetic code for 001 is 0010.
(ii) Arithmetic Code is almost best code
Arithmetic code requires bits, 1 bit is lost due to quantization, another 1 bit is lost due to the open interval at the end
Arithmetic code is almost best code, since Huffman achieves bits and Huffman is best code.
(iii) Encoding
Denote be the input, width of interval = ,
and start of interval (a) =
1. Evaluate the interval for the input with the above formulae.
2. Find a code that binary floating point value falling into the range of the interval.
(iv) Decoding
1. Evaluate the floating point value of code.
2. Binary search the floating point value on [0, 1) for n steps, if the floating point value is on the left, pick [a, (1-p)a+pb), otherwise pick [(1-p)a+pb, b).
(v) Advantages of Arithmetic Code
1. Arithmetic code is almost best code, with at most 2 bits more than the best code.
2. It is easy for Arithmetic code to handle non-binary cases by adding more parameters.
3. The design arithmetic code facilitates different parameters at different position of the code.
4. Coding procedure is incremental and can be used for any blocklength.
5. Arithmetic code is memoryless.
Q.4 Differential Entropy h(x)
(i) Definitions
Natural definition of differential entropy
Joint differential entropy
Conditional differential entropy
Mutual information
Kullback-leibler Divergence (or relative different entropy)
(ii) Properties of Differential Entropy (Highlight)
1. h(X) can be positive, zero and even negative
Examples: f(x) = 0.5 on [0, 2] gives positive h(X), f(x) = 1 on [0, 1] gives zero h(X), f(x) = 2 on [0, 0.5] gives negative h(X).
2.
3. Chain rules for differential entropy:
4. Chain rules for mutual information:
5. Kullback-leibler divergence and mutual information of 2 random variables are non-negative
6. Data Processing Inequality:
(iii) Relation of Differential Entropy to Discrete Entropy
Show that:
Proof:
So,
Note: This equation is an informal proof for the property of h(p), can be positive, zero and negative. If close to 0+, h(p) is highly negative.
(iv) Maximum value of differential Entropy
Maximize h(X) over all probability density function f(x) satisfying:
1.
2.
3. , i = 1, 2, where is called moment constraints
Theorem: h(X) attains maximum when , which is also called Guassian distribution , through completing square.
Proof:
step 1: construct h(X) with Lagrange Multiplier
step 2: differential h(X) with respect to f(x)
step 3: put to obtain , furthermore, perform first derivative test to show this is the maximum of h(X)
Comments (1)
sidjaggi said
at 12:48 am on Nov 15, 2011
Really excellent figures/explanations, Danny! Good job!
You don't have permission to comment on this page.