If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.
You already know Dokkio is an AI-powered assistant to organize & manage your digital files & messages. Very soon, Dokkio will support Outlook as well as One Drive. Check it out today!

Scribe Note 5

Page history last edited by Yip Kit Sang Danny 12 years, 5 months ago

Q.2 Arithmetic Code

(i) Introduction

The descriptions and diagrams below illustrate the idea of Arithmetic code:

Let p(0) = p, p(1) = 1 - p,

Length of input	Set of all input	Probability Function	Distribution Function	Intervals
1	{0, 1}
2	{00, 01, 10, 11}
3	{000, 001, 010, 011, 100, 101, 110, 111}
4	{0000, 0001, 0010, 0011, 0100, 0101, 0110, 0111, 1000, 1001, 1010, 1011, 1100, 1101, 1110, 1111}		(Just leave it blank, too trouble)	(leave it blank)

In this diagram, each line represents corresponding length of input of arithmetic code. This diagram also illustrates the code width of a input.

Examples:

(a) code width of 01 and 10 is p(1-p)

(b) code width of 001, 010, 100 is Formula

Furthermore, the sum of all code width is 1.

This diagram illustrate the distribution function of arithmetic code up to length of input = 3. Red line, purple line and blue line represent length of input equal to 1, 2 and 3 respectively. The indices at x-axis are sets of inputs, while the vertical distance is the code width.

Here comes to the problem, what are the codes?

If Formula is the input, Formula is the arithmetic binary-code, then Formula

Example: Let's say p = 0.4, n = 3

(a) Arithmetic code 0011: in binary, Formula .

As input Formula , so, the arithmetic code of 010 is 0011.

(b) For input Formula , we find out that Formula ,

So, the arithmetic code for 001 is 0010.

(ii) Arithmetic Code is almost best code

Arithmetic code requires Formula bits, 1 bit is lost due to quantization, another 1 bit is lost due to the open interval at the end

Arithmetic code is almost best code, since Huffman achieves Formula bits and Huffman is best code.

(iii) Encoding

Denote Formula be the input, width of interval = Formula ,

and start of interval (a) = Formula

1. Evaluate the interval for the input with the above formulae.

2. Find a code that binary floating point value falling into the range of the interval.

(iv) Decoding

1. Evaluate the floating point value of code.

2. Binary search the floating point value on [0, 1) for n steps, if the floating point value is on the left, pick [a, (1-p)a+pb), otherwise pick [(1-p)a+pb, b).

(v) Advantages of Arithmetic Code

1. Arithmetic code is almost best code, with at most 2 bits more than the best code.

2. It is easy for Arithmetic code to handle non-binary cases by adding more parameters.

3. The design arithmetic code facilitates different parameters at different position of the code.

4. Coding procedure is incremental and can be used for any blocklength.

5. Arithmetic code is memoryless.

Q.4 Differential Entropy h(x)

(i) Definitions

Natural definition of differential entropy Formula

Joint differential entropy Formula

Conditional differential entropy Formula

Mutual information Formula

Kullback-leibler Divergence (or relative different entropy) Formula

(ii) Properties of Differential Entropy (Highlight)

1. h(X) can be positive, zero and even negative

Examples: f(x) = 0.5 on [0, 2] gives positive h(X), f(x) = 1 on [0, 1] gives zero h(X), f(x) = 2 on [0, 0.5] gives negative h(X).

2. Formula

3. Chain rules for differential entropy: Formula

4. Chain rules for mutual information: Formula

5. Kullback-leibler divergence and mutual information of 2 random variables are non-negative

6. Data Processing Inequality: Formula

(iii) Relation of Differential Entropy to Discrete Entropy

Show that: Formula

Proof:

Formula

So, Formula

Note: This equation is an informal proof for the property of h(p), can be positive, zero and negative. If Formula close to 0+, h(p) is highly negative.