Some notations in symbol-by-symbol source coding

: alphabet set

c(x): the codeword of x, or the bit sequence of x

l(x): length of the codeword of x, or the number of bits of the bit sequence

: the maximum length among all codewords

L: the expected length of codewords. i.e.

If there are m symbols in , then

Definitions of different classes of code:

1. nonsingular code

Eevery element in maps into a different codeword string.

i.e.

2. uniquely decodabl code

Different sequences maps into different streams of codewords.

i.e.

(If , then the codeword stream for )

3. Prefix code (a.k.a prefix-free code, instantaneous code)

No codeword is the prefix of any other code.

Relationship between these classes of code:

Plugin error: That plugin is not available.

From Text book Cover/Thomas Page106

Question in Problem Set 4

Q2

(b)

//can be any sequence consisting of N symbols; is the length of

Compare the two expressions, we find that if i is , then A(i) is the number of source sequences mapping into length-i codeword stream.

The size of length-i codeword stream is (binary case).

By the definition of uniquely decodable code, at most one source sequence is mapped into one codeword stream.

Therefore, no more than source sequences can be mapped into length-i codeword, i.e.

Take Nth root at both side, we get .

(c)

The inequality we derived in (b) holds for any N.

In order to find the upper bound for the LHS, we need to minimize the RHS.

By simple analysis, the minimum value of RHS, 1, is reached when n is infinity.

So, . The Kraft Inequality holds in this specific case.

Follow the same procedure, we can prove that all uniquely decodable code satisfy the Kraft Inequality.

Q3 is basically skipped in class.

Some questions to think about:

. Is H(X) a good lower bound for L? Is it possible to get an equality?

The answers are affirmative. (e.g. fair coin tossing)

Note that this bound is true for any uniquely decodable code.

Q4

(a)

Why is it hard to achieve the entropy bound?

l(i) is always an integer while log(1/p(i)) is usually not a rational number. They are not equal generally.

The easiest way to make logarithm an integer is to use ceiling function.

(b)

Shannon code: choose l(i) such that .

First of all, make sure that Shannon code can be achieved as a prefix code. (By Kraft Inequality)

By direct computation, we get the upper bound for L.

Consider again the bounds for L, . The gap may be large sometimes.

e.g. Biased coin tossing. Pr(head)=0.99999 and Pr(tail)=0.00001. Then H(X) is nearly 0 and L is 1.

(c)

To improve the performance, we can encode a sequece of symbols instead of one symbol at a time.

Let denote the expected length of codeword stream when source sequence has N symbols.

By (b)

Then, . The gap is reduced to 1/n, which tends to 0 as n tends to infinity.

However, when n is large, the space complexity is quite high.

Q5 Concentrate on prefix code

(a)

i) Can be easily done using proof by contradiction.

(Be careful with the signs: )

ii) Consider a code tree.

and have the longest codewords and they only differ in the last bit, that is equivalent to say,

the two codewords are siblings(having the same parent node) in the code tree.

Proof by contradiction:

Suppose one longest codeword doesn't have a sibling, then we can move it to its parent. (There couldn't be any codeword occupying its parent position originally, since the code is prefix-free.) Therefore, the expected length is shortened and a better code is formed, which contradicts the assumption that the code is the best.

iii) Greedy algorithm: each time combine the least two probabilities and keep others unchanged

Illustrated by the following example from text book Cover/Thomas Page119

Plugin error: That plugin is not available.

Let L denote the original expected length and L' denote the expected length after one iteration.

To minimize L is equivalent to minimize L'.(Formal prove is in the text book)

The code generated by this algorithm is also the best among all uniquely decodable code. (Why?)

(Remark: This algorithm is one type of dynamic programming.)

(e)

Huffman code is not adopted in many practical applicaions because the sender has to obtain the statistics information of source file, form the code tree, send the tree before the content of the file can be transmitted.

## Comments (0)

You don't have permission to comment on this page.