## Arithmetic Codes

This is going to be the last lossless source coding scheme covered by this course.

**(Encoding)**

We are going to illustrate the principle of arithmetic code by an example:

Consider a binary source with and , instead of using a sequence of bits to represent a certain outcome, we use a subinterval of [0, 1) to denote the outcome.

As shown in the chart below, all different possible outcomes with symbols are represented by some intervals in level . For example, is represented by the interval .

After we know the length of the interval, we can be sure that the point given by the first bits of the binary representation of the midpoint of the interval will fall into this interval, so we choose this finite digit number to be the representation of this interval.

For example, in this case, has an interval length of , so this interval can be represented by the first bits of its midpoint , i.e. will represent this interval(the digit before the floating point and also the floating point is omitted because they are the same for every point).

**(Decoding)**

After the code is sent to the decoder, the decoder will construct the same subintervals corresponding to the level the encoder is using and see which subinterval the code falls in. Using this the decoder will be able to find the original code.

**Remarks:**

1. Adding an additional bit to the source block takes a constant Time Complexity to encode again for Arithmetic Codes, which is a lot better then Hoffman Codes (exponemtial).

2. The arithmetic coding procedure, given any length n and probability mass function , enables one to encode the sequence in a code of length , this procedure achieves an average length for the block that is within 2 bits of the entropy.

3. Arithmetic Coding scheme works even if the source codes are not independent. For example, when the source codes follows a Markov Chain, then , but the principle does not need to change.

4. When the encoder and the decoder does not know the distribution of the source, they can use the information they already know to estimate the distribution, and by the law of large number, this estimated distribution will converge to the real distribution in probability.

**Weak Typicality**

We define the -weakly typical set as the set of all such that:

.

By manipulating the equation above, we get:

.

in probability.

Now what if the s are not i.i.d.? Recall the definition of :

so even if s are not i.i.d., we can still define weak typical in a similar way.

The -weakly typical set as the set of all such that:

.

Now let's calculate the number of elements in :

Then we try to get a lower bound for :

Hence, we can get .

So weak typicality is also a good way of saying typicality.

## Comments (1)

## Lingxiao XIA said

at 10:04 pm on Mar 20, 2009

sorry for this late modification of the scribe note, i thought i already finished it until i saw the *under construction* sign... anyway it is finished now, welcome for any comments

You don't have permission to comment on this page.