| 
  • If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.

  • You already know Dokkio is an AI-powered assistant to organize & manage your digital files & messages. Very soon, Dokkio will support Outlook as well as One Drive. Check it out today!

View
 

lecture notes 3

Page history last edited by sidjaggi 15 years, 1 month ago

 

Source Coding Theorem (Achievability) (Scribe: LIANG Tong)

 

 

We now derive and use properties of the Formula to obtain a source code.

 

`As shown in the previous class/notes, 

 

Formula                             (1)

 

To remind ourselves, this bound arises by first proving that the probability that a sequence is atypical is Formula, where

Formula is defined as the relative entropy between two probability mass functions p(x) and q(x) (also called the Kullback-Leibler divergence), where the Formula denotes the sample space of the random variable X.

 

Second, as asked to prove in PS2, if  Formula we have Formula. Combining these two results gives (1).

 

For a "reasonable" source code we would expect that Formula be very small, or in fact asymptotically negligible in n. This is possible if we let Formula be a function Formula of n, and set Formula. In all that follows, we condition on the assumption that Formula is indeed typical, and that

          Formula                                                                            (2)

  

 

Bound on the size of the typical set

 

The number of  typical Formula 

               Formula

               Formula (here we assume that p < 1/2; if p>1/2, a similar argument yields an upper bound of Formula)

Question :                Formula

               By the Taylor series expansion of H(p),   Formula for some p' in Formula. For fixed p distinct from 0 and 1, the maximum value of Formula can be bounded by some c (that is a function of p, but not of Formula)

 

 

Therefore, the size of the typical set is at most Formula.

 

 

Source coding scheme

 

1. Use the first bit to describe whether the sequence is typical or not

2(a). If the sequence is atypical, describe the entire sequence uncompressed using n bits.

2(b). If the sequence is typical, describe the index number of the sequence in the set of typical sequences, using Formula bits.

 

Expected total # bits transmitted  (one bit to describe to whether typical or not, and then bits to describe the sequence (typical/atypical) )

 Formula.

 

Note that the size of the typical set decreases as epsilon decreases, but corresponding the probability of atypicality increases.

 

 

Discussion of typicality

 

 

p <1/2 and n =10 in the following figures

Fig1: the relation between #of heads and probability,

Fig2: between # of heads and size of T(k,n)

  

Note that the value of k for which the elements of T(k,n) have the largest (individual) probabilities is k=0 (Figure 1), and the value of k for which the size of the set T(k,n) is largest is k =n/2 (Figure 2).

However, as we saw via the analysis before, the most likely T(k,n) (the type-class with the largest probability of occuring) is T(np, n), i.e., when k=np.

 

 

 

Binary source coding theorem

 

(a) Formula such that for all n>N, Formula, Formulacode with expected rate Formula and 0-error.

(b) Formula code with Formulaand expected rate < Formula (we shall prove this "converse" in a later class).

 

The above theorem can be generalized to general i.i.d. sources over finite alphabets, with the only difference being in how the corresponding entropy function is defined -- the achievability is left as an excercise for the reader.

Comments (10)

Cho Yiu Ng said

at 11:23 am on Jan 29, 2009

No comments?

sidjaggi said

at 12:37 am on Feb 2, 2009

Hmm, I really would like the class to join in and comment a bit on this, before I jump in and edit. Open for comments until Friday the 6th of Feb.

Mak said

at 2:45 pm on Feb 2, 2009

You wrote: "the typical set decrease, as the epsilon decrease." What does that mean? Thanks.

Silas said

at 10:13 am on Feb 4, 2009

I cannot follow the logical flow easily in this scribe note. May you add some connectives between statements?
Thanks a lot.

Silas said

at 10:18 am on Feb 4, 2009

Since H(p) is not an increasing function, H(p+\epsilon) >= H(k/n) does not hold in general. Therefore, I don't understand why it is true in the middle of the scribe note. Please correct me if I am wrong.

Tong said

at 11:38 am on Feb 4, 2009

"Since H(p) is not an increasing function, H(p+\epsilon) >= H(k/n) does not hold in general"
Here, p< 1/2. and p+\epsilon < 1/2 (i think it is because the epsilon is a small number)

Cho Yiu Ng said

at 11:14 am on Feb 5, 2009

What is the script X in your definition of divergence?

Please make sure all the symbols are defined and state the assumptions explicitly.

Tong said

at 11:51 am on Feb 5, 2009

the set of the value which x can take

Cho Yiu Ng said

at 5:03 pm on Feb 5, 2009

Please put this in the scribe note.

sidjaggi said

at 8:10 am on Feb 10, 2009

Finished my revision -- if there are any comments, speak now, or forever hold your peace :)

You don't have permission to comment on this page.