| 
  • If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.

  • You already know Dokkio is an AI-powered assistant to organize & manage your digital files & messages. Very soon, Dokkio will support Outlook as well as One Drive. Check it out today!

View
 

Problem Set 3

Page history last edited by sidjaggi 12 years, 5 months ago

Source Coding Theorem/Channel Coding Theorem

 

Q1. (Converse for source coding theorem:) A source generates n i.i.d. random variables Formula, each with entropy H(X). The encoder Alice encodes this into a string of nR bits Formula (R is the rate). The decoder Bob decodes the source as the sequence of bits Formula. For any (Alice, Bob) pair, justify each step in the following sequence of equalities and inequalities.

Use this to prove a lower bound on the probability of error if R < H(X). Under what conditions are the inequalities tight? Is it crucial that we use fixed-length block coding here?

 

Key concepts in Q1: Converse to a coding theorem, chain of entropy equalities and inequalities, positive bound on probability of error.

 

 

Q2. (Achievability for source coding theorem:) Consider the following encoding scheme for the source in Q1. For each source sequence Formula, Alice assigns Formula bits Formula i.i.d, uniformly at random. (Note that for R<1 this may be a many-to-one assignment.) This assigment is known a priori to both Alice and Bob, and is called the codebook. To decode, given a string Formula, Bob examines the codebook to see if there is a unique typical  Formula that generates this Formula. If so, he outputs Formula. Else he declares a decoding error.

(a) Why is this a terrible scheme? As described, what is the space-complexity of encoding and decoding? How about the complexity of checking whether the code is "good" (has "small" error)? Also, is a non-zero probability of error necessary for this problem of compressing Formula?

(b) Why is this not such a terrible scheme? Prove that the probability of error is asymptotically negligible (in fact, that it decreases exponentially in the block-length n). Proceed as follows:

 (i) Prove that with the probability of transmitting an atypical Formula is "small". (How small? Can you use ideas from PS1 and Q4 of PS2 to precisely quantify this?)

  (ii) Prove that for any Formula, the probability that Formula is "bad", i.e., another typical sequence Formulaalso gets encoded to Formula, is "small". How small? Use a combinatorial argument to bound this. Can you again provide an exponentially decaying bound?

  (iii) How many typical Formula are there (bounds?) Therefore what are bounds on the probability of observing a typical Formula?

  (iv) Combining (i), (ii) and (iii) (via the union bound), can you come up with an upper bound (hopefully exponential) on the probability of the code being "bad", i.e., it has a probability of error > Formula?

  (v) Using (iv), can you obtain a bound on the fraction of codes that are bad? Are we therefore done?

 

Remark: Admittedly this is a very convoluted way of proving a theorem that we can prove much more easily in several different ways (how? you should be able to think of at least one, having proven it in an earlier problem set...). Nonetheless, this is a teaser for other, more complicated theorems to come, where this technique is in fact the only one we know of to prove corresponding achievability theorems.

 

Key concepts in Q2: Achievability for a coding theorem, complexity of design and implementation, random coding, Sanov's theorem, bound on divergence, combinatorial probabilistic bound, bounds on typical sets and probabilities of typical sequences, probability of a random code being bad, fraction of bad random codes.

 

 

Q3. (Intuition for channel coding theorem:) A channel is discrete and memoryless, with per-symbol transition probabilities given by the conditional distribution p(y|x). source generates m i.i.d. uniformly random bits Formula. The encoder Alice encodes this into a string of n symbols Formula (m/n = R < 1 is the rate). After passing through the channel, the decoder Bob receives Formula, and decodes the source as the sequence of bits Formula. An error e is said to have been made if Formula. The probability of this event is denoted by Formula.

(a) For each transmitted X, what is the size of the corresponding set of Y vectors conditionally typical on X?

(b) What is the maximal total number of possible Y vectors?

(c) What would you guess the capacity C (maximal rate) of the channel to be, therefore? Is there a reason why this guess may be somewhat optimistic?

 

Key concepts in Q3: Channel, discrete, memoryless, transition probabilities, rate, error, capacity of a channel.

 

 

Q4. (Comparison of the source coding theorem and the channel coding theorem:) (a) (Source coding) What's the benefit of source coding? That is, what's the ratio between the number of bits required by a naive coding scheme (uncoded), and the best possible coding scheme?

(b) (Channel coding) Consider the following naive scheme for channel coding. To transmit a single bit over a BSC(p), (Binary Symmetric Channel with crossover probability p), Alice repeats it k times (repetition code). To decode, Bob uses the majority rule. Prove that to transmit a single bit over k channel uses, this is indeed the best possible scheme. (Hint: Consider the "Hamming distance" between the two possible codewords...) What's the probability of error of this scheme? What if this scheme was used to transmit n bits? What's the block probability of error, i.e., the probability of error of the block of n bits? What if this probability is required to be exponentially small in n -- how large should k be, and therefore what would the rate of the code be? How does this compare with the rate we hoe to get, via Q3?

 

Key concepts in Q4: Repetition code, Hamming distance, majority rule decoding = minimum distance decoding, probability of error computations.

 

 

Q5. (Converse for channel coding theorem:) As in Q1, use a chain of inequalities to come up with an upper bound on the channel capacity. Proceed as follows.

(a) First prove that Formula. Hint -- the quantities in the inequalities, in no particular order, are

Formula,Formula,Formula,FormulaFormula,Formula,Formula,Formula,   and Formula.

Justify each step!

(b) Now use (a) to prove that FormulaHint -- the quantities in the inequalities, in no particular order, are

FormulaFormulaFormulaFormulaFormulaFormulaFormula

Justify each step!

 

 

Q6. (Achievability for channel coding theorem:)  Consider the following encoding scheme for the source in Q1. For each source sequence Formula, Alice assigns Formula symbols Formula i.i.d, uniformly at random. This assigment is known a priori to both Alice and Bob, and is called the codebook. To decode, given a string Formula, Bob examines the codebook to see if there is a unique typical  Formula that is jointly typical with this Formula, and if there's a unique Formula that encodes to this Formula. If so, he outputs Formula. Else he declares a decoding error.

(a) Prove that the probability of error is asymptotically negligible (in fact, that it decreases exponentially in the block-length n). Proceed as follows:

 (i) Prove that with the probability of an  Formula  pair being jointly atypical is "small". (How small?)

 (ii) Prove that for any Formula, the probability that Formula is "bad", i.e., another typical sequence Formulaalso gets encoded to Formula, is "small". How small? Use a combinatorial argument to bound this. Can you again provide an exponentially decaying bound?

  (iii) Conditioned on (i) and (ii), what's the probability that at least two Formula in the codebook are both jointly typical with Formula? How many codewords are there in the codebook? Therefore what the probability over all codewords in all codes that there's a decoding error?

  (iv) Combining (i), (ii) and (iii) (via the union bound), can you come up with an upper bound (hopefully exponential) on the probability of the code being "bad", i.e., it has a probability of error > Formula?

  (v) Using (iv), can you obtain a bound on the fraction of codes that are bad? Are we therefore done?

 

Q7. (Feedback does not improve the source coding rate:) Suppose noiseless feedback were allowed in a source coding system -- could a lower rate still guarantee ``small" error? More formally, suppose the encoder Alice possesses a discrete memoryless source that generates Formula i.i.d symbols Formula, each with entropy Formula. Alice generates Formula symbol Formula (which may be a function of Formula), and transmits this to Bob over the noisy channel {p(y|x)}. Bob gets Formulaprocesses this symbol and transmits back (noiselessly) a symbol Formula from an arbitrary (but fixed) alphabet (Formula must be a (possibly random) function of only Formula) back to Alice. For instance, this Formula may equal Formula. Alice generates Formula symbol Formula (which may be a function of Formula and Formula), and transmits this to Bob over the noisy channel, who then processes this symbol and transmits back (again noiselessly) a symbol Formula (it may be a function of Formula and Formula) back to Alice.

At the Formulath step, Alice's Formulath transmitted symbol Formula may be a function of Formula and Formula, and Bob's feedback symbol may be a function of Formula and Formula.

This continues for Formula steps (Formula is the rate).

After Formula steps, Bob decodes the source as the sequence of bits Formula, which may be a function of Formula and Formula. Prove that if Formula, the probability of error is bounded away from zero.

 

Comments (10)

MEI Yuchen said

at 12:15 am on Feb 6, 2009

In which chapter of Cover/Thomas can I find source coding theorem?
I have no idea how to prove the achievability of source coding theorem.

sidjaggi said

at 1:25 am on Feb 6, 2009

Both the achievability and the converse proofs we used were "novel". They are, however, both very similar to the proof of the channel coding theorem...

Leonard Wong said

at 3:49 am on Feb 7, 2009

I think there is a typo in (b)(iv) The probability of error should be < (...) rather than > (...)

Leonard Wong said

at 3:58 am on Feb 7, 2009

(in question 2)

Silas said

at 6:34 pm on Feb 9, 2009

Question 6 is almost identical to question 2. Is there an error?

Tong said

at 5:55 pm on Feb 12, 2009

Sorry, i miss the lecture of source coding.
I am confusing the Q2 a and b. Is there any hint?

Cho Yiu Ng said

at 6:04 pm on Feb 12, 2009

Basically, you are asking the whole question 2. First of all, do you understand how the codebook is generated?

Tong said

at 6:09 pm on Feb 12, 2009

Use the function Y (X^n) to generate the codebook,right?

Cho Yiu Ng said

at 7:26 pm on Feb 12, 2009

No. The codebook is generated as follows. For each sequence X^n, we randomly and independently assign a codeword and this codeword is denoted by Y(X^n). The collection of the pairs (X^n, Y(X^n)) is the codebook.
Next, given a codeword Y^nR, how is it decoded for a given codebook generated in the above manner?

Tong said

at 5:49 pm on Feb 19, 2009

Question 4 b How to judge the scheme is best?

You don't have permission to comment on this page.