**Channel Coding**

**Q1: (Channel models)** A channel is defined by its *transition probabilities*. Find the capacities of the following channels.

(a) (Bijective) Binary input binary output, p(i|j) = 1 if i = j, 0 otherwise.

(b) (Injective) Binary input ternary output, p(0|0) = 1, p(1|1) = p(2|1) = 1/2, p(i|j) = 0 for all other (i,j).

(c) (Surjective) Ternary input binary output, p(0|0) = p(1|1) = p(1|2) = 1, p(i|j) = 0 for all other (i,j).

(d) (Circular calculator with tiny buttons) Decimal input, decimal output, p(i|i) = p((i+1)mod(10)|i) = 1/2.

(e) (BEC -- Binary Erasure Channel) Binary input ternary output, p(0|0) = p(1|1) = 1-p, p(2|0) = p(2|1) = p, p(i|j) = 0 for all other (i,j).

(f) (BSC -- Binary Symmetric Channel) Binary input binary output, p(0|0) = p(1|1) = 1-p, p(0|1) = p(1|0) = p.

(g) (Weakly symmetric channels) Each row of the transition matrix {p(y|x)} is a permutation of the first row, each column sum is equal.

**Q2: (Codes)** "Explicit" constructions of good codes are few and far between. Let's see a few. Notation used below -- q is the (input/output) alphabet size, n is the block-length, k is the number of information bits, d is the minimum-distance of the code, p is the noise parameter/e is the number of errors, and R is the rate of the code. Below, we'll be careful to distinguish between "random" noise pn and "worst-case" noise e.

**Codes for Binary channels**

(a) Repetition code. Based on the name, how would you define such a code? What would its parameters be?

(b) Binary (q=2) (n=7,k=4,d=3) Hamming code. Can you construct such a code? How many errors e could it correct? How many error can it detect? Can you derive a general relationship between error detection and error correction?

(c) (i) (Coding-theoretic upper bound) Using "sphere-packing" arguments, for given q, n, and d, can you obtain a bound (the "Hamming bound/sphere-packing bound/volume bound") on k? What rate R do such codes achieve? If, for large n, the number of errors e asymptotically approach np, what is a good approximation to this R? Is it possible to obtain a better upper bound for sphere-packing? In particular, consider Plotkin's Bound -- this shows that in fact there's a gap between coding theory and information theory!

(ii) (Coding-theoretic lower bound) Using a "greedy" approach, can you demonstrate that a rate of 1-H(2p) is obtainable? (These constructions result in Gilbert-Varshamov codes.) Can you obtain a better achievability for binary sphere packing?

**Codes for "large" alphabet Symmetric Channels.**

(d) For "large" q (i.e., n = o(log(q))), what does the Hamming bound reduce to? Can you suggest codes that asymptotically achieve this performance for "random noise" -- *i.e.*, each symbol is independently and uniformly changed to some other symbol?

(e) How about coding theory bounds for large alphabets?

(i) (Singleton bound) Can you think of a tighter upper bound than the Hamming bound, based on the minimum distance between codewords?

(ii) (Reed-Solomon code) Consider a code generated as follows. Let be the sequence of source symbols. The polynomial is of degree k-1 in the variable r, and the coefficient corresponding to is . The n output symbols are chosen by evaluating this polynomial for n distinct values of r. Prove that the minimum distance of this code is n-k+1. How does this compare with the Singleton bound? What are the conditions on q for this code to "work"? For efficient decoding, look up the Peterson algorithm.

**Binary channels again**

(f) (Concatenated codes) Hopefully the above has convinced you that obtaining low-complexity codes with positive rate (forget capacity-achieving!) is non-trivial! Now we sketch a two-stage code (proposed by Forney) called concatenated coding, that does this. The first stage involves coding sub-sequences of the source's message into chunks of length log(n) each. The second stage involves coding the resulting chunk using a Reed-Solomon code. Can you fill in the blanks now? What parameters for each code would you choose for a random noise channel? For an "coding-theory" type result? How would you efficiently decode?

(Here's a reasonable starting point for more on the rich and beautiful field of coding theory.)

**Q3.** (Feedback capacity) *Suppose each symbol transmitted over a channel can be a function not just of the source message , but also of the symbols previously received by the receiver, how does Q5 of PS3 change? Does the answer differ? (Hint: Replace with and see if you can still prove the same result) -- the answer should be surprising!*

**Q4.** (*Source-channel separation*) Suppose one wishes to transmit a source with entropy rate over a DMC with capacity C. Prove that "separation holds", i.e., a probability of error that decays with block-length n is possible if and only if . (One direction should be "obvious" -- the other direction requires you to show that no *joint* source-channel scheme can perform much better than doing the two separately -- to do this, separate the entropy rate term into a conditional entropy term and a mutual information term, and use the "appropriate" inequalities).

**Q5.** (*Gaussian channels*) (a) (i) If you were allowed to transmit a real number, and it could be reconstructed perfectly (without noise), what would the capacity of the corresponding channel be?

(ii) How about if there were noise with a finite variance, but there was no power constraint?

(b) How would you define an AWGN channel?

(c) What would the "sphere-packing bound" for an AWGN channel be like? Why is the naïve argument incomplete?

(d) How does the Channel Coding theorem change for AWGN channels? The achievability? The converse?

(e) What if there are multiple parallel AWGNs, each with different noise levels? Can you think of an example of such a situation in real-life? What would the capacity of such a channel be? How would you achieve it?

## Comments (6)

## MEI Yuchen said

at 4:43 pm on Mar 16, 2009

I don't know how to answer Q2(d).

How to estimate Hamming Bound when q is large?

## Silas said

at 10:08 am on Mar 27, 2009

I just write R<1 is achievable!

## sidjaggi said

at 11:19 am on Mar 27, 2009

Hmm, sorry -- should've seen this question before.

Hint -- the eventual answer should be linear in p. What approximations can you make, using the fact that n is much less than q, to get what you want?

## sidjaggi said

at 3:34 pm on Mar 27, 2009

Actually, just changed 2(d) a bit -- now n is o(log (q)) rather than o(q)

## Lingxiao XIA said

at 1:05 am on Apr 20, 2009

hi i have some confusion about symmetric channels

it's said in the book that symmetric channels are those with a transition matrix having rows as permutations of each other, and columns suml to a constant

but this is different from the definition here in 1.(g)

consider a transition matrix:

0.3 0.4 0.3

0.4 0.4 0.2

0.3 0.2 0.5

it satisfies p(i|j)=p(j|i) but the rows are not permutations of each other

## sidjaggi said

at 9:23 am on Apr 20, 2009

God catch -- my mistake -- corrected :)

You don't have permission to comment on this page.