Under Construction
Channel Coding
Having proved the source coding theorem, we proceed to the channel coding theorem, another landmark of information theory.
Formulation
In his 1948 paper, Claude Shannon said that "the fundamental problem of communication is that of reproducing at one point either exactly or approximately a message selected at another point." A mathematical model of the communication problem is illustrated by the following figure:
The major ingredients are described as follows. All alphabets are assumed to be finite.
(1) The information source generates a message with values in .
(2) Alice (the encoder) processes the message and generates a codeword (signal in the diagram) with values in . The
(3) The channel is comtaminated by noise. As a result, the received signal , with values in , is a random variable. The channel is characterized by a transition probability matrix1 , with the interpretation that is the conditional probability that is the output of the channel given that is the input.
(4) Following page 195 of textbook, the rate of the channel is defined by . In particular, if , we have . can be interpreted as the number of bits transmitted per channel use.
(5) Bob (the decoder) processes the output to get an estimate , with values in , of the original message . An error is made if .
Actually, does not characterize the channel completely (why?). In our treatment, we shall make an important assumption that the channel is memoryless and without feedback. See section 7.5 of the textbook for explanations of the notation.
Definition (i) The channel is a discrete memoryless channel (DMC) if .
(ii) The channel is without feedback if .
It is an easy consequence of the two assumptions that
.
It is desirable to have a coding scheme that has a "high rate" and a "low probability of error". Shannon's channel coding theorem says that if we want to transmit messages with asymptotically small probability of error, no matter what we do, we cannot do it above a certain rate called the capacity of the channel.
Intuition of Channel Coding Theorem (Q3)
Below we discuss the ideas of the theorem informally. Rigorous definitions will be given later.
We use the idea of random coding which is introduced last time. Suppose that is fixed. For each , Alice randomly2 assigns an element according to the distribution . This random assignment forms the codebook (which is agreed by Alice and Bob before transmission). Bob receives according to the distribution .To decode, he examines the codebook to see if there exists a unique which is jointly typical3with . If so, Bob outputs . Else he declares a decoding error. Joint typicality is defined by the joint distribution of and , which is given by .
For each typical , we can look at the set of which are conditionally typical given . The size of this set3 is about . Think of as a ball in the space. If and are disjoint, we will not mix up and in the decoding process. Hence, we want these conditionally typical sets be "as disjoint as possible"4, and the number of "disjoint balls" gives the number of codewords that can be transmitted reliably. Since the number of typical is about and that each has about elements on the average, there are at most
"disjoint balls". Hence, the rate is at most . Taking maximum over all possible , we have
.
is called the capacity of the channel. If we do every step rigorously, it can be shown that is actually achievable (see page 195 of textbook for the definition of achievability). This is the Achievability for channel coding theorem. The converse of the theorem states that any rate above is not achievable. The proof is based on Fano's inequality (among other inequalities) and is the subject of Q5.
Proof of Channel Coding Theorem (Q6)
This is not proved in class. If I have time, I will give a full proof here.
Relationship between source coding and channel coding (Q4)
Will add something later.
Footnotes
1. That is, for all and for each .
2. Compare this with the random code in Scribe Note 2-Feb.
3. For ease of exposition, I have to invent a few notations.
4. We will not go very far if we insist on using mutually disjoint sets. We have to allow a little intersection.
Comments (2)
Mak said
at 10:56 am on Mar 16, 2009
Hi,
After so long, I guess you're not going to add anything to the part "Relationship between source coding and channel coding (Q4)"? :- )
By the way, I think the notes above give a very clear picture of what was discussed in class. I failed to attend that class (its time was changed due to pre-CNY special arrangement), and this notes really help. Thanks.
sidjaggi said
at 3:31 pm on Mar 16, 2009
I've been really lagging behind in my editing, because of my travels. I'll try to catch up with my own editorials of the scribe notes over the next two weeks -- will announce in class as I do.
You don't have permission to comment on this page.