**Channel Coding Theorem**

Overview

We analyze a communication system, where *U* is the message, *X*^{n} is the encoded input sequence, *Y*^{n} is the output sequence after transmission and *Û* is the decoded message.

The information channel capacity of a discrete memoryless channel is defined as the maximum mutual information in any single use of the channel, where the maximization is over all possible input probability distributions {*p*(*x*_{i})} on *X*, i.e.

In channel coding, the converse is *R*≤max *I* (*X *;*Y *)+*ε* while the achievability is *R*≥max *I* (*X *;*Y *)-*ε*

* *

For each (typical) input n-sequence, there are approximately 2^{nH}^{(Y|X) }possible *Y* sequences, all of them equally likely as Fig. 2. We wish to ensure that no two *X* sequences produce the same *Y* output sequence. Otherwise, we will not be able to decide which *X* sequence was sent. The total number of possible (typical) *Y *sequences is ≈ 2^{nH}^{(Y)} .This set has to be divided into sets of size 2^{nH}^{(Y|X)} corresponding to the different input *X* sequences. The total number of disjoint sets is less than or equal to 2^{n}^{(H(Y)−H(Y|X))} = 2^{nI}^{ (X;Y)}. Hence, we can send at most ≈ 2^{nI }^{(X;Y) }distinguishable sequences of length *n*.

Prove the converse of channel coding (Ref. to the textbook *Element of Information Theory* from p.206)

For convenience, we will use *W* to denote *U* for input message and *Ŵ* to denote *Û* for output message in the following proof.

Firstly, we want to prove

By the dentition of a discrete memoryless channel, *Y*_{i} depends only on *X*_{i} and is conditionally independent of everything else. Then, we have the above derivation from (7.94), where (7.95) follows from the fact that the entropy of a collection of random variables is less than the sum of their individual entropies, and (7.97) follows from the dentition of capacity. Thus, we have proved that using the channel many times does not increase the information capacity in bits per transmission.

Secondly, we want to prove that any sequence of (2^{nR},*n*) codes with *λ*^{n}→0 must have *R*≤*C*. If the maximal probability of error tends to zero, the average probability of error for the sequence of codes also goes to zero [i.e., *λ*^{n}→0 implies *P*_{e}^{n}→0. For a fixed encoding rule *X*^{n}() and a ﬁxed decoding rule *Ŵ* = *g*(*Y*^{n}),we have *W*→*X*^{n}(*W*)→*Y*^{n}→*Ŵ*. For each *n*, let *W* be drawn according to a uniform distribution over {1, 2,..., 2^{nR}}.Since *W* has a uniform distribution, Pr(*Ŵ*≠*W*) =* P*_{e}^{n} = 1/2^{nR} ∑_{i} λ_{i}. So, we have,

where (a) follows from the assumption that *W* is uniform distribution, (b) is an identity, (c) is Fano’s inequality for *W* taking on at most 2^{nR }values, (d) is the data-processing inequality, and (e) is from Lemma 7.9.2 in the first part. Then dividing by *n*, we obtain

** **

When *n*→∞, the ﬁrst two terms on the right-hand side tend to 0, --> R≤C and we get

This equation shows that if *R*>*C*, the probability of error is bounded away from 0 for sufficiently large *n*, we can construct codes for large *n* with *P*_{e}^{n} = 0. Hence, we cannot achieve an arbitrarily low probability of error at rates above capacity.

In order to maximize the

*C*=(*f*,*g*), *nR*=*H*(*X*)+*𝜀*, where *f* is the coding function and *g* is the decoding function.

Probability of each codeword,

Binary Symmetric Channel BSC(*p*) with crossover probability *p*

Size = {*U*}=2^{nR}

Decoding rule: If∃unique *u* s.t. *f*(*u*)=*x*^{n}, *y**∈A*_{ε}^{n} (*Y*^{n}|*X*^{n}=*x*^{n}), then *g*(*y*^{n})=*u*, otherwise there is an error

Error events:

1.

2. For fixed particular *y*^{n}, the probability of error event,

*E*_{e}*E*_{channel }=

*R*<1-*H*(*p*)<*E*//*E*

Let *p**be the probability distribution over *X*, for every message *u*, we choose the codebook *f*

∀*u*: choose *f*(*u*)=*x*^{n}^{}∈ *X*^{n} by choosing each *X*_{i} in *p**(*x*)

Channel with feedback

*X*_{i}(*U*, *Z*_{1}^{i}^{-1})

*I*(*x*^{n};*y*^{n}) ≤*nC*

*I*(*U*;*y*^{n}) =*H*(*y*^{n})-*H*(*y*^{n}|*U*)

=* H*(*y*^{n})-∑* H*(*y*_{i}|* y*_{i}^{i-1}, *U*)

=* H*(*y*^{n})-∑* H*(*y*_{i}|* y*_{i}^{i-1}, *X*, *U*)

≤∑* H*(*y*_{i})- ∑* H*(*y*_{i}|* x*_{i})≤*nC*

*nR*≤1+*P*_{e}nR+* I*(*U*_{; }*y*^{n})≤ 1+*P*_{e}nR+*nC*

## Comments (0)

You don't have permission to comment on this page.