• If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.

• Dokkio Sidebar (from the makers of PBworks) is a Chrome extension that eliminates the need for endless browser tabs. You can search all your online stuff without any extra effort. And Sidebar was #1 on Product Hunt! Check out what people are saying by clicking here.

View

# Scribe Note 3_2

last edited by 10 years, 11 months ago

Channel Coding Theorem

Overview

We analyze a communication system, where U is the message, Xn is the encoded input sequence, Yn is the output sequence after transmission and Û is the decoded message. The information channel capacity of a discrete memoryless channel is defined as the maximum mutual information in any single use of the channel, where the maximization is over all possible input probability distributions {p(xi)} on X, i.e. In channel coding, the converse is R≤max I (X ;Y )+ε  while the achievability is R≥max I (X ;Y )-ε  For each (typical) input n-sequence, there are approximately 2nH(Y|X) possible Y sequences, all of them equally likely as Fig. 2. We wish to ensure that no two X sequences produce the same Y output sequence. Otherwise, we will not be able to decide which X sequence was sent. The total number of possible (typical) Y sequences is ≈ 2nH(Y) .This set has to be divided into sets of size 2nH(Y|X) corresponding to the different input X sequences. The total number of disjoint sets is less than or equal to 2n(H(Y)−H(Y|X)) = 2nI (X;Y). Hence, we can send at most ≈ 2nI (X;Y) distinguishable sequences of length n.

Prove the converse of channel coding (Ref. to the textbook Element of Information Theory from p.206)

For convenience, we will use W to denote U for input message and Ŵ to denote Û for output message in the following proof.

Firstly, we want to prove   By the dentition of a discrete memoryless channel, Yi depends only on Xi and is conditionally independent of everything else. Then, we have the above derivation from (7.94), where (7.95) follows from the fact that the entropy of a collection of random variables is less than the sum of their individual entropies, and (7.97) follows from the dentition of capacity. Thus, we have proved that using the channel many times does not increase the information capacity in bits per transmission.

Secondly, we want to prove that any sequence of (2nR,n) codes with λn0 must have RC. If the maximal probability of error tends to zero, the average probability of error for the sequence of codes also goes to zero [i.e., λn0 implies Pen0. For a fixed encoding rule Xn() and a ﬁxed decoding rule Ŵ = g(Yn),we have WXn(W)YnŴ. For each n, let W be drawn according to a uniform distribution over {1, 2,..., 2nR}.Since W has a uniform distribution, Pr(ŴW) = Pen = 1/2nR i λi. So, we have, where (a) follows from the assumption that W is uniform distribution, (b) is an identity, (c) is Fano’s inequality for W taking on at most 2nR values, (d) is the data-processing inequality, and (e) is from Lemma 7.9.2 in the first part. Then dividing by n, we obtain When n→∞, the ﬁrst two terms on the right-hand side tend to 0, --> RC and we get This equation shows that if R>C, the probability of error is bounded away from 0 for sufficiently large n, we can construct codes for large n with Pen = 0. Hence, we cannot achieve an arbitrarily low probability of error at rates above capacity.

In order to maximize the   C=(f,g), nR=H(X)+𝜀, where f  is the coding function and g is the decoding function.

Probability of each codeword,

1. 2. Binary Symmetric Channel BSC(p) with crossover probability p  Size = {U}=2nR

Decoding rule: Ifunique u s.t. f(u)=xn, yAεn (Yn|Xn=xn), then g(yn)=u, otherwise there is an error

Error events:

1. 2. For fixed particular yn, the probability of error event,

EeEchannel =   R<1-H(p)<E//E

Let p*be the probability distribution over X, for every message u, we choose the codebook f

u: choose f(u)=xn Xn by choosing each Xi in p*(x) Channel with feedback Xi(U, Z1i-1)

I(xn;yn)      nC

I(U;yn)      =H(yn)-H(yn|U)

= H(yn)- H(yi| yii-1, U)

= H(yn)- H(yi| yii-1, X, U)

H(yi)- ∑ H(yi| xi)nC

nR1+PenR+ I(U; yn) 1+PenR+nC