1 Feedback Capacity

Figure 1. Discrete memoryless channel with feedback

A channel with feedback is illustrated in Figure 1. We assume that all the received symbols are sent back immediately and noiselessly to the transmitter, which can then use them to decide which symbol to send next. Can we do better with feedback? The surprising answer is no, which we shall now prove. We define a Formula feedback code as a sequence of mappings Formula , where each Formula is a function only of the message Formula and the previous received values, Formula , and a sequence of decoding functions Formula .

Thus,

Formula

When W is uniformly distributed over Formula .

Theorem 1 (Feedback capacity)

Formula

Where Formula is the capacity with feedback.

Proof: Since a non-feedback code is a special case of a feedback code, any rate that can be achieved without feedback can be achieved with feedback and hence

Formula .

Let W be uniformly distributed over Formula . Then Formula } and

Formula

Where (a) follows from W is uniformly distributed;

(b) follows from Fano's inequality

Formula

Where (a) follows from the chain rule;

(b) follows from the fact that Formula is a function of Formula and Formula ;

(d) follows from the chain rule and the fact that conditioning reduces entropy;

(e) follows from the definition of capacity for a discrete memoryless channel. Putting these together, we obtain

Formula

and dividing by n and letting Formula , we conclude that Formula .

Thus, we cannot achieve any higher rates with feedback than we can without feedback, and Formula .

2 Source-channel separation

Let us define the setup under consideration. We have a source V that generates symbols from an alphabet Formula . We will not make any assumptions about the kind of stochastic process produced by V other than that it is from a finite alphabet and satisfies the AEP. We want to send the sequence of symbols Formula over the channel so that the receiver can reconstruct the sequence. To do this, we map the sequence onto a codeword Formula and send the codeword over the channel. The receiver looks at his received sequence Formula and makes an estimate Formula of the sequence Formula that was sent. The receiver makes an error if Formula . We define the probability of error as Formula ,

where I is the indicator function and g(y^{n}) is the decoding function. The system is illustrated in Figure 2.

Figure 2. Joint source and channel coding

Theorem 2 (Source-channel coding theorem) If Formula is a finite alphabet stochastic process that satisfies the AEP and Formula , then there exists a source-channel code with probability of error Formula . Conversely, for any stationary stochastic process, if Formula , the probability of error is bounded away from zero, and it is not possible to send the process over the channel with arbitrarily low probability of error.

Proof:

Achievability. Since we have assumed that the stochastic process satisfies the AEP, it implies that there exists a typical set Formula of size Formula which contains most of the probability. We will encode only the source sequences belonging to the typical set; all other sequences will result in an error. This will contribute at most Formula to the probability of error.

We index all the sequences belonging to Formula . Since there are at most Formula such sequences, Formula bits suffice to index them. We can transmit the desired index to the receiver with probability of error less than Formula if Formula .

The receiver can reconstruct Formula by enumerating the typical set Formula and choosing the sequence corresponding to the estimated index. This sequence will agree with the transmitted sequence with high probability. To be precise,

Formula for n sufficiently large. Hence, we can reconstruct the sequence with low probability of error for n sufficiently large if Formula .

Converse: We wish to show that Formula implies that Formula for any sequence of source-channel codes

Formula

Thus Formula is an arbitrary assignment of codewords to data sequence Formula , and Formula is any decoding function (assignment of estimates Formula to output sequences Formula ). By Fano's inequality, we must have Formula .

Hence for the code,

Formula

Where (a) follows from the definition of entropy rate of a stationary process;

(b) follows from Fano's inequality;

(d) follows from the memorylessness of the channel.

Now letting Formula , we have Formula and hence Formula .

The joint source-channel separation theorem enables us to consider the problem of source coding separately from the problem of channel coding. The source coder tries to find the most efficient representation of he source, and the channel coder encodes the message to combat the noise and errors introduced by the channel. The separation theorem says that the separate encoders can achieve the same rates as the joint encoder.

3 Gaussian channels

The most important continuous alphabet channel is the Gaussian channel . This is a time-discrete channel with output Formula at time i , where Formula is the sum of the input Formula and the noise Formula . The noise Formula is drawn i.i.d from a Gaussian distribution with variance N . Thus, Formula . The noise Formula is assumed to be independent of the signal Formula .

• If the noise variance is zero, the receiver receives the transmitted symbol perfectly. Since X can take on any real value, the channel can transmit an arbitrary real number with no error. Thus, the capacity is infinite.

• If the noise variance is nonzero and there is no constraint on the input. Intuitively, we can choose an infinite subset of inputs arbitrarily far apart(the distance is much larger than the variance), so that they are distinguishable at the output with arbitrarily small probability of error. The channel still has an infinite capacity.

The most common limitation on the input is an average energy or power constraint. Formula .

This communication channel models many practical channels, including radio and satellite links. And by the central limit theorem, the cumulative effect of a large number of small random effects will be approximately normal. So it is "simple" to analyze.

Theorem3 The capacity of a Gaussian channel with power constraint P and noise variance N is Formula

Consider any codeword of length n. The received vector is normally distributed with mean equal to the true codeword and variance equal to the noise variance. With high probability, the received vector is contained in a sphere of radius Formula around the true codeword. If we assign everything within this sphere to the given codeword, when this codeword is sent there will be an error only if the received vector falls outside the sphere, which has low probability. Similarly, we can choose other codewords and their corresponding decoding spheres. The volume of an n -dimensional sphere is of the form Formula , where r is the radius of the sphere. Each decoding sphere has radius Formula . These spheres are scattered throughout the space of received vectors. The received vectors have energy no greater than n(P+N) , so they lie in a sphere of radius Formula . The maximum number of nonintersecting decoding spheres in this volume is no more than Formula . Therefore, the rate of the code is Formula .

Figure 3. Sphere packing for the Gaussian channel

For the details of the proof(achievablity and converse), please refer to the chapter 9 of the text book.

Assume that we have a set of Gaussian channels in parallel. The output of each channel is the sum of the input and Gaussian noise. For channel j , Formula , j=1,2,...,k, with Formula , and the noise is assumed to be independent from channel to channel. We assume that there is a common power constraint on the total power used, that is, Formula .

The information capacity of the channel C is Formula

We can reduce the problem to find the power allotment that maximizes the capacity subject to the constrain that Formula . By the Lagrange multipliers, we get Formula . Since Formula is nonnegtive, we use the KKT conditions to verify that the solution Formula .

Scribe Note 6 week 10_2