**1 Feedback Capacity**

Figure 1. Discrete memoryless channel with feedback

A channel with feedback is illustrated in Figure 1. We assume that all the received symbols are sent back immediately and noiselessly to the transmitter, which can then use them to decide which symbol to send next. Can we do better with feedback? The surprising answer is no, which we shall now prove. We define a feedback code as a sequence of mappings , where each is a function only of the message and the previous received values, , and a sequence of decoding functions .

Thus,

When W is uniformly distributed over .

**Theorem 1** (Feedback capacity)

Where is the capacity with feedback.

Proof: Since a non-feedback code is a special case of a feedback code, any rate that can be achieved without feedback can be achieved with feedback and hence

.

Let W be uniformly distributed over . Then } and

Where (a) follows from W is uniformly distributed;

(b) follows from Fano's inequality

(c) follows from data-processing inequality.

Where (a) follows from the chain rule;

(b) follows from the fact that is a function of and ;

(c) follows from the fact that conditional on is independent of W and past samples of Y .

(d) follows from the chain rule and the fact that conditioning reduces entropy;

(e) follows from the definition of capacity for a discrete memoryless channel. Putting these together, we obtain

and dividing by n and letting , we conclude that .

Thus, we cannot achieve any higher rates with feedback than we can without feedback, and .

**2 Source-channel separation**

Let us define the setup under consideration. We have a source V that generates symbols from an alphabet . We will not make any assumptions about the kind of stochastic process produced by V other than that it is from a finite alphabet and satisfies the AEP. We want to send the sequence of symbols over the channel so that the receiver can reconstruct the sequence. To do this, we map the sequence onto a codeword and send the codeword over the channel. The receiver looks at his received sequence and makes an estimate of the sequence that was sent. The receiver makes an error if . We define the probability of error as ,

where I is the indicator function and g(y^{n}) is the decoding function. The system is illustrated in Figure 2.

Figure 2. Joint source and channel coding

**Theorem 2** (Source-channel coding theorem) If is a finite alphabet stochastic process that satisfies the AEP and , then there exists a source-channel code with probability of error . Conversely, for any stationary stochastic process, if , the probability of error is bounded away from zero, and it is not possible to send the process over the channel with arbitrarily low probability of error.

Proof:

Achievability. Since we have assumed that the stochastic process satisfies the AEP, it implies that there exists a typical set of size which contains most of the probability. We will encode only the source sequences belonging to the typical set; all other sequences will result in an error. This will contribute at most to the probability of error.

We index all the sequences belonging to . Since there are at most such sequences, bits suffice to index them. We can transmit the desired index to the receiver with probability of error less than if .

The receiver can reconstruct by enumerating the typical set and choosing the sequence corresponding to the estimated index. This sequence will agree with the transmitted sequence with high probability. To be precise,

for n sufficiently large. Hence, we can reconstruct the sequence with low probability of error for n sufficiently large if .

Converse: We wish to show that implies that for any sequence of source-channel codes

Thus is an arbitrary assignment of codewords to data sequence , and is any decoding function (assignment of estimates to output sequences ). By Fano's inequality, we must have .

Hence for the code,

Where (a) follows from the definition of entropy rate of a stationary process;

(b) follows from Fano's inequality;

(c) follows from the data processing inequality;

(d) follows from the memorylessness of the channel.

Now letting , we have and hence .

The joint source-channel separation theorem enables us to consider the problem of source coding separately from the problem of channel coding. The source coder tries to find the most efficient representation of he source, and the channel coder encodes the message to combat the noise and errors introduced by the channel. The separation theorem says that the separate encoders can achieve the same rates as the joint encoder.

**3 Gaussian channels**

The most important continuous alphabet channel is the Gaussian channel . This is a time-discrete channel with output at time i , where is the sum of the input and the noise . The noise is drawn i.i.d from a Gaussian distribution with variance N . Thus, . The noise is assumed to be independent of the signal .

• If the noise variance is zero, the receiver receives the transmitted symbol perfectly. Since X can take on any real value, the channel can transmit an arbitrary real number with no error. Thus, the capacity is infinite.

• If the noise variance is nonzero and there is no constraint on the input. Intuitively, we can choose an infinite subset of inputs arbitrarily far apart(the distance is much larger than the variance), so that they are distinguishable at the output with arbitrarily small probability of error. The channel still has an infinite capacity.

The most common limitation on the input is an average energy or power constraint. .

This communication channel models many practical channels, including radio and satellite links. And by the central limit theorem, the cumulative effect of a large number of small random effects will be approximately normal. So it is "simple" to analyze.

**Theorem3** The capacity of a Gaussian channel with power constraint P and noise variance N is

Consider any codeword of length n. The received vector is normally distributed with mean equal to the true codeword and variance equal to the noise variance. With high probability, the received vector is contained in a sphere of radius around the true codeword. If we assign everything within this sphere to the given codeword, when this codeword is sent there will be an error only if the received vector falls outside the sphere, which has low probability. Similarly, we can choose other codewords and their corresponding decoding spheres. The volume of an n -dimensional sphere is of the form , where r is the radius of the sphere. Each decoding sphere has radius . These spheres are scattered throughout the space of received vectors. The received vectors have energy no greater than n(P+N) , so they lie in a sphere of radius . The maximum number of nonintersecting decoding spheres in this volume is no more than . Therefore, the rate of the code is .

Figure 3. Sphere packing for the Gaussian channel

For the details of the proof(achievablity and converse), please refer to the chapter 9 of the text book.

Assume that we have a set of Gaussian channels in parallel. The output of each channel is the sum of the input and Gaussian noise. For channel j , , j=1,2,...,k, with , and the noise is assumed to be independent from channel to channel. We assume that there is a common power constraint on the total power used, that is, .

The information capacity of the channel C is

We can reduce the problem to find the power allotment that maximizes the capacity subject to the constrain that . By the Lagrange multipliers, we get . Since is nonnegtive, we use the KKT conditions to verify that the solution .

## Comments (0)

You don't have permission to comment on this page.