If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.
You already know Dokkio is an AI-powered assistant to organize & manage your digital files & messages. Very soon, Dokkio will support Outlook as well as One Drive. Check it out today!

Homework 1

Page history last edited by sidjaggi 15 years, 2 months ago

Q1. (Sanov's Theorem:)} Let Formula be drawn i.i.d. from the distribution Formula , over alphabet Formula . Let Formula be a set Formula of type-classes of strings of length Formula . Prove that the probability of observing a string in a type-class in the set Formula satisfies

Formula

where Formula is the p.m.f. in Formula that is closest to Formula in K.-L. divergence.

Q2. (Estimating type:) The true p.m.f. of a random source Formula is unknown, and needs to be estimated by observing a sequence of Formula symbols drawn i.i.d. from Formula . Let Formula be the estimate of Formula ,

obtained simply by taking the type of the string Formula .

Obtain an upper bound on the expected variance of Formula from Formula (with expectation over Formula ), i.e., Formula ? (An order of magnitude bound in terms of n is all that is expected)

Q3. (Universal source code:) A universal source code of rate Formula is a code such that for any memoryless source Formula satisfying Formula , the decoding error is less than Formula . Prove that such codes exist.

Q4. (Non-positivity of ``triple" mutual information:) As in class, for any three random variables Formula , Formula , and Formula , define Formula as Formula .

(a). Find a joint distribution Formula such that Formula is positive.

(b). Find a joint distribution Formula such that Formula is negative.

Q5. (Feedback does not improve the source coding rate:) Suppose feedback were allowed in a source coding system -- could a lower rate still guarantee ``small" error? More formally, suppose the encoder Alice possesses a discrete memoryless source that generates Formula i.i.d symbols Formula , each with entropy Formula . Alice generates Formula bit Formula (which may be a function of Formula ), and transmits this to Bob. Bob processes this bit and transmits back a symbol Formula from an arbitrary alphabet ( Formula may be a function of only Formula ) back to Alice. Alice generates Formula bit Formula (which may be a function of Formula and Formula ), and transmits this to Bob, who then processes this bit and transmits back a symbol Formula from an arbitrary alphabet (it may be a function of Formula and Formula ) back to Alice.

At the Formula th step, Alice's Formula th bit Formula may be a function of Formula and Formula , and Bob's feedback bit may be a function of $ Formula and Formula .

This continues for Formula steps ( Formula is the rate).

After Formula steps, Bob decodes the source as the sequence of bits Formula , which may be a function of Formula and Formula . Prove that if Formula , the probability of error is bounded away from zero.

Q6. (``Practical" Source code:) Attempt Q3 of PS1. Hints:

(a) How many ``elementary" operations does it take to compute Formula ?

(b) When you are searching a dictionary (an English one :) for a word, do you use binary search?

Q7. Attempt any three of the following questions from Cover/Thomas, Chapter 2.

1, 2, 40, 16, 8, 10.

Comments (9)

what is the definition for "expected variance of p from q"? Thanks

good question -- added clarifying remarks...

How should we read Q4? What does "define I(X;Y;Z) as I(X;Y;Z)" mean?

The other problems look very hard too. Can anyone share some hints?

Mak, thanks for pointing out the typo -- corrected.

Hints: Note that Q1 and Q3 are based on material from Cover/Thomas (look in the index). However, I'd prefer it if you tried yourself, or talked to friends, before looking at the book. For other problems, I can be persuaded to release hints on this wiki :) However, for that to happen, I need to see at least an "active discussion" (whatever that means), with some attempts on attacking the problem, before I give some hints.

I do not quite understand Q5.
What is the meaning of "Alice generates H(X) bit Y1"?
As far as I can see, after Alice processes the sequence of length n, a length nR binary sequence should be generated and passed to Bob.
How come only one bit is transmitted at one time?
Moreover, what is Z? How does Bob get Z from Y? How does Alice use Z together with X to generate Y?

I stronly agree with Mak. The problems are very difficult.

Hi Yuchen,

The idea is that each bit of Alice's encoded codeword can, in principle, depend on the previous bits that Bob feeds back to her. Similarly, each bit of Bob's feedback can depend on the previous bits that Alice has fed to him. The definitions in the question set up conditions on which bits can depend on which bits. Since this gives more freedom to possible encoding/decoding schemes, the question is whether Alice can get by without transmitting nR bits. Question -- do you really think it'll help substantially, or not?

It is true that some of the questions may seem hard. As I said earlier, I'm willing to give hints, if people use this wiki to discuss their attempts thus far.
Note that the homework has been online for two weeks -- some discussion of it earlier might have generated hints earlier :) Also many questions are from the standard text-book, or are slight variants thereof.

Also, you're fully expected to talk to your classmates -- two heads are better than one, and many heads are even better (though unlikely with unbiased coins :)

Lastly, the homeworks are intended to get you thinking information theory -- I've tried to choose questions that are relevant to classwork, and have some teaching value -- giving you softball questions without a purpose helps no-one learn. Even if you don't get a couple of questions, it's only a homework set -- it's not the end of the world :)

Thanks a lot.
At least I do understand the question now and can start attempting.

in Q2 the p(x) is same as a constant in the E[] or not?

Hmm, not quite sure what you mean by the question.

The observed \hat{p}(x) can take many values, and for each value that \hat{p}(x) takes, there is an associated probability of observing that \hat{p}(x) (say we call that probability \hat{q}(x)). Then the expected variance equals the sum (over all possible values of \hat{p(x)) of the quantity \hat{q}(x)|p(x) - \hat{p}(x)|^2

hope that makes it clearer?

You don't have permission to comment on this page.

To join this workspace, request access.

Already have an account? Log in!

Loading…

Homework 1

Homework 1

Page Tools

Insert links

Comments (9)

MEI Yuchen said

sidjaggi said

Mak said

sidjaggi said

MEI Yuchen said

sidjaggi said

MEI Yuchen said

Tong said

sidjaggi said

Join this workspace

Navigator

SideBar

Related Links:

Recent Activity