• If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.

  • You already know Dokkio is an AI-powered assistant to organize & manage your digital files & messages. Very soon, Dokkio will support Outlook as well as One Drive. Check it out today!


Scribe Notes 1

Page history last edited by Mak 15 years, 4 months ago

Introduction to Information/"Worst-case compression" (Scribe: XIA Lingxiao)


Claude Elwood Shannon, the father of Information Theory, once said[1]  "Information is the resolution of uncertainty". Things that cannot be predicted, i.e., what we don’t know about the world, usually draw most of our attention. In this class we shall not discuss the semantic meaning of information; rather, we formulate an operational definition. That is, we try to model information mathematically to model randomness, rather than pondering the philosophical implications of this randomness. More precisely, we examine the statistical and probabilistic properties of "information". Our basic example, one we shall return to many times, is, What happens when a (biased) coin is tossed many times? For instance, in a black-and-white picture, a dark pixel may be denoted by a 1 and a light pixel by a 0. Statistically, the probability of a 1 should be significantly higher than that of a 0 -- perhaps this observation can be used to compress the image (i.e., transfer fewer bits than the number of pixels in the original image). If we are willing to tolerate some distortion in the image, we may be able to compress the image even more. These examples correspond to two formulations of data compression problems, both of which we shall examine in this first course in information theory.

1.Lossless source coding (Exact reconstruction of the source's data is required).

2. Rate-distortion theory (Given a budget of maximal reconstruction distortion, the goal is to minimize the number of bits in the compressed data.)


Another problem of interest might be that of transmitting information over a noisy channel. For example, a sequence of bits is transmitted wirelessly, and if the receiver decodes a bit to the same value that the transmitter had transmitted, then a 0 is said to have been added to the signal, or else a 1 is.Depending on how noisy the ether is, careful design of coding schemes might allow an entire message to be transmitted with low error, even though individual bits might have a significant probability of decoding error. The set of problems corresponding to this problem formulations is called

3. Channel coding (Transmission of information over a noisy channel)


These are but three examples of a vast array of possible problems relating to information compression, storage, and tranmission, that one could conceivably be interested in. In this course we examine the theory and applications surrounding these three problems, and leave other problem formulations for more advanced classes.



Some knowledge of basic probability theory is assumed in this class -- this is summarized here.


Problem Set 1


A certain fair coin is tossed n times. Each time it is tossed, the result is independent of the previous toss. This is an example of an i.i.d. binary random variable.

Q1: (Worst-case "compression")

(a) If the coin described above is tossed n times, how many distinct length-n sequences of coin tosses are possible?


(b) Given N possible outcomes of a random variable, how many bits (elements from {0,1}) are required to assign a unique length-N bit-sequence to each possible outcome?

         Formula (the ceiling function of the binary logarithm of N)

(c) Describe a method based on binary trees, to assign a length-N bit-sequence to an outcome, based on a binary tree, and a corresponding method to reconstruct the coin-toss sequence from the bit-sequence.


Coin tossing


Bit sequence







        Property: easy to construct/recover, one-to-one

(d) If the encoding alphabet is ternary (eg: {0,1,2}) instead of binary, how does your answer to (b) change? How about your answer to (c), if ternary trees are allowed? What if you were tossing a six-sided dice instead of a coin? How do your answers change? What is the time-complexity of your assignment schemes?

        Time complexity: O(n), more details in the next class.....



"I don't get the names of things, I just know how often they happen. " [Lingxiao: I don't know where I heard this, but seems like this quote belongs somewhere in this note]



Homework: check out Stirling's Approximation.


  1. http://www.brainyquote.com/quotes/authors/c/claude_shannon.html

Comments (11)

Leonard Wong said

at 7:30 pm on Jan 11, 2009

I think the first sentence can be modified.

Original: There’re many ways of talking about information, but basically it’s more about new things that cannot be predicted.

In information theory, we try to model information mathematically. In doing so we neglect many features of information, such as meaning and language. Here, we focus on the statistical properties of information, and say that information (in information theory) is the decrease of uncertainty. Meaning and language are very basic and important. But we do not consider them here. Hence, I think that the word "basically" is not appropriate.

My recommendation: There are many ways of talking about information. Information theory attempts to model information mathematically by focusing on a particular aspect of information. Claude Shannon, the father of information theory, said that information is the resolution of uncertainty.

Lingxiao XIA said

at 9:26 pm on Jan 11, 2009

im amazed by how long you guys' comments can be, even when the thing is still unfinished, im dying trying to recall any new things to add when the thing was still three lines long, anyway it's a scribe note, which means im only writing down what is said in the class rather than creating new things, thanks for your comment though

sidjaggi said

at 11:28 pm on Jan 11, 2009


Don't feel restricted into saying just what was talked about in class -- if there are better ways of saying things, or even other things we didn't talk about that are relevant/interesting, feel free to dive in.

You're right, though, about "unfinished" business -- the goal of giving you a week to edit it before other people commented, was to allow you some time to think/collect your thoughts before people started breathing down your neck. Starting tomorrow, guys, feel free to jump in with comments, and Lingxia, over the next week you'll be the judge of which of those comments are relevant to include/edit, and in general how to handle them.

Lingxiao XIA said

at 11:38 pm on Jan 11, 2009

Lingxiao, but never mind

Lingxiao XIA said

at 12:20 am on Jan 12, 2009

"There are many ways of talking about information. Information theory attempts to model information mathematically by focusing on a particular aspect of information. Claude Shannon, the father of information theory, said that information is the resolution of uncertainty."

a little bit confused about "model information mathematically". doesn't information denote things that we already know? in my opinion, we dont have to model it anymore, we use it, to model the future, or uncertainty. so information theory should be about how to use what we already know to predict things, mathematically.

i'm sorry, i didnt mean to be so picky about choosing of words, just a little bit confused

Leonard Wong said

at 10:49 pm on Jan 13, 2009

Thanks a lot! Your notes helped me a lot to recall things mentioned in class.

The question is "What is (mathematical) modelling". I don't know a lot about the philosophy of science/math, but I can share a little about my understanding. My definition is "We model something mathematically by creating a mathematical system that intends to represent it in some ways." Once the mathematical system is established, it is independent of the phenomena that we wish to describe because it exists in the form of axioms, definitions and theorems. We derive consequences of the mathematical system, and wish to link them to the phenomena. This linkage is not automatic, and emprical experiments must be carried out to see whether the model's predictions are consistent with nature.

Consider probability theory. Probability is a branch of mathematics that models uncertainty. The (weak/strong) law of large number is a THEOREM derived from the axioms of probability. We know that the relative frequency of coin tosses tends to 1/2 as n goes to infinity. This is NOT a logical consequene of the law of large number, but only indicates that the binomial distribution is a reasonable approximation of the problem at hand. If the relative frequency does not stablize, we can no longer use the binomial distribution to model coin tosses. But the theorem of law of large number is STILL RIGHT in the mathematical system. See the introduction of William Feller's classic for a better discussion of the interplay of math and nature.

sidjaggi said

at 11:08 pm on Jan 13, 2009

Hey Lingxiao -- for the relatively small amount of math you have on the page, it shouldn't be too hard to use the LaTeX plugin (use the insert plugin tab on the top right) to enter the math more neatly. For example, the equation F(x) = P(X <= x) becomes F(x) = \Pr(X \leq x) in the plugin.

Also, some conventions -- a random variable will always be boldface and capitalized, a "normal" variable with be neither. With this notation, the equation becomes F(x) = \Pr(\mathbf{X} \leq x).

Hope that helped.

Mak said

at 2:14 am on Jan 16, 2009

I think the notes have mentioned all the main points discussed in the first lecture.

>> Similar rules also apply to the independence of multiple variables. It is interesting to note here that the independence
>> of each two of the multiple variables doesn’t lead to the independence of all variables.
I remember that a particular example that illustrates this fact has been mentioned in class. It would be great if you could include that.

sidjaggi said

at 7:13 pm on Jan 18, 2009

Lingxiao -- edited your scribe notes. Do tell me if you agree with the reworking of the material (I realize not everything is as I said it in class, but perhaps this is a better way of saying it, and perhaps I SHOULD have said it like this :)

Lingxiao XIA said

at 9:23 pm on Jan 18, 2009

wow, sorry i didnt modify this article to add the example before you did, i was lazy this weekend....sorry again.. holiday time...is close...
this looks like a brand new article now, everything looks more professional, great success!

Mak said

at 6:56 pm on Jan 19, 2009

I just turned the 宋体 into Times New Roman. I think it looks more comfortable :-)

You don't have permission to comment on this page.