| 
  • If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.

  • You already know Dokkio is an AI-powered assistant to organize & manage your digital files & messages. Very soon, Dokkio will support Outlook as well as One Drive. Check it out today!

View
 

Scribe Notes 1

This version was saved 15 years, 3 months ago View current version     Page history
Saved by Lingxiao XIA
on January 11, 2009 at 11:55:40 pm
 

*UNFINISHED*

 

 

 

 

 

There’re many ways of talking about information, but basically it’s more about new things that cannot be predicted. i.e. what we don’t know about the world. Claude Shannon, the father of information theory, once said that information is the resolution of uncertainty. It involves statistic property & probability. For example, what will happen when you toss a coin many times?

 

In this course, a lot attention would be paid to biased coin problems. For instance, consider a black and white picture of a dark room, the probability of a dot being dark, denoted by 1, would be larger than the probability of the dot being light, denoted by 0. Or consider a digitalized signal for noise, the probability of next signal contains no noise, denoted by 0, would be larger than the probability of the next signal to be noise, denoted by 1, if the channel is rather clear.

 

Types of data compression:

l        lossless source coding

l        Rate-distortion theory. e.g. images (within some error, compress as much as one can)

 

Problem Set 1

A certain fair coin is tossed n times. Each time it is tossed, the result is independent of the previous toss. This is an example of an i.i.d. binary random variable.

Q1: (Worst-case "compression")

Background Information:

l        Cumulative distribution function (c.d.f.) of x:

F(x) = P(x <= x)

l        Independence:

Two random variables X and Y are independent if and only if

P(X <= x, Y <= y) = P(X <= x)P(Y <= y)

Similar rules also apply to the independence of multiple variables. It is interesting to note here that the independence of each two of the multiple variables doesn’t lead to the independence of all variables. 

l        i.i.d. identical & independently distributed random variables.

 

(a) If the coin described above is tossed n times, how many distinct length-n sequences of coin tosses are possible?

        2^10

(b) Given N possible outcomes of a random variable, how many bits (elements from {0,1}) are required to assign a unique length-N bit-sequence to each possible outcome?

        Ceiling function of log(N) with base 2

(c) Describe a method based on binary trees, to assign a length-N bit-sequence to an outcome, based on a binary tree, and a corresponding method to reconstruct the coin-toss sequence from the bit-sequence.

        Algorithm:

Coin tossing

<->

Bit sequence

Head

<->

0

Tail

<->

1

        Property: easy to construct/recover, one-to-one

(d) If the encoding alphabet is ternary (eg: {0,1,2}) instead of binary, how does your answer to (b) change? How about your answer to (c), if ternary trees are allowed? What if you were tossing a six-sided dice instead of a coin? How do your answers change? What is the time-complexity of your assignment schemes?

        Time complexity: O(n), more details in the next class.....

"I don't get the names of things, I just know how often they happen. " [I don't know where I heard this, but seems like it belongs to somewhere in this note]

Homework: check out Stirling's Approx.

Comments (0)

You don't have permission to comment on this page.