• If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.

• Whenever you search in PBworks or on the Web, Dokkio Sidebar (from the makers of PBworks) will run the same search in your Drive, Dropbox, OneDrive, Gmail, Slack, and browsed web pages. Now you can find what you're looking for wherever it lives. Try Dokkio Sidebar for free.

View

# Scribe Notes 1

View current version     Page history
Saved by
on January 11, 2009 at 2:50:17 pm

*UNFINISHED*

There’re many ways of talking about information, but basically it’s more about new things that cannot be predicted. i.e. what we don’t know about the world. Claude Shannon, the father of information theory, once said that information is the resolution of uncertainty. It involves statistic property & probability. For example, what will happen when you toss a coin many times?

In this course, a lot attention would be paid to biased coin problems. For instance, consider a black and white picture of a dark room, the probability of a dot being dark, denoted by 1, would be larger than the probability of the dot being light, denoted by 0. Or consider a digitalized signal for noise, the probability of next signal contains no noise, denoted by 0, would be larger than the probability of the next signal to be noise, denoted by 1, if the channel is rather clear.

Types of data compression:

l        lossless source coding

l        Rate-distortion theory. e.g. images (within some error, compress as much as one can)

Problem Set 1

A certain fair coin is tossed n times. Each time it is tossed, the result is independent of the previous toss. This is an example of an i.i.d. binary random variable.

Q1: (Worst-case "compression")

Background Information:

l        Cumulative distribution function (c.d.f.) of x:

F(x) = Pr(x <= x)

l        Independence:

Two random variables X and Y are independent if and only if

P(X <= x, Y <= y) = P(X <= x)P(Y <= y)

Similar rules also apply to the independence of multiple variables. It is interesting to note here that the independence of each two of the multiple variables doesn’t lead to the independence of all variables.

l        i.i.d. identical & independently distributed.

(a) If the coin described above is tossed n times, how many distinct length-n sequences of coin tosses are possible?

2^10

(b) Given N possible outcomes of a random variable, how many bits (elements from {0,1}) are required to assign a unique length-N bit-sequence to each possible outcome?

Ceiling function of log(N) with base 2

(c) Describe a method based on binary trees, to assign a length-N bit-sequence to an outcome, based on a binary tree, and a corresponding method to reconstruct the coin-toss sequence from the bit-sequence.

Algorithm:

 Coin tossing <-> Bit sequence Head <-> 0 Tail <-> 1

Property: easy to construct/recover, one-to-one