*ALMOST FINISHED*
Claude Shannon, the father of Information Theory, once said that information is the resolution of uncertainty. Things that cannot be predicted usually draw the most of our attention, i.e. what we don’t know about the world, and information would be an answer. It involves statistic property & probability. For example, what will happen when you toss a coin many times?
In this course, a lot attention would be paid to biased coin problems. For instance, consider a black and white picture of a dark room, the probability of a dot being dark, denoted by 1, would be larger than the probability of the dot being light, denoted by 0. Or consider a digitalized signal for noise, the probability of next signal contains no noise, denoted by 0, would be larger than the probability of the next signal to be noise, denoted by 1, when the channel is rather clear.
Types of data compression:
l lossless source coding
l Rate-distortion theory. e.g. images (within some error, compress as much as one can)
Problem Set 1
A certain fair coin is tossed n times. Each time it is tossed, the result is independent of the previous toss. This is an example of an i.i.d. binary random variable.
Q1: (Worst-case "compression")
Background Information:
l Cumulative distribution function (c.d.f.) of x:
F(x) = P(x <= x)
l Independence:
Two random variables X and Y are independent if and only if
P(X <= x, Y <= y) = P(X <= x)P(Y <= y)
Similar rules also apply to the independence of multiple variables. It is interesting to note here that the independence of each two of the multiple variables doesn’t lead to the independence of all variables.
l i.i.d. identical & independently distributed random variables.
(a) If the coin described above is tossed n times, how many distinct length-n sequences of coin tosses are possible?
2^10
(b) Given N possible outcomes of a random variable, how many bits (elements from {0,1}) are required to assign a unique length-N bit-sequence to each possible outcome?
Ceiling function of log(N) with base 2
(c) Describe a method based on binary trees, to assign a length-N bit-sequence to an outcome, based on a binary tree, and a corresponding method to reconstruct the coin-toss sequence from the bit-sequence.
Algorithm:
Coin tossing
|
<->
|
Bit sequence
|
Head
|
<->
|
0
|
Tail
|
<->
|
1
|
Property: easy to construct/recover, one-to-one
(d) If the encoding alphabet is ternary (eg: {0,1,2}) instead of binary, how does your answer to (b) change? How about your answer to (c), if ternary trees are allowed? What if you were tossing a six-sided dice instead of a coin? How do your answers change? What is the time-complexity of your assignment schemes?
Time complexity: O(n), more details in the next class.....
"I don't get the names of things, I just know how often they happen. " [I don't know where I heard this, but seems like it belongs to somewhere in this note]
Homework: check out Stirling's Approx.
Comments (0)
You don't have permission to comment on this page.