If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.
You already know Dokkio is an AI-powered assistant to organize & manage your digital files & messages. Very soon, Dokkio will support Outlook as well as One Drive. Check it out today!

Scribe_Note_1

Page history last edited by WANG Limin 12 years, 6 months ago

Introduction to Coding Problem and Part of Problem Set 1(Scribe: WANG Limin)

I. Introduction to Coding Problem

The main problem of information theory can be described as follows: Imagine a stream of source data, say in the form of bits (0 or 1), is being transmitted over a communication channel. In the source, we want to encode the source message to make it more efficient and robust for transmission. Meanwhile, receiver will hope to decode the message and get meaning of the message(sometimes we may not recover the source message totally due to the noise in the channel).

See the next figure for an example, our source send a message denoted as Formula where the alphabet is Formula , then the encoder encodes the message as codeword Formula where the alphabet is Formula , finally the decoder receives the codeword and recovers the message as Formula where the alphabet is Formula . Encoder is usually a function that assigns a codeword to each source message and decoder attempts to correct any errors in transmission and then recovers the original message. Usually the random variable Formula form a first order Markov Chain, i.e. Formula

Ideally, we wish to decode correctly and get the source message, i.e. Formula . However, in fact there is always noise during the transmission (e.g. bit 0 has probability p to be turned into 1 and vice-versa) and we can not guarantee the receiver recovers the message totally correct. In information theory, we usually consider the error probability Formula less than some threshold Formula ,i.e. Formula . In addition to robustness, we need to take the efficiency into account and want the length of our code is smaller than the source message, i.e. Formula . We usually define the rate Formula . When Formula , the error Formula may converge to Formula and the rate Formula can reach to best rate Formula , see the next figure for illustration.

There is a common sense that encoding source information, by adding additional information, sometimes referred to as redundancy, that can be used to detect and perhaps correct errors in transmission. The more redundancy that we add to the original source code. the more reliably we can detect and correct errors, but the less efficient we become at transmitting the source data. Thus, we need to keep a balance between efficiency and robustness.

II. Part of Problem Set 1

A certain fair coin is tossed n times. Each time it is tossed, the result is independent of the previous toss. This is an example of an i.i.d. binary random variable.

Q1: (Worst-case "compression")

(a) If the coin described above is tossed n times, how many distinct length-n sequences of coin tosses are possible?

The answer is Formula

(b) Given N possible outcomes of a random variable, how many bits (elements from {0,1}) are required to assign a unique length-N bit-sequence to each possible outcome?

The answer is Formula ,see the next figure for an example, we have 8 possible out comes and we need 3 bit to assign a unique length-3 bit sequence to each possible outcome. Generally, for a full binary tree with N leaves, the height is Formula

(c) Describe a method based on binary trees, to assign a length-N bit-sequence to an outcome, based on a binary tree, and a corresponding method to reconstruct the coin-toss sequence from the bit-sequence.

Code: if coin is tossed head, the output bit is 0; if coin is tossed tail the output bit is 1.

Decode: if the bit is 0, the tossed result is head; if the bit is 1, the tossed result is tail.

Since the coin is fair, i.e the probability of head is 0.5, we can use a full binary tree for our coding and decoding scheme, see the next figure for a example. If we coin is head, we then go left and the bit is 0. If coin is tail, we then go right and the bit is 1. The same process is for decoding.

(d) If the encoding alphabet is ternary (eg: {0,1,2}) instead of binary, how does your answer to (b) change? How about your answer to (c), if ternary trees are allowed? What if you were tossing a six-sided dice instead of a coin? How do your answers change? What is the time-complexity of your assignment schemes?

The answer is Formula for problem (b). We can think of ternary trees with N leaves, then the average of tree height is Formula , see the next figure for a example with 9 leaves, we only need 2 ternary bits.

Coding idea:

Step1: We can according the coding scheme of problem (d) to map the tossed result into a n-length bit, the time complex is Formula , e.g. HTHHT coded as 01001.

Step2: We need translate the binary code into ternary code, the time complexity is Formula , e.g. (01001)₂=(100)₃

Thus the total time complexity is Formula .

Decoding idea:

The reverse order of coding scheme and the time complexity is Formula .

Note: n is the length of bit sequence.

If I toss a six-sided dice, the answer is similar to a coin.

A second coin is biased with probability p, i.e., it comes up Heads with probability p and Tails with probability 1-p. Each time it is tossed, the result is independent of the previous toss. This is an example of an i.i.d. binary random variable, also called a Bernoulli(p) random variable.

Q2: (Average-case compression)

(b) What is the probability of observing a particular coin-toss sequence with Formula Heads?

The answer is Formula

(c) What is the size of the type-class T(k,n) of all coin-toss sequences with exactly k Heads? What is therefore the probability of observing the type-class T(k,n)?

The size of the type-class T(k,n) is Formula and thus the probability of observing the type class T(k,n) is Formula

The other questions are in the next scribe and thanks for your reading.

Comments (8)

Hi, regarding the second figure showing the relationship between Rn and n, why the Rn decreases as n increases while Rn = n / m?
Jacky

I think the second figure show our coding goal. We hope that Rn and en decreases when n increase. The reason may be related to the coding scheme. I do not know the details either.

I think the left part and right part of figure2 illustrates difference scenes. For the right part, it's easier to understand that, when m is a constant number, if we increase code length, we could reach lower error rate. As when n goes infinite, we can repeat the message for infinite times, to guarantee it could be correctly received. For the left part, I think it illustrates the scene that, the CODING ALGORITHM is defined, and m appears to be a function of n, say f(n), thus the rate differs with different n(more precisely, n be a function of m, but this is invertible).

Here's an example: consider a parity checking scheme, which will append one more bit to the message. Thus, n = m + 1, Rn = n / (n - 1) . The series of Rn appears to be, 2/1, 3/2, ..., 1(at infinite, this is Rbest)

Of course, this scheme can't help to correct fault messages.

That's my understanding of the graph...

Actually, even though m is a function of n, you cannot guarantee that Rn is a decreasing function~ I mean, the figure may not be true for some CODING ALGORITHMs.

Good comments, all!

First of all, let's clarify the meaning of rate for compression. Before that, let's even clarify the meaning of "code".

Our encoding and decoding functions are denoted respectively by the functions f and g. That is, x^n = f(u^m), and u'^m = g(x^n) = g(f(u^m)). The sequence u^m is a random sequence due to the fact that U is a random variable. Because of this x^n and u'^m may also be random variables (includnig their lengths).
Also, in general, we allow randomization in our definitions of functions, so, for instance, x^n may not be a deterministic function of u^m.

Together, (f,g) denotes the "code".

In general, the code may be "good", or "bad". Let's call a code ``\epsilon_m-good" if the (average) probability of error (defined as \sum_{u^m} \Pr(u^m) l(u'^m \neq u^m) is less than some \epsilon_m.

Also, the "rate" R_m should be defined as 1/m \sum_{u^m} \Pr(u^m) l(x^n). (Note that I denote it as a function of m rather than n.)

Here l(x^n) is defined as the length of the codeword x^n (might be variable length, not necessarily exactly n), and \Pr(u^m) is the probability of observing message u^m. Hence \sum_{u^m} \Pr(u^m) l(x^n) denotes the average length of the codewords, and when we further divide by m, this gives the "amortized" average length, i.e., the average length of encoding per input symbol. This average length is, in general, of course a function of the encoding-decoding scheme.

Now let's see what the goals are.
We want that for increasing m, that there exist an infinite sequence of \epsilon_m-good codes (f,g) (one for each m) so that, simultaneously, R_m converges to R_best, and \epsilon_m converges to zero.

It is true that it is not clear that this is possible (for one thing, it depends on R_best), but that is one of the goals of this class.

Hope that clarifies definitions -- we'll also repeat this in class!

Good job with the scribe notes, btw! I like the figure!

Actually, for the answers you gave, would you mind explaining a little bit more, so that future readers will be able to understand a bit better? For instance, for 1(b), your answer is
"The answer is \lceil \log_2 N \rciel"

I would have explained, perhaps, with the help of a figure of a tree, showing why this is the case. Similarly for 1(c) the tree figure would help! For 1(d), doing a better job describing the algorithm for completeness sake (even if the next scribe also does so) would be nice. And so on.

Think of these notes as being for posterity -- future generations of information theorists should look at these notes and be like "WOWWWWW!!!!" ;)

Thanks for your advice for the scribe note and I will modify my scribe note and add some details

You don't have permission to comment on this page.

To join this workspace, request access.

Already have an account? Log in!

Loading…

Scribe_Note_1

Scribe_Note_1

Page Tools

Insert links

Comments (8)

ZhanLei_Jacky said

WANG Limin said

Pili HU said

ZhanLei_Jacky said

sidjaggi said

sidjaggi said

sidjaggi said

WANG Limin said

Join this workspace

Navigator

SideBar

Related Links:

Recent Activity