| 
  • If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.

  • You already know Dokkio is an AI-powered assistant to organize & manage your digital files & messages. Very soon, Dokkio will support Outlook as well as One Drive. Check it out today!

View
 

Scribe-Notes-1-2

Page history last edited by sidjaggi 15 years, 2 months ago

Complexity notation/Stirling's Approximation and Strong Typicality (Scribe: MAK Yan Kei)

 

 

I.                   Complexity of an algorithm 

We use the following noatation to measure the efficiency of an algorithm. Let the problem input have n bits. Let the number of steps that the algorithm takes to solve the problem be denoted Formula. By a “step” here we mean some simple operation such as adding two fixed-length (say, 8-bit) integers. Adding two arbitrarily long (say, 1000-bit) integers, therefore, would not be a single step.

 

Usually, we only care about the order of magnitude behaviour of Formula when n is large. To talk about the asymptotic complexity of an algorithm, we introduce the following notations (See this page from Wikipedia):

l          Formula  (Big O): There exist constants Formula such that Formula, i.e., f grows no faster than g.

2         Formula  (Big Omega): There exist Formula such that Formula, i.e., f grows no slower than g.

3          Formula (Theta) : Formula and Formula, i.e., f grows roughly at the same rate as g.

4          Formula  (Little O): For all c, there exists N such that Formula, i.e., f grows strictly slower than g.

5          Formula (Little Omega): For all c, there exists N such that Formula i.e., f grows strictly faster than g.

 

Take the byte-wise addition of two n-bit numbers as an example. We need roughly Formula steps (addition byte by byte takes n/8 steps, another n/8  steps for the "carry" bit). Let Formula be the number of steps required. Then, FormulaFormula, and Formula. Look at the comments to the first scribe notes for more examples.

 

The use of such notations is of course not restricted to complexity analysis. We can think of this notation as giving a way to compare the growth rate of two functions. Note that these estimates need not hold when n is small.

 

 

II.                Average-case Vs. Worst-case Compression

 

In Lecture 1, we have derived an encoding algorithm with optimal worst-case behaviour (i.e. the number of bits in the longest encoded sequence is minimized). In what follows we aim to find an algorithm that works better on average.

 

Intuition: Consider a biased coin that gives Heads much more frequently than Tails. In this case, coin-toss sequences like TTTTTT are less likely to occur then those with many heads, like HTHHHT. It seems common-sensical to assign more common coin-toss sequences to shorter encoded bit-sequences, so as to minize the average number of bits. However, it is still not clear what the optimal relationship between the "commonness" of a coin-toss sequence and the encoding length should be.

 

Situation

 

We have a biased coin, which comes up Heads with probability p (0 < p < 1), and Tails with probability 1-p. The result of each toss is an example of a Bernoulli(p) random variable. If we toss it n times, the probability of observing a particular coin-toss sequence with k Heads (Formula) is Formula.

 

Now define the type-class T(k,n) as the set of all sequences of n tosses with exactly k Heads. Since all coin-toss sequences with k Heads are equally likely to occur, and there are totally Formula of them, we conclude that the size of T(k,n) is Formula.

 

Binary entropy function and Stirling’s approximation

 

We define binary entropy function as follows: Formula (all logarithms here and henceforth are binary unless otherwise specified).

We also need Stirling’s approximation for factorials: Formula (this approximation holds in the sense that the ratio of the left and right hand sides tends to 1 for large n)

With these, we deduce the approximation Formula as follows.

 

Note that the RHS

Formula

Formula

Formula

Formula

Formula

 

Now, the LHS

Formula

Formula

Formula

Formula

Formula

Formula

 

Hence, Formula. Again, the approximation holds in the sense that for large n the ratio of the LHS and RHS tends to 1.

 

 

 

 

 

 

When the probability is maximized

 

With the approximation above, we have:

Formula

Formula

Formula

Formula

Formula

 

Denote Formula

Then, the above expression can be written simply as: Formula

 

Recall that Formula is a probability, so it is at most 1. Note that Formula if Formula, and this happens if Formula, i.e. Formula (we examine the properties of the D(.) function in future classes -- for now, suffice it to say that it equals 0 if and only if k=np). Hence, the probability is “maximized” if Formula (assuming this is an integer, if not, we round to the nearest integer), which is the expected value of Heads. This is not surprising, since Formula is theoretically the most likely outcome. Of course, Formula is not exactly equal to 1 (since our calculations involved order of magnitude estimates). The largest indeed it is very possible that we get slightly varied results (e.g. Formula Heads) in our experiment.

 

Probability of Formula-strongly atypical sequences

We say that a type-class T(k,n) is Formula-strongly atypical if Formula

To proceed, we will need a result that would be proved later (PS2):

 

Bound on D(.||.): Suppose Formulathen Formula

From this, we get Formula

The last inequality holds since Formula for all atypical type-classes.

 

Next, we use the union bound:

For eventsFormula Formula

 

The type classes ranges from T(0,n) to T(n,n), so there are totally n+1 of them. There are at least Formula integers (and at most Formula) in the range Formula. Hence, we get the following upper bound:

Formula

Formula

Formula

 

This should help us find an encoding scheme with a better average complexity, as we shall discuss in the next class.

 

 

Comments (12)

Mak said

at 2:05 am on Jan 16, 2009

Whew... it is really troublesome typing all those formulae...... Let me have some sleep now, will proofread again in tomorrow morning. :- )

sidjaggi said

at 8:25 am on Jan 16, 2009

I did warn you to get an earlier start :)

Very good job for a first draft... let's see the comments as they drift in...

Lingxiao XIA said

at 9:53 am on Jan 16, 2009

that's really impressive, look at all the formulae...i feel sorry for you man

sidjaggi said

at 11:54 am on Jan 16, 2009

Heh -- and nobody believed me when I said that writing the first scribe notes would be easiest :)

sidjaggi said

at 8:43 pm on Jan 18, 2009

Hi Mak -- some comments.

1. Feel free to hyperlink to web resources that might help clarify terms/notation/etc (wikipedia is fine. see for example the edited version of Scribe notes 1).
2. You were right about the mistake in the definition of the K-L divergence. Look up the correct definition from a reference, and fix.
3. Your last statement is "note that what we write above is not very rigorous. Quite a number of the steps need more careful justification." In what senses are the steps not rigorous? If this statement (of non-rigour) is true, understanding the cases when the equations we derive are not correct gives more intuition -- on the other hand, if you can actually make them more rigorous, then you've made a valuable contribution. Either way, a bit more work would make the notes much more valuable...

sidjaggi said

at 8:44 pm on Jan 18, 2009

Oh, also, do try to make the formatting sort of consistent with the formatting of the previous scribe notes...

Silas said

at 4:06 pm on Jan 19, 2009

The statement "There are at most 2(\epsilon)n integers" should be replaced by "There are at least 2(\epsilon)n - 1 integers" when calculating an upper bound for atypical sequences because we need to calculate a lower bound for the number of typical sequences in order to find an upper bound for atypical sequences.

Mak said

at 5:53 pm on Jan 19, 2009

Hi,

By "not very rigorous", I just recall that you said in class that the computations are "not done very carefully".

I'm revising the notes now.

Mak said

at 6:50 pm on Jan 19, 2009

Hi Silas,

You're right. I meant to write "There are _approximately_ 2\epsilon n integers..." there, but....
Thanks for pointing out the mistake!

sidjaggi said

at 12:20 am on Jan 23, 2009

If you're done editing, Mak, and commenting, the rest of you guys, I'll edit now?
I'll wait for an hour or two for responses -- if none, I'll jump in.
(I'm traveling tomorrow morning, so either I edit tonight, or after Wed).

sidjaggi said

at 12:32 am on Feb 2, 2009

Alright Mak, done with my comments -- let me know if you disagree with anything.

sidjaggi said

at 12:34 am on Feb 2, 2009

excellent job, by the way :)

You don't have permission to comment on this page.