Mathematics 191, Honors Seminar:
Information Theory.

MWF 1-1:50, MS 5127
Dimitri Shlyakhtenko, MS 7901,
Office hours: by appointment

Course description.

Textbook: Robert B. Ash, Information Theory, Dover Publ., New York, 1990 (available in the bookstore).

This course will be an introduction to classical information theory, which is generally regarded as having been founded by Shannon in 1948. We will deal with the notions of data encoding and decoding; transmission of data over a communication channel; error correction; capacity of a communication channel; and data compression. One of the principle ideas is that of a measure of information, also known as entropy. Thus we would cover, at the very least, the first four chapters of Information Theory.

Depending on the interests of the audience and available time, we may get into somewhat more advanced topics. Such possibilities include: further discussion of error correcting codes; Boltzmann's formula from statistical mechanics and entropy; dynamical entropy; and quantum information theory in the context of quantum communication.

At least for the bulk of the course, the prerequisites are rather minor: students would be expected to know some linear algebra (115A) and very basic probability theory, which will be reviewed at the beginning of the course (we will only deal with random variables having a finite number of possible values). Students are also expected to have some mathematical maturity and familiarity with proofs (on the level of ones encountered in the linear algebra 115A and analysis 131A).

Grading will be discussed at the first class meeting. The plan is to have regular homework, and a final exam.


Due dateAssignment
Mon, Oct. 6 pp. 24-26: 1.1; 1.2; 1.4; 1.6; 1.7*; 1.11*;
Mon, Oct. 13 p. 45: 2.3; 2.4.
Problem A: given the letter frequencies in the English language found on Wikipedia, construct a Huffman code. Note: a related code, but one in which also space is represented, can be found also on Wikipedia.
Problem B: Assume that your are given 20 coins, one of which is a fake (it weighs less than the others). You have a pair of balance scales, on which you can weigh two groups of coins and decide whether a given group is lighter, heavier, or of the same weight as another. What is the minimal number of times that coins must be weighted to find a fake coin? Prove your answer. Do the same if you start with N coins to begin with.
Solution. Number the coins 1 ... N, and let X = the number of the coin that is fake. Assume that you can determine the value of X in at most S steps. Then you can encode the value of X by means of recording answers to the weightings, producing a trinary number with at most S digits. Let us now choose the fake coin according to some probability distribution (so that X now becomes a random variable). Then our encoding scheme produces at most S bits all the time, and so the average wordlength for the code is at most S. From the theorem in the book (2.5.2), we conclude that H(X)/log(3) ≤ S. Now H(X) is maximal (=log N) if X is uniformly distributed, so we conclude that S ≥ log N/log(3) Conversely, you can just write down the scheme directly: split the coins into three groups, A1, A2, A3 so that A1 and A2 have equal number of coins, and each has approximately 1/3 of the total number of coins. Put A1 and A2 on the scales. With one weighting you can tell if the coin is in A1 (if A2 is heavier), A2 (if A1 is heavier) or A3 (if A1 and A2 are the same). But then you repeat the scheme using whichever group the coin is in; you get your answer when the final group has ≤ 3 coins. This clearly requires less than logN/log3 + 1 steps. Thus the answer is logN/log3.
Wed, Oct. 22nd pp. 84-85: 3.2, 3.3, 3.4, 3.5, 3.8;
Problem C: Assume that your are given 20 coins, two of which are fakes (a fake coin weighs less than the others). You have a pair of balance scales, on which you can weigh two groups of coins and decide whether a given group is lighter, heavier, or of the same weight as another. What is the minimal number of times that coins must be weighted to find both fake coins? Prove your answer. Do the same if you start with N coins to begin with.
Mon, Nov 3rd pp. 85-86: 3.7, 3.9, 3.12, 3.13.
Problem D: Assume that you are given 10 sacks of coins (each sack contains M coins, and M is very large). In each sack, either all coins are fake or all are genuine. You know that a fake coin has weight 9 grams and a real coin has weight 10 grams. You have a digital scale that tells you the actual weight in grams. (a) Show that you can determine which coin is fake by putting some coins on the scale and reading off the resulting weight once. (b) Explain why there is no contradiction with coding theory.
Solution. Weigh simultaneously: 1 coin from bag 1; 10 coins from bag 2; 100 coins from bag 3 and so on (10k coins from bag k). If all coins were genuine and you had n bags, you would get 10g + 100g + 1000g + ... + 10n+1, so the total weight would have been the number 11....110 (as many 1's as you have bags, n+1 digits total). But if the k-th bag has fake coins in it, these coins would contribute 9×10k rather than 10×10k, so that the k-th digit would be zero, indicating which bag has fake coins (this works for all bags, except the first; if the first bag has fake coins in it, the least significant digit would be 9 rather than 0; you could avoid this exception if you were to add a 1g weight to the scales together with all the coins). There is no contradition with information theory, becase the output of your experiment is a number with as many digits as you want, so there is no limit to the number of scenarios (locations of bag with fake coins) that the outcome of the experiment can represent.
Wed. Nov. 12th pp. 130-131: 4.1, 4.2, 4.3.
Wed. Nov. 12th pp. 130-131: 4.1, 4.2, 4.3.
Wed. Nov. 19th pp. 132-133: 4.4, 4.5, 4.6*, 4.7, 4.11*.
Problem E: Consider the vector space Pn of all polynomials of degree at most n and whose coefficients are in F2 (the field with two elements). Let A(p) be the remainder obtained after dividing p by (x-1). Thus A is a map from Pn to constant polynomials. (a) Identify Pn with F2n+1 using the basis consisting of monomials (i.e., powers of x). (b) Find the matrix and another interpretation of the map A.
Solution. We first claim that xn = 1 mod (x-1). This is easily seen by observing that xn-1 is divisible by x-1. It follows that if p=a0+a1 x + ... + an-1 xn-1, then p = a0 + ... + an-1 mod (x-1). In other words, the remainder after division by x-1 of a polynomial p is the same as the sum of its coefficients. In the basis given by monomials, the matrix of this map is [ 1 1 ... 1 ]; you can also think of this map as giving you the parity (=sum of the digits) of the vector made from the coefficients of the polynomial. this map as
Wed. Nov. 26th pp. 163-164: 5.1, 5.2.
Problem F: (CRC codes). Consider a file containing N binary digits (bits), a0...aN-1. We wish to detect (but not necessarily locate) changes in the file (you could think of them as errors when the file is transmitted, or alterations of the file). To do so we fix once and for all a polynomial g, which we'll call the generating polynomial. Then we correspond to the file a polynomial p given as the sum a0+...+ aN-1 xN-1. p is also called the data polynomial. Lastly, we compute the CRC checksum of the file as the remainder r after division of p by g, i.e., r=p mod g.
(a) Show that if g=x-1, then the CRC checksum is the same as the parity (=sum of all digits in the file);
(b) Assume that the file contains the digits 11100010100. Let g be the polynomial x3+x+1. Compute the CRC of the file for this generating polynomial.
(c) Let g be as in part (b). Show that, whatever the length of the file, if we change any 1 bit in the file, the CRC checksum will change. Hint: how does the data polynomial change when you alter a bit in the file? How does the CRC change?
(d) Show that g of problem (c) is irreducible. Find the period of g, i.e., the smallest n so that x2n-1 has the same remainder as 1 mod g.
(e) Let g be as in part (b). Show that if we change any two bits in the file in such a way that the CRC checksum does not change, then the two bits must be a mulitple of 7 digits from each other.
(f) Show that in general if g has period M and has nonzero constant coefficient, then for the CRC checksum to fail to detect two bit errors, the errors must be a multiple of 2M-1 digits apart.
(g) What is M from part (f) in the case that g= x5+x2+1 (CRC for this polynomial is used by certain packets on the USB bus).

(a) See problem E from last h/w.

(b) The data polynomial in this case is

            2   6   8
p = 1+ x + x + x + x .

The CRC is the remainder after dividing p by g = x3 + x + 1. Long division gives p 2x2 mod g; since we are working over F2, p 0 mod g and so the CRC is zero.

(c) Since g 1 mod (x - 1), it follows that if p r mod g, then p and r have the same remainder mod (x - 1). Indeed, if p r mod g, then p - r = hg, and so (p - r) h mod (x - 1) g mod (x - 1) = g mod (x - 1). Thus the parity (sum of coefficients) of p and the CRC checksum r are the same; thus if we modify one coefficients of p (thus changing its parity) we must change the parity of r as well.

(d) If g were not irreducible, we would have g = fh. Since deg g = 3, it must be that one of deg h, deg f is at most 1 (since deg f + deg h = 3). Thus if g were not irreducible, g would have to be divisible by either x or x - 1. The remainder is 1 in each case.

To find the period, we list the remainders of various monomials mod g. You can see that the period is 7 (i.e., x7 - 1 is divisible by g, but xk - 1 is not for any k < 7).

(e) If we change two bits, we alter p by xa + xb = xa - xb, where a,b are the locations of the changed bits. The CRC will be unchanged iff xa - xb is divisible by g. Let’s assume that a > b for definiteness. Then g must divide xa - xb = xb(xa-b - 1). Since g is irredicuble, it follows that g must divide either xb or (xa-b - 1). The former is clearly not the case. Thus g divides xa-b - 1, which means that a - b (=distance between errors) is a multiple of the period, 7 in this case.

(f) Same argument as in (e)

(g) The period is 31, M=5.

You may find the following Wikipedia article useful as a reference on CRC codes.
For the last two lectures, we were following the Notes on Expander Codes by D. Spielman.
There will be a take-home final exam handed out on Friday's lecture. The exam will be due at 11am on Wednesday Dec. 10th and should be handed to me personally before that deadline.
Final exam solutions
Note: * denotes optional problems; if you are a more advanced student you should try them.