e2718281828459045

Category: Puzzle

A curious application of recurrence

Express (\sqrt{2} + \sqrt{3})^{1980} as a decimal fraction, what are the digits immediately to the left and to the right of the decimal point?

For example, for (\sqrt{2} + \sqrt{3})^2 = 5 + 2\sqrt{6} \approx 9.89, the digit immediately to the left of the decimal point is 9, and the digit to the right is 8.

This problem, and the following proof is from the book “Discrete Mathematics” by Martin Aigner.

The problem and the proof are both quite nice, because at first sight this problem seems have nothing to do with recurrence.

Here is the sketch of the proof: first observe that (\sqrt{2}+\sqrt{3})^{2n}, n \in \mathbb{N} is the sum of an integer and a multiple of \sqrt{6}. For example, for n = 1, (\sqrt{2}+\sqrt{3})^2 = 5 + 2\sqrt{6}. This is straight forward from binomial theorem:
(\sqrt{2}+\sqrt{3})^{2n} = \sum_{k=0}^{2n} {2n \choose k}(\sqrt{2})^{2n-k}(\sqrt{3})^k.
For k odd, a term (\sqrt{2})^{2n-k}\sqrt{3}^k is a multiple of \sqrt{6}; for k even, this term is an integer. So the above expression is a summation of alternating integers and (integral) multiples of \sqrt{6}, therefore, we have
(\sqrt{2}+\sqrt{3})^{2n} = a_n + b_n \sqrt{6}, for n = 0, 1, \cdots, where a_n, b_n are integers.

The recurrence formula for a_n and b_n is:
a_{n+1} = 5a_n + 12 b_n, b_{n+1} = 2a_n + 5 b_n. Using the machinery of generating functions, we get a closed form of a_n as
a_n = \frac{1}{2}\left((5+2\sqrt{6})^n+(5-2\sqrt{6})^n\right). As (\sqrt{2}+\sqrt{3})^{2n} = (5+2\sqrt{6})^n, we get a_n = \frac{1}{2}(a_n + b_n\sqrt{6}+(5-2\sqrt{6})^n), which implies that
a_n = b_n \sqrt{6} + (5-2\sqrt{6})^n. (*)

Note that the fractional part of (\sqrt{2}+\sqrt{3})^{2n} = a_n + b_n\sqrt{6} is the fractional part of b_n \sqrt{6}. From (*), what can we infer about b_n \sqrt{6}? Because a_n is an integer, the fractional part of b_n \sqrt{6} and the fractional part of (5-2\sqrt{6})^n must sum up to an integer. The crucial observation is that 5 - 2\sqrt{6} < 1, therefore, when it is raised to a large power, say, 990, (5-2\sqrt{6})^{990} is very close to 0 — it is of the form "0.000….". This implies that b_n \sqrt{6}, where n = 990 must be of the form "0.999…..". Thus, the digit immediately to the right of the decimal point in (\sqrt{2}+\sqrt{3})^{2n}, n = 990, is 9.

To determine the digit immediately to the left of the decimal point in (\sqrt{2}+\sqrt{3})^{2n}, n = 990, note that it is determined by a_{990} and the integral part of b_{990}\sqrt{6}. Following a similar reasoning as the above, we see that the integral part of b_{990}\sqrt{6} is one less than a_{990} (which is an integer, of course) modulo 10. Therefore, the last digit of the integral part of (\sqrt{2}+\sqrt{3})^{1980} is completely determined by a_{990}: to be precise, it is 2a_{990} - 1 modulo 10. For n = 0, 1, \cdots, 5, the first 6 pairs of the last digits of (a_i, b_i) is (1, 0), (5, 2), (9, 0), (5, 8), (1, 0), (5, 2). Therefore, it is periodic with period 4. Hence, a_{990} is 9 as 990 \equiv 2 \mod 4. Therefore, the digit immediately to the left of the decimal point of (\sqrt{2}+\sqrt{3})^{1980} is 2 \times 9 - 1 \equiv 7 \mod 10.

Slice of pizza with no crust

Find a way of cutting a pizza into finitely many congruent pieces such that at least one piece of pizza has no crust on it.
(source: http://math.stackexchange.com/questions/481527/slice-of-pizza-with-no-crust)

Allow rotation, translation, mirroring (solution)
Allow rotation and translation, but not mirroring (solution)

Shapes in a lattice

Let A be a shape in the plane with area {\rm area}(A) < 1. Prove that it is possible to place A such that it contains no points from the integer grid. The integer grid is the set of points whose x coordinates and y coordinates are both integers.

I first saw this problem from the book "思考的乐趣“.

This problem looks quite hard at first sight: if the shape A is compact and convex, then the problem would look much more “manageable”. Indeed, we can show that by randomly choosing a vector from [0,1) \times [0,1) and shifting the integer lattice according to the vector, the expected number of integer lattice point contained in A is {\rm area}(A). If the area of A is strictly less than 1, then the probability that A contains no point from the randomly shifted lattice is positive, hence there must exist a shift of the lattice, such that A contains no lattice point. However, there is no such assumption on the compactness/convexity of A, so the shape can be quite exotic: for example, it might not even be connected and can be scattered in multiple lattice cells, as long as the summation of the areas of each part is less than 1.

The proof is very interesting: Imagine that we cut the lattice along the grid lines, so we get a set of unit squares. Now collect all the unit squares that contain part of A and stack them together (that is, align the sides of these unit squares). The key point is that we can pierce a needle through the stack of squares without hitting any part of A: the reason is that the area of A is less than 1, and after stacking the squares, the area of the union of the parts of A in each square still cannot exceed 1. After we find such a point to pierce , place all the squares back into their original places, and shift the grid line so that they align with the horizontal and vertical line passing through the point pierced by the needle.

Here is an illustration of the above proof.

Of course, the proof can be formulated more formally in mathematical language.

The proof above is quite vivid and interesting, however, there are several points that I think are somehow swept under the carpet because of the vagueness of the problem statement. For example, when we say “area of a shape”, we are implicitly implying that A is at least measurable: otherwise, the area would not be defined. Another thing that we took for granted in the proof is in the stacking step: we get a collection of unit squares, each unit square contains part of A; after we stack these, the area of the union of the parts is less than 1. Although this is intuitively correct, as we might be stacking infinitely many unit squares, it clearly needs a more rigorous proof, which in turn depends the properties of A being a measurable set.

A probability puzzle: absent-minded passengers

I came across an interesting puzzle today:

Absent-Minded Passengers: A fully-booked plane is about to be boarded by its hundred passengers. The first passenger loses his boarding card and sits down in a random place. From then on, a passenger, whenever finding his seat taken, very politely sits somewhere at random. What is the probability that the hundredth passenger sits in his own seat?

This problem is from the book “The Art of Mathematics” by Bollobas.

This problem can be solved using induction (on n, the number of passengers), but a better approach is backward analysis.

Let’s try it out on some small instances first, say, when the number of passengers, n, is 2 or 3. The answer is 1/2: either the first passenger would sit on his own seat correctly, so the last passenger (in this case, the second passenger) would also seat on his own seat; otherwise, the first passenger occupies the seat of the second passenger, and the second passenger ends up at the first passenger’s seat. How about the situation when n = 3? The final seating situation is one of the following cases: [1, 2, 3] or [2, 1, 3] or [2, 3, 1] or [3, 2, 1]. The sequence (x_1, x_2, x_3) denotes that the i^{\rm th} passengers sits on seat x_i. So the probability that the last passenger sits on his own seat is still 1/2.

If we think about the process of seating, it is somehow analogous to randomly forming a cycle in [n] ending with 1 (because at some time points, some passenger will pick seat 1, either by chance, or because seat 1 is the only seat left). Formally, let us denote a cycle of length m by (x_0, x_1, \cdots, x_m). Then the process of seating is similar to filling up a random cycle: suppose the first passenger chooses the seat x_1, then it corresponds to a (partially filled) cycle (x_1, ?, ?, \cdots, 1), we can ignore the passengers 2 till x_1-1 as their seats are not occupied, so they would seat on their own seat. The x_1^{\rm th} passenger’s seat is taken, so when he boards, he would again randomly pick a seat from \{1, x_1+1, \cdots, n\}. Suppose he picks seat x_2, then the cycle becomes (x_1, x_2, \cdots, 1). Again nothing interesting happens until the x_2^{\rm th} passenger boards the plane……

what will be the final state the cycle? At some time point, one of the following two things will happen: a passenger x_k in the cycle picks

(1) the n^{\rm th} (the last) seat. In this case, the cycle would end up with the n^{\rm th} passenger choosing the only remaining seat—seat 1, in which case the last passenger does not sit on his own seat. Note that in this case, after x_k^{\rm th} passenger chooses the n^{\rm th} seat, all the passengers following him, except the last passenger, would sit correctly.

(2) the x_k^{\rm th} passenger choose seat 1, in this case, again all passengers after x_k would sit correctly, including the last passenger.

The probability of either case is 1/2. Hence the probability that the last passenger sits on his own seat is 1/2.

A probability puzzle: loaded dice

I came across this question on interview street:

Random number generator
There is an ideal random number generator, which given a positive integer M can generate any real number between 0 to M with equal probability.
Suppose we generate 2 numbers x and y via the generator by giving it 2 positive integers A and B, what’s the probability that x + y is less than C? where C is a positive integer.

Input Format
The first line of the input is an integer N, the number of test cases.
N lines follow. Each line contains 3 positive integers A, B and C.
All the integers are no larger than 10000.

Output Format
For each output, output a fraction that indicates the probability. The greatest common divisor of each pair of numerator and denominator should be 1.

Input
3
1 1 1
1 1 2
1 1 3

Output
1/2
1/1
1/1

One way to consider this problem is by generating functions. Suppose the distribution is not uniform, is there any efficient way to compute the probability that the sum of the two randomly generated numbers is no greater than C?

Suppose for the first random number generator, the probability of that the randomly generated number, X, equals to i, is p_i, for 0 \leq i \leq A, and similarly, for the second random generator, the probability that the randomly generated number Y is i is q_i, 0 \leq i \leq B Define two polynomials P(x) = \sum_{j=0}^A p_j x^j and Q(x) = \sum_{j = 0}^B q_j x^j. Then consider the multiplication of P(x)Q(x):
P(x)Q(x) = \sum_{k = 0}^{A+B}\left(\sum_{i+j = k, 0 \leq i \leq k, 0\leq j \leq k}p_i q_j x^k\right),
The k^{\rm th} coefficient in the right hand side of the above equation is precisely the probability of X+Y=k. So the probability that X+Y \leq k is the summation of the first k+1^{\rm th} terms. In the problem of interview street, since the distribution is uniform, each p_i is 1/(A+1), and each q_i is 1/(B+1). So we can easily obtain the probability that X+Y < C.

The probability puzzle (using generating functions) is the following:
Loaded dice: No matter what two loaded dice we have, it cannot happen that each of the sums 2,3, …, 12 comes up with the same probability.
(I first came across this puzzle from the book “The art of mathematics”).

We can reason like this: let p_i denotes the probability that the result of the first dice is i, for 1 \leq i \leq 6. Similarly, let q_i denotes the probability that the result of the second dice is i, 1 \leq i \leq 6. Then the probability that the sum of the two results is 2 is p_1 q_1=1/11 (the first dice and second dice must both output 1). The probability that the sum is 12 is p_6 q_6=1/11 (both dices must output 6). Then consider the probability that the sum is 7: it is at least p_1 q_6 + p_6 q_1 (the first dice outputs 1, and the second dice outputs 6, or the reverse). But p_1 q_6  + p_6 q_1 = (1/11) \times (p_1/p_6 + p_6/p_1) \ge 2/11.

This reasoning works, but the book provides a much better proof using generating functions. As before, let P(x) = \sum_{i = 1}^6 p_i x^{i-1} and let Q(x) = \sum_{i=1}^6 q_i x^{i-1}. Now if the sums 2,3,…,12 comes with the same probability, 1/11, then we have
11 P(x)Q(x) = 1+x+x^2+x^3+...+x^{10} = (x^{11}-1)/(x-1). Now since both P(x) and Q(x) are polynomials of degree 5, 11 P(x)Q(x) has real roots; however, the right hand side is a cyclotomic polynomial which does not have any real root. Thus the identity cannot hold.

Here is another application of generating functions for computing probabilities.

Counting heads Given integers n and k, along with p_1, ..., p_n \in [0,1], you want to determine the probability of obtaining exactly k heads when n biased coins are tossed independently at random, where p_i is the probability that the i^{\rm th} coin comes up head. Assume that you can multiply and add two numbers in [0,1] in O(1) time.

(This problem is from this book.)

This problem can be solved using dynamic programming: Let X_i be the indicator random variable for the event that the result of tossing i^{\rm} coin is head, for 1 \leq i \leq n. Notice that
{\rm Pr}(\sum_{i=1}^n X_i = k) = {\rm Pr}(\sum_{i = 1}^{n-1} X_i = k-1 {\rm AND } X_n = 1) + {\rm Pr}(\sum_{i=1}^{n-1} X_i = k {\rm AND } X_n =0)   = p_n {\rm Pr}(\sum_{i=1}^{n-1} X_i = k-1) + q_n {\rm Pr}(\sum_{i=1}^{n-1} X_i = k).

The time complexity is O(n^2). However, if we consider the approach generating functions, we can get a faster algorithm using fast fourier transform:
Let P_i(x) = q_i + p_i x, for 1 \leq i \leq n. Then consider the product of the n polynomials of degree 1: \prod_{i = 1}^n P_i(x) = \sum_{j = 0}^n c_j x^j. The probability of obtaining k heads is the coefficient c_k. Therefore, the problem is reduced to how fast we can multiply polynomials. Using fast fourier transform, we can multiply two polynomials of degree at most n in time O(n log n). Hence, we can use divide-and-conquer to multiply these n polynomials:
\prod_{j=1}^n P_i(x) = \left(\prod_{j=1}^{n/2} P_i(x)\right) \times \left(\prod_{j=n/2+1}^n P_i(x)\right).
Let T(n) denotes the time to multiply n polynomials of degree 1. Then we have the following recursion:
T(n) = 2 T(n/2) + O(n log n),
as in the “conquer” step, we are multiplying two polynomials of degree n/2. Hence T(n) = O(n (log n)^2).