December 22, 2022
Útulek Series, 2 | CF1761D Chapter, 2
Last time, I left off with the following probability problem:
A drawer contains red socks and black socks. When two socks are drawn at random (without replacement), the probably that both are red is 1/2. How many socks can the drawer contain?
It’s easy to see that b=1 gives a square determinant and a valid r=3. In fact, since 8b2+1 is necessarily odd and also grows faster than 2b+1, the solutions to the original problem form a bijection with the positive integer solutions to x2=8b2+1, which is a case of the general Pell equation
x2−Dy2=1(0.2)
with D=8. Should D be a square, it is trivial that Pell equations have no solution in the positive integers should D. Otherwise, it is well-known that they exhibit infinite solutions. In this post, I will provide my intuition behind a constructive proof of the latter claim, based loosely on the proof presented in Titu’s An Introduction to Diophantine Equations, before briefly discussing how this relates back to 1761D.
1 Integers are Dense in the Radicals
Equation (0.2) essentially says: “x2 and Dy2 are very close” (there is a similar variant of the Pell with RHS ±1). Knowing that D is irrational, one might suspect that some yD might be very very close to an integer x, which would get us close to the claim of the Pell equation. Thus forms our first lemma.
Lemma 1.1: For any integer E, we can find positive integer x1,y1 satisfying y1≤E and
∣x1−y1D∣<1/E.(1.1)
Proof of Lemma 1.1: I would use ϵ as the RHS, but it complicates later parts of the proof without first asserting that the rationals are dense in the reals, which I want to avoid here.
This lemma is almost certainly true, should one take an example D=7 and enumerate the D multiples. We thus enumerate the first E+1 multiples 0,D,…,ED and their fractional parts 0≤{0},{D},…,{ED}≤1. By pidgeonhole, then, there must be two multiples {iD},{jD}, i≤j which fall within 1/E of each other. Subtracting their indices, we have y1=j−i and x1 likewise as the difference between their integer portions. ■
Technically, we will had to have invoked the Archimedean principle, like in the proof of dense real numbers, but I’d like to stay away from group theory for now.
2 Dy2 Can Be Close, But Not Arbitrarily Close to x2
Lemma (1.1) tell us y1D can be arbitrarily close to an integer x1. In fact, an infinite number of pairs (x1,y1) may be found with repeated applications of the pidgeonhole principle.
Going from a bound on x1−y1D to a bound on x12−y12D now seems straighforward:
Thus, should we enumerate the integers k in (−2D−1,2D+1), at least one of them k0 must satisfy x12−Dy12=k0 for infinitely many pairs (xi,yi) for i≥1.
Lemma 2.1: An infinite number of positive integer pairs (xi,yi) satisfy
xi2−Dyi2=k0
for some integer k0.
3 Eyes on the Ball
Lemma (2.1) is remarkably close to the Pell equation (0.2). Squinting our eyes at (2.1), one may see
xi2−Dyi2=k02(3.1)
which will give a solution to the Pell equation should ∣k0∣ divide both xi,yi, which seems likely considering we have infinite pairs (xi,yi).
In fact, getting to (3.1) only requires a small amount of algebraic foresight. Take two pairs (xi,yi),(xj,yj) and multiply them:
at which point we recognize we want to add something to xi2yj2+xj2yi2 to make it a square, and to subtract the same thing from xi2xj2+D2yi2yj2 to also make it square:
Thus, let x=xixj+Dyiyj,y=xiyj−xjyi and we have
x2−Dy2=k02
as we wished for earlier. As suggested, to ensure x≡y≡0[∣k0∣], we may invoke the fact that there are infinite pairs (xi,yi) and hence there must be pairs (xi,yi),(xj,yj) with xi≡xj[∣k0∣] and yi≡yj[∣k0∣], which will give us the divisibility requirement.
Theorem 3.1: For non-square D, there exist at least one pair of positive integers x,y satisfying
x2−Dy2=1.
4 Infinite Solutions via Matrix Exponentiation
With our initial solution x,y (which we will now re-index as x1,y1), we may generate a second solution x2,y2 with similar algebraic manipulation as in (3.2):
We can, of course, express this more compactly in matrix form:
[xiyi]=[x1y1Dy1x1][xi−1yi−1].(4.2)
An interesting corollary which I shall state without proof is that should x1,y1 be the minimal positive solution, this method generates all solutions. Thus, we must have the following corollary.
Corollary 4.3: All solutions to Pell equation (0.2) may be generated by sequential matrix exponentiation:
[xiyi]=[x1y1Dy1x1]i−1[x1y1].
5 Tying it Together
Recalling (0.1) for the original probability question, we have D=8 and initial solution (x1,y1)=(3,1), and may subsequently generate (x2,y2)=(17,6) which is easily verifiable. The proof provided above is reasonably constructive, and may be used to find the initial solution for the general Pell equation (0.2)—however, the better-known method is to use the continued fraction representation of D. The proof for why that works is tedious and I will not present it presently.
It is also a short derivation to determine that the result in corollary (4.3) may alternatively be expressed as
with the eigenvalues of matrix [x1y1Dy1x1] in plain view. I was actually not aware of the general result that matrix exponentiation could always be expressed in terms of exponentiation of the eigenvalues of the matrix, but it certainly puts into context the relationship between Binet’s formula and the Fibonacci matrix solution (follows from equation (1.3) from the previous post):
[Si+1Si]=[1110]i[10],
Si=51((21+5)i−(21−5)i).
So there is some relationship between one-dimensional recurrences, such as the Fibonacci, and Pell equations—which leads to them both being solved by the general form
It is then surprising to me that when a recurrence is upgraded from one-dimensional to two-dimensional, such as with 1761D or Pascal’s triangle, we no longer see powers of radicals, but rather summations of (kn) (solution (2.5) for 1761D from the previous post):
AN,K=j=1∑2K3N−j(⌊j/2⌋N−K)(⌈j/2⌉−1K−1).
At this point, I believed I didn’t understand well enough the class of functions like corollary (4.3) which exhibit exponential behavior:
f(1)f(i)=f(i+1).(5.1)
In the next post, I investigate multiplicative functions, which I mistakenly thought were functions like (5.1), but are actually functions of the form
f(i)f(j)=f(ij).(5.2)
This will lead us on a convoluted journey through the linear sieve and Taylor’s series—and, eventually, back to 1761D.