New York Journal of Mathematics On Finite Sequences Satisfying Linear Recursions

advertisement
New York Journal of Mathematics
New York J. Math. 8 (2002) 85–97.
On Finite Sequences Satisfying Linear Recursions
Noam D. Elkies
Abstract. For any field k and any integers m, n with 0 2m n + 1, let
Wn be the k-vector space of sequences (x0 , . . . , xn ), and let Hm ⊆ Wn be the
subset of sequences satisfying a degree-m linear recursion — that is, for which
there exist a0 , . . . , am ∈ k, not all zero, such that
m
ai xi+j = 0
i=0
holds for each j = 0, 1, . . . , n − m. Equivalently, Hm is the set of (x0 , . . . , xn )
such that the (m + 1) × (n − m + 1) matrix with (i, j) entry xi+j (0 i m,
0 j n − m) has rank at most m. We use elementary linear and polynomial
algebra to study these sets Hm . In particular, when k is a finite field of q
elements, we write the characteristic function of Hm as a linear combination of
characteristic functions of linear subspaces of dimensions m and m + 1 in Wn .
We deduce a formula for the discrete Fourier transform of this characteristic
function, and obtain some consequences. For instance, if the 2m + 1 entries of
a square Hankel matrix of order m + 1 are chosen independently from a fixed
but not necessarily uniform distribution μ on k, then as m → ∞ the matrix is
singular with probability approaching 1/q provided μ1 < q 1/2 . This bound
q 1/2 is best possible if q is a square.
Contents
1. Introduction
2. The spaces Vn , Wn and some linear algebra
3. The characteristic function of Hm
4. Open questions
References
85
87
90
96
97
1. Introduction
Fix a field k. For any integers m, n with 0 2m n + 1, let Wn be the k-vector
space of sequences (x0 , . . . , xn ), and let Hm ⊆ Wn be the subset of sequences
Received May 12, 2001, and in revised form in April, 2002.
Mathematics Subject Classification. 47B35, 15A57.
Key words and phrases. Hankel matrix, linear recursion, finite field, discrete Fourier transform,
random Hankel matrix.
Supported in part by the Packard Foundation.
ISSN 1076-9803/02
85
Noam D. Elkies
86
satisfying a degree-m linear recursion, that is, for which there exist a0 , . . . , am ∈ k,
not all zero, such that
m
(1)
ai xi+j = 0
i=0
holds for each j = 0, 1, . . . , n − m. Equivalently, Hm is the set of (x0 , . . . , xn ) such
that the (m + 1) × (n − m + 1) Hankel matrix1
⎛
⎞
x0
x1
...
xn−m
⎜ x1
x2
. . . xn−m+1 ⎟
⎜
⎟
(2)
⎜ ..
⎟
..
..
⎝ .
⎠
.
.
xm
xm+1
...
xn
has rank at most m.2 Now linear recursions on infinite sequences {xi }i∈Z are
known to correspond to polynomials in the shift operators T ±1 : {xi } → {xi±1 },
modulo multiplication by powers of T . This approach does not work so nicely for
finite sequences, because T and T −1 push x0 and xn off the edge. We propose to
remedy this problem at T = 0, ∞ by homogenizing: instead of polynomials in T ±1 ,
use homogeneous polynomials in two variables Y and Z that act on Wn as the right
and left truncation maps to Wn−1 . We shall see that this approach yields a clean
account of linear recursions and the subsets Hm in the space Wn , which itself will
be identified with the dual of the space Vn of homogeneous polynomials of degree n
in Y and Z.3
In the present paper we develop this account using elementary linear and polynomial algebra. When k is a finite field of q elements, we also write the characteristic
function of Hm as a linear combination of characteristic functions of linear subspaces of dimensions m and m + 1 in Wn . We deduce a formula for the discrete
Fourier transform of this characteristic function, and obtain some consequences.
For instance we obtain a new proof that #Hm = q 2m . We further show that if the
2m + 1 entries x0 , . . . , x2m of a square Hankel matrix
⎛
⎞
x0
x1
...
xm
⎜ x1
x2
. . . xm+1 ⎟
⎜
⎟
(3)
⎜ ..
⎟
..
..
⎝ .
⎠
.
.
xm xm+1 . . . x2m
of order m + 1 are chosen independently from a fixed but not necessarily uniform
distribution μ on k, then as m → ∞ the the matrix is singular with probability
1 For more background on Hankel matrices (matrices with entries constant on NE-SW diagonals), and the closely related Toeplitz or “persymmetric” matrices (with entries constant on
NW-SE diagonals), see for instance [4]. These matrices arise in diverse mathematical contexts;
see for instance [1, 3] and the references in [4]. In our setting, Hankel matrices are more natural
than Toeplitz ones, but our results on rank distribution, culminating in Theorem 2, apply equally
well to matrices of either Hankel or Toeplitz type.
2 I thank Joe Harris for the geometric observation that H
m consists of the lines through the
origin coming from the points (x0 : x1 : · · · : xn ) lying on the m-th secant variety of the rational
normal curve (ξ n : ξ n−1 η : · · · : η n ) in n-dimensional projective space over k. We shall not need
this formulation here, but it arises naturally in an arithmetic application of Hm [3].
3 As an added benefit, the whole structure inherits a GL (k) structure from the action of
2
GL2 (k) by linear substitutions on Y, Z. But this, too, is not needed for the present paper.
On Finite Sequences Satisfying Linear Recursions
87
approaching 1/q provided the Fourier transform of μ has l1 norm less than q 1/2 .
This bound is best possible if q is a square: if μ is the uniform distribution on ck0 ,
μ1 = q 1/2
where k0 is a quadratic subfield of k and c ∈ k ∗ is arbitrary, then −1/2
but the probability is q
. It seems reasonable to conjecture that for any μ the
matrix (3) is singular with probability → 1/q as long as μ is supported on a set of
at least two elements not contained in ck0 for any proper subfield k0 of k.
2. The spaces Vn , Wn and some linear algebra
Basic notions and lemmas. Fix a field k. For each integer n ≥ −1, let Vn be
the vector space of dimension n + 1 over k consisting of bivariate homogeneous
polynomials
(4)
P (Y, Z) =
n
ai Y i Z n−i
i=0
of degree n. Let Wn be the dual of Vn . We identify Wn with the space of sequences
(x0 , . . . , xn ) by regarding such a sequence as the linear functional
(5)
n
ai Y i Z n−i →
i=0
n
ai xi
i=0
on Vn . Note that we allow V−1 and W−1 , each of which is the zero space, but not
Vn , Wn for n < −1.
If m 0 and n + 1 m, polynomial multiplication Vm × Vn−m → Vn gives for
each Q ∈ Vm a linear map Mn (Q) : Vn−m → Vn defined by
(6)
Mn (Q) : P → P Q (P ∈ Vn−m ).
Our reason for identifying the space Wn of sequences (x0 , . . . , xn ) with the dual of
Vn is the following observation:
m
Lemma 1. Suppose Q ∈ Vm is the polynomial i=0 ai Y i Z m−i . Then the adjoint
of Mn (Q) is the linear map Mn∗ (Q) : Wn → Wn−m taking any (x0 , . . . , xn ) to the
sequence of length n − m + 1 whose j-th term is
(7)
m
ai xi+j
i=0
for each j with 0 j n − m.
Proof. We show the equivalent dual statement: The linear map Mn (Q) takes any
polynomial
n−m
P (Y, Z) =
bj Y j Z n−m−j
j=0
in Vn−m to the polynomial P Q ∈ Vn whose Y r Z n−r coefficient is i+j=r ai bj for
each r with 0 r n. But this is immediate from the expansion of P Q.
Thus Hm is the union of ker Mn∗ (Q) over all nonzero Q ∈ Vm .
Of course that union is not disjoint, but as long as 2m n + 1 we shall describe
the intersection of ker Mn∗ (Q1 ) and ker Mn∗ (Q2 ) for any Q1 , Q2 of degree at most m
(see Lemma 4 below). We first establish some further basic properties:
Noam D. Elkies
88
Lemma 2.
(8)
i) For any Q, Q ∈ Vm and n such that n + 1 m 0 we have
Mn∗ (Q + Q ) = Mn∗ (Q) + Mn∗ (Q ).
ii) For any Q1 ∈ Vm1 , Q2 ∈ Vm2 , and n such that m1 , m2 > 0 and n + 1 m1 + m2 , we have
(9)
∗
∗
Mn∗ (Q1 Q2 ) = Mn−m
(Q1 ) ◦ Mn∗ (Q2 ) = Mn−m
(Q2 ) ◦ Mn∗ (Q1 ).
2
1
iii) For any nonzero Q ∈ Vm and any n ≥ m − 1, the map Mn∗ (Q) is surjective
and its kernel has dimension m.
Proof. (i) This is the dual of the identity Mn (Q+Q ) = Mn (Q)+Mn (Q ), which is
just the distributive law P (Q + Q ) = P Q + P Q for multiplication of homogeneous
polynomials. (Alternatively, apply Lemma 1.)
(ii) Likewise this is the dual of the fact that multiplying a polynomial of degree
n − m1 − m2 by Q1 Q2 is the same as multiplying it first by Q2 and then by Q1 or
vice versa.
(iii) Since k[Y, Z] has no zero divisors, Mn (Q) is injective; thus Mn∗ (Q) is surjective, and its kernel has dimension dim Wn − dim Wn−m = m.
The ideal Ix . For x = (x0 , . . . , xn ) ∈ Wn , define Ix ⊆ k[Y, Z] as follows: any
M
Q ∈ k[Y, Z] is uniquely m=0 Qm with each Qm ∈ Vm ; the subset Ix consists of
M
those m=0 Qm for which (Mn∗ (Qm ))(x) = 0 for each m n + 1.
Lemma 3. Ix is a homogeneous ideal in k[Y, Z] for all x ∈ Wn .
M
Proof. By definition m=0 Qm ∈ Ix if and only if each Qm ∈ Ix . So it is enough
to check that Ix ∩ Vm is closed under addition for each m, and that P Q ∈ Ix if
Q ∈ Ix ∩ Vm and P ∈ Vm for some m, m ≥ 0. Each of these is vacuously true
if m > n + 1 or m + m > n + 1 respectively, and follows from part (i) or (ii) of
Lemma 2 otherwise.
The main result of this section is the following partial description of Ix , stating in
effect that it is approximated by a principal ideal as well as dimension considerations
allow:
Proposition 1. Suppose for some x ∈ Wn that Ix contains a nonzero polynomial
of degree at most (n + 1)/2. Let m0 be the smallest degree of such a polynomial.
Then Ix ∩ Vm0 is 1-dimensional, say
(10)
Ix ∩ Vm0 = kQ0
for some nonzero Q0 ∈ Vm0 . For each m n + 1 − m0 ,
(11)
Ix ∩ Vm = (Mn (Q0 )) (Vm−m0 ).
Remark 1. In particular, it follows that Ix ∩ Vm has dimension m − m0 + 1 for
m0 m n + 1 − m0 . This cannot hold once m > n + 1 − m0 , except in the trivial
case x = 0, when m0 = 0 and Ix is all of k[Y, Z]. Indeed suppose that m0 > 0 and
m > n+1−m0 . If m > n+1 then Ix ∩Vm = Vm has dimension m+1 > m−m0 +1.
If m n + 1 then Ix ∩ Vm is the kernel of the linear map
(12)
Vm → Wn−m ,
Q → (Mn∗ (Q))(x);
On Finite Sequences Satisfying Linear Recursions
89
thus
(13)
dim(Ix ∩ Vm ) dim Vm − dim Wn−m = 2m − n,
which again exceeds m − m0 + 1 since m > n + 1 − m0 . This is what we mean
when we state that Ix approximates the principal ideal (Q0 ) as well as dimension
considerations allow.
To prove Proposition 1 we must first make good on our promise to describe intersections of the spaces ker Mn∗ (Q). We do this in the next lemma, whose statement
uses the greatest common divisor Q of two homogeneous polynomials Q1 , Q2 . This
is defined only up to multiplication by k ∗ , but such scaling does not affect the space
ker Mn∗ (Q), so the choice of g.c.d. will not affect the result.
Lemma 4. Let Q1 , Q2 be nonzero polynomials in Vm1 , Vm2 respectively, with greatest common divisor Q. Then, for each n max(m1 , m2 ) − 1,
(14)
ker Mn∗ (Q1 ) ∩ ker Mn∗ (Q2 ) ⊇ ker Mn∗ (Q),
with equality if and only if
(15)
n + 1 m1 + m2 − deg(Q).
Proof. If x ∈ ker Mn∗ (Q) then x is in the kernel of both Mn∗ (Q1 ) and Mn∗ (Q2 ),
because each of these linear maps factors through Mn∗ (Q) by part (ii) of Lemma 2.
Thus x is in the intersection of the two kernels, whence (14) follows. It remains to
establish the condition of equality.
Let m = deg Q, and m = m1 + m2 − m. By Lemma 2(iii), the codimensions
in Wn of ker Mn∗ (Q1 ) and ker Mn∗ (Q2 ) are n + 1 − m1 and n + 1 − m2 respectively.
Thus their intersection has codimension at most
(16)
(n + 1 − m1 ) + (n + 1 − m2 ) = (n + 1 − m) + (n + 1 − m ).
Hence if m > n + 1 then this codimension is strictly less than the codimension of
ker Mn∗ (Q). Thus the condition m n + 1 is necessary for equality in (14).
We conclude the proof by showing that this condition is also sufficient. Let Q
be the least common multiple
(17)
Q = Q1 Q2 /Q
of Q1 and Q2 ; this is a homogeneous polynomial of degree m . Assuming that
m n + 1, we may then consider Mn∗ (Q ). We claim that
(18)
ker Mn∗ (Q1 ) + ker Mn∗ (Q2 ) = ker Mn∗ (Q ).
By duality, this claim is equivalent to
(19)
im(Mn (Q1 )) ∩ im(Mn (Q2 )) = im(Mn (Q )).
But this is just the statement that a polynomial in Vn is divisible by both Q1 and
Q2 if and only if it is divisible by Q — which is true because Q is the least common
multiple of Q1 and Q2 . We thus have
(20) dim(ker Mn∗ (Q1 ) ∩ ker Mn∗ (Q2 ))
= dim(ker Mn∗ (Q1 )) + dim(ker Mn∗ (Q2 )) − dim(ker Mn∗ (Q1 ) + ker Mn∗ (Q2 ))
= dim(ker Mn∗ (Q1 )) + dim(ker Mn∗ (Q2 )) − dim(ker Mn∗ (Q )).
Noam D. Elkies
90
By Lemma 2(iii) again, this dimension equals
(21)
m1 + m2 − m = m = dim(ker Mn∗ (Q)).
Since we already know that ker Mn∗ (Q1 ) ∩ ker Mn∗ (Q2 ) contains ker Mn∗ (Q), we conclude that these two spaces are equal.
Corollary 1. Suppose x ∈ Wn . If Ix contains homogeneous polynomials Q1 , Q2
whose least common multiple has degree at most n+1, then Ix contains gcd(Q1 , Q2 ).
In particular, this conclusion holds if deg Q1 + deg Q2 n + 1.
Proof. Under our hypotheses, x is contained in both ker Mn∗ (Q1 ) and ker Mn∗ (Q2 ),
and the equality condition of Lemma 4 is satisfied. Therefore
(22)
x ∈ ker Mn∗ (Q1 ) ∩ ker Mn∗ (Q2 ) = ker Mn∗ (gcd(Q1 , Q2 )),
which is to say that Ix contains gcd(Q1 , Q2 ) as claimed.
We can now easily prove Proposition 1:
Proof of Proposition 1. Suppose Q1 , Q2 are nonzero polynomials in Ix ∩ Vm0 .
By the hypothesis of Proposition 1 we know 2m0 n + 1. Corollary 1 thus applies, and we find that Ix contains gcd(Q1 , Q2 ). Unless Q1 , Q2 are proportional,
deg(gcd(Q1 , Q2 )) < m0 , which is impossible by the definition of m0 . Thus Ix ∩ Vm0
has dimension 1 as claimed. By the same Corollary, if m n + 1 − m0 and
Q ∈ Ix ∩ Vm0 − {0} then Ix gcd(Q0 , Q). Since again gcd(Q0 , Q) must have degree
at least m0 , we conclude that Q is a multiple of Q0 . Since Ix is an ideal (Lemma 3),
we already know that Ix contains all multiples of Q0 ; thus Ix ∩ Vm0 consists of all
degree-m multiples of Q0 , and we are done.
It is thus natural to call Q0 the minimal linear recursion satisfied by x. (Again
Q0 is defined only up to multiplication by k ∗ .) From Proposition 1 we deduce the
following description of the degree m0 of this minimal recursion:
Corollary 2. If x ∈ Hm for some m (n + 1)/2 then the degree of the minimal
linear recursion satisfied by x equals the rank of the Hankel matrix (2) associated
to x.
Proof. Let m0 be this minimal degree. The rank of (2) is m + 1 − d, where d is the
dimension of the kernel of the action of this matrix on row vectors of length m + 1.
But Lemma 1 identifies this kernel with the space Ix ∩ Vm of degree-m recursions
satisfied by x. Since m0 m (n + 1)/2, we may apply Proposition 1 to find that
d = m − m0 + 1. Thus m0 is the rank of the Hankel matrix, as claimed.
3. The characteristic function of Hm
Decomposition into signed linear subspaces. We assume henceforth that k
is a finite field of q elements. For integers m, n satisfying our customary condition
2m n + 1, let Pm be the set of all subspaces of Wn of the form ker Mn∗ (Q) for
some nonzero Q ∈ Vm . (By Lemma 4 and part (iii) of Lemma 2, ker Mn∗ (Q1 ) =
ker Mn∗ (Q2 ) if and only if Q1 , Q2 are proportional; thus Pm consists of
(23)
q m+1 − 1
#(Vm − {0})
=
#(k ∗ )
q−1
On Finite Sequences Satisfying Linear Recursions
91
subspaces. We note for later use that this formula remains valid if we allow m = −1,
when Pm is empty.) Recall that we defined Hm as the set of x ∈ Wn satisfying a
recursion of degree m, and noted that Hm is thus the union of all the subspaces
in Pm . We further noted that this union is not disjoint, and thus that χHm , the
characteristic function of Hm , is not simply the sum of the characteristic functions of
the subspaces in Pm . However, by Lemma 4, the intersection of any two subspaces
in Pm is again the kernel of Mn∗ (Q) for some nonzero homogeneous Q of degree
m, and more generally if m1 , m2 m then the intersection of any subspace
in Pm1 with any subspace in Pm2 is itself in Pm for some m m. Thus we
can use inclusion-exclusion identities to write χHm as a linear combination of the
characteristic functions of subspaces in Pm for m m. Fortunately the resulting
formula is quite simple:
Proposition 2. The characteristic function of Hm equals
(24)
χK − q
χK ,
K∈Pm
K∈Pm−1
in which χK is the characteristic function of the set K, and the second sum is
interpreted as zero when m = 0.
Proof. Clearly (24) is an integer-valued function on Wn supported on Hm . Thus
we need only show that its value at x equals 1 for all x ∈ Hm . But this value is
(25)
#(Ix ∩ Vm−1 ) − 1
#(Ix ∩ Vm ) − 1
−q
q−1
q−1
#(Ix ∩ Vm ) − q #(Ix ∩ Vm−1 )
=1+
.
q−1
Let m0 be the degree of the minimal linear recursion satisfied by x. By Proposition 1, Ix ∩Vm and Ix ∩Vm−1 are vector spaces of dimensions m−m0 +1 and m−m0
respectively over k. (Note that this remains true if m0 = m, when Ix ∩ Vm−1 is
the zero space.) Thus #(Ix ∩ Vm ) = q #(Ix ∩ Vm−1 ), and (25) simplifies to 1 as
claimed.
We easily deduce the formula [2, Thm. 1] for the size of Hm :
Corollary 3. For all nonnegative m (n + 1)/2 we have
#(Hm ) = q 2m .
(26)
Proof. The size of Hm is the sum of χHm (x) over x ∈ Wn . By (24), this sum is
(27)
#(K) − q
#(K).
K∈Pm
K∈Pm−1
But by Lemma 2(iii), each K ∈ Pm has size q m , and each K ∈ Pm−1 has size
q m−1 . Using (23) — and this is where we use the validity of (23) also for m = −1
— we thus simplify (27) to
(28)
as claimed.
q m − 1 m−1
q m+1 − 1 m
q − q
q
= q 2m ,
q−1
q−1
92
Noam D. Elkies
In particular, if n = 2m − 1 then #Hm = #Wn , whence Hm = Wn — which
is clear because in this case the Hankel matrix (2) has only m rows, so must have
rank at most m. (This is essentially the special case n = 2m − 1 of the dimension
count we used earlier to deduce (13); in this case we find that Ix ∩ Vm has rank at
least 2m − n = 1, so must contain a nonzero vector.) Starting from this, one may
establish without too much difficulty a bijection from W2m−1 to the subset Hm
of Wn for any n 2m − 1, even without our k[Y, Z] framework. (This is in effect
how (26) is proved in [2].) But our approach also yields a formula for the Fourier
transform χ
Hm (P ) for all P ∈ Vn , whereas (26) only gives χ
Hm (0). We turn to
χ
Hm next.
Discrete Fourier transform. To define the Fourier transform on Wn , we first
define it on k. Fix a nontrivial character ψ0 of k, that is, a nontrivial homomorphism
from the additive group of k to the unit circle in C. [If k = Z/pZ for some prime p,
we may take ψ0 (x) = exp(2πix/p); in general k contains Z/pZ where p is the
characteristic of k, and we may take ψ0 (x) = exp(2πit(x)/p) where t : k → Z/pZ is
any nontrivial homomorphism of additive groups. One common choice for t is the
trace from k to Z/pZ. At any rate none of our results will depend on the choice
of ψ0 .] For any function f : k → C, we define the (discrete) Fourier transform f
of f to be the following function from k to C:
f(a) :=
(29)
f (x)ψ0 (ax).
x∈k
It is known that f → f is a linear bijection on the space Cq of complex-valued
functions on k, and that the inverse bijection is given by the Fourier inversion
formula:
1
(30)
f (a)ψ0 (−ax).
f (x) =
q
a∈k
The Fourier transform is defined more generally for finite-dimensional vector spaces
over k. Let V, W be a dual pair of such spaces, of dimension d. (We shall use
V = Vn , W = Wn , d = n + 1.) To each function F : W → C we associate its
discrete Fourier transform
(31)
F (x)ψ0 (a, x).
F(a) :=
x∈W
Again F → F is a linear bijection, and in this context the inversion formula reads
1 F (x) = d
(32)
F (a)ψ0 (−a, x).
q
a∈V
To recover χ
Hm from Proposition 2, we shall need one more fact about the discrete
Fourier transform:
Lemma 5. For any linear subspace K ⊆ W , the Fourier transform of its characteristic function χK is (#K) · χK ⊥ , where K ⊥ is the annihilator of K in V .
Proof. By definition, χ
K (a) is the sum over K of the character x → ψ0 (a, x);
thus χ
K (y) = #K or 0 according as this character is trivial or nontrivial on K,
that is, according as a ∈ K ⊥ or a ∈
/ K ⊥.
On Finite Sequences Satisfying Linear Recursions
93
We can now give our formula for χ
Hm . It will be convenient to introduce the
following notation: for P ∈ Vn and any integer d, define ωd (P ) to be 1/(q − 1)
times the number of nonzero Q ∈ Vd such that P is a multiple of Q. Equivalently,
ωd (P ) is the number of degree-d factors of P up to k ∗ scaling, and the number
of homogeneous principal ideals in k[Y, Z] that contain P and have a generator of
degree d. For instance, ω0 (P ) = 1, and for all d ≥ −1,
(33)
ωd (0) =
q d+1 − 1
[= #(Pd ) if 2d n + 1].
q−1
Moreover, for nonzero P we have the identity
(34)
ωd (P ) = ωn−d (P ),
due to the bijection Q ↔ P/Q between factors of P of degree d and n − d. (The
notation ωd is suggested by the omega function in elementary number theory, which
counts the positive divisors of a given positive integer.)
Theorem 1. For every m (n + 1)/2 and P ∈ Vn we have
(35)
χ
Hm (P ) = q m (ωm (P ) − ωm−1 (P )) .
Proof. By Proposition 2 and Lemma 5, this follows from the following observation:
for any homogeneous polynomial Q of degree at most n, the annihilator in Vn of
ker Mn∗ (Q) is the image of Mn (Q), which is the space of degree-n multiples of Q.
Thus when we use (24) to expand χ
Hm as a linear combination of characteristic
functions of annihilators, the number of subspaces in Pm or Pm−1 that contribute
a term to χ
Hm (P ) is the number of divisors of P of degree m or m − 1 up to k ∗
scaling. Each of these terms is q m or −q · q m−1 = −q m respectively, whence the
formula (35).
As promised, Proposition 2 is the special case P = 0 of this formula (cf. (33)).
Also, if n = 2m − 1, the identity (34) yields χ
Hm (P ) = 0 for all P = 0, consistent
with Hm = Wn in that case.
Hankel matrices with independently biased entries. The formula (26) can
be interpreted thus: if x0 , . . . , xn are chosen independently at random from the
uniform distribution on k, then the resulting vector (x0 , . . . , xn ) is in Hm with
probability q 2m−(n+1) . Using Theorem 1 we can also get at the probability that
(x0 , . . . , xn ) ∈ Hm if the xi are still chosen independently at random but from
distributions μi on k that are not necessarily uniform.
We regard the μi as functions from k to R satisfying the conditions: μi (x) 0
for all x ∈ k, and
(36)
μi (x) = 1.
[
μi (0) =]
x∈k
Then the probability that x := (x0 , . . . , xn ) is in Hm is
(37)
Πm (μ0 , . . . , μn ) =
x∈Wn
χHm (x)
n
μi (xi ).
i=0
By applying Fourier inversion to χHm we can express this as a linear combination
of the values of χ
Hm (P ). The resulting formula is:
Noam D. Elkies
94
Lemma 6. We have
Πm (μ0 , . . . , μn ) = q −(n+1)
(38)
n
χ
Hm (P )
P ∈Vn
μ
i (−ai ),
i=0
where ai is the Y i Z n−i coefficient of P as in (4).
Proof. By Fourier inversion (32),
(39)
Πm (μ0 , . . . , μn ) = q −(n+1)
Now P, x =
(40)
⎛
χ
Hm (P ) ⎝
P ∈Vn
n
i=0
ψ0 (−P, x)
x∈Wn
n
⎞
μi (xi )⎠ .
i=0
ai xi , so
ψ0 (−P, x)
n
μi (xi ) =
i=0
n
ψ0 (−ai xi )μi (xi ).
i=0
Thus the inner sum in (39) factors into
n
n
(41)
ψ0 (−ai xi )μi (xi ) =
μ
i (−ai ).
i=0
xi ∈k
i=0
Entering this into (39) yields the claimed formula (38).
The term P = 0 in (39) contributes
q −(n+1) χ
Hm (0)
(42)
n
μ
i (0) = q 2m−(n+1) ,
i=0
2m
because χ
Hm (0) = q
and each μ
i (0) = 1. The absolute value of the sum of the
remaining terms is at most
(43)
q −n+1
sup
P ∈Vn −{0}
=q
−n+1
|
μi (−ai )|
P ∈Vn −{0} i=0
sup
P ∈Vn −{0}
|
χHm (P )|
where μi 1 is the l1 norm
(44)
n
|
χHm (P )| ·
μi 1 :=
n
μi 1
−1 ,
i=0
|
μi (a)|.
a∈k
μi 1 1, with equality if and only if μ
i (a) = 0 for
Since μ
i (0) = 1, we have all a = 0. By Fourier inversion (30), this condition is equivalent to μi (x) = 1/q
for all x. Hence μi 1 = 1 if and ony if μi is the uniform distribution on k. We
n
μi 1 ) − 1 as a measure of how far the product distribution
may thus regard ( i=0 μ0 . . . μn departs from uniform distribution on Wn .
What of the other factor supP =0 |
χHm (P )| in the error estimate (43)? By TheoHm (P )
rem 1, each χ
Hm (P ) is a multiple of q m . Once n 2m, we cannot expect χ
to vanish for all P = 0, so supP =0 |
χHm (P )| must be at least q m . We next show
that it |
χHm (P )| is never much larger than q m for P = 0:
On Finite Sequences Satisfying Linear Recursions
95
Lemma 7. For every q and > 0, there exists an effective constant C such that
ωd (P ) < C(1 + )n
(45)
for every nonzero P ∈ Vn and every integer d.
(This is analogous to the standard fact that the number of factors of an n-digit
integer is subexponential in n, and will be proved in the same way.)
Proof. Define
ω(P ) :=
(46)
n
ωd (P ),
d=0
∗
the total number of divisors of P up to k scaling. Factor P into irreducibles over k:
r
(47)
Pses ,
P =
s=1
with Ps distinct irreducibles of degree fs . Comparing degrees in (47) we find
r
n=
(48)
es fs .
s=1
Now
ω(P ) =
(49)
r
(es + 1),
s=1
r
e
because the general divisor of P is s=1 Ps s with each es chosen from among the
es + 1 possibilities 0, 1, . . . , es . Fix m0 large enough that 21/m0 < 1 + , and factor
(49) as
(es + 1)
(50)
(es + 1).
ω(P ) =
fs <m0
fs m0
The second product is at most
es
(51)
2es = 2 fs m0 2n/m0 ,
fs m0
r
since m0 fs m0 es s=1 es fs = n by (48). The first product in (50) has at most
B factors, where B is the number of irreducible bivariate homogeneous polynomials
of degree < m0 up to k ∗ scaling. Each factor is at most n + 1, so the product is at
most (n + 1)B . Since log(n + 1)B = o(n) as n → ∞, and 21/m0 < 1 + , we conclude
that
(52)
ω(P ) 2n/m0 (n + 1)B (1 + )n .
Since ωd (P ) ω(P ), we deduce ωd (P ) (1 + )n .
Combining this estimate with Theorem 1 and Lemma 6, we obtain:
Theorem 2. For every q and > 0, there exists an effective constant C such that
n
(53)
μi 1 .
Πm (μ0 , . . . , μn ) − q 2m−(n+1) < C(1 + )n q m−n
i=0
for any n and any distributions μi on k.
Noam D. Elkies
96
In particular, suppose that n = 2m+α for some fixed nonnegative integer α, and
that all the μi are the same, so that each xi is chosen from the same distribution μ.
Then, as long as μ1 < q 1/2 , the error term in (53) approaches 0 as m → ∞,
and we conclude that if each of x0 , . . . , x2m+α is chosen independently from the
distribution μ then x ∈ Hm with probability approaching q −(α+1) , same as for the
uniform distribution. As noted in the Introduction, the bound on μ1 is best
possible, at least if q is a square: in that case k has a quadratic subfield k0 , and
if each xi is chosen uniformly from k0 (or from ck0 for some c ∈ k ∗ ) then x ∈ Hm
with probability q −(α+1)/2 , not q −(α+1) ; but for this distribution, μ1 = q 1/2 by
Lemma 5.
4. Open questions
Better bounds on Πm (μ0 , . . . , μn ) − q 2m−(n+1) ? We showed (Theorem 2)
that Πm (μ0 , . . . , μn ) is well approximated by q 2m−(n+1) under certain hypotheses
on the μi . Can these hypotheses by weakened by lowering the error bound in (53)?
Of course we must exclude some choices of μi . For instance we certainly cannot have
every μi supported on only one point; and we already gave the counterexample of
uniform distribution on a proper subfield of k. But it seems plausible that, except
for such pathological cases, (x0 , . . . , xn ) should be about as likely to be in Hm with
xi chosen from μi as it is with x chosen uniformly from Wn — whether or not the
μi 1 are small enough to deduce Πm (μ0 , . . . , μn ) ∼ q 2m−(n+1) from Theorem 2.
For instance we may surmise the following
Conjecture. Fix k and a closed set K of distributions μ : k → R. Assume that
no μ ∈ K is supported on a single point, nor on ck0 for any c ∈ k ∗ and any proper
subfield k0 of k. Then, for every real R 2, we have
(54)
Πm (μ0 , . . . , μn ) = (1 + o(1))q 2m−(n+1)
for any sequence of (n, m, μ0 , . . . , μn ) for which m → ∞, 2m n Rm, and
μi ∈ K for each i.
In particular, suppose q = R = 2. A distribution on k is then a pair (μ(0), μ(1))
of nonnegative numbers with μ(0) + μ(1) = 1. The conjecture then asserts that, for
each p > 0, if each entry xi of a square Hankel matrix of order m + 1 over Z/2Z is
chosen independently at random with probabilities μi (0), μi (1) both p, then the
matrix is singular with probability approaching 1/2 as m → ∞. Theorem 2 shows
this only for p > 1 − 2−1/2 ≈ 29.3%.
Higher dimensions. What happens to our theory in the context of arrays of
dimension 2 or greater, rather than finite sequences? One could start the analysis
in the same way, using for instance homogeneous polynomials in three variables
to treat triangular arrays, or bihomogeneous polynomials in two pairs of variables
for rectangular arrays. The resulting structures will surely be more complicated in
higher dimensions, but it may still be possible to find tractable descriptions.
Determinants of nonsingular Hankel matrices. In another direction, we return to the case n = 2m of square Hankel matrices (3) of order m + 1, for which
Hm consists in effect of such matrices whose determinant vanishes. We then ask: Is
there a formula analogous to (35), or even an estimate analogous to Lemma 7, for
On Finite Sequences Satisfying Linear Recursions
97
the discrete Fourier transform of the set of square Hankel matrices of order m + 1
with determinant c, for any given nonzero c ∈ k? This is easy when q = 2, in which
case that set is just the complement of Hm . But the problem seems to require new
techniques once q 3.
References
[1] D. G. Cantor, On the analogue of the division polynomials for hyperelliptic curves, J. Reine
Angew. Math. (Crelle’s J.), 447 (1994), 91–145, MR 94m:11071, Zbl 0788.14026.
[2] D. Daykin, Distribution of bordered persymmetric matrices in a finite field, J. Reine Angew.
Math. (Crelle’s J.), 203 (1960), 47–54, MR 22 #3734, Zbl 0104.01304.
[3] A. Dress, N. D. Elkies and F. Luca, A characterization of Mahler’s generalized Liouville
numbers by simultaneous rational approximation, preprint, 2001.
[4] I. S. Iohvidov, Hankel and Toeplitz Matrices and Forms: Algebraic Theory (trans. G.P.A.
Thijsse), Boston: Birkhäuser 1982, MR 83k:15021, Zbl 0493.15018.
Department of Mathematics, Harvard University, Cambridge, MA 02138
elkies@math.harvard.edu http://www.math.harvard.edu/˜elkies/
This paper is available via http://nyjm.albany.edu:8000/j/2002/8-5.html.
Download