Sums of Digits, Overlaps, and Palindromes Jean-Paul Allouche and Jeffrey Shallit

advertisement
Discrete Mathematics and Theoretical Computer Science 4, 2000, 001–010
Sums of Digits, Overlaps, and Palindromes
Jean-Paul Allouche and Jeffrey Shallit
CNRS, Laboratoire de Recherche en Informatique, Bâtiment 490, F-91405 Orsay Cedex, France
Department of Computer Science, University of Waterloo, Waterloo, Ontario N2L 3G1, Canada
received August 3, 1999, revised February 8, 2000, accepted February 16, 2000.
Let denote the sum of the digits in the base- representation of . In a celebrated paper, Thue showed that the
infinite word is overlap-free, i.e., contains no subword of the form , where is any finite
word and is a single symbol. Let "! be integers with $#% , !&#(' . In this paper, generalizing Thue’s result, we
prove that the infinite word )+* ,.- /.01
2!3 is overlap-free if and only if !4#5 . We also prove that
)6* , contains arbitrarily long squares (i.e., subwords of the form where is nonempty), and contains arbitrarily
long palindromes if and only if !87% .
Keywords: sum of digits, overlap-free sequence, palindrome
1
Introduction
At the beginning of the 20th century, the Norwegian mathematician Axel Thue initiated the study of what
is now called combinatorics on words with his results on repetitions in words [18, 19, 6, 8]. We say a
nonempty word 9 is a square if it can be written in the form :: for some word : . Examples include the
words chercher in French and murmur in English. We say that 9 is an overlap if it can be written in
the form ;<:;<:=; for some word : and single symbol ; . Examples include the words entente in French
and alfalfa in English. Thue explicitly constructed an infinite word on two symbols that is overlapfree, that is, contains no subword that is an overlap. He also constructed an infinite word on three symbols
that is square-free, that is, contains no subword that is a square.
Thue’s constructions are based on what is now called the Thue-Morse sequence
>@?5A+BDCFE1A+B EGA+B E3HIHH<?JC F C CKC K CFC C K CLH+HHM
There are many alternative ways to define this sequence (see, for example, [3]), one being as the fixed
C
>
BC<EO?PC
B E%?
C
point, starting with , of the morphism N
, N
. One can also define in terms of
S
BSGE
sums of digits. We define Q0R
A+BUSGEV?
BUSGEYX[Z\ SO]^C to be the sum of the
> digits in the base- T representation of . Then
QIW
for all
. Thue proved that is overlap-free.
Since Thue’s pioneering work, many other investigators have studied overlap-free words and their generalizations. For example, Thue’s construction was rediscovered by Morse, who used it in a construction
_
Research supported in part by a grant from NSERC.
`
1365–8050 c 2000 Maison de l’Informatique et des Mathématiques Discrètes (MIMD), Paris, France
2
Jean-Paul Allouche and Jeffrey Shallit
in differential geometry [14]. The Dutch chess master Max Euwe rediscovered Thue’s construction in
connection with a problem about infinite chess games [10].
Fife [12] described all infinite overlap-free binary sequences; also see [7]. Séébold proved the beautiful
>
and remarkable result that is essentially the only infinite overlap-free binary sequence which is generated
by iterating a morphism [17].
It is natural to wonder if Thue’s overlap-free construction is either unique in some sense, or a particular
case of a more general construction. In this note we show that t is a particular case of a more general
construction involving sums of digits. We will prove
]
]
Theorem 1 Let T
? C IM,MM alphabet >
be integers. Then the sequence R ]
T .
is overlap-free if and only if
?
B
Q R
BSGE X[Z\
E
over the
>
In contrast to Theorem 1 we also show that R always contains arbitarily long squares.
>
We also consider the occurrence of palindromes in R . A palindrome is a word (such as kayak or
>
radar) that is equal to its reversal. We prove that R contains arbitrarily long palindromes if and only
if .
>
Some combinatorial properties of R were previously studied by Morton and Mourant [15], who
>
B
E
proved among other things that R is ultimately periodic if and only if T .
We observe that overlaps, squares, and palindromes in sequences have several applications. For example, in number theory they aid in proving the transcendence of real numbers whose base expansion or
continued fraction expansion have “repetitions” [11, 4, 16, 2], while in statistical physics they are useful
for studying the spectrum of certain discrete Schrödinger operators [9, 13, 1, 5].
2
Some useful lemmas
In this section we introduce some notation and prove some useful lemmas.
] Lemma 2 For any T
R defined by R B ; E ?
]
;
B
;
>
E3HHH B
, the sequence R E B
;
C
is the fixed point, starting with , of the morphism
E
;(T where the sums are taken modulo .
Proof. Left to the reader.
>
? ?
Remark. Lemma 2 shows that R is a T -automatic sequence. For T
, we get the well-known
C
fact that the Thue-Morse infinite word is the fixed point, starting with , of the morphism defined by
C! C
C
,
.
] S%]
BUSGE
Let T
,
be integers. Then we define " R
to( be the exponent of the highest power of T which
S
BSGE ?
S
S
; if T$# but T#%'& .
divides . More precisely, " R
] Lemma 3 For all integers T
"
R
BS
and
S * E-,
S) S+* ]
we have
? /
X .10 B BSGE BUS2*E E
" R
" R
]
BSGE
"
R
BUSGE? 3
if " R
BUSGE ?
if "
R
R
"
"
R
BS2*E
;
B2
S *E
.
(1)
Sums of Digits, Overlaps, and Palindromes
3
Proof. Left to the reader.
BUS S2*E ?
Remark. Note that if T is a prime number, then we have "<R
necessarily true if T is not prime.
B
BUSGE E Lemma 4 Any block of T consecutive values of the sequence "<R
value .
"
R
&
BSGE
"
R
BS2* E
, but this is not
contains an occurrence of the
Proof. Any T consecutive numbers contains some multiple of T , say . We have "R
B
EV?
we are done. Otherwise "FR
(T
by Lemma 3.
B E ]
. If "KR
B EV?
,
UB SGE
BSGE
and Q R
is given by the following lemma.
] SO]
The link between "KR
Lemma 5 Let T
,
be integers. Then
Q R
BSGE B S Q R
BSGE ?
EV?
B
T E
"
R
BUSGE+M
#
C HHI H C B for
is 9 S
Proof. Let "
R
; . Then the base- T expansion
? 3 C of can be written in the formS 9 some word 9 and some single digit , where
. Then the base- T expansion of
E
B
#
E[HHI H
BUSGE BS EV?
E
BT Q R
; .
? We now turn to the properties of overlaps. If ;F:=;<:; is an overlap, then we call
.
?
HIHH
?C$ IMMIM B E .
? B ;<: its period
EYX[
Z\
If 9
over , then we define 9
; & ;W
; ? is a word
;
HHIH
&
. Given a word 9
; & ;W
; , we define its word of first differences
? B E HHH B E 9
;W ; &
;
; &
T B
E
T . It follows that Q0R
where the differences are taken mod
. We can also extend
to infinite words.
B E C BX[Z\ E .
? HHIH
Q Q & Q W
Proof. Suppose .
? HIHH
?
HHIH
: : &
: W
Q
Q % W
If contains an overlap of period beginning at position of . Let :
?
C ?
:
:
:
be this overlap. Then :
for
. Then : : for , so
!
? & . Furthermore,
1% % & B EV? : : @? C .
?.B 0E B % E3HIHH B E
: & :
: W : &
: : ! & , we get :
letting ] C
]
For the converse, suppose contains a square . Then there exist integers ,
, such that
?
U
B
[
X
Z
\
E
C
Q %" %+& Q %" Q %" %+&% Q %"1% for #%$ . By telescoping cancellation, it follows that
?
BX[Z\
E
Q %& %+& Q Q %& %+& % Q % (2)
C (')$
B EJ
C BUX[Z\
E
. If, as the hypothesis states, we have for
, then
J
C
B
[
X
Z
\
+
E
M
Q % Q (3)
BX Z\
E
C *' Together Eqs. (2) and (3) imply that Q for
Q %& % . But this implies that %&
contains at overlap of period , beginning at index .
C MIMM
Lemma 6 Let be a finite or infinite word over
if and only if the word
contains a square
such that
. The word contains an overlap ;<:;<:;
4
Jean-Paul Allouche and Jeffrey Shallit
3
Proof of the main theorem
We are now ready to prove Theorem 1.
] Proof. Fix integers T
C
for $
and
]
A+B S
T ]
T . If
? A+BDCFEGA+B E1A+B EHIHH
>
, and let R E
E
B S
Q R T Q R
BUSGE
. Note that
BX[Z\
E
(4)
T , it then follows that
Every symbol contained in
?5A+B G
S EGA+B S
T
T E[HHIH A+B S
T (T E
appears exactly once in .
(5)
>
C
We call such a subword (of length T , starting at a position in R which is congruent to , modulo
T ) a T -aligned subword. It follows from Eq. (4) that every symbol in a T -aligned subword is completely
determined once the value of a single such symbol is known.
?
: Assume
the subword
$
>
T . We will prove that the sequence R contains an overlap of period
A+B B
E E1A+B
T
T
C
HHIH B EGC
H HIH B E
is the overlap
]
Since T
, the base- T expansion of T
C
E3HHIHGA+B T E
, with ] ] , is of the form
B E HIH HI& B E B E6M
T
T
T
Then
B
EV? B EB E
BX[Z\ E
Q R T T
T ] ]
A+B B
E E[HIHH A+B EV?JC
HHH B EGC
for . Thus T
.
C T Similarly, the base- T expansion of T
,
, is of the form
C H HI H C M
HIHH B
C
A+B E3HIHH1A+B EV?
B EV? T Thus Q R T
for . Thus T
. In fact,
.
result follows.
EGC
. The
? : If > R has an overlap, then it has an overlap of shortest period . Let
A+B E A+B
be such an overlap. Note
C .
E
A+B
E ? A+B
# E
. By the definition of overlap, we have E3HIHHGA+B
]
$
?
Case 1: T . In this case, write
?
*
C T where
T .
write
T
EGA+B
E3HIHHGA+B
EGA+B
for
* , where * is a positive integer, and, using the division theorem,
Sums of Digits, Overlaps, and Palindromes
' EV? A+B E ' C #
'
(' ' A+B
5
' ' '
'
' '
$ !' ' ' (
$
$T Case $ 2: T T ; (c) . In this T .case there are three subcases to consider, based on the size of : (a) T ; (b)
? ?
C
Case 2(a): $ T . Let '
. Then T '
where $ T . There are two cases to consider,
R
(i) and (ii) .
? A+B
A+B E HIHH1A+B '
E
Case 2(a)(i): If , then 9
of T '
, and
) E A+B HIHH1A+B )E A+B E is a subword
T OT E
9 contains two identical symbols, namely and , which contradicts observation (5).
, then 9 ? A+B E A+B HIHE HYA+B A+ B E isE a subword of A+B T B !' E E HHHYA+B T ' E , and 9
Case 2(a)(ii): If contains two identical symbols, namely and , again contradicting observation (5).
In both cases we get a contradiction, so there cannot be an overlap with $ T .
?
T . As in Case 2(a), let '
Case 2(b): T $ $
R . Suppose the overlap is
A+B E HHIH A+B
E6M
?5A+B
E
C
Define :
for , and note
?
C
M
(6)
: : %" for # >
C
? ' Set subword of R and $ T . There are
T
, so that C & ? : HHIH : % R & is a T -aligned
two cases to consider: (i)
T ; and (ii) T .
C
Case 2(b)(i): T . If further T , then define
? : HHH : & : HHIH : % R & : % R HHIH : % HHIH : % W R & : % W R HHH : % R % HIHH : % R & : % R HHH :W (7)
Note that , @W , and are all T -aligned subwords.
&
$ T , define
Otherwise, if T
? : VHHIH : : HHIH : R : R HIHH : HIHH : W R : W R HHIH : R HIHH : W M (8)
for , we have, by considering only those that are multiples of
Since
A+B
* E ?JA+B
*E
C * (
A+B *
* E ? A+B *
*
*E
T , that
T
T
(
for
T E C (T *
T A+B * T * E ^T
C * *
A+B B *
*E
EV? A+B B * T .* Hence
*E
*
for
. Hence T
for . Hence
T
A+B *
*
*E
BUX[Z\
E
A+B *
* E &A+B *
*
*E BX[Z\
E
C * *
, and so
for
.
But
then
(
* ?
A+B * E%HHH2A+B * * E
is an overlap of period
, contradicting our assumption that was
T
minimal.
Note that
&
and
&
%
&
%
%
W are both T -aligned subwords, and
%
&
%
%
%
is a prefix of a T -aligned subword.
M
6
Jean-Paul Allouche and Jeffrey Shallit
5T & BUX[isZa\ T -aligned
subword, : R
%
&
E
. Since is a T -aligned subword or
BX[Z\
E+M
(9)
: % R % T
BUX[Z\
E
BUX[Z\
E
From : (6) we get : . Since W is a T -aligned subword, we get
%
(T BX[Z\ andE . From
: % R
(6) we get
T BUX[Z\ E6M
(10)
: % R % U
B
[
X
Z
\
E
8
C
U
B
[
X
Z
\
E
Now combining Eqs. (9) and (10) gives T
or , so . If
T ] ]
] ?
, then since T we get T , a contradiction. Hence .
In this case by examining we get
&
E
M
BUX[Z\
: for $ T
(11)
?
E
BX[Z\
Now :
for T #%$ T . But
: % , so by examining @W we get : ?
, so
M
E
BUX[Z\
for T #%$ T
(12)
: , so (11) and (12) together cover all residue classes mod , and so we get
Now T
M
E
C
BX[Z\
for # (13)
: W
? ? HIH H Now consider
. By Lemma 6, must be a square. From (13) we get
. But by
BUX[Z\
:
E
. Then, from the fact that
. Then from (6) we get :
R
T % %
&
prefix of one, we get
UB Suppose
X[Z\
E
Lemma 5
? B B T
E
"
B
R
E E[HHIH B B T
E
R
"
B
EE
where both sides are considered modulo . Now
by Lemma 4, there exists an index ,
T E , so
B
B
B E E X[Z\
?
BX Z\
E
B EV?
such that "
R
. Then T , so T
.
"KR
C[BUX[Z\
E
] ]
Hence
and hence
. Then since T
, we have T
and so T
.
T
T
?
But
and hence T
. This contradicts the assumption of case 2(b) that T
, and hence this case
cannot occur.
$ T , define
? : HHH : R HIHH : : HHH : H HH : R : R HHIH : HHIH : W R : W R HI HH :W T , we get T
Note that is a T -aligned subword, and from the inequality so W is also a T -aligned subword. Note that may be empty.
T , define Otherwise, if ? : HHIH : R HHIH : : HHIH : H HIH : R : R HIHH : HIHH : W
In this case is a T -aligned subword, and W is a prefix of a T -aligned subword.
Case 2(b)(ii):
T . If further %
&
%
&
%
&
&
%
%
&
%
(14)
,
%
&
%
&
%
%
(15)
Sums of Digits, Overlaps, and Palindromes
BUX[Z\
Suppose :
E
. Then since
:
^T
Then
:
?
%
is a T -aligned subword, we have
&
BX Z\
E+M
%
%
is the suffix of a T -aligned subword, so from Eq. (17) we get :
: , so
:
Combining the congruences (16) and (18), we get
] ]
and so . As before, if
, then since
we get
Now, by examining
BUX[Z\
T we get E6M
] OT %
%
] ? C : , and @W is a T -aligned subword or prefix of one, so : R , so : R lies in , and
?
(T BX Z\ E+M
: R
: R Now :
7
(16)
BUX[Z\
E
. Now
(17)
BUX[Z\
JC
. Hence T , a contradiction. Hence BUX[Z\
E
E
. Then
BX[Z\
?
(18)
E
,
.
BUX[Z\ E for $ T M
(19)
Similarly, by examining we get
E
C
M
BX[Z\
: for #%$ (20)
Combining (19) and (20), we get
E
C
M
BUX[Z\
: for #$ T
(21)
By assumption for this case, we have $ (T , so all residue classes mod , are covered, and we have
M
E
C
BX[Z\
for # (22)
: &
:
The rest of the proof proceeds as in Case 2(b)(i). The argument there shows this case cannot occur.
>
$ )
Case 2(c):
from Lemma 4 there must be an , T . Consider the word . Then
B EV?
>
that "KR
. Then by Lemma 6 we know
contains a square. Then by Lemma 5 we have
Hence
B
E
B
E
T T ]
B E
B
"
R
B
B
E
T "
R
E
E B
T " R EV?JC
B
E
BX[Z\
BUX[Z\
E+M
such
E6M
JC$BUX[Z\
E
]
It follows that "KR , for if "KR we would have T , and so T and T
, a contradiction.
(
B EV?
B
E ]
Now "
R
and "
R . It follows from Lemma 3 that T . But in Case 2 we assumed T ,
a contradiction.
The proof of Theorem 1 is now complete.
)
8
Jean-Paul Allouche and Jeffrey Shallit
4
Squares in the sequence
T
It is easy to show the following theorem about the existence of arbitrarily long squares in the sequence
>
R .
>
Theorem 7 The sequence R contains arbitrarily long squares. More precisely we have
>
(a) The sequence R contains the square of a single letter if and only if (b) For all integers T
] ]
,
>
\ B
T EV?
.
, the sequence R contains arbitrarily long squares.
Proof.
>
(a) By Lemma 6, there exists a square ;; with ; in the sequence R if and only if there exists
S ]
B E BSGE
BU[
X Z\
E
BSGE
an integer
T
"KR
. Since "
R
can take any integer value, this is
\ B such that
EV?
equivalent to T .
$
T , then in Theorem 1 we proved the existence of overlaps, hence> squares. Now the image of
(b) If
a square by R is a longer square. Iterating R and using the fact that R is a fixed point of R gives arbitrarily long squares.
]
>
T . Then the first T terms of the sequence R are
Suppose now that
C
HHH B
T E
HHIH B
T E
which contains a square of length T . The images of this square under iterates of R are arbitrarily
large squares.
Remark. It would be interesting to determine the largest (fractional) power that occurs in the sequence
>
]
R . For
T , we already know that is sharp.
5
Palindromes in
T
>
In this section we examine the occurrence of palindromes in R .
>
Theorem 8 The sequence R .
?
with T
] ,
]
contains arbitrarily long palindromes if and only if
>
Proof.
: Suppose that the sequence R contains some palindrome of even length larger than or
equal to . Then it must contain the word 6;; for some ; . If ;; is contained in the image by
?
R of some letter in , then
. Otherwise the first ; must be the last letter of the image by R of some letter, and the second ; must be the first letter of the image by R of some letter. It follows that
BX[Z\
E
BUX[Z\
E
C BX[Z\
E
; ; >
and . Hence
and this gives .
Now suppose that the sequence R contains a palindrome of odd length larger than or equal to ,
. If +; is a subword of the image by R of some letter
say 6; , with ; in , then
BUX[Z\
E
BUX[Z\
E
C BUX[Z\
E
;
'
; and , hence
. Hence again . If +; is not a
subword of the image by R of some letter in , then we have two possibilities according to whether
]
? T
or T
.
Sums of Digits, Overlaps, and Palindromes
]
9
, then either 6; is a suffix of the image by R of some letter, and a prefix of the image
If T
by R of some letter, or is a suffix of the image by R of some letter, and ; a prefix of the
BUX[Z\
E
BUX[Z\
E
image by R of some letter. In the first case we have
,
, and
; ; BX Z\ E
C BUX[Z\
E
B
X
Z
\
E
U
B
[
X
Z
\
E
, hence
. In the second case
,
,
2
;
BX[Z\
E
C BX[Z\
E
and
, hence
. This gives in both cases.
;
? If T
, then either the first is the last letter of the image by R of some letter, 6; is the image
by R of some letter, and is the image by R of some letter, or is the image by R of some
letter, ; is the image by R of some letter, and the second is the first letter of the image by R BX[Z\
E
BUX[Z\
E
of some letter. In the first case, we must have ;
and
, hence
? B
E B
E B
E
? 6; which is an overlap, hence
T
from Theorem 1 and this
BUX[Z\
E
BUX[Z\
E
is impossible. In the second case, we must have and ; , hence
BX[Z\
E
?
B
E B
E
;
. This gives 6; which is again an overlap, and we conclude as
just above.
? : Now let us suppose that . If ? , then > R ? CFCKC HIHH and hence trivially contains
arbitrarily large palindromes.
? >
? C C C C HHH
. If T is odd, then R and hence trivially contains arbitrarily
Now assume
long palindromes.
>
C$ If T is even, the sequence R is a fixed point of the morphism defined on
by
W
&
and an easy induction shows that R R R BDCFE
BDCFE
B E
$
? BC E R W
? B CFE R W
is a palindrome of length T
&
W .
Acknowledgements
We thank the referees for a careful reading of this paper.
References
[1] J.-P. Allouche. Schrödinger operators with Rudin-Shapiro potentials are not palindromic. J. Math.
Phys. 38 (1997), 1843–1848.
[2] J.-P. Allouche, L. Davison, and M. Queffélec. Transcendence of Sturmian and morphic continued
fractions. Preprint, 1999.
[3] J.-P. Allouche and J. Shallit. The ubiquitous Prouhet-Thue-Morse sequence. In C. Ding, T. Helleseth,
and H. Niederreiter, editors, Sequences and Their Applications, Proceedings of SETA ’98, pp. 1–16.
Springer-Verlag, 1999.
[4] J.-P. Allouche and L. Q. Zamboni. Algebraic irrational binary numbers cannot be fixed points of
non-trivial constant-length or primitive morphisms. J. Number Theory 69 (1998), 119–124.
[5] M. Baake. A note on palindromicity. Preprint, 1999.
10
Jean-Paul Allouche and Jeffrey Shallit
[6] J. Berstel. Axel Thue’s work on repetitions in words. In P. Leroux and C. Reutenauer, editors, Séries
Formelles et Combinatoire Algébrique, number 11 in Publications du LACIM, pp. 65–80. Université
du Québec à Montréal, June 1992.
[7] J. Berstel. A rewriting of Fife’s theorem about overlap-free words. In J. Karhumäki, H. Maurer,
and G. Rozenberg, editors, Results and Trends in Theoretical Computer Science, Vol. 812 of Lecture
Notes in Computer Science, pp. 19–29. Springer-Verlag, 1994.
[8] J. Berstel. Axel Thue’s Papers on Repetitions in Words: a Translation. Number 20 in Publications
du LACIM. Université du Québec à Montréal, February 1995.
[9] A. Bovier and J.-M. Ghez. Spectral properties of one-dimensional Schrödinger operators with potentials generated by substitutions. Comm. Math. Phys. 158 (1993), 45–66. Erratum, 166 (1994),
431–432.
[10] M. Euwe. Mengentheoretische Betrachtungen über das Schachspiel. Proc. Konin. Akad. Wetenschappen, Amsterdam 32 (1929), 633–642.
[11] S. Ferenczi and C. Mauduit. Transcendence of numbers with a low complexity expansion. J. Number
Theory 67 (1997), 146–161.
[12] E. D. Fife. Binary sequences which contain no
. Trans. Amer. Math. Soc. 261 (1980), 115–136.
[13] A. Hof, O. Knill, and B. Simon. Singular continuous spectrum for palindromic Schrödinger operators. Comm. Math. Phys. 174 (1995), 149–159.
[14] M. Morse. Recurrent geodesics on a surface of negative curvature. Trans. Amer. Math. Soc. 22
(1921), 84–100.
[15] P. Morton and W. J. Mourant. Digit patterns and transcendental numbers. J. Austral. Math. Soc. Ser.
A 51 (1991), 216–236.
[16] M. Queffélec. Transcendance des fractions continue de Thue-Morse. J. Number Theory 73 (1998),
201–211.
[17] P. Séébold. Sequences generated by infinitely iterated morphisms. Disc. Appl. Math. 11 (1985),
255–264.
[18] A. Thue. Über unendliche Zeichenreihen. Norske vid. Selsk. Skr. Mat. Nat. Kl. 7 (1906), 1–22.
Reprinted in Selected Mathematical Papers of Axel Thue, T. Nagell, editor, Universitetsforlaget,
Oslo, 1977, pp. 139–158.
[19] A. Thue. Über die gegenseitige Lage gleicher Teile gewisser Zeichenreihen. Norske vid. Selsk. Skr.
Mat. Nat. Kl. 1 (1912), 1–67. Reprinted in Selected Mathematical Papers of Axel Thue, T. Nagell,
editor, Universitetsforlaget, Oslo, 1977, pp. 413–478.
Download