Discrete Mathematics and Theoretical Computer Science 4, 2000, 001–010 Sums of Digits, Overlaps, and Palindromes Jean-Paul Allouche and Jeffrey Shallit CNRS, Laboratoire de Recherche en Informatique, Bâtiment 490, F-91405 Orsay Cedex, France Department of Computer Science, University of Waterloo, Waterloo, Ontario N2L 3G1, Canada received August 3, 1999, revised February 8, 2000, accepted February 16, 2000. Let denote the sum of the digits in the base- representation of . In a celebrated paper, Thue showed that the infinite word is overlap-free, i.e., contains no subword of the form , where is any finite word and is a single symbol. Let "! be integers with $#% , !&#(' . In this paper, generalizing Thue’s result, we prove that the infinite word )+* ,.- /.01 2!3 is overlap-free if and only if !4#5 . We also prove that )6* , contains arbitrarily long squares (i.e., subwords of the form where is nonempty), and contains arbitrarily long palindromes if and only if !87% . Keywords: sum of digits, overlap-free sequence, palindrome 1 Introduction At the beginning of the 20th century, the Norwegian mathematician Axel Thue initiated the study of what is now called combinatorics on words with his results on repetitions in words [18, 19, 6, 8]. We say a nonempty word 9 is a square if it can be written in the form :: for some word : . Examples include the words chercher in French and murmur in English. We say that 9 is an overlap if it can be written in the form ;<:;<:=; for some word : and single symbol ; . Examples include the words entente in French and alfalfa in English. Thue explicitly constructed an infinite word on two symbols that is overlapfree, that is, contains no subword that is an overlap. He also constructed an infinite word on three symbols that is square-free, that is, contains no subword that is a square. Thue’s constructions are based on what is now called the Thue-Morse sequence >@?5A+BDCFE1A+B EGA+B E3HIHH<?JC F C CKC K CFC C K CLH+HHM There are many alternative ways to define this sequence (see, for example, [3]), one being as the fixed C > BC<EO?PC B E%? C point, starting with , of the morphism N , N . One can also define in terms of S BSGE sums of digits. We define Q0R A+BUSGEV? BUSGEYX[Z\ SO]^C to be the sum of the > digits in the base- T representation of . Then QIW for all . Thue proved that is overlap-free. Since Thue’s pioneering work, many other investigators have studied overlap-free words and their generalizations. For example, Thue’s construction was rediscovered by Morse, who used it in a construction _ Research supported in part by a grant from NSERC. ` 1365–8050 c 2000 Maison de l’Informatique et des Mathématiques Discrètes (MIMD), Paris, France 2 Jean-Paul Allouche and Jeffrey Shallit in differential geometry [14]. The Dutch chess master Max Euwe rediscovered Thue’s construction in connection with a problem about infinite chess games [10]. Fife [12] described all infinite overlap-free binary sequences; also see [7]. Séébold proved the beautiful > and remarkable result that is essentially the only infinite overlap-free binary sequence which is generated by iterating a morphism [17]. It is natural to wonder if Thue’s overlap-free construction is either unique in some sense, or a particular case of a more general construction. In this note we show that t is a particular case of a more general construction involving sums of digits. We will prove ] ] Theorem 1 Let T ? C IM,MM alphabet > be integers. Then the sequence R ] T . is overlap-free if and only if ? B Q R BSGE X[Z\ E over the > In contrast to Theorem 1 we also show that R always contains arbitarily long squares. > We also consider the occurrence of palindromes in R . A palindrome is a word (such as kayak or > radar) that is equal to its reversal. We prove that R contains arbitrarily long palindromes if and only if . > Some combinatorial properties of R were previously studied by Morton and Mourant [15], who > B E proved among other things that R is ultimately periodic if and only if T . We observe that overlaps, squares, and palindromes in sequences have several applications. For example, in number theory they aid in proving the transcendence of real numbers whose base expansion or continued fraction expansion have “repetitions” [11, 4, 16, 2], while in statistical physics they are useful for studying the spectrum of certain discrete Schrödinger operators [9, 13, 1, 5]. 2 Some useful lemmas In this section we introduce some notation and prove some useful lemmas. ] Lemma 2 For any T R defined by R B ; E ? ] ; B ; > E3HHH B , the sequence R E B ; C is the fixed point, starting with , of the morphism E ;(T where the sums are taken modulo . Proof. Left to the reader. > ? ? Remark. Lemma 2 shows that R is a T -automatic sequence. For T , we get the well-known C fact that the Thue-Morse infinite word is the fixed point, starting with , of the morphism defined by C! C C , . ] S%] BUSGE Let T , be integers. Then we define " R to( be the exponent of the highest power of T which S BSGE ? S S ; if T$# but T#%'& . divides . More precisely, " R ] Lemma 3 For all integers T " R BS and S * E-, S) S+* ] we have ? / X .10 B BSGE BUS2*E E " R " R ] BSGE " R BUSGE? 3 if " R BUSGE ? if " R R " " R BS2*E ; B2 S *E . (1) Sums of Digits, Overlaps, and Palindromes 3 Proof. Left to the reader. BUS S2*E ? Remark. Note that if T is a prime number, then we have "<R necessarily true if T is not prime. B BUSGE E Lemma 4 Any block of T consecutive values of the sequence "<R value . " R & BSGE " R BS2* E , but this is not contains an occurrence of the Proof. Any T consecutive numbers contains some multiple of T , say . We have "R B EV? we are done. Otherwise "FR (T by Lemma 3. B E ] . If "KR B EV? , UB SGE BSGE and Q R is given by the following lemma. ] SO] The link between "KR Lemma 5 Let T , be integers. Then Q R BSGE B S Q R BSGE ? EV? B T E " R BUSGE+M # C HHI H C B for is 9 S Proof. Let " R ; . Then the base- T expansion ? 3 C of can be written in the formS 9 some word 9 and some single digit , where . Then the base- T expansion of E B # E[HHI H BUSGE BS EV? E BT Q R ; . ? We now turn to the properties of overlaps. If ;F:=;<:; is an overlap, then we call . ? HIHH ?C$ IMMIM B E . ? B ;<: its period EYX[ Z\ If 9 over , then we define 9 ; & ;W ; ? is a word ; HHIH & . Given a word 9 ; & ;W ; , we define its word of first differences ? B E HHH B E 9 ;W ; & ; ; & T B E T . It follows that Q0R where the differences are taken mod . We can also extend to infinite words. B E C BX[Z\ E . ? HHIH Q Q & Q W Proof. Suppose . ? HIHH ? HHIH : : & : W Q Q % W If contains an overlap of period beginning at position of . Let : ? C ? : : : be this overlap. Then : for . Then : : for , so ! ? & . Furthermore, 1% % & B EV? : : @? C . ?.B 0E B % E3HIHH B E : & : : W : & : : ! & , we get : letting ] C ] For the converse, suppose contains a square . Then there exist integers , , such that ? U B [ X Z \ E C Q %" %+& Q %" Q %" %+&% Q %"1% for #%$ . By telescoping cancellation, it follows that ? BX[Z\ E Q %& %+& Q Q %& %+& % Q % (2) C (')$ B EJ C BUX[Z\ E . If, as the hypothesis states, we have for , then J C B [ X Z \ + E M Q % Q (3) BX Z\ E C *' Together Eqs. (2) and (3) imply that Q for Q %& % . But this implies that %& contains at overlap of period , beginning at index . C MIMM Lemma 6 Let be a finite or infinite word over if and only if the word contains a square such that . The word contains an overlap ;<:;<:; 4 Jean-Paul Allouche and Jeffrey Shallit 3 Proof of the main theorem We are now ready to prove Theorem 1. ] Proof. Fix integers T C for $ and ] A+B S T ] T . If ? A+BDCFEGA+B E1A+B EHIHH > , and let R E E B S Q R T Q R BUSGE . Note that BX[Z\ E (4) T , it then follows that Every symbol contained in ?5A+B G S EGA+B S T T E[HHIH A+B S T (T E appears exactly once in . (5) > C We call such a subword (of length T , starting at a position in R which is congruent to , modulo T ) a T -aligned subword. It follows from Eq. (4) that every symbol in a T -aligned subword is completely determined once the value of a single such symbol is known. ? : Assume the subword $ > T . We will prove that the sequence R contains an overlap of period A+B B E E1A+B T T C HHIH B EGC H HIH B E is the overlap ] Since T , the base- T expansion of T C E3HHIHGA+B T E , with ] ] , is of the form B E HIH HI& B E B E6M T T T Then B EV? B EB E BX[Z\ E Q R T T T ] ] A+B B E E[HIHH A+B EV?JC HHH B EGC for . Thus T . C T Similarly, the base- T expansion of T , , is of the form C H HI H C M HIHH B C A+B E3HIHH1A+B EV? B EV? T Thus Q R T for . Thus T . In fact, . result follows. EGC . The ? : If > R has an overlap, then it has an overlap of shortest period . Let A+B E A+B be such an overlap. Note C . E A+B E ? A+B # E . By the definition of overlap, we have E3HIHHGA+B ] $ ? Case 1: T . In this case, write ? * C T where T . write T EGA+B E3HIHHGA+B EGA+B for * , where * is a positive integer, and, using the division theorem, Sums of Digits, Overlaps, and Palindromes ' EV? A+B E ' C # ' (' ' A+B 5 ' ' ' ' ' ' $ !' ' ' ( $ $T Case $ 2: T T ; (c) . In this T .case there are three subcases to consider, based on the size of : (a) T ; (b) ? ? C Case 2(a): $ T . Let ' . Then T ' where $ T . There are two cases to consider, R (i) and (ii) . ? A+B A+B E HIHH1A+B ' E Case 2(a)(i): If , then 9 of T ' , and ) E A+B HIHH1A+B )E A+B E is a subword T OT E 9 contains two identical symbols, namely and , which contradicts observation (5). , then 9 ? A+B E A+B HIHE HYA+B A+ B E isE a subword of A+B T B !' E E HHHYA+B T ' E , and 9 Case 2(a)(ii): If contains two identical symbols, namely and , again contradicting observation (5). In both cases we get a contradiction, so there cannot be an overlap with $ T . ? T . As in Case 2(a), let ' Case 2(b): T $ $ R . Suppose the overlap is A+B E HHIH A+B E6M ?5A+B E C Define : for , and note ? C M (6) : : %" for # > C ? ' Set subword of R and $ T . There are T , so that C & ? : HHIH : % R & is a T -aligned two cases to consider: (i) T ; and (ii) T . C Case 2(b)(i): T . If further T , then define ? : HHH : & : HHIH : % R & : % R HHIH : % HHIH : % W R & : % W R HHH : % R % HIHH : % R & : % R HHH :W (7) Note that , @W , and are all T -aligned subwords. & $ T , define Otherwise, if T ? : VHHIH : : HHIH : R : R HIHH : HIHH : W R : W R HHIH : R HIHH : W M (8) for , we have, by considering only those that are multiples of Since A+B * E ?JA+B *E C * ( A+B * * E ? A+B * * *E T , that T T ( for T E C (T * T A+B * T * E ^T C * * A+B B * *E EV? A+B B * T .* Hence *E * for . Hence T for . Hence T A+B * * *E BUX[Z\ E A+B * * E &A+B * * *E BX[Z\ E C * * , and so for . But then ( * ? A+B * E%HHH2A+B * * E is an overlap of period , contradicting our assumption that was T minimal. Note that & and & % & % % W are both T -aligned subwords, and % & % % % is a prefix of a T -aligned subword. M 6 Jean-Paul Allouche and Jeffrey Shallit 5T & BUX[isZa\ T -aligned subword, : R % & E . Since is a T -aligned subword or BX[Z\ E+M (9) : % R % T BUX[Z\ E BUX[Z\ E From : (6) we get : . Since W is a T -aligned subword, we get % (T BX[Z\ andE . From : % R (6) we get T BUX[Z\ E6M (10) : % R % U B [ X Z \ E 8 C U B [ X Z \ E Now combining Eqs. (9) and (10) gives T or , so . If T ] ] ] ? , then since T we get T , a contradiction. Hence . In this case by examining we get & E M BUX[Z\ : for $ T (11) ? E BX[Z\ Now : for T #%$ T . But : % , so by examining @W we get : ? , so M E BUX[Z\ for T #%$ T (12) : , so (11) and (12) together cover all residue classes mod , and so we get Now T M E C BX[Z\ for # (13) : W ? ? HIH H Now consider . By Lemma 6, must be a square. From (13) we get . But by BUX[Z\ : E . Then, from the fact that . Then from (6) we get : R T % % & prefix of one, we get UB Suppose X[Z\ E Lemma 5 ? B B T E " B R E E[HHIH B B T E R " B EE where both sides are considered modulo . Now by Lemma 4, there exists an index , T E , so B B B E E X[Z\ ? BX Z\ E B EV? such that " R . Then T , so T . "KR C[BUX[Z\ E ] ] Hence and hence . Then since T , we have T and so T . T T ? But and hence T . This contradicts the assumption of case 2(b) that T , and hence this case cannot occur. $ T , define ? : HHH : R HIHH : : HHH : H HH : R : R HHIH : HHIH : W R : W R HI HH :W T , we get T Note that is a T -aligned subword, and from the inequality so W is also a T -aligned subword. Note that may be empty. T , define Otherwise, if ? : HHIH : R HHIH : : HHIH : H HIH : R : R HIHH : HIHH : W In this case is a T -aligned subword, and W is a prefix of a T -aligned subword. Case 2(b)(ii): T . If further % & % & % & & % % & % (14) , % & % & % % (15) Sums of Digits, Overlaps, and Palindromes BUX[Z\ Suppose : E . Then since : ^T Then : ? % is a T -aligned subword, we have & BX Z\ E+M % % is the suffix of a T -aligned subword, so from Eq. (17) we get : : , so : Combining the congruences (16) and (18), we get ] ] and so . As before, if , then since we get Now, by examining BUX[Z\ T we get E6M ] OT % % ] ? C : , and @W is a T -aligned subword or prefix of one, so : R , so : R lies in , and ? (T BX Z\ E+M : R : R Now : 7 (16) BUX[Z\ E . Now (17) BUX[Z\ JC . Hence T , a contradiction. Hence BUX[Z\ E E . Then BX[Z\ ? (18) E , . BUX[Z\ E for $ T M (19) Similarly, by examining we get E C M BX[Z\ : for #%$ (20) Combining (19) and (20), we get E C M BUX[Z\ : for #$ T (21) By assumption for this case, we have $ (T , so all residue classes mod , are covered, and we have M E C BX[Z\ for # (22) : & : The rest of the proof proceeds as in Case 2(b)(i). The argument there shows this case cannot occur. > $ ) Case 2(c): from Lemma 4 there must be an , T . Consider the word . Then B EV? > that "KR . Then by Lemma 6 we know contains a square. Then by Lemma 5 we have Hence B E B E T T ] B E B " R B B E T " R E E B T " R EV?JC B E BX[Z\ BUX[Z\ E+M such E6M JC$BUX[Z\ E ] It follows that "KR , for if "KR we would have T , and so T and T , a contradiction. ( B EV? B E ] Now " R and " R . It follows from Lemma 3 that T . But in Case 2 we assumed T , a contradiction. The proof of Theorem 1 is now complete. ) 8 Jean-Paul Allouche and Jeffrey Shallit 4 Squares in the sequence T It is easy to show the following theorem about the existence of arbitrarily long squares in the sequence > R . > Theorem 7 The sequence R contains arbitrarily long squares. More precisely we have > (a) The sequence R contains the square of a single letter if and only if (b) For all integers T ] ] , > \ B T EV? . , the sequence R contains arbitrarily long squares. Proof. > (a) By Lemma 6, there exists a square ;; with ; in the sequence R if and only if there exists S ] B E BSGE BU[ X Z\ E BSGE an integer T "KR . Since " R can take any integer value, this is \ B such that EV? equivalent to T . $ T , then in Theorem 1 we proved the existence of overlaps, hence> squares. Now the image of (b) If a square by R is a longer square. Iterating R and using the fact that R is a fixed point of R gives arbitrarily long squares. ] > T . Then the first T terms of the sequence R are Suppose now that C HHH B T E HHIH B T E which contains a square of length T . The images of this square under iterates of R are arbitrarily large squares. Remark. It would be interesting to determine the largest (fractional) power that occurs in the sequence > ] R . For T , we already know that is sharp. 5 Palindromes in T > In this section we examine the occurrence of palindromes in R . > Theorem 8 The sequence R . ? with T ] , ] contains arbitrarily long palindromes if and only if > Proof. : Suppose that the sequence R contains some palindrome of even length larger than or equal to . Then it must contain the word 6;; for some ; . If ;; is contained in the image by ? R of some letter in , then . Otherwise the first ; must be the last letter of the image by R of some letter, and the second ; must be the first letter of the image by R of some letter. It follows that BX[Z\ E BUX[Z\ E C BX[Z\ E ; ; > and . Hence and this gives . Now suppose that the sequence R contains a palindrome of odd length larger than or equal to , . If +; is a subword of the image by R of some letter say 6; , with ; in , then BUX[Z\ E BUX[Z\ E C BUX[Z\ E ; ' ; and , hence . Hence again . If +; is not a subword of the image by R of some letter in , then we have two possibilities according to whether ] ? T or T . Sums of Digits, Overlaps, and Palindromes ] 9 , then either 6; is a suffix of the image by R of some letter, and a prefix of the image If T by R of some letter, or is a suffix of the image by R of some letter, and ; a prefix of the BUX[Z\ E BUX[Z\ E image by R of some letter. In the first case we have , , and ; ; BX Z\ E C BUX[Z\ E B X Z \ E U B [ X Z \ E , hence . In the second case , , 2 ; BX[Z\ E C BX[Z\ E and , hence . This gives in both cases. ; ? If T , then either the first is the last letter of the image by R of some letter, 6; is the image by R of some letter, and is the image by R of some letter, or is the image by R of some letter, ; is the image by R of some letter, and the second is the first letter of the image by R BX[Z\ E BUX[Z\ E of some letter. In the first case, we must have ; and , hence ? B E B E B E ? 6; which is an overlap, hence T from Theorem 1 and this BUX[Z\ E BUX[Z\ E is impossible. In the second case, we must have and ; , hence BX[Z\ E ? B E B E ; . This gives 6; which is again an overlap, and we conclude as just above. ? : Now let us suppose that . If ? , then > R ? CFCKC HIHH and hence trivially contains arbitrarily large palindromes. ? > ? C C C C HHH . If T is odd, then R and hence trivially contains arbitrarily Now assume long palindromes. > C$ If T is even, the sequence R is a fixed point of the morphism defined on by W & and an easy induction shows that R R R BDCFE BDCFE B E $ ? BC E R W ? B CFE R W is a palindrome of length T & W . Acknowledgements We thank the referees for a careful reading of this paper. References [1] J.-P. Allouche. Schrödinger operators with Rudin-Shapiro potentials are not palindromic. J. Math. Phys. 38 (1997), 1843–1848. [2] J.-P. Allouche, L. Davison, and M. Queffélec. Transcendence of Sturmian and morphic continued fractions. Preprint, 1999. [3] J.-P. Allouche and J. Shallit. The ubiquitous Prouhet-Thue-Morse sequence. In C. Ding, T. Helleseth, and H. Niederreiter, editors, Sequences and Their Applications, Proceedings of SETA ’98, pp. 1–16. Springer-Verlag, 1999. [4] J.-P. Allouche and L. Q. Zamboni. Algebraic irrational binary numbers cannot be fixed points of non-trivial constant-length or primitive morphisms. J. Number Theory 69 (1998), 119–124. [5] M. Baake. A note on palindromicity. Preprint, 1999. 10 Jean-Paul Allouche and Jeffrey Shallit [6] J. Berstel. Axel Thue’s work on repetitions in words. In P. Leroux and C. Reutenauer, editors, Séries Formelles et Combinatoire Algébrique, number 11 in Publications du LACIM, pp. 65–80. Université du Québec à Montréal, June 1992. [7] J. Berstel. A rewriting of Fife’s theorem about overlap-free words. In J. Karhumäki, H. Maurer, and G. Rozenberg, editors, Results and Trends in Theoretical Computer Science, Vol. 812 of Lecture Notes in Computer Science, pp. 19–29. Springer-Verlag, 1994. [8] J. Berstel. Axel Thue’s Papers on Repetitions in Words: a Translation. Number 20 in Publications du LACIM. Université du Québec à Montréal, February 1995. [9] A. Bovier and J.-M. Ghez. Spectral properties of one-dimensional Schrödinger operators with potentials generated by substitutions. Comm. Math. Phys. 158 (1993), 45–66. Erratum, 166 (1994), 431–432. [10] M. Euwe. Mengentheoretische Betrachtungen über das Schachspiel. Proc. Konin. Akad. Wetenschappen, Amsterdam 32 (1929), 633–642. [11] S. Ferenczi and C. Mauduit. Transcendence of numbers with a low complexity expansion. J. Number Theory 67 (1997), 146–161. [12] E. D. Fife. Binary sequences which contain no . Trans. Amer. Math. Soc. 261 (1980), 115–136. [13] A. Hof, O. Knill, and B. Simon. Singular continuous spectrum for palindromic Schrödinger operators. Comm. Math. Phys. 174 (1995), 149–159. [14] M. Morse. Recurrent geodesics on a surface of negative curvature. Trans. Amer. Math. Soc. 22 (1921), 84–100. [15] P. Morton and W. J. Mourant. Digit patterns and transcendental numbers. J. Austral. Math. Soc. Ser. A 51 (1991), 216–236. [16] M. Queffélec. Transcendance des fractions continue de Thue-Morse. J. Number Theory 73 (1998), 201–211. [17] P. Séébold. Sequences generated by infinitely iterated morphisms. Disc. Appl. Math. 11 (1985), 255–264. [18] A. Thue. Über unendliche Zeichenreihen. Norske vid. Selsk. Skr. Mat. Nat. Kl. 7 (1906), 1–22. Reprinted in Selected Mathematical Papers of Axel Thue, T. Nagell, editor, Universitetsforlaget, Oslo, 1977, pp. 139–158. [19] A. Thue. Über die gegenseitige Lage gleicher Teile gewisser Zeichenreihen. Norske vid. Selsk. Skr. Mat. Nat. Kl. 1 (1912), 1–67. Reprinted in Selected Mathematical Papers of Axel Thue, T. Nagell, editor, Universitetsforlaget, Oslo, 1977, pp. 413–478.