Centre de Recherches Mathématiques CRM Proceedings and Lecture Notes Volume 46, 2008 On the Distribution of ω(n) Rizwanur Khan Abstract. We give yet another proof of the normal distribution of ω(n). 1. Introduction Let ω(n) denote the number of distinct prime factors of n. That is, ω(n) = X 1. p|n In this article we revisit the old question of how the values of w(n) are distributed. The average value of this additive function is easily found to be 1X 1 XX 1 XX ω(n) = 1= 1 x x x (1.1) n≤x n≤x p|n = p≤x n≤x p|n ¹ º 1X x = log2 x + C + O(1/ log x), x p p≤x where we write log2 x for log log x (and similarly for logj x) and C is an absolute constant. Next let us follow Turán [11] in finding the variance of ω(n). We have (1.2) 1X 1X 1X bxc (ω(n) − log2 x)2 = ω(n)2 − 2 log2 x ω(n) + (log2 x)2 x x x x n≤x n≤x n≤x 1X = ω(n)2 − (log2 x)2 − 2C log2 x + O(1). x n≤x 2000 Mathematics Subject Classification. Primary: ??; Secondary: ?? This is the final form of the paper. c °2008 American Mathematical Society 1 2 R. KHAN For the first term we have ¹ º ¹ º 1X 1X x 1X X 1 X x ω(n)2 = 1= + x x x pq x p n≤x p≤x n≤x q≤x p|n,q|n = pq≤x p6=q µX ¶2 X 1 1 − − 2 p √ p p≤x p≤ x p≤x X p≤x,q≤x pq>x 1 + log2 x + O(1) pq = (log2 x)2 + (2C + 1) log2 x + O(log3 x). Thus (1.2) equals (1.3) 1X (ω(n) − log2 x)2 = log2 x + O(log3 x), x n≤x after a delicate cancellation of the (log2 x)2 terms! Note that (1.3) implies that that w(n) ∼ log2 n for almost all n. P With this method, computation of the higher moments x1 n≤x (ω(n)−log2 x)k for integers k > 2 would quickly become difficult, as cancellations and errors would become arduous to keep track of. It is in fact not a good idea to separate ω(n) and log2 x as we did when we expanded (ω(n) − log2 x)2 to find the second moment above. Granville and SoundararajanP[3] show how powerful this simple point is by computing very high moments: x1 n≤x (ω(n) − log2 x)k uniformly for k ≤ (log2 x)1/3 . Let us follow them in defining for a prime p, ( 1 − p1 if p | n, fp (n) = − p1 if p - n. Q Q αi i Extend this definition multiplicatively: if m = i pα i , define fm (n) = i fpi (n) . (ThusP f1 (n) = 1.) Also, instead of log2 x let us work with the more exact mean value p≤x p1 . Then we have ω(n) − X1 X = fp (n). p p≤x p≤x What is the advantage of this? If we think of a prime p dividing n with probability 1/p independently ofP other primes, then we have E(fm )=0 for squarefree m. So we have written ω(n) − p≤x 1/p as a sum of independent random variables of mean 0. When working with this expression, the delicate cancellations mentioned above take care of themselves in moment calculations. P Further, this model would predict by the central limit theorem that ω(n) − p≤x 1/p is normally distributed with mean 0. The prediction above is in fact true. Erdős and Kac [1] were the first to show that ω(n) is normally distributed with mean log2 x and variance log2 x. In [6] Rényi and Turán proved this with a sharp error term, using results on πk (x), the number of integers up to x with exactly k prime factors (see [8, 9]). The following theorem can also be found in Tenenbaum’s book [10]. ON THE DISTRIBUTION OF ω(n) 3 p Theorem 1.1. The number of integers n ≤ x for which |(ω(n)−log2 x)/ < W for some W > 0 is µ ¶ Z W x x 2 √ exp(−u /2) du + O p , 2π −W log2 x log2 x| where the implied constant depends on W . We show how a smooth version of the theorem above, with an error term nearly as good, can be proven with little effort. Theorem 1.2. Fix a smooth (C ∞ ) and compactly supported real function ψ. We have µ ¶ µ ¶ Z ∞ 1X ω(n) − log2 x 1 log x p ψ =√ ψ(u) exp(−u2 /2) du + O p 3 , x 2π −∞ log2 x log2 x n≤x where the implied constant depends on ψ. Let us adopt the convention that ε always denotes an arbitrary small positive constant, but not necessarily the same one from one occurrence to the next, and that all implied constants may depend implicitly on ε and ψ. The moments computed by Granville and Soundararajan match the moments of a normally distributed random variable, and so their work implies the result of Erdős Pand Kac. We do use the idea of Granville and Soundararajan of writing ω(n) − p≤x 1/p as a sum of random variables of mean 0, but instead of computing P moments we work with the Fourier transform of ω(n) − p≤x 1/p, also called the characteristic function in the language of probability theory. 2. Set Up Define y = y(x) = log x and z = z(x) = x1/ log2 x and X ω(n; y, z) = 1. p|n y<p<z It is more convenient to work with P ω(n; y, z) − y<p<z 1/p qP y<p<z 1/p p in place of (ω(n) − log2 x)/ log2 x. The following lemma will be used to show that there is no significant loss in disregarding these small or large primes. Lemma 2.1. We have P ¯ µ ¶ µ ¶¯ ω(n; y, z) − y<p<z 1/p ¯ 1 X ¯¯ ω(n) − log2 x log3 x ¯¿ p p q − ψ . ψ P ¯ ¯ x log x log2 x 1/p 2 n≤x y<p<z Proof. Let E = E(x) denote the (exceptional) set of integers less than or equal to x with more than 10 log3 x distinct prime factors less than y or more than 10 log3 x distinct prime factors greater than z. The size of this set is µX ¶10 log3 x µ X ¶10 log3 x x x x 1 1 |E| ≤ + ¿ , b10 log3 xc! p b10 log3 xc! p log2 x p≤y z≤p≤x 4 R. KHAN P using that p≤x 1/p = log2 x + C + O(1/ log x) and Stirling’s estimate n! ∼ √ n+1/2 −n 2πn e . Therefore we have P ¯ µ ¶ µ ¶¯ X ¯ ω(n; y, z) − y<p<z 1/p ¯ 1 ω(n) − log x 2 ¯ ¿ kψk∞ . ¯ψ p q − ψ P ¯ ¯ 1 x log x log x n∈E 2 y<p<z p 2 For n ∈ / E we have ω(n) − ω(n; y, z) ¿ log3 x, and so it follows that P ω(n) − log2 x ω(n; y, z) − y<p<z 1/p p qP − log2 x y<p<z 1/p P µ ¶ ω(n) − log2 x − ω(n; y, z) + y<p<z 1/p |ω(n) − log2 x| log3 x qP = +O log2 x y<p<z 1/p log x |ω(n) − log2 x| log3 x + . ¿p 3 log2 x log2 x Thus P ¯ µ µ ¶ ¶¯ ω(n; y, z) − y<p<z 1/p ¯ 1 X ¯¯ ω(n) − log2 x ¯ p q − ψ ψ P ¯ ¯ x log2 x n∈E / y<p<z 1/p ¿ kψ 0 k∞ log3 x kψ 0 k∞ X |ω(n) − log2 x| log3 x p + . x log2 x log2 x n≤x By the Cauchy – Schwarz inequality and (1.3) the second term above is seen to be ¿ log3 x(log2 x)−1/2 . ¤ Let Z b )= ψ(T ∞ ψ(u)e−iuT du −∞ b )| ¿ 1 and by integration by parts denote the Fourier transform of ψ. We have |ψ(T −B several times we have |ψ̂(T )| ¿B T for any B > 0. By Fourier inversion and these bounds we have, P µ ¶ ω(n; y, z) − y<p<z 1/p 1X qP (2.1) ψ x n≤x y<p<z 1/p P µ ¶ Z ∞ ω(n; y, z) − y<p<z 1/p 1X 1 b ) exp iT qP = ψ(T dT x 2π −∞ 1/p n≤x y<p<z ! P ε µ µ ¶ Z (log x) 2 ω(n; y, z)− y<p<z 1/p 1 1X 1 b qP dT + O = ψ(T ) exp iT , x 2π −(log2 x)ε (log2 x)15 1/p n≤x y<p<z for any ε > 0. In the next section we will prove the following theorem. P P Theorem 2.2. Let t = T ( y<p<z 1/p)−1/2 . For |T | ≤ ( y<p<z 1/p)1/2 /1000, we have P ¶ µ ¶ µ ¶ µ X ω(n; y, z) − y<p<z 1/p 1 1X it qP 1/p +O . exp iT = exp (e −1−it) x log x y<p<z n≤x y<p<z 1/p ON THE DISTRIBUTION OF ω(n) 5 Certainly the theorem is true with 10−3 replaced by a larger constant, but we have not tried to improve it. For |T | ≤ (log2 x)ε the main term above equals õ ! ¶ µ ¶ X X −t2 it 3 (2.2) exp (e − 1 − it) 1/p = exp + O(t ) 1/p 2 y<p<z y<p<z ¶! µ ¶Ã µ −T 2 T3 = exp 1+O p . 2 log2 x The function exp(−T 2 /2) is the characteristic function of the normal distribution. Using Theorem 2.2 and the observation (2.2) we get that the main term of (2.1) equals µ ¶ µ ¶ Z (log2 x)ε 2 1 1 b ) exp −T (2.3) ψ(T dT + O p 2π −(log2 x)ε 2 log2 x ¶ µ µ ¶ Z ∞ 1 1 −T 2 b dT + O p = ψ(T ) exp . 2π −∞ 2 log2 x √ Recall that the Fourier transform of (1/ 2π) exp(−u2 /2) is exp(−T 2 /2). By the Plancherel formula the main term of (2.3) equals µ 2¶ Z ∞ 1 −u √ ψ(u) exp du. 2 2π −∞ Thus Theorem 1.2 follows from Lemma 2.1 and Theorem 2.2. 3. Proof of Theorem 2.2 For the range of T under consideration in Theorem 2.2 we have that |t| ≤ 10−3 . The left-hand side of Theorem 2.2 is µ X ¶ ¡ ¢ 1X 1X Y exp it fp (n) = exp itfp (n) . x x y<p<z y<p<z n≤x n≤x Expanding the exponential as a Taylor series this equals µ ¶ 1X Y 1 1 (3.1) 1 + itfp (n) − t2 fp2 (n) − it3 fp3 (n) + · · · x 2! 3! y<p<z n≤x = X Ka tΩ(a) a≥1 p|a⇒y<p<z 1X fa (n), x n≤x where Ω(a) is the number of prime factors of a counted with multiplicity and Y iα Ka = . α! α p ka α α α+1 Above p k a means that p | a but p result of Granville and Soundararajan, - a. To evaluate (3.1) we use the following Lemma 3.1. Let A denote the square-free divisors of A. We have ¶α µ Y µ1µ 1 1X 1− + 1− fa (n) = x p p α n≤x p ka part of a and d(A) the number of 1 p ¶µ −1 p ¶α ¶ µ +O ¶ d(A) . x 6 R. KHAN The main term is just what we would expect from the heuristic that p divides n with probability 1/p, independently of other primes. Note that the main term is zero unless a is 1 or square-full (a is square-full if p | a implies p2 | a). Before using this lemma, we make a couple of observations which will be needed later. Lemma 3.2. We have X (3.2) |Ka | |t|Ω(a) a≥1 p|a⇒y<p<z ω(a)≥(log2 x)/2 1X 1 . |fa (n)| ¿ x log x n≤x Proof. Let r ≥ 12 log2 x be an integer. We first bound absolutely the contribution of the terms of (3.2) with ω(a) = r. Note that |Ka | ≤ 1 and |fpα (n)| ≤ |fp (n)|. Therefore X X 1X 1X (3.3) |Ka ||t|Ω(a) |fa (n)| ¿ |t|Ω(a) |fA (n)|, x x a≥1 ω(a)=r p|a⇒y<p<z n≤x a≥1 ω(a)=r p|a⇒y<p<z n≤x where A denotes the square-free part of a. Note that for a fixed square-free integer A with ω(A) = r we have X |t|Ω(a) ¿ |2t|r , a≥1 A=square-free part of a as |t| ≤ 10−3 . Thus (3.3) is bounded by µ X ¶r 1X 1 (3.4) ¿ |2t|r |fp (n)| . x r! y<p<z n≤x Since |fp (n)| ≤ 1 if p | n and |fp (n)| ≤ 1/p if p - n, this is bounded by µ ¶r 1X 1 ¿ |2t|r ω(n; y, z) + log2 x (3.5) x r! n≤x ¿ |4t|r (log2 x)r 1 X ω(n; y, z)r + |4t|r . r! x r! n≤x Since r ≥ 12 log2 x and |t| ≤ 10−3 we have that the first term on the right-hand side of (3.5) is ¿ 10−r . For the second term we have µ ¶r X ω(n; y, z)r |4t|r 1 X X r1 (3.6) |4t| = 1 x r! r! x n≤x n≤x ¿ |4t|r r! p|n y<p<z X y<p1 ,...,pr <z 1 x X 1, n≤x [p1 ,...,pr ]|n where [p1 , . . . , pr ] is the least common multiple of the primes p1 , . . . , pr . We have that (3.6) is less than or equal to X |4t|r 1 (3.7) . r! y<p ,...,p <z [p1 , . . . , pr ] 1 r ON THE DISTRIBUTION OF ω(n) Now for 0 ≤ k ≤ r − 1 we have X y<p1 ,...,pr <z {p1 , . . . , pr } is a set of r − k distinct primes 1 ¿ [p1 , . . . , pr ] 7 µ ¶µ X ¶r−k r 1 . k p y<p<z So by the binomial theorem we get that (3.7) is P |4t|r (1 + y<p<z 1/p)r |4t|r (log2 x)r ¿ ¿ . r! r! Since r ≥ 12 log2 x and |t| ≤ 10−3 , this is ¿ 10−r . Finally, summing 10−r over the integers r ≥ 12 log2 x, we get the lemma. ¤ Lemma 3.3. We have ¶α µ ¶µ ¶α ¯ X Y ¯¯ 1 µ 1 1 −1 ¯¯ 1 Ω(a) ¯ (3.8) |Ka ||t| + 1− ¯p 1 − p ¯ ¿ log x . p p α a≥1 p|a⇒y<p<z ω(a)≥(log2 x)/2 p ka Proof. Let r ≥ 12 log2 x be an integer. We first bound absolutely the contribution of the terms of (3.8) with ω(a) = r. We have ¯ ¶α µ ¶µ ¶α ¯ X Y ¯1µ 1 1 −1 ¯¯ Ω(a) ¯ (3.9) |Ka ||t| + 1− ¯p 1 − p ¯ p p a≥1 pα ka p|a⇒y<p<z µ X ¶r 1 1 1 1 ω(a)=r ¿ |4t|r (log2 x)r ¿ r , ¿ |4t|r r! y<p<z p r! 10 since r ≥ 12 log2 x and |t| ≤ 10−3 . Summing 10−r over the integers r ≥ get the lemma. 1 2 log2 x, we ¤ Now we are ready to to find the main term of Theorem 2.2. By Lemma 3.2 we have that (3.1) equals, up to an error of O(1/ log x), the sum X 1X (3.10) Ka tΩ(a) fa (n). x a≥1 p|a⇒y<p<z ω(a)≤(log2 x)/2 n≤x We use Lemma 3.1 to evaluate this. The error incurred from the use of this lemma is bounded by X X d(A) 2log2 x log x 2log2 x |t|Ω(a) ¿ ¿ z (log2 x)/2 ¿ √ . x x x x a≥1 p|a⇒y<p<z ω(a)≤(log2 x)/2 a square-free p|a⇒p<z ω(a)≤(log2 x)/2 The main term of (3.10) equals ¶α µ ¶µ ¶α ¶ X Y iα µ 1 µ 1 −1 1 (3.11) tα 1− . + 1− α! p p p p α a≥1 p ka p|a⇒y<p<z ω(a)≤(log2 x)/2 The expression above is zero if a is not 1 or square-full. Thus for a 6= 1 we may further impose the implication pα k a ⇒ α ≥ 2. By Lemma 3.3, we may also 8 R. KHAN extend, up to an error of O(1/ log x), the sum in (3.11) to all a ≥ 1 whose prime factors lie between y and z. Thus (3.11) equals, up to this error, µ ¶α µ ¶ µ ¶α ¶ Y ³ 1 X iα α 1 1 1 X iα α (3.12) 1+ t 1− + 1− t − p α! p p α! p y<p<z α≥2 α≥2 à µ ¶! Y 1 it 1 = 1 + (e − 1 − it) + O 2 p p y<p<z à ! µ ³ 1 ´¶ X 1 it log 1 + (e − 1 − it) + O 2 = exp . p p y<p<z Now since µ ¶! µ ¶ 1 it 1 it 1 1 = (e − 1 − it) + O 2 , log 1 + (e − 1 − it) + O 2 p p p p à we have that (3.12) equals µ µ ¶ X 1¶ 1 it (3.13) exp (e − 1 − it) +O . p y y<p<z 4. Concluding Remarks In [4], the characteristic function of Theorem 2.2 is used to detect when P ¯ ¯ ¯ ω(n) − y<p<z 1/p ¯ 1 ¯ q ¯≤ P ¯ ¯ (log x)1/2−ε , 2 y<p<z 1/p or when |ω(n) − log2 x| ≤ (log2 x)ε . These are integers with very nearly the average number of prime factors. Let us call them ε-normal numbers. How are these numbers distributed amongst the integers? p The number of ε-normal numbers up to x is ∼ x/( π/2(log2 x)1/2−ε ), by Theorem 1.1. Let us write them in order as N1 , N2 , . . .. The distance between p consecutive elements is then π/2(log2 x)1/2−ε on average. How does the spacing e distance p vary from the average? Rescale the ε-normal numbers by defining Ni = Ni /( π/2(log2 x)1/2−ε ), so that the consecutive spacing between the rescaled εnormal numbers is 1 on average. In [4] we prove that given 0 < α < β, we have Z β ei+1 − N ei < β} ∼ x (4.1) #{i ≤ x : α < N e−u du, α as x → ∞. The distribution function e−u is called the Poisson or exponential distribution. It is significant because it predicts how randomly scattered objects are spaced. More precisely, suppose that we have bxc random variables independently and uniformly taking real values in the interval (0, bxc) and written in order as Y1 < · · · < Ybxc . Then for α > 0 we have that Prob(Yi+1 − Yi > α) ∼ e−α as x → ∞ for any i. So (4.1) implies that ε-normal numbers are somewhat randomly dispersed amongst the integers. The study of spacings in arithmetic sequences is of great interest to number theorists. We name just a few examples. Gallagher [2] showed that conditional on the truth of the Hardy – Littlewood prime k-tuple conjectures, the primes less than ON THE DISTRIBUTION OF ω(n) 9 x, as x → ∞, form a Poisson process (that is, their spacings follow the Poisson distribution law as described above). Kurlberg and Rudnick [5] proved that the Error: it should be quadratic residues modulo q, as √ ω(q) → ∞, form a Poisson process. It is conjectured that the fractional parts of n 2 for n ≤ x, as x → ∞, form a Poisson process (see $n^2 \sqrt{2}$ [7]). Of course, some arithmetic sequences do not behave randomly, such as the zeros of the Riemann zeta function. We conjecture that integers with more or less exactly the average number of prime factors, that is for |ω(n) − log2 x| ≤ 12 , form a Poisson process. The main strategy of our proof is to show that additive shifts of ω(n) behave independently. For example we find the joint characteristic function P P µ ¶ µ ¶ ω(n; y, z) − y<p<z 1/p ω(n + 2; y, z) − y<p<z 1/p 1X q q exp iT1 exp iT2 P P x n≤x y<p<z 1/p y<p<z 1/p P µ ¶ ω(n + 5; y, z) − y<p<z 1/p qP × exp iT3 , y<p<z 1/p and show that it essentially equals exp(−T12 /2) exp(−T22 /2) exp(−T32 /2). References 1. P. Erdős and M. Kac, The Gaussian law of errors in the theory of additive number theoretic functions, Amer. J. Math. 62 (1940), 738 – 742. 2. P. X. Gallagher, On the distribution of primes in short intervals, Mathematika 23 (1976), no. 1, 4 – 9. 3. A. Granville and K. Soundararajan, Sieving and the Erdős – Kac theorem, Equidistribution in Number Theory, an Introduction (Montréal, 2005) (A. Granville and Z. Rudnick, eds.), NATO Sci. Ser. II Math. Phys. Chem., vol. 237, Springer, Dordrecht, 2007, pp. 15 – 27. 4. R. Khan, On the distribution of normal numbers, preprint. 5. P. Kurlberg and Z. Rudnick, The distribution of spacings between quadratic residues, Duke Math. J. 100 (1999), no. 2, 211 – 242. 6. A. Rényi and P. Turán, On a theorem of Erdős – Kac, Acta Arith. 4 (1958), 71 – 84. 7. Z. Rudnick, P. Sarnak, and A. Zaharescu, The distribution of spacings between the fractional parts of n2 α, Invent. Math. 145 (2001), no. 1, 37 – 57. 8. L. G. Sathe, On a problem of Hardy on the distribution of integers having a given number of prime factors. IV, J. Indian Math. Soc. (N.S.) 18 (1954), 43 – 81. 9. A. Selberg, Note on a paper by L. G. Sathe, J. Indian Math. Soc. (N.S.) 18 (1954), 83 – 87. 10. G. Tenenbaum, Introduction à la théorie analytique et probabiliste des nombres, 2nd ed., Cours Spec., vol. 1, Soc. Math. France, Paris, 1995; English transl. in Introduction to analytic and probabilistic number theory, translated by C. B. Thomas, Cambridge Stud Adv. Math., vol. 46, Cambridge Univ. Press, Cambridge, 1995. 11. P. Turán, On a theorem of Hardy and Ramanujan, J. London Math. Soc. 9 (1934), 274 – 276. Department of Mathematics, University of Michigan, 530 Church St., Ann Arbor, MI 48109, USA. E-mail address: rrkhan@umich.edu