VIII. DETECTION AND ESTIMATION THEORY Academic and Research Staff Prof. H. L. Van Trees Prof. A. B. Baggeroer Prof. D. L. Snyder Graduate Students I. L. Bruyland A. E. Eckberg A. T. P. McGarty J. R. Morrissey J. E. Evans LIKELIHOOD RATIOS AND MUTUAL INFORMATION FOR POISSON PROCESSES WITH STOCHASTIC INTENSITY FUNCTIONS 1. Introduction In this report we formally obtain some results regarding likelihood ratios and mutual information for (conditional) Poisson processes with stochastic intensity functions. Our results for likelihood ratios are an extension of results obtained recently by Snyderl ' Z by a somewhat different approach (Snyder considers the case where the stochastic intensity function is Markovian while our results include the case of nonMarkovian stochastic intensities). To the best of our knowledge, the results for mutual information are new. 2. Likelihood Ratios Given the observations {N(u), u-- u -<t of a condetermine so as to minimize an expected risk We consider the following problem. ditional Poisson counting process, function, which of the following hypotheses is true. 3 ko +X(t) H 1: Xr(t) = H or Xr(t) = X,o (1) (2) where Xr (t) is the (stochastic) intensity function of N(t). X might represent the rate 2 o of arrival of photons from background radiation. Following Snyder, by a conditional Poisson counting process, we mean that if {(r(t), t >0O is given, then {N(t), t >0} is an integer valued process with independent increments, and for t > 0, This work was supported in part by the Joint Services Electronics Programs (U. S. Army, U. S. Navy, and U. S. Air Force) under Contract DA 28-043-AMC-02536(E). QPR No. 98 103 DETECTION AND ESTIMATION THEORY) (VIII. Pr [N(t + At)-N(t)=l xr(t)] = Xr(t) At + o(At) (3) Pr [N(t + At)-N(t)=0 Xr(t)] = 1 - XrAt + o(At). The method by which we obtain the likelihood ratio for the detection problem is 4-6 to obtain results for additive white Gaussian quite similar to that used by Duncan We first consider the likelihood ratio conditional on knowledge of This is equivalent to the detection problem with known intensity function and noise channels. X (t). r the form of the likelihood ratio is well known. 7-13 The unconditional likelihood ratio is then obtained by averaging over X(t) and utilizing certain properties of stochastic integrals. The likelihood ratio for the detection problem above conditional on knowledge of X(t) has been shown by Reiffen and Sherman to be x(u) + st k(u) du + = X(t) = exp 0 p(N I Ho) (4) dN(u) in [ where p(Nt H 1 , X) = conditional probability density of the observed counting process Nt = {N(u), 0 < u O p(N Ho = <t}, given X(t), under the assumption that H 1 is true. conditional probability density of N , under the assumption that H 0 is true. The second integral on the right-hand side of (4) (and all other integrals involving dN(u)) will be interpreted in the sense introduced by Ito (see Doobl). Before averaging (3) over possible paths of X(t), we shall put (3) into a more convenient form by use of the Ito differential rule. Snyder 2 has shown that if z t is a scalar Markov process satisfying the stochastic differential equation, (5) dzt = at dt + bt dN t , where at and bt are nonanticipative functionals of the Poisson counting process and Nt , t(zt) is a differentiable function of zt and t, then (zt) + dqt(zt) = Applying (6) to (4), (zt) a we find that X(t) dt + [t(zt+bt)-t(zt)]dNt. satisfies the following stochastic equation: QPR No. 98 104 (6) differential DETECTION AND ESTIMATION THEORY) (VIII. dx(t) = -X(t) X(t) dt + Lx Xo x(t) - X(t) X(t) e I n dN(t) (7) X(t) X(t) dN(t), = -x(t) x(t) dt + - or equivalently X(t) = X(o) - X(u) %(u) du + 0 with respect to all possible paths of We now take the expectation of both sides of (7) X(t). From (4), (8) X(u) X(u) dN(u). o0 we see that X(t) is related to the unconditional likelihood ratio, A(t), A(t) = EX[X(t) ] (9) , where EX[. ] denotes averaging over all sample paths of X(t). A(t) = A(o) - EX X(u) X(u) du 0 + Thus X(u)X(u) dN(u) E 0 (10) = A(o) - EX[X(u)X(u)] du + 0 In Eqs. fixed, it 8-10, is o 0 EX[X(u)X(u) dN(u). ] important to keep in mind that N(u) is the observation and thus as far as the expectation in (9) and (10) is concerned. The stochastic differential equation corresponding to (10) dA(t) = -EX[X(t)X(t)] dt + is Ex[X(t)X(t)] dN(t) o =- EX[(t)X(t) ] A(t) dt E[X(t)] 1 o EX[X(t)X(t) (11) ] A(t) dN(t). EX[X(t)] But EX[x(t)k(t)] EiXL(t)p(Nt I , H)] Hi] -X(t) = E x(t)i N EX[X(t)] E,[ t Ip X, H1 = conditional expectation of X(t), t given No, under the assumption that H1 is true. least-squares realizable estimate of X(t), under the assumption that H 1 is true. QPR No. 98 105 (VIII. DETECTION AND ESTIMATION THEORY) Thus, we find that the likelihood ratio, A(t), is the solution to the stochastic dif- ferential equation A dA(t) = -xt) A(t) dt + ^ 1 k(t) A(t) dN(t). - (13) o The solution to (13) is A(t) = expL (u) du + In dN(u) . (14) A Our result (14) can be verified by application of the Ito differential rule for Poisson counting processes to (14) and noting that we obtain (13). Furthermore, (14) clearly has the proper value at t = 0. Comparing (14) functions is with (4), we see that the likelihood ratio for stochastic intensity equivalent to that for known intensity functions, causal minimum mean-square estimate of the intensity is tor in the same way as if it were known. in the sense that the incorporated in the detec- An analogous relationship has been obtained for stochastic signals on additive white Gaussian noise channels by Duncan. 4 ' 5 The form above for the likelihood ratio is identical to that obtained by Snyder for Markovian X(t). Snyderl also has some results concerning means of obtaining causal minimum mean-square estimates of 3. 2 k(t). Mutual Information Results We shall now apply our results for likelihood ratios to the problem of computing the mutual information between the message process m and the conditional Poisson counting process {N(t), t >0} when the stochastic intensity function of N(t) is X + X(m, t). Our 6 15 o approach is quite similar to that of Duncan and Zakai who obtained mutual information results for additive white Gaussian noise channels: of calculating certain probabilities We convert the problem into a detection problem by introducing a dummy hypothesis corresponding to Ho of the previous section. By using appropriate like- lihood ratio results and some properties of stochastic integrals, we are able to obtain results in terms of certain causal filtering estimates. Let us define N t = channel output in time interval [u, t] Sr(t) = = counting process k(t, m) + ko = intensity function of N(t) m = message. The mutual information between N t and m 0 QPR No. 98 is given (see Gelfand and Yaglom 106 15 by (VIII. I[N DETECTION AND ESTIMATION THEORY) (15) = E log. m The likelihood ratio appearing in the expression for the mutual written p(Ntm information can be p(NtIm (16) where likelihood ratio for detection between the hypothesis that X (t) X(t, m) + Xo with the particular m and the hypothesis that Xr(t) = -(Nt p(Nt ) . = likelihood ratio for detection between the hypothesis m random and the hypothesis that k = (Nt r k = X(t, m) + X with o Now, from section 2, p(Nt m ) = exp rt X(u, rn) du + 0 FX+(u, m) t t In 0 (u m) dN(u) (17) p(N t )u) - exp oro X(u) du + YIn X. A where X is the conditional mean of X(u, m), given N(u), o - u -<t. X(t)= EIX(t' m) I Nt, X(t) = X(t, m) +X]. Thus we have QPR No. 98 107 (18) dN(u), That is, (19) (VIII. DETECTION AND ESTIMATION THEORY) I(Nto , m) t [-X(u, m)+ At(u)] du + = E X + k(u, m) In dN(u) A . (20) OX+ (u) The first integral can be shown to be zero by utilizing the properties of a conditional expectation: E E[ (u,m)INto =E [(u, m)-X (u)]du} -X(u) du = 0. (21) To evaluate the second integral, we write + X(u,) t in n dN(u) 0+ dN(u) 0O = I n X + X(u, m) [dN(u) -k(u, m)du] + A X0 + X(U) 5O In k(u, m) du. o 0 Now t X + X(u, m) o In E ]- [dN(u)-k(u, m)du] (22) = 0 from the definition of the stochastic integral, since N(u)14 ). (Doob fOu k(u, m) du is a martingale Thus we have [t ILN ,m t = E tX In X + X(u, m) o w n+ k(u, m) (23) dt. (u) Next, we note that IN 0, m Y E{ln [o + X(u, m)] + (u, m)-ln [X X(u)] X(u, m)} du. t Again taking expectations conditional on No we obtain I[N, m] QPR No. 98 n = (u)] n[ E In [Xo+ k(u)] k(u) -In u u] [Xo + k(u)] k(u) 0 108 du, (24) (VIII. DETECTION AND ESTIMATION THEORY) where In [o + k(u)] k(u) = realizable least-squares estimate of X(u,m) In [ko +(u,m)], given the observations {N(s), O - s -- u}, under the assumption that (t) = X(t, m) + X One can readily verify that I[N , m] is a monotone increasing function of t. In the case of large background intensity, that is, X >>X(u, m) A X0 >>X(u), we can relate IIN t,m to a realizable least-squares filtering error by using the expansion In [Xo+X] = In X + X/Xo to obtain I[ No'm = E[ (u)-k(u)] du = (25) E{p(u)} du, A where p(u) is the conditional variance of the estimate X(u). A somewhat more interesting result can be obtained by using this expansion on Eq. 23 to obtain I[Nto, m] (26) E{[X(u, m)-(u)] X(u, m)} du. = A A The error, X(u, m) - X(u), is seen to be orthogonal to X(u) as follows: E{[X(u, m)-X(u)](u)} = E(E [(u, m)-X(u)]X(u) Nt so that (12) is I N m = 0, equivalent to = o0 5 E ([X(t, m)-X(u)]2) du. (27) Equation 27 is our desired result for the relationship between the mutual information and the realizable filtering error for the estimation of X(t). A result quite similar to (27) was obtained for the mutual information on the white Gaussian channel with (or without) feedback by Duncan and Zakai. They consider the problem y(t) = QPR No. 98 (28) s, ys, m) ds + w(t), 0 109 (VIII. DETECTION AND ESTIMATION THEORY) where m denotes the message, y(t) denotes the channel output at time path y(u), o - u < s, w(t) is a Brownian motion, and m is the channel that they obtain is that the amount of information between the message path yo I m , is related to a causal mean-square filtering error [' Iym = 1 E (s,yo = E [s, yo, m Iy ]. Y, Yo, _- yo 2 t, yS denotes the input. The result m and the output by ds, (29) where 4. Summary We have formally derived some results regarding likelihood ratios and mutual information for (conditional) Poisson processes with stochastic intensity functions. The results emphasize the role of realizable least-squares estimates of the stochastic intensity. Several comments on extensions and utilization of the results presented here are appropriate. 1. The derivation here is formal; for example, we have assumed that several interchanges of integration and expectation are valid without giving sufficient conditions. It would be of interest to rigorously establish these results by using appropriate Radon-Nikodym derivative results, 17 , 18 Borel fields, and so forth. 2. We can readily extend our result to cover the case of feedback channels, by making a minor extension of the likelihood ratio results. Vector channels can also be readily considered, by minor modifications to the likelihood-ratio results. 3. These results may be of interest in optical communication, 9 biomedical processing, 1 and studies of auditory psychophysics.19 data J. E. Evans [Mr. James E. Evans is an M. I. T. Lincoln Laboratory Staff Associate.] References 1. D. Snyder, "Estimation of Stochastic Intensity Functions of Conditional Poisson Processes," Monograph No. 128, Washington University Biomedical Computer Laboratory, St. Louis, Missouri, April 1970. 2. 3. D. Snyder "Detection of Nonhomogeneous Poisson Processes Having Stochastic Intensity Functions," Monograph No. 129, Washington University Biomedical Computer Laboratory, St. Louis, Missouri, April 1970. The likelihood ratio when X = X + X(t) on hypothesis H can be obtained by using 4. the chain rule for likelihood ratios. See Snyder 2 for details. T. E. Duncan, "Evaluation of Likelihood Functions," Inform. Contr. 13, 62-74 (1968). QPR No. 98 r oo 110 (VIII. DETECTION AND ESTIMATION THEORY) Noise" T. E. Duncan, "Likelihood Functions for Stochastic Signals (unpublished paper, dated 1969). 6. T. E. tion). 7. B. Reiffen and H. Sherman, "An Optimum Demodulator for Poisson Processes: Photon Source Detectors," Proc. IEEE 51, 1316-1320 (1963). 8. J. W. Goodman, tum Electronics, 9. R. M. Gagliardi and S. Karp, "N-ary Poisson Detection and Optical Communications," IEEE Trans., Vol. COM-17, No. 2, pp. 208-216, April 1969. Duncan, in White 5. "On the Calculation of Mutual Information" (submitted for publica- "Optimum Processing of Photoelectron Vol. QE-1, pp. 180-181, July 1965. Counts," IEEE J. Quan- 10. G. L. Fillmore and G. Lachs, "Information Rates for Photocount Detection Systems," IEEE Trans., Vol. IT-15, No. 4, pp. 465-468, July 1969. 11. J. Bar-David, "Communication under Poisson Regime," IEEE Trans., Vol. IT-15, No. 1, pp. 31-37, January 1969. 12. R. W. Chang, "Photon Detection for an Optical Pulse Code Modulation System," IEEE Trans., Vol. IT-15, pp. 725-728, November 1969. 13. K. Abend, "Optimum Photon Detection," IEEE Trans., January 1966. 14. J. 15. M. Zakai, "On the Mutual Information of the White Gaussian Channel with or without Feedback" (submitted for publication). 16. I. M. Gelfand and A. M. Yaglom, "Calculation of the Amount of Information about a Random Function Contained in Another Such Function," Am. Math. Soc. Translations (Series 2) 12, 199-246 (1959). 17. A. V. Skorokhod, "On the Differentiability of Measures Which Correspond to Stochastic Processes I. Processes with Independent Increments," in Theory of Probability and Its Applications, Vol. 2, No. 4, pp. 407-432, 1957. 18. A. V. Skorokhod, "Absolute Continuity of Infinitely Divisible Distributions under Translations," in Theory of Probability and Its Applications, Vol. 10, No. 4, pp. 465-472, 1965. 19. W. M. Siebert, "Stimulus Transformations in the Peripheral Auditory System," in P. A. Kolers and M. Eden (eds.), Recognizing Patterns in Living and Automatic Systems (The M. I. T. Press, Cambridge, Mass. , 1968), pp. 104-133. L. Doob, QPR No. 98 Vol. IT-12, Stochastic Processes (John Wiley and Sons, 111 Inc., pp. 64-65, New York, 1953).