Survey of Maneuvering Target Tracking. Part V: Multiple-Model Methods X. RONG LI, Fellow, IEEE VESSELIN P. JILKOV, Member, IEEE University of New Orleans This is the fifth part of a series of papers that provide a comprehensive survey of techniques for tracking maneuvering targets without addressing the so-called measurement-origin uncertainty. Part I and Part II deal with target motion models. Part III covers measurement models and associated techniques. Part IV is concerned with tracking techniques that are based on decisions regarding target maneuvers. This part surveys the multiple-model methods–the use of multiple models (and filters) simultaneously–which is the prevailing approach to maneuvering target tracking in recent years. The survey is presented in a structured way, centered around three generations of algorithms: autonomous, cooperating, and variable structure. It emphasizes the underpinning of each algorithm and covers various issues in algorithm design, application, and performance. I. II. III. IV. V. VI. VII. VIII. IX. CONTENTS Nomenclature Acronyms Introduction Hybrid Estimation Overview of MM Approach First Generation: Autonomous MM Second Generation: Cooperating MM Third Generation: Variable-Structure MM MM Algorithm Design Issues Nonstatistical Techniques Concluding Remarks References Part I: Dynamic Models. IEEE Transactions on Aerospace and Electronic Systems, 39, 4 (Oct. 2003), 1333—1364. Part II: Ballistic Target Models. In Proceedings of the 2001 SPIE Conference on Signal and Data Processing of Small Targets, vol. 4473, 559—581. Part III: Measurement Models. In Proceedings of the 2001 SPIE Conference on Signal and Data Processing of Small Targets, vol. 4473, 423—446. Part IV: Decision-Based Methods. In Proceedings of the 2002 SPIE Conference on Signal and Data Processing of Small Targets, vol. 4728, 511—534. Manuscript received December 8, 2003; revised August 5 and October 11, 2004; released for publication April 13, 2005. IEEE Log No. T-AES/41/4/860796. Refereeing of this contribution was handled by W. Koch. This work was supported in part by ARO Grant W911NF-04-1-0274, NASA/LEQSF Grant (2001-4)-01 and NNSF Grants 60374025 and 60328306. Authors’ address: Dept. of Electrical Engineering, University of New Orleans, Lakefront, New Orleans, LA 70148, E-mail: (xli@uno.edu). c 2005 IEEE 0018-9251/05/$17.00 ° NOMENCLATURE arg maxx g(x) E[y] f(x) (i) (ik ) Argument x that maximizes g(x) Expectation of random variable y pdf of continuous random variable x Quantity pertaining to model m(i) Quantity pertaining to mode/model sequence m(ik k ) k Time index (subscript for quantity at time k; superscript for sequence through time k) Gaussian (normal) density function of x with mean x̄ and covariance P Model (mathematical model at certain accuracy level) Model set (set of models used) Number of models in the model set M = fsk = m(i) g = event that model m(i) matches the true mode at time k = fm1(i1 ) , : : : , mk(ik ) g = sequence of events that models match the true mode = E[(x ¡ x̂)(x ¡ x̂)0 ] = E[(x ¡ x̂)(x ¡ x̂)0 j z] Probability of event A pmf of discrete random variable m = f(x j A)PfAg = mixed pdf probability of random variable x and event A = f(x j m)p(m) = mixed pdf-pmf of random variables x and m Mode (true behavior pattern, system structure, or exact mathematical model) Mode space (set of possible modes) Base state (continuous valued) Estimate of y Data (measurement) = x0 Ax. N (x; x̄, P) m M M mk(i) m(ik k ) MSE(x̂) MSE(x̂ j z) PfAg p(m) P(x, A) p(x, m) s S x ŷ z kxk2A ACRONYMS1 2D 3D AMM ATC CA CMM CT CV EKF EM FSMM GPBn Two dimensional Three dimensional Autonomous multiple model Air traffic control (Nearly) constant acceleration Cooperating multiple model (Nearly) constant turn (Nearly) constant velocity Extended Kalman filter Expectation-maximization Fixed-structure multiple model Generalized pseudo-Bayesian of order n 1 The list includes acronyms used throughout the paper, but not those that are used locally within a subsection. IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS VOL. 41, NO. 4 OCTOBER 2005 1255 HMM IMM LMMSE MAP MHT MJLS ML MM MMSE MSE PDA pdf pmf PMHT VS VSMM Hidden Markov model Interacting multiple model Linear minimum mean-square error Maximum a posteriori Multiple hypothesis tracking Markov jump-linear system Maximum likelihood Multiple model Minimum mean-square error Mean-square error (matrix) Probabilistic data association Probability density function Probability mass function Probabilistic multiple hypothesis tracking Variable structure Variable-structure multiple model. I. INTRODUCTION This is the fifth part of a series of papers that provide a comprehensive survey of techniques for tracking maneuvering targets without addressing the so-called measurement-origin uncertainty. Part I [209] and Part II [205] deal with general target motion models and ballistic target motion models, respectively. Part III [206] covers measurement models, including measurement model-based techniques. Part IV [208] surveys various decision-based methods. Part VI surveys various nonlinear filtering methods for target tracking, a part of which is [210]. In the absence of measurement-origin uncertainty, maneuvering target tracking faces two interrelated main challenges: target motion-mode uncertainty and nonlinearity. Multiple-model (MM) methods have been generally considered the mainstream approach to maneuvering target tracking under motion-mode uncertainty. This part surveys such methods, that is, methods in which multiple models are used simultaneously at a time for maneuvering target tracking. Nonlinearity is best handled by nonlinear filtering techniques, to be surveyed in subsequent parts. MM methods and nonlinear filters are clearly complementary to each other and their integration is certainly appealing. Estimation of random quantities can be classified as point estimation and density estimation. Density estimation aims at approximating the entire density (distribution) of the estimatee (i.e., quantity to be estimated–the target state), while point estimation approximates the estimatee directly. MM methods are applicable to density estimation as well as point estimation. To be more focused, however, this part only handles MM methods for point estimation, leaving those for density estimation to subsequent parts. For the same reason, this part also focuses on the interplay of model-based filters, rather than individual model-based filtering. As such, it 1256 may be helpful for the reader to assume that each single-model-based filter is a Kalman filter. The more general case of MM nonlinear filtering is covered in subsequent parts. This survey is structured. It emphasizes the underlying ideas, concepts, and assumptions of the methods, rather than particular implementations for specific applications. This should help the reader understand not only how these methods work but also their pros and cons. It is hoped that a distinctive feature of this survey is that it reveals the interrelationships among various methods. However, the reader should keep in mind that such statements are based on our personal views and preferences, not always accurate or unbiased, although a good deal of effort has been made toward this goal. In addition to such discussions, a considerable amount of material included in this survey has not appeared elsewhere, including a significant number of open problems and ideas for further research. Also, more recent results are discussed in greater detail. Within maneuvering target tracking, MM methods probably have the vastest literature, as evidenced by the reference list and the length of this paper. Regrettably, many important issues associated with application of the MM methods, particularly those of implementation and tuning of the MM algorithms for specific applications, cannot be discussed in greater detail as many readers would hope for. We hope the reader will accept our apology for omission or oversight of any work that deserves to be mentioned or discussed at a greater length. As stated repeatedly in the previous parts, we appreciate receiving comments, corrections, and missing material that should be included in this part. While we may not be able to respond to each input, information received will be considered seriously for the refinement of this part for its final publication in a book. This paper is organized as follows. Section II formulates the problem of hybrid estimation, which is what the MM methods are good for. Section III introduces and provides an overview of the MM approach, including its strengths, structures, underlying criteria, and three generations. The three generations (autonomous, cooperating, and variable-structure MM estimation) are surveyed in Sections IV, V, and VI, respectively, including their tracking applications. Section VII covers MM algorithm design issues, including model-set design and transition probability determination. Section VIII is dedicated to nonstatistical MM techniques. Concluding remarks are given in the final section. II. HYBRID ESTIMATION In simple terms, hybrid estimation is the estimation of a quantity (a parameter or a process) that has both continuous and discrete components [192, 326]. It is IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS VOL. 41, NO. 4 OCTOBER 2005 particularly good for process estimation with structural uncertainty. Target tracking is a hybrid estimation problem [192]. In the prevailing approaches to target tracking, the modeling of the target motion/dynamics and the sensory system is essential. It is customary in these approaches to use the process/plant noise and measurement noise to cover the continuous-valued uncertainties in both the target trajectory and the measurement system. However, the major challenge of maneuvering target tracking arises from the target motion-mode uncertainty, which is discrete valued, not to mention the discrete-valued uncertainties in the measurement origins and the number of targets. The target motion-mode uncertainty exhibits itself in the situations where a target may undergo a known or unknown maneuver during an unknown time period. In general, a nonmaneuvering motion and different maneuvers can be described only in different motion models (see, e.g., Part I [209]). The use of an incorrect model often leads to unacceptable results. A primary approach to target tracking in the presence of motion-mode uncertainty is the so-called MM method, which is one of the most natural approaches to hybrid estimation. While the MM method can be applied to hybrid estimation of a parameter, in this paper we are concerned with its application to estimation of the target state process as a hybrid process. Generally speaking, a hybrid process is one with both continuous and discrete components, such as the state of a hybrid system. A continuous-time (stochastic) hybrid system is described by the following dynamic and measurement equations _ = f[x(t), s(t), w(t), t] x(t) z(t) = h[x(t), s(t), v(t), t] along with a law that governs the evolution of s, which is often given in probabilistic terms. Here x is called the base state, which (usually) varies continuously, just like the state of a conventional system; s is known as the system mode or modal state, which has a staircase-type trajectory, that is, it may either jump or stay unchanged; z is the measurement; and w, v are the process and measurement noise, respectively. We refer to the set of possible modes as the mode space and denote it by S. In simple terms, we say that x is continuous valued and s is discrete valued (even though S may be a continuum in some cases). In this sense, the whole state » = (x, s) of a hybrid system is a hybrid process. For such a system, hybrid estimation refers to the problem of estimating x and s, or ». Such a system (in continuous or discrete time) is a Markov jump system if s is a Markov (j) chain, that is, if Pfsk+1 j sk(i) g = pij,k , 8i, j, k (for a discrete-time system with discrete-valued s), where sk(i) signifies that mode i is in effect at time k. Often, s is assumed a homogeneous Markov chain, that is, the transition probabilities pij,k = pij , 8i, j are not a function of time index k. One of the simplest discrete-time hybrid systems is the so-called jump-linear system, given by xk+1 = Fk (sk+1 )xk + Gk (sk+1 )wk (sk+1 ) (1) zk = Hk (sk )xk + vk (sk ): (2) This system is nonlinear because, for example, x or z does not depend on the state » of the system in a linear fashion. Were the system mode s given, however, the system would be linear. In fact, s may actually jump at unknown time instants, hence the name. This system is known as a Markov jump-linear system (MJLS) if s is a Markov chain. If s is unknown but time invariant, it can be viewed or argued as an unknown parameter of the system and the system is linear in this perspective. A variant of (1), also with first-order dependence on fsk g, is xk+1 = Fk (sk )xk + Gk (sk )wk (sk ). It has pros and cons compared with (1). For a discrete-time model obtained by sampling a continuous-time system, they rely on different underlying assumptions concerning how sk is obtained from s(t). Caution should be taken when dealing with this subtle issue in some applications (see, e.g., [368]). More generally, the following second-order fsk g-dependence model xk+1 = Fk (sk+1 , sk )xk + Gk (sk+1 , sk )wk (sk+1 , sk ) (3) was proposed and advocated by Blom [57, 59]. It can describe jumps of x that occur simultaneously with and due to jumps of s. In other words, this model is capable of characterizing the system’s behavior at the instant of a jump in s and thus eliminates the need for introducing such a model explicitly, as has been done in many designs for MM estimation. Important algorithms for the first-order dependence models can be generalized to this second-order dependence model of a greater modeling power. For example, it has been shown [59] that a time-reversed version of (the autonomous part of) (3), xk+1 = Fk (sk+1 , sk )xk , has in general a second-order dependence on time-reversed fsk g, even if Fk (sk+1 , sk ) is actually fsk g-invariant. Such time-reversed models are useful in, for example, the well-known backward filters for smoothing. III. OVERVIEW OF MULTIPLE-MODEL APPROACH A. Basic Idea of MM Approach Conventional solutions to hybrid estimation problems follow the strategy that can be characterized as “estimation after decision,” “decision followed by estimation,” or simply “decision-estimation.” At any time, it first decides on a (best) model and then runs a single filter based on the model as if it were the true one. The decision-based methods for maneuvering target tracking, surveyed in Part IV [208], belong to this class. This approach has LI & JILKOV: SURVEY OF MANEUVERING TARGET TRACKING. PART V: MULTIPLE-MODEL METHODS 1257 several obvious drawbacks. First, possible errors in deciding on the model are not accounted for in the estimation. Second, decision is done irrevocably before estimation, although estimation results are often beneficial to decision making. Although these drawbacks have been well perceived, their remedies are hard to come by within this conventional strategy. For example, accounting for decision errors would require estimation in the presence of an unknown model-truth mismatch, which is very challenging and is still an open problem. Also, traditional model-based estimation cannot be done before the decision since it relies on the use of a single model. One possibility here is to use a model-free (i.e., nonparametric) estimation method, which appears an overkill for maneuvering target tracking, where although uncertain the true mode is only over a fairly limited set. Rather, the semi-parametric methodology seems more attractive here in general (see, e.g., [33]). Another possible improvement is to have an iterated version with several decision-estimation cycles at each time to take advantage of the estimation results in the decision step. This actually amounts to a degenerated MM approach and its benefit is probably not commensurate with the increased complexity. The MM approach gets around the difficulty due to the model uncertainty by using more than one model. Its basic idea is to assume a set M of models as possible candidates of the true mode in effect at the time; run a bank of elemental filters, each based on a unique model in the set; and generate the overall estimates by a process based on the results of these elemental filters. As such, the MM method provides an integrated approach to the joint decision and estimation problem of maneuvering target tracking. It can be classified as a semi-parametric approach since its model coverage is in between parametric and nonparametric approaches. In the optimization theoretic parlance, the MM method has the potential of arriving at a globally optimal solution, which is inherently superior in performance to the two-stage optimization strategy of the conventional decision-estimation approach. In this survey, we maintain that MM and non-MM estimation methods are separated as follows. At any single time, the latter actually runs only one (model-based) filter, possibly out of a set of candidates, but filters at different times may differ, while the former runs multiple model-based filters at least at some time. One may think that a better name for MM estimation approach is “multiple-filter” approach, but the MM approach is not limited to estimation. For example, it has been applied to control, modeling, and identification. For simplicity here we only describe the MM method for MJLSs for two main reasons. Almost all MM algorithms are theoretically valid only for this class of systems, and our description here can be 1258 extended to other hybrid systems in theory, although the development of the corresponding MM algorithms is not necessarily straightforward. For a Markov jump-linear system, the ith model in the MM method obeys the following equations: xk+1 = Fk(i) xk + Gk(i) wk(i) zk = Hk(i) xk (4) + vk(i) (5) where E[wk(i) ] = w̄k(i) , cov(wk(i) ) = Qk(i) , E[vk(i) ] = v̄k(i) , cov(vk(i) ) = Rk(i) . Superscript (i) denotes quantities pertinent to model m(i) in M, and the jumps, if any, of the system mode are assumed 8m(i) , m(j) , k to have the following homogeneous transition probabilities2 (j) Pfmk+1 j mk(i) g = ¼ij (6) where mk(i) denotes the event that model m(i) matches the system mode s in effect at time k: ¢ mk(i) =fsk = m(i) g: (7) Similarly, their finite sequences are denoted as sk = fs1(i1 ) , : : : , sk(ik ) g and mk = fm1(i1 ) , : : : , mk(ik ) g, respectively. Mode versus Model: To be more precise, in this paper a mode refers to a pattern of behavior or a phenomenon, or a structure of a system, and a model is a mathematical representation or description of the phenomenon pattern (system structure) at a certain accuracy level. It is models, not modes, on which an estimator is based. For example, the behavior pattern of an aircraft during a specific turn motion is a mode of turning. Many mathematical models are available at different accuracy levels that describe such a mode [209]. Equivalently, we may think that a mode describes the truth precisely and a model is an approximation of the mode. Such a distinction is necessary whenever the mismatch between the model and mode is of concern. The model set M differs in general from the mode space S in two aspects: 1) they have different numbers of elements–M usually has much fewer elements than S; and 2) a model is usually a simplified description of a mode. For example, one may use a small set of models, such as a nonmaneuver model plus several constant-turn (CT) models for tracking a target that may undergo various (complex) maneuvers (modes). B. Underlying Structures of MM Algorithms In general, four key components of MM estimation algorithms can be identified as follows. 1) Model-set determination: This includes both offline design and possibly online adaptation of the model set. An MM estimation algorithm distinguishes 2 However, homogeneous transition probabilities are not suitable for systems sampled at a nonuniform rate (see Section VIIB). IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS VOL. 41, NO. 4 OCTOBER 2005 C. Fig. 1. General structure of MM estimation algorithms (with 2 model-based filters). itself from non-MM estimators by the use of a set of models, instead of a single model. The performance of an MM estimator depends largely on the set of models used. The major task in the application of MM estimation is the design (and possibly adaptation) of the set of multiple models. 2) Cooperation strategy: This refers to all measures taken to deal with the discrete-valued uncertainties within the model set, particularly those for hypotheses about the model sequences. It includes not only pruning of unlikely model sequences, merging of “similar” model sequences, and selection of (most) likely model sequences, but also iterative strategies, such as those based on the expectation-maximization (EM) algorithm. 3) Conditional filtering: This is the recursive (or batch) estimation of the continuous-valued components of the hybrid process conditioned on some assumed mode sequence. It is conceptually the same as state estimation of a conventional system with only continuous-valued state. 4) Output processing: This is the process that generates overall estimates using results of all filters as well as measurements. It includes fusing/combining estimates from all filters and selecting the best ones. The operation of MM estimation algorithms has a general structure, as depicted in Fig. 1 with only two models. In the figure, the outer loops between the filters and the cooperation strategy represent (possibly multiscan) recursions; the vertical arrows between the filters and the cooperation strategy represent their cooperation/interaction within one recursion. The three components (i.e., exclusive of conditional filtering) are not present in a non-MM algorithm essentially. Output processing is covered in Section IVC. Cooperation strategies are the topic of Section VB. Design and adaptation of model sets are addressed in detail in Sections VIIA and VIB, respectively. In some MM algorithms conditional filtering and cooperation strategies are tightly coupled and can hardly be separated. Output processing, cooperation strategies, and model-set adaptation, respectively, are the cornerstones of three generations of MM algorithms developed so far, which are discussed next. Three Generations of MM Algorithms Three generations of MM algorithms have been identified in [195]. This identification is very beneficial: Different generations have their fundamental differences in operations, structures, and limitations/capabilities/potential; the three generations came into existence sequentially; and more convincingly, later generations do inherit superior characteristics of the earlier generations. This identification also helps reveal possible directions for further development. The first generation MM method was pioneered by Magill [235] and Lainiotis (see, e.g., [179], [180], [347]), and widely applied and promoted by Maybeck [242] and others. The second generation, represented unquestionably by Blom’s interacting MM (IMM) algorithm [56, 55, 58], has earned an enviable reputation for MM estimation via a significant number of successful applications in target tracking. Its popularization and further development have been spearheaded by Bar-Shalom (see, e.g., [13], [14], [18], [19], [21]). Its practical value in tracking has been strongly advocated and well demonstrated by Bar-Shalom and others, notably Blom and Blair. The third generation, characterized by its variable structure, is gaining momentum rapidly and is becoming the state of the art of MM estimation. Its initiation [198, 191, 201] and advancement have been led by Li and his team (see, e.g., [196], [224], [215], [192], [195]). There are two types of estimation problems for hybrid systems. The first one involves an unknown (random or nonrandom) but time-invariant mode. This is the case for estimating the state of a system with an unknown model that does not change over time or with a known model involving an unknown (time-invariant) parameter. In contrast to this, the mode in the second type may jump sometimes. The first generation is characterized by the fact that each of its elemental filters operates individually and independently of all the other elemental filters. Its advantage over many non-MM approaches stems from its superior output processing of results from elemental filters to generate the overall estimate. It would be optimal if the true mode were time invariant but unknown over a set that is identical to the model set used. These MM algorithms have been known under various names. A better name is autonomous MM (AMM) algorithms for several reasons, to become clear later. The second generation inherits the first generation’s superior output processing, and its elemental filters work together as a team via effective internal cooperation, rather than work independently as in the first generation. The cooperation includes all measures taken to achieve a better performance, such as individualized reconditioning of each filter (e.g., reinitialization as in the IMM algorithm, LI & JILKOV: SURVEY OF MANEUVERING TARGET TRACKING. PART V: MULTIPLE-MODEL METHODS 1259 see Section VB1), performance enhancement via interactive iterations and competitions among filters (e.g., those based on the EM algorithms, see Section VB5), joint parameter adaptation (e.g., cohesive online identification of transition probabilities, see Section VIIB), and other hypothesis reduction strategies, discussed in Section VB. This generation has the potential of optimal performance when the true mode may jump among members of a set that is identical to the model set used. Many of the algorithms in this generation have been called “switching” or “dynamic” MM algorithms before, which is not quite accurate or representative. We refer to the second generation as cooperating MM (CMM) algorithms. The model groups or teams in the first two generations have a fixed membership over time and thus have a fixed structure. They are allowed to have a variable membership in the third generation, leading to a variable structure, that is, a variable set of models. Still under active development, this generation is potentially much more advanced in the sense of having an open architecture than its ancestors, which have a closed architecture. Not only does it inherit the second generation’s effective internal cooperation and the first generation’s superior output processing, but it also adapts to the outside world by producing new elemental filters if the existing ones are not good enough and by eliminating elemental filters that are harmful. This generation has been known as VSMM algorithms. It is most suitable in the case where there is a significant truth-model mismatch: the model set used does not match the set of possible true modes. Using a human-team analogy, a non-MM algorithm relies entirely on the performance of a single “best” individual decided prior to his performance. In contrast, an MM algorithm asks all individuals in a group to perform simultaneously and produces the overall estimate after their performance. In the first generation, these individuals work independently. Its superiority to non-MM algorithms stems from its flexibility in generating its output reports based on the individual results a posteriori. The second generation focuses on internal cooperation. Its individuals form a cooperative team. It outperforms the first generation by team work. The third generation explores the best team makeup. It determines an adaptive, cooperative team with a variable membership–it may recruit new members and fire bad or incompetent members. Each generation is more capable than its predecessors at the price of increased sophistication/complexity. It is interesting to note that the development of MM algorithms has been along the direction from the final product to the underlying structure through the internal mechanisms; that is, from the output report of the team, to the internal cooperation of the team, and then to the makeup of the team. 1260 This paper is limited to MM approach to point estimation for maneuvering target tracking. MM approach has been applied successfully in many other areas, including control and identification, as well as to target tracking in the presence of measurement-origin uncertainty, and in recent years to density estimation for maneuvering target tracking, such as particle filtering. Outside of the target tracking area, almost all MM research and development have dealt only with the first generation so far; knowledge of the second generation is limited; and the third generation is hardly known. D. Optimality Criteria Consider a hybrid random variable » = (x, m), where x is continuous valued and m is discrete valued with M possible values. The complete Bayesian solution of estimating (x, m) using data z is the mixed (joint) probability density function/probability mass function (pdf-pmf) p(x, m j z) = f(x j m, z)p(m j z). This clearly involves a density estimation problem. Since a density function requires in general infinitely many numbers to describe it completely, this solution generally has an infinite dimension. For a hybrid process f»k g, this solution requires in general recursive estimation of the density function, known as nonlinear filtering. This is the topic of more than one subsequent part of this survey, which covers various exact and approximate nonlinear filtering methods. In this part, we deal only with point estimation; that is, estimators that have the same, finite dimension as the estimatee (i.e., the quantity to be estimated, which could be the base state, modal state, or hybrid state in our case). Least squares, maximum likelihood (ML), minimum mean-square error (MMSE), maximum a posteriori (MAP), and the method of moments are probably most widely used methods for point estimation. Virtually all point estimation algorithms using MMs developed so far are in essence based on either the MMSE or MAP criterion. This is understandable since MMSE and MAP are two primary Bayesian criteria for estimating a random quantity and the state of a hybrid system is more naturally treated as random than as deterministic. For convenience, we adopt the following notation for estimators of continuous-value x and discrete-valued m: x̂MMSE = E[x j z] m̂MMSE = E[m j z] (x̂JMAP , m̂JMAP ) = arg max p(x, m j z) (x,m) x̂ x̂ MAP m̂ MAP = arg max f(x j z) x MAP = arg max p(m j z) m (m̂) = arg max f(x j z, m̂) x IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS VOL. 41, NO. 4 OCTOBER 2005 Fig. 2. Illustration of MMSE and MAP estimators. (a) (p1 , p2 , p3 ) = (0:21, 0:41, 0:38). (b) (p1 , p2 , p3 ) = (0:25, 0:51, 0:24), x̂ = x̂2 = x̂4 . ¢ where f(¢) is pdf, p(¢) is pmf, p(x, m) = f(x j m)p(m) is a mixed (joint) pdf-pmf, and arg maxx g(x) stands for “the argument x that maximizes g(x),” meaning the maximizer (i.e., the location of the largest peak) of g(x). Note the following. MAP 1) In x̂ (m̂), m̂ can be any mode estimator in general, but is almost always taken to be m̂MAP . 2) As defined above, m̂MMSE does not exist whenever the convex sum ®1 m1 + ¢ ¢ ¢ + ®M mM is meaningless or does not exist, which is the case, for (j) example, if m(i) different dimensions. Pand m have (i) (i) 3) f(x) = M f(x j m )p(m ) is p(x, m) averaged i=1 over possible values of m. Fig. 2 illustrates the differences among x̂1 = x̂MMSE , x̂2 = x̂JMAP , x̂3 = x̂MAP , x̂4 = x̂MAP (m̂MAP ), and x̂5 = x̂MAP (m̂) for a Gaussian mixture density: f(x) = p1 N (x; x̄1 , ¾12 ) + p2 N (x; x̄2 , ¾22 ) + p3 N (x; x̄3 , ¾32 ) p where N (x; x̄i , ¾i2 ) = exp[¡(x ¡ x̄i )2 =(2¾i2 )]=( 2¼¾i ) with (x̄1 , x̄2 , x̄3 ) = (¡3, 0, 5), (¾12 , ¾22 , ¾32 ) = (1:62 , 2:252 , 22 ), and pi = Pfm = m(i) g. In the figure, the thicker line is the mixture density f(x). Note the following. 1) x̂1 = x̂MMSE = p1 x̄1 + p2 x̄2 + p3 x̄3 is the center of probability mass, that is, the balance point at which the line with f(x) as the mass density function would not tip to the left or right if a pivot is placed. 2) x̂2 = x̂JMAP is the location of the peak with the largest weighted peak value of allpcomponent densities: pi £ maxx N (x; x̄i , ¾i2 ) = pi =( 2¼¾i ). 3) x̂3 = x̂MAP is the location of the largest peak of the mixture density f(x). It is always within the interval between the leftmost and the rightmost peaks of all component densities. 4) x̂4 = x̂MAP (m̂MAP ) is the location of the largest peak of the component density N (x; x̄i , ¾i2 ) with largest pi . 5) x̂5 = x̂MAP (m̂ = m(i) ) is the location of the largest peak of the component density N (x; x̄i , ¾i2 ) (in the figures, i = 1). 6) Fig. 2(a) and (b) are for two mixture densities with the same component densities but different sets of weights: x̂MMSE , x̂JMAP , x̂MAP , x̂MAP (m̂MAP ) are relatively close to each other in (b), but quite different in (a), although the sets of weights differ not much. For the two cases, x̂MMSE changes least and x̂JMAP and x̂MAP (m̂MAP ) change most. These estimators minimize expectations of Bayes cost functions Ci , i = 1, 2, 3, 4, 5, respectively (see, e.g., [94], [99]). For example, C1 (x ¡ x̂) = (x ¡ x̂)2 and C3 (x ¡ x̂) = lim²!0 1(jx ¡ x̂j ¡ ²), where 1(x) is the unit-step function and ½ 0 jx ¡ x̂j < ² 1(jx ¡ x̂j ¡ ²) = 1 jx ¡ x̂j > ² describes a “golf hole” of radius ². Parallel to MAP estimation, ML estimators can also be obtained with the involved posterior pdf or pmf replaced by the corresponding likelihood functions, although they have not been proposed systematically in MM estimation. It turns out, however, that the application of the likelihood principle to estimation of a nonrandom hybrid quantity differs significantly from the MAP case. Assume that » = (x, m) is nonrandom with continuous-valued x and M possible values fm(1) , : : : , m(M) g for m. Then f(z j x, m) represents a set of likelihood functions ff(z j x, m(1) ), : : : , f(z j x, m(M) )g. The joint ML estimator of x and m and x̂ML (m̂) given m̂ are well defined as »ˆML = (x̂JML , m̂JML ) = arg max f(z j x, m) (x,m) (1) = arg max ff(z j x, m ), : : : , f(z j x, m(M) )g (8) (i) (x,m ) x̂ML (m̂) = arg max f(z j x, m(i) ) x if m̂ = m(i) but the ML estimators of x and m separately are not well defined, because neither f(z j x) nor f(z j m) has a generally accepted definition, although several definitions are available. One such definition of the likelihood functions L(x j z) and L(m j z) is based on the so-called generalized likelihood principle, leading to L(x j z) = maxff(z j x, m(1) ), : : : , f(z j x, m(M) )g m(i) L(m j z)jm=m(i) = max f(z j x, m(i) ), x LI & JILKOV: SURVEY OF MANEUVERING TARGET TRACKING. PART V: MULTIPLE-MODEL METHODS i = 1, 2, : : : , M 1261 is no mode-model mismatch and thus we use m to denote both the mode and the model and treat a model (which is deterministic) as a realization of a mode (which is random) in this section. Note that Sk = M implies, but is not implied by, sk 2 M (i.e., Sk µ M). A1 implies that m(i) is the true model for all times if and only if it is so at some single time and thus conditioned on anything A, the mode probability and mode-sequence probability are equal: and the corresponding maximizers are then taken to be the generalized ML (GML) estimators x̂GML and m̂GML , respectively, which are however equal to x̂JML and m̂JML because maxx L(x j z) = max(x,m(i) ) ff(z j x, m(1) ), : : : , f(z j x, m(M) )g, which is (8). We choose the continuous-valued part of »ˆML as x̂ML , namely, x̂ML := x̂JML . As such, x̂ML is the ML counterpart of x̂JMAP , not x̂MAP . The likelihood functions can also be defined by expectation or marginalization (see, e.g., [127]) Pfmk(i) j A, A1g = Pfm1(i) , : : : , mk(i) j A, A1g L(x j z) = f(z j x) = E[f(z j x, m) j x] X = f(z j x, m(i) )p(m(i) j x) i L(m j z) = f(z j m) = E[f(z j x, m) j m] Z = f(z j x, m)f(x j m)dx (9) which, however, requires treating the quantity being averaged out as random, in violation of the previous assumption that x and m are nonrandom. Nevertheless, for MM estimation f(z j m) = E[f(z j x, m) j m] is a sensible way to go because it is natural to treat the base state x as random and the model as nonrandom. Therefore, we use m̂ML based on (9). With this, x̂ML (m̂ML ) = arg maxx f(z j x, m̂ML ) is well defined. where mk(i) denotes that model m(i) matches the true mode at time k. Denote by M = jMj the number of models used. Then, all possible model sequences (through time k) are constant (by A1) and there are exactly M of them (by A2), given by k m(i) = fm1(i) , : : : , mk(i) g, Fundamental Assumptions: The first generation, AMM algorithms were developed based on the following two fundamental assumptions. A1. The true mode s is time invariant (i.e., sk = s, 8k). A2. The true mode s at any time has a mode space S that is time invariant and identical to the time-invariant finite model set M used (i.e., Sk = M, 8k). A2 can be decomposed into three components: A2a: Mk = M, 8k; A2b: Sk = S, 8k; A2c: S = M. A2b can be viewed to be implied by A1, but A2 may be invoked without A1, as the second generation does. Assumption A2a is the defining assumption of both first and second generations. A1 and A2 allow in principle the true mode s to be deterministic or random. Almost all AMM algorithms developed so far assume a random s, although assuming an unknown but nonrandom s appears, in our opinion, somewhat more natural for many applications. Furthermore, neither A2b nor A2c is in fact needed if s is assumed nonrandom, such as for the ML estimation alluded to above and in Section IVD. According to A2, there 1262 (10) Note that Assumptions A1 and A2 are embedded in k . this definition of m(i) MMSE-AMM: As first proposed in [235], the MMSE-optimal AMM base-state estimator is given by the total expectation theorem as (see, e.g., [21]) x̂kjk = E[xk j z k , A1, A2] = IV. THE FIRST GENERATION: AUTONOMOUS MM ESTIMATION A. Optimal Autonomous MM Estimation m(i) 2 M: M X i=1 k k E[xk j z k , m(i) ]Pfm(i) j z k , A1, A2g = M X (i) (i) ¹k x̂kjk i=1 (11) where z k = (z1 , : : : , zk ) is measurements through time (i) k k, ¹(i) k = Pfmk j z , A1, A2g is the posterior mode probability under A1 and A2 that the mode in effect is constant and equal to one and only one but possibly (i) k anyone of the models in M, and x̂kjk = E[xk j z k , m(i) ] is the MMSE estimate from the ith elemental filter assuming m(i) is true throughout time. MMSE Under A1 and A2, x̂kjk is unbiased in the sense MMSE E[xk ¡ x̂kjk ] = 0; its conditional mean-square error (MSE) matrix (often called error covariance loosely) is minimum of all base-state estimators x̂kjk , given originally in [183] (see also [21]) by3 Pkjk = MSE(x̂kjk j z k , A1, A2) = M X (i) (i) (i) [Pkjk + (x̂kjk ¡ x̂kjk )(x̂kjk ¡ x̂kjk )0 ]¹(i) k (12) i=1 (i) = where MSE(x̂ j z) = E[(x ¡ x̂)(x ¡ x̂)0 j z], and Pkjk (i) MSE(x̂kjk j z k , A1, A2) is the conditional MSE matrix (i) of the MMSE estimator x̂kjk under A1 and A2. The 3 In fact, this equation holds true for any estimator x̂kjk , not just the (i) k ]. = E[xk j z k , m(i) optimal AMM estimator, provided x̂kjk IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS VOL. 41, NO. 4 OCTOBER 2005 TABLE I One Cycle of AMM Algorithm Gaussian system (4)—(5) where the Kalman filter is optimal given the mode. MAP-AMM: The mixed pdf-pmf of the base state and mode at time k is 1. Model-conditioned filtering (for i = 1, 2, : : : , M): Predicted state: (i) (i) (i) (i) (i) x̂kjk¡1 = Fk¡1 x̂k¡1jk¡1 + Gk¡1 w̄k¡1 Predicted covariance: (i) (i) (i) (i) 0 Pkjk¡1 = Fk¡1 Pk¡1jk¡1 (Fk¡1 ) Measurement residual: Residual covariance: Filter gain: Updated state: Updated covariance: p(xk , mk j z k , A1, A2) = f(xk j z k , mk , A1)p(mk j z k , A1, A2) (i) (i) (i) 0 +Gk¡1 Qk¡1 (Gk¡1 ) (i) (i) (i) z̃k = zk ¡ Hk x̂kjk¡1 ¡ v̄k(i) (i) Sk(i) = Hk(i) Pkjk¡1 (Hk(i) )0 + Rk(i) (i) (i) (i) 0 (i) ¡1 Kk = Pkjk¡1 (Hk ) (Sk ) (i) (i) x̂kjk = x̂kjk¡1 + Kk(i) z̃k(i) (i) (i) Pkjk = Pkjk¡1 ¡ Kk(i) Sk(i) (Kk(i) )0 = ff(i) (xk j z k )¹(i) k , i · Mg k where f(i) (xk j z k ) = f(xk j z k , m(i) ) is the density assuming the mode sequence is m1(i) , : : : , mk(i) (i.e., m(i) is the true model). It thus follows from the total probability theorem that the base state has the posterior mixture density 2. Mode probability update (for i = 1, 2, : : : , M): ¢ k , z k¡1 ] assume L(i) = p[z̃k(i) j m(i) = N (z̃k(i) ; 0, Sk(i) ) k ¹(i) L(i) k¡1 k ¹(i) = k (j) (j) ¹ L j k¡1 k Model likelihood: Mode probability: P 3. Estimate fusion: Overall estimate: Overall covariance: P P = E[mk j z M X m(i) ¹(i) k i=1 = f(i) (xk j z k )¹(i) k : k k , m(i) ]Pfm(i) (15) xk MAP JMAP JMAP »ˆkjk = (x̂kjk , m̂kjk ) = arg max ff(i) (xk j z k )¹(i) k , i · Mg (xk ,m(i) ) k m̂MAP = (m̂1 , : : : , m̂k ), k j z , A1, A2g (13) i=1 0 k M X i=1 MAP x̂kjk = arg max f(xk j z k , A1, A2) m̂kjk = E[mk j z k , A1, A2] k k f(xk j z k , m(i) )Pfmk(i) j z k , A1, A2g The corresponding MAP-AMM estimators are given by (i) ¡ x̂kjk ) corresponding mode estimator is given by M X = M X i=1 x̂(i) ¹(i) i kjk k Pkjk = [P (i) + (x̂kjk i kjk (i) 0 (i) ) ]¹k £ (x̂kjk ¡ x̂kjk x̂kjk = f(xk j z k , A1, A2) = k MSE(m̂kjk j z ) = E[(mk ¡ m̂kjk )(mk ¡ m̂kjk ) j z , A1, A2] M X (m(i) ¡ m̂kjk )(m(i) ¡ m̂kjk )0 ¹(i) = k : (14) i=1 This mode estimator exists and is meaningful only if the convex sum (13) is well defined and meaningful, which is not the case if m(i) is only an index of the system structure or behavior pattern, e.g., M = f1, 2, : : : , Mg. For instance, if m(1) = 1 is a booster model of a missile and m(2) = 2 represents a climbing motion of an aircraft, their weighted sum is meaningless. Even if m(2) = 2 is also for the missile (say, a reentry model), their weighted sum is still hard to interpret: what does it mean in this MMSE case if m̂kjk = 1:63? Also, (m(1) , m(2) ) = (1, 2) and MMSE , (m(1) , m(2) ) = (2, 3) would lead to different m̂kjk which renders interpretation difficult. This mode estimator is meaningful in general when all m(i) are points in a vector space. Table I gives the MMSE-AMM algorithm (of the base state) under Assumptions A1 and A2 for a (16) m̂1 = ¢ ¢ ¢ = m̂k = m̂kMAP = arg max ¹(i) k m(i) MAP (m̂k ) = arg max f(j) (xk j z k ) x̂kjk xk MAP k MAP (m̂MAP ) = x̂kjk (m̂k )jm̂k =m̂k x̂kjk MAP k if m̂k = m(j) : MAP In words, the MAP base-state estimator x̂kjk is the k peak location of the mixture density f(xk j z , A1, A2), which is a probabilistically weighted sum of f(i) (xk j z k ); the joint MAP (JMAP) estimator JMAP JMAP (x̂kjk , m̂kjk ) is the maximizer of the pdf-pmf k p(xk , mk j z , A1, A2) or the set of f(i) (xk j z k )¹(i) k ; the MAP mode estimator m̂kMAP is the one with the largest posterior probability and thus the MAP k mode-sequence estimator m̂MAP is the constant sequence of this mode throughout time (m̂kMAP and k m̂MAP can also be interpreted as outcomes of the corresponding MAP tests); the model-sequence MAP conditioned MAP estimator x̂kjk (m̂k ) is the peak location of the component density f(j) (xk j z k ) corresponding to the model m̂ = m(j) . It should be clear that the special MAP estimators MAP k MAP x̂kjk (m̂MAP ) and x̂kjk (m̂k ) can be obtained from (i)MAP = the set of component MAP estimators x̂kjk k MAP arg maxxk f(i) (xk j z ), but the MAP estimator x̂kjk and JMAP the JMAP estimator x̂kjk cannot in general, although LI & JILKOV: SURVEY OF MANEUVERING TARGET TRACKING. PART V: MULTIPLE-MODEL METHODS 1263 (i)MAP JMAP coincides with the x̂kjk with the largest value x̂kjk (i) of f(i) (xk j z k )¹k over xk and m(i) . Some of these MAP-AMM estimators were presented in [99] in simpler terms and evaluated along with the MMSE estimator in terms of several performance measures via computer simulations for a simplified aircraft tracking example. The MSE matrices (error covariances) of these MAP estimators do not have a known, explicit, analytic form, but they are related to that of the MMSE via the following easily MMSE estimator x̂kjk obtainable general relationship: Pkjk = MSE(x̂kjk j z k ) MMSE MMSE MMSE 0 j z k ) + (x̂kjk ¡ x̂kjk )(x̂kjk ¡ x̂kjk ) = MSE(x̂kjk (17) where x̂kjk can be any estimator, including the above MAP estimators. B. Autonomous Operations of Conditional Filtering It is clear from the above that there are two key functions in the operation of an AMM algorithm: model-based conditional filtering and output processing. They are discussed next. Assumption A1 is a defining assumption of the AMM algorithms. Under A1, each elemental filter does conditional filtering based on a constant model sequence. In other words, each filter works individually and independently of other filters. As such, each conditional filtering operation is autonomous, hence the name “autonomous multiple-model algorithms.” For elemental filter i, the goal is to compute the k ) for the MAP case and pdf f(i) (xk j z k ) = f(xk j z k , m(i) (i) (i) the estimate (x̂kjk , Pkjk ) for the MMSE case, where R (i) (i) k x̂kjk = E[xk j z k , m(i) ] = xk f(i) (xk j z k )dxk and Pkjk = (i) MSE(x̂kjk j z k ). If the base state xk and measurements k , then f(i) (xk j z k ) = z k are jointly Gaussian under m(i) (i) (i) N (xk ; x̂kjk , Pkjk ) and thus conditional filtering operations for MMSE and MAP estimation are theoretically the same. For the general, non-Gaussian case, neither class has an explicit, analytic form. However, if model m(i) is linear (i.e., the system conditioned k is linear) and satisfies the whiteness and on m(i) uncorrelatedness assumption of the Kalman filter for the process and measurement noises, recursive linear (i) (i) (i) (i) , Pk¡1jk¡1 ) ! (x̂kjk , Pkjk ) is MMSE estimation (x̂k¡1jk¡1 given by the Kalman filter explicitly regardless of Gaussianity (see, e.g., [21]). This is not the case for the conditional filtering for MAP estimation. Adaptive estimation was studied by many (see, e.g., [184]), including the case of unknown time-invariant parameters with independent 1264 measurements. These results were extended by Magill [235] in a nonrecursive form to scalar, dependent measurements generated by state-space models with unknown discrete parameters. Magill’s results were further extended by Lainiotis, et al. [315, 183, 179, 182, 180] to a recursive form, with an exact error covariance form, for vector measurements and arbitrary continuous and discrete parameters. Early works dealt only with estimation for systems with a time-invariant mode that is unknown (a nonrandom constant) or uncertain (a random variable, but not a random process), which led to the autonomous operations of conditional filtering [315, 308, 118, 185]. Many reinventions, extensions, and applications of this generation can be found in the literature under various names, including the “partition (partitioned or partitioning) filter” [179, 182, 180, 347], the “multiple model adaptive filter” [179, 182], the “parallel processing algorithm” [7], the “multiple model adaptive estimator” [242], the “static multiple-model algorithm” [18, 216], the “filter bank method” [65], the “self-tuning estimator” [65], the “operating regime approach” [156], and in the same spirit the “mixture of experts” [254, 73]. These names suggest the structure, features, and capability of the first generation, particularly in comparison with non-MM algorithms. For example, this MM algorithm was applied recently in [181] to state estimation of a nonlinear system without model uncertainty. It runs a bank of perturbed (linearized) Kalman filters, each with a nominal state trajectory that is a random realization of the true one, as proposed in [179]. A large number of nominal trajectories are generated first and then clustered to achieve better cost effectiveness [181]. C. Output Processing for MM Estimation The output processor generates the overall estimate using information available from all elemental filters as well as data. It is in general an information fusion process. This type of fusion, however, differs from multisensor data fusion in at least two aspects [195]: 1) at most one (and at least one under Assumption A2) estimate is correct (but not necessarily precise) in the MM fusion (but which one is unknown), whereas more than one estimate may be correct in sensor fusion, and 2) different filters use the same data in the MM fusion but different data in sensor fusion. Note, however, that the second difference is not fundamental since the same data can be treated artificially as multiple pieces of data with perfect coupling, that is, sensor fusion with dependent data. The principles proposed for output processing of AMM algorithms turn out to be general, applicable to later generations as well, although their specific implementations are tuned for AMM algorithms. In view of this, the following discussion is directed to IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS VOL. 41, NO. 4 OCTOBER 2005 the general MM, not just AMM, algorithms. These principles can be classified into two groups: hard decision and soft decision [192, 195]. Hard Decision: This approach identifies a “good” subset B of model sequences by a hard decision procedure, and then generates the overall estimate from the estimates conditioned on these model sequences. The set B may change with respect to time as more data are collected, although only constant model sequences are considered in the AMM algorithms. The most natural subset B is the set of model sequences deemed most likely or “not too unlikely.” Several different decision procedures for this purpose have been proposed, including what can be called B-best approach and the Viterbi algorithm. This hard decision amounts to pruning of unlikely model sequences. It is applied here to output processing. As discussed later in Section VB3, the same procedure has also been applied to hypothesis reduction as a cooperation strategy. These two applications have appeared together so far, which is natural, although each actually stands alone without the other. The Viterbi algorithm or forward dynamic programming (see, e.g., [115]) can identify the single best (i.e., most probable) model history for each model at any time. It always has M (i.e., number of models) survived model histories at any time k. Each of them is the best model history for a model-based filter at a time. These M histories definitely include k but they are not necessarily the M the best one m̂MAP best ones (for second and third generations) because, for example, the second best model history for a model may be better than the best one for another model. In short, this procedure selects “the fittest” for each model; some of these fittest can be quite unfit to the overall truth, but they are needed to guarantee “survival of the fittest” for the future. In the B-best approach, the subset B of the B most probable model sequences at time k can be found by MAP hypothesis tests (see, e.g., [129], [338]) using all data through k in a batch format. However, recursive implementations are usually needed. Recursive B-best algorithms rely on MAP hypothesis tests using data available at the time, where the test statistic is usually a function of mode sequence probabilities and conditional measurement residuals. Many MM algorithms for adaptive control make decisions in this way. However, they actually do not guarantee to yield the B most probable sequences over the entire time horizon because some or all of these best sequences may have less probable partial histories and been deleted earlier in the process. In contrast, the Viterbi algorithm can be implemented recursively if the model sequence is a (hidden) Markov chain with mode-history independent measurements. Another drawback of the B-best approach is that some or many of the B most probable model sequences may be quite similar and had better be merged to reduce processing load. More details and an illustrative example are given in Section VB3. Knowing the above pros and cons of these two approaches, it appears sensible to combine them as follows. Use the Viterbi algorithm for hypothesis reduction (i.e., to decide which model sequences should be maintained), which involves M 2 conditional filtering operations (see Section VB5); but use the B-best algorithm to select B most probable ones out of the M 2 model sequences for output processing if we can only afford processing B sequences at a time for output, which may be the case for the MAP MAP as the peak location of a base-state estimator x̂kjk mixture density of B components. In the extreme case of the “survival of the fittest” where the set B has only one sequence, the estimate based on the single most probable mode sequence i is taken to be the overall estimate: (x̂kjk , Pkjk ) = (i) (i) , Pkjk ). We emphasize that compared with the (x̂kjk conventional “decision-estimation” non-MM approach, this “estimation-decision” MM approach is superior even in this extreme case because, as explained at the beginning of Section IIIA, it makes a more informed decision since the decision is made after the completion of conditional filtering (estimation). Other hard decision procedures are possible, including heuristic rules, expert systems, neural networks, and so on (see Section VIII). The overall estimate given a subset B of more than one model sequence is usually taken to be a probabilistically weighted sum of estimates based on model sequences in B, but other ways as described next could also be used. Soft Decision: The output processing does not need to involve hard decisions. In fact, the most widely used MMSE-based estimators generate the overall estimate by a weighted sum of MMSE (i) from all elemental filters with estimates x̂kjk mode-sequence probabilities ¹k(i) as the weights: P (i) k MMSE MMSE = i x̂kjk ¹(i) . Likewise for m̂kjk . This can be x̂kjk thought of as a soft decision for convenience.4 This soft decision is often applied in practice to fuse non-MMSE estimates as well. This is particularly popular for a hybrid system with non-Gaussian linear subsystems to which the Kalman filters are applicable. (i) are linear MMSE (LMMSE) In this case, x̂kjk estimators (i.e., have minimum MSE of all linear estimators), not MMSE estimators. As a result, P (i) but x̂kjk = i x̂kjk ¹k(i) is neither MMSE nor LMMSE. It is a nonlinear estimator hopefully close to the MMSE estimator. Although this hope lacks strong theoretical support, this x̂kjk can be expected to beat the overall 4 However, this is not to be confused with the soft decision in communication and decoding, which refers to a decision that is revocable. LI & JILKOV: SURVEY OF MANEUVERING TARGET TRACKING. PART V: MULTIPLE-MODEL METHODS 1265 LMMSE LMMSE estimator x̂kjk (see the end of Section VA for evidence). Like any discrete-valued quantity, for a mode k is actually an alias sequence, MAP estimation m̂MAP of MAP decision (test). The special MAP estimator MAP k x̂kjk (m̂MAP ) is hard decision based, equal to the (i)MAP corresponding to component MAP estimator x̂kjk k . Likewise for the most probable mode sequence m̂MAP MAP k JMAP x̂kjk (m̂ ). The JMAP estimator x̂kjk is also equal (j)MAP to one of the component MAP estimators x̂kjk by MAP a hard decision. However, the MAP estimator x̂kjk relies on a soft decision that requires in general all the component densities as well as the mode-sequence probabilities [99]. Still another class of soft decision procedures is nonprobabilistic, such as those based on Dempster-Shafer evidence theory, fuzzy logic, neural networks, as discussed in Section VIII. Other approaches to output processing are possible, such as combinations of the above approaches. D. Convergence of AMM Estimates Magill’s original work [235] includes sufficient conditions on the convergence of the mode probabilities for a single-output linear system. This was extended to multiple outputs by others [315, 183, 179, 180]. It was shown in [130] and [25] that the correct (true) model has a probability that tends to unity (almost surely) as time increases under Assumptions A1 and A2 and that [25] the correct model and any of the incorrect models 1) cannot be perfectly distinguished in finite time by their k likelihood functions f(z k j m(i) ) and 2) do not have k identical marginal likelihood functions f(zk j z k¡1 , m(i) ) ML MAP MMSE are as k increases. As such, m̂kjk , m̂kjk , and m̂kjk all consistent estimators in that they converge with probability one to the true model (m̂MMSE is also mean-square consistent) under the assumptions stated ML above [25]. Here the ML estimator m̂kjk is the model with the largest model likelihood. For a set M of linear time-invariant systems with white, uncorrelated, stationary Gaussian noise under Assumptions A1 and A2, the conditions 1 and 2 defined above can be replaced by the easily verifiable condition [25] (i) s ½(i) 60, 8i 6 = s, m(i) 2 M with5 s = j½ ¡ ½ j = ½(i) = log det(S (i) ) + tr[(S (i) )¡1 S̃ (i) ] ½s = ½(i) jm(i) =s = log det(S̃) + dim(zk ) (18) or by the even more easily verifiable condition [22]: det(S̃ ¡ S (i) ) 6 = 0, 8i 6 = s, m(i) 2 M, where S̃ and 5 Note that ½(i) s is always dimensionless regardless of what are used (i) for S , S̃ (i) , and S̃. 1266 S (i) are the steady-state measurement prediction covariances (MSE matrices) of the correct model s and an incorrect model m(i) , respectively, as calculated in the corresponding Kalman filters, and S̃ (i) is the true steady-state measurement prediction MSE matrix of the incorrect model m(i) . For AMM estimation of the base state, we then have s,MMSE MMSE x̂kjk ! x̂kjk s,MAP MAP JMAP MAP MAP , x̂kjk , x̂kjk (m̂kjk ) ! x̂kjk x̂kjk s,ML ML ML ML ML ML , x̂kjk (m̂kjk ), x̂kjk (m̂kjk ) ! x̂kjk x̂kjk as k ! 1 with probability one under the stated s,MMSE s,MAP s,ML assumptions. Here x̂kjk , x̂kjk , and x̂kjk are MMSE, MAP, and ML estimators based on the true ML ML ML ML ML model, and x̂kjk , x̂kjk (m̂kjk ), x̂kjk (m̂kjk ) were defined in Section IIID. The above results hold when the true model s is in M (Assumption A2). For any model pair m(i) and m(j) in M, as shown in [24], the likelihood ratio k k f(z k jm(i) )=f(z k jm(j) ) goes to zero with probability one (j) (i) if ½ < ½ and only if ½(j) · ½(i) under Assumption A1 and that the corresponding measurement residual sequences of the linear-Gaussian system are ergodic and have a finite and positive-definite steady-state mean-square matrix. It follows that regardless whether the true model s is in M or not, the probability of the model m(i) in M with the smallest “distance” ½(i) s to the true model s tends to unity almost surely as time increases under the assumptions stated above (but without A2) if this smallest “distance” is unique [24]. Consequently, all the above convergence results for m̂kjk and x̂kjk of the linear-Gaussian systems hold true if the true model s is replaced by the (assumed unique) model closest to it.6 Closely related, but slightly less convenient results were obtained in [130] 6 The “distance” ½(i) was given the following Kullback-type s information theoretic interpretation in [24]. Let k k ¸k (i, j) = log[f(z k jm(i) )=f(z k jm(j) )] k k ¡ log[f(z k¡1 jm(i) )=f(z k¡1 jm(j) )], d(i, j) = lim jE[¸k (i, j) j sk ]j, k!1 d (i) = d(s, i) where s is the true model and sk is the constant true model (j) sequence through time k. Then d (i) ¸ d (j) if and only if ½(i) s ¸ ½s . As shown in [24], d(i, j) is a pseudo distance (i.e., satisfies the triangle inequality and is symmetric and nonnegative definite, but not positive definite, hence the prefix pseudo and the need for the extra uniqueness assumption), although it is closely related to the Kullback-Leibler information measure I(s, j) = k )] j sk ), which does not satisfy the triangle E(log[f(zk j sk )=f(zk j m(j) inequality and is not symmetric. Despite this connection of ½(i) s to information “distance,” no general results are available for nonlinear, non-Gaussian systems based on d (i) directly. IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS VOL. 41, NO. 4 OCTOBER 2005 based on the Kullback-Leibler information. Note that in this case MMSE- and MAP-AMM estimators all (implicitly and incorrectly) assume that the true model is in M, but ML-AMM estimators need not and would not benefit from this assumption. As argued in [234], when the true model s is not in M, the convergence of the probability of the closest-to-truth model to unity is not necessarily desirable because many models in M may capture distinct characteristics of the true one and thus is better reflected in the limit. An AMM algorithm with mode “probabilities” converging to constants other than zero and one was proposed in [234]. It modifies Bayes’ rule for mode probability update by replacing the predicted mode probabilities with constant (e.g., uniform) prior probabilities. E. Tracking Applications The AMM (first generation) algorithms have numerous applications outside of the target tracking area. They are particularly popular for problems involving unknown parameters (see, e.g., [347]). They form an important approach for dealing with systems subject to faults (see, e.g., [26]). However, several more recent publications [106, 253, 371, 297] demonstrated that the second- and third-generation MM algorithms still outperform the AMM algorithms for such applications, wherein the system mode does jump, in violation of the basic assumptions of the AMM estimation. This is easily understandable for people engaged in target tracking research and development, in view of the duality between fault detection and maneuvering target tracking [164]. For obvious reasons, here we only survey limited AMM applications in maneuvering target tracking. Although not a straightforward implementation of the AMM estimator, the maneuvering target tracking algorithm proposed in [332] has some features of the AMM algorithms. It combines a nonmaneuver filter and a maneuver estimator chosen by a hard decision logic. The maneuver estimator itself is a two-model AMM procedure, which generates the overall estimate by a probabilistically weighted sum. In [245], a two-model AMM algorithm was designed, which includes a (nearly) constantacceleration (CA) model plus either a Singer model or a 3D (nearly) constant-turn (CT) model (see [209]), for line-of-sight angular tracking of a close-range, highly maneuverable, airborne target using forward looking infrared sensor measurements. Further enhancements of this image tracker were proposed and analyzed in [334]. Some of the elemental filters were allowed to have a rectangular field of view; the algorithm was tuned to harsher target dynamics by considering both Gauss-Markov acceleration and constant turn-rate models; and an initial target acquisition algorithm was devised to remove significant biases in the estimated target template to be used in a correlator within the tracker. Further along these lines, a 3-model AMM configuration based on a second-order Markov acceleration model [209] together with the CA and Singer models was investigated in [357]. For a different application, in [178] a 3-model AMM algorithm with a first-order Gauss-Markov acceleration, first-order Gauss-Markov velocity, and a nearly constant position model of pilot’s head motion was applied as a predictor for a virtual environment flight simulator. Also, an AMM algorithm preceding the above applications can be found in [252]. These publications, via their demonstrations of the superiority of the MM approach to the single-model-based trackers (e.g., extended Kalman filter (EKF)), have also helped establish that using well-selected, possibly more sophisticated models for certain tracking applications can reduce the number of elemental filters significantly since they provide a better coverage of possible motion modes than simple models. Results of a comprehensive study were presented recently in [276] on the capabilities of an AMM algorithm for tracking and interception of a highly maneuverable fighter aircraft armed with electronic countermeasures (ECM) by an air-to-air missile equipped with a monopulse radar seeker. The scenario involves a sequence of periodical evasive maneuvers of the aircraft and electronic jinking7 generated by the aircraft ECM system. The system mode space in terms of the maneuver-jinking pairs of evasion strategies is unknown to the tracker (the missile homing system), and was approximated via “quantization” by a set of 45 strategies, serving as the “ground truth” in the simulations. Due to feasibility considerations, however, the design of the MM tracker included only a small fixed set of six most representative models, selected by a “trial-and-error” process to cover the mode space reasonably. Each of the six elemental EKFs with an 11-dimensional state was carefully tuned to a particular evasion strategy (motion-jinking model). While demonstrating a significant improvement over previous, non-MM filters (e.g., single EKF), the simulation results presented therein again exposed some typical deficiencies of the AMM algorithms. These include: failed or delayed identification–the filter may lock on an erroneous model and fail to switch to the true one; poor estimation, mainly due to the mismatch between the dominant model and the true mode. Overall, besides the important practical results as well as demonstrating the superiority of the MM approach to single filters, this study 7 It is a periodic switching of the aircraft’s apparent radar reflection center from one wing tip to the other. LI & JILKOV: SURVEY OF MANEUVERING TARGET TRACKING. PART V: MULTIPLE-MODEL METHODS 1267 Fig. 3. Possible model sequences of CMM and AMM algorithms. (a) CMM. (b) AMM. illustrates that AMM estimation is not well suited to situations with frequent modal changes, for which later generations are more suitable. Also, covering a large mode space by a small fixed set of models inevitably causes a large model-truth mismatch that normally leads to inadequate estimation performance. Instead, the third-generation, VSMM algorithms appear much more appealing for this challenging, practical problem, characterized by a large mode space and high motion dynamics. Another study of AMM implementation for a similar air-to-air missile guidance problem was reported in [269]. The AMM approach was employed for another important class of problems–ballistic target tracking [104, 103, 312]. Tracking of a ballistic target was considered in [104] and [103] for all flight phases: boost (including postboost), coast (free flight), and reentry (possibly maneuverable) [205]. The tracker designed consists of seven autonomous filters running in parallel, each corresponding to one of the three specific flight phases. The output of the algorithm is selected from those of the elemental filters by a hard decision logic, e.g., from the one based on the most likely model: x̂(m̂ML ). Using an autonomous system of filters appears more reasonable here than in such applications as aircraft tracking since the modal changes here are infrequent. The use of a hard decision for output was motivated by two considerations. First, models of the different flight phases have state vectors of different dimensions and fusion of estimates of these vectors by soft decision for output is nontrivial and lacks theoretical support. Second, such a fused estimate is statistically inferior to the single best estimate for interception and destruction applications where the hit rate is more important than average errors [220]. These arguments for hard decision, albeit sensible, are debatable, though. For example, it can be argued that a hard decision is in general more prone to false alarms than a soft decision, possibly resulting in much poorer estimates. For such reasons soft decision based schemes (e.g., IMM) have been adopted more often for similar problems [160, 257, 39, 141, 142, 80]. In [312], the focus is on tracking a tactical ballistic missile in its incoming phase capable of random (bang-bang) evasive maneuvers for the purposes of 1268 interception and destruction. From the estimation point of view, the main uncertainty facing the tracker in such a scenario, embedded in a differential game framework, reduces to the unknown maneuver onset time. Thus the models involved differ only in the maneuver onset time.8 The choice of the AMM configuration appears appropriate within this formulation as far as estimation of the target state after the maneuver is concerned since a mode here cannot jump to another. V. THE SECOND GENERATION: COOPERATING MM ESTIMATION A. Optimal Cooperating MM Estimation Fundamental Assumptions: In the second generation, cooperating MM (CMM) algorithms, the fundamental Assumption A2 of the first generation is retained without a change, but A1 is relaxed to allow a time-varying mode sequence fsk g. Similar to the first generation, this sequence in principle can be random or deterministic, but for tractability, it is almost always more conveniently assumed to be a random process, in particular a Markov (or semi-Markov) process, as stated formally below. A10 . The true mode sequence fsk g is Markov (or semi-Markov). If fsk g is indeed random, the above Markovian assumption is justifiable for many applications. A hybrid system with A10 is known as a Markov jump system. Like the first generation, under A2 there is no mode-model mismatch and thus we again use m to denote both the mode and the model and treat a model as a realization of a mode in this section. Under A2, there are M k possible model sequences (or realizations) through time k, which increases exponentially with time, where M = jMj is the number of possible models at each time. This full tree (more precisely, trellis) is illustrated in Fig. 3(a) 8 This idea of using MMs for the maneuver onset time was not new–the AMM of [235] was mentioned in [64] as an option for the input estimation method; however, it was abandoned and a hard decision (generalized likelihood ratio test approach) was used instead. IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS VOL. 41, NO. 4 OCTOBER 2005 for a three-model case. Under A1, it reduces to the corresponding “tree” for the first generation, as shown in Fig. 3(b). We denote a generic event of a model sequence through time k as k m(ik k ) = m(ik 1 ,:::,ik ) = fm1(i1 ) , : : : , mk(ik ) g = fsk = m(i ) g (19) and the set of all such sequences as Mk , where k sk = (s1 , : : : , sk ), m(i ) = (m(i1 ) , : : : , m(ik ) ), m(in ) 2 M: i2M (i) (i) + (x̂kjk ¡ x̂kjk )(x̂kjk ¡ x̂kjk )0 ]¹(i) k : (i) E[xk j z k , m(ik k ) ] and x̂kjk = E[xk j z k , mk(i) ], respectively. The corresponding mode estimator at time k is given by m̂kjk = E[mk j z k , A10 , A2] For simplicity we use ik 2 M k to mean m(ik k ) 2 Mk (i.e., use M to denote the index set of M) and refer to m(ik k ) as a sequence, rather than an event. The discussion in the rest of this subsection is largely parallel to that of the optimal AMM estimation in Section IVA. It is nonetheless presented here because of its importance for understanding suboptimal CMM algorithms presented later. MMSE-CMM: The MMSE-CMM base-state estimator at time k is given by (see, e.g., [21]) = = E[xk j z k , mk(i) ]Pfmk(i) j z k , A10 , A2g = X (i) (i) ¹k x̂kjk i2M X E[mk j z k , mk(i) ]Pfmk(i) j z k , A10 , A2g = X m(i) ¹(i) k i2M i2M = X ik 2M k ik 2M k X m(ik ) ¹k(ik ) = (21) = X ik 2M k = (i ) k ¹(ik ) x̂kjk E[mk j z k , m(ik k ) ]Pfm(ik k ) j z k , A10 , A2g ik 2M k ik 2M k = X MSE(m̂kjk j z k ) = E[(mk ¡ m̂kjk )(mk ¡ m̂kjk )0 j z k , A10 , A2] x̂kjk = E[xk j z k , A10 , A2] X = E[xk j z k , m(ik k ) ]Pfm(ik k ) j z k , A10 , A2g k (22) (m(ik ) ¡ m̂kjk )(m(ik ) ¡ m̂kjk )0 ¹k(ik ) X i2M (m(i) ¡ m̂kjk )(m(i) ¡ m̂kjk )0 ¹(i) k where m(ik ) and m(ik k ) are relative by (19)—(20). Although not seen in the literature, the estimators for the base-state sequence and mode sequence using a batch of data through k can be obtained as i2M ¹k(ik ) Pfm(ik k ) 0 k where = j z , A1 , A2g is the posterior mode-sequence probability assuming that the mode sequence in effect is one and only one but possibly (ik ) anyone in the set Mk , x̂kjk = E[xk j z k , m(ik k ) ] is the conditional MMSE estimate assuming sequence m(ik k ) ¹(i) k Pfmk(i) k 0 is true, = j z , A1 , A2g is the posterior mode probability under A10 and A2 that m(i) is in (i) effect at time k, x̂kjk = E[xk j z k , mk(i) ] is the conditional MMSE estimate assuming model m(i) is in effect at (ik ) time k. It in effect lumps a lot of x̂kjk . Like the first 0 MMSE is unbiased and generation, under A1 and A2 x̂kjk with a conditional MSE matrix that is minimum of all base-state estimators x̂kjk , given by (see, e.g., [21]) Pkjk = MSE(x̂kjk j z k , A10 , A2) X (ik ) [MSE(x̂kjk j z k , A10 , A2) = ik 2M k k k (i ) (i ) ¡ x̂kjk )(x̂kjk ¡ x̂kjk )0 ]¹k(ik ) + (x̂kjk (24) Equations (23) and (24) hold true for any (optimal (ik ) or nonoptimal) estimator x̂kjk provided x̂kjk = (20) X X (i) [MSE(x̂kjk j z k , A10 , A2) = (23) m̂kjk = E[mk j z k , A10 , A2] = = X k x̂kjk = E[xk j z k , A10 , A2] = = ik 2M k E[mk j z k , m(ik k ) ]¹k(ik ) m(i ) ¹k(ik ) = (m̂1jk , m̂2jk , : : : , m̂kjk ) ik 2M k X X X ik 2M k E[xk j z k , m(ik k ) ]¹k(ik ) kjk x̂(ik ) ¹k(ik ) = (x̂1jk , x̂2jk , : : : , x̂kjk ) ik 2M k k where m(i ) , given by (20), stands for a possibly time-varying model sequence through k with the index kjk sequence ik = (i1 , : : : , ik ), x̂(ik ) = E[xk j z k , m(ik k ) ] is the corresponding estimate of the base-state sequence, and MMSE MMSE and x̂njk are smoothed MMSE estimates, m̂njk given by m̂njk = E[mn j z k , A10 , A2] X X = E[mn j z k , m(ik k ) ]¹k(ik ) = m(in ) ¹k(ik ) ik 2M k LI & JILKOV: SURVEY OF MANEUVERING TARGET TRACKING. PART V: MULTIPLE-MODEL METHODS ik 2M k 1269 x̂njk = E[xn j z k , A10 , A2] X (ik ) X E[xn j z k , m(ik k ) ]¹k(ik ) = x̂njk ¹k(ik ) : = ik 2M k Unlike the MMSE estimation, the component m̂njk of kjk ik 2M k MAP-CMM: The mixed pdf-pmf of the base state at time k and the mode sequence through time k is p(xk , mk j z k , A10 , A2) = f(xk j z k , mk )p(mk j z k , A10 , A2) = ff(ik ) (xk j z k )¹k(ik ) , ik 2 M k g = f(xk j z k , mk(i) )p(mk j z k , A10 , A2) 6 = ff(i) (xk j z k )¹(i) k , i 2 Mg where f(ik ) (xk j z k ) = f(xk j z k , m(ik k ) ) is the density assuming the mode sequence in effect is m(ik k ) . It follows that the base state has the posterior mixture density f(xk j z k , A10 , A2) X = f(xk j z k , m(ik k ) )Pfm(ik k ) j z k , A10 , A2g ik 2M k X = ik 2M k f(ik ) (xk j z k )¹k(ik ) X = 6 f(xk j z k , mk(i) )Pfmk(i) j z k , A10 , A2g i2M = X i2M f(i) (xk j z k )¹(i) k : (25) Note the difference: summation over the model sequences is the same as over the current models for MMSE estimation (see (21)—(22)), but not the same for MAP estimation (see above). Thus the MAP-CMM estimators are given by X MAP x̂kjk = arg max f(ik ) (xk j z k )¹k(ik ) xk ik 2M k kjk JMAP (x̂kjk , m̂JMAP ) = arg max ff(ik ) (xk j z k )¹k(ik ) , ik 2 M k g (xk ,m(ik ) ) kjk m̂MAP = arg maxf¹k(ik ) , ik 2 M k g m(ik ) = (m̂1jk , m̂2jk , : : : , m̂kjk )MAP kjk x̂MAP = arg max xk X ik 2M k (26) f(ik ) (xk j z k )¹k(ik ) = (x̂1jk , x̂2jk , : : : , x̂kjk )MAP MAP (m̂k ) = arg max f(ik ) (xk j z k ) x̂kjk xk kjk MAP MAP (m̂MAP ) = x̂kjk (m̂k )jm̂k =m̂kjk : x̂kjk MAP 1270 if m̂k = m(i k ) MAP a MAP sequence estimate m̂MAP is not equal to m̂njk , (n) the MAP estimate of the component m of the mode sequence mk . Likewise for the base state. Compared with Section IVA, these optimal CMM estimators clearly correspond to the respective optimal AMM estimators by replacing the constant mode k therein with a possibly time-varying sequence m(i) mode sequence m(ik k ) . As a result, all discussions of the AMM estimators regarding interpretations, dependence on component MAP estimators, and MSE matrix apply to the CMM estimators accordingly. MMSE versus MAP: As explained in Section IIID, MMSE minimizes the MSE, the MMSE estimator x̂kjk MAP while the MAP estimator x̂kjk maximizes the rate of hitting a tiny golf hole centered at xk in the base-state space. They are suitable for different applications, as kjk JMAP , m̂JMAP ) is a joint estimator elaborated in [220]. (x̂kjk that maximizes the rate of hitting the “golf hole” and simultaneously choosing the correct model sequence. MMSE is the mean of As explained in Section IIID, x̂kjk MAP the posterior mixture density; x̂kjk is the location JMAP of the highest peak of the mixture density; x̂kjk MAP k and x̂kjk (m̂ ) are locations of the highest peaks of the corresponding component densities. When the mixture density consists of many components, as is the case for CMM estimation with a random mode sequence having an exponentially increasing JMAP MAP or x̂kjk (m̂k ) number of realizations, the use of x̂kjk k for any m̂ is on shaky ground: they are based on a component density which might have a very small (say, 1%), albeit largest of all components, probability MMSE MAP and x̂kjk of being the true density.9 The use of x̂kjk is on much firmer ground. If, however, the mode sequence is not random, the mixture density and MMSE MAP and x̂kjk are meaningless. In this case, hence x̂kjk kjk JML ML the ML estimators x̂kjk and x̂kjk (m̂ML ) appear to be reasonable choices. The above discussion is based on Assumption A2 that the true mode is exactly one of the models in the set M used. While this assumption is exact or very reasonable for many communication or decoding problems, it is almost never true for maneuvering target tracking where the motion-mode uncertainty is almost never resolved exactly by the models used. For instance, a target almost never takes an exact constant turn and even if it should, the turn rate would not be exactly equal to one of those used in JMAP MAP JML ML , x̂kjk (m̂k ), x̂kjk , and x̂kjk (m̂k ) rely the models. x̂kjk heavily on impossibility of such mismatch between 9 The kjk MAP estimator m̂MAP of the mode sequence is still one of the most reasonable choices. IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS VOL. 41, NO. 4 OCTOBER 2005 modes and models. While assuming no such mismatch MMSE MAP explicitly, x̂kjk and x̂kjk are more robust against (less sensitive to) this mismatch because they rely on all component densities, rather than put all their eggs in one basket as the other MAP and ML estimators do. This in effect “covers” modes between any two models used (i.e., the convex set formed by the models used). As explained in Sections IIID and IVB, given the probabilistic weights, the entire density function f(ik ) (xk j z k ) of every component is needed to MAP MMSE , while x̂kjk relies only on its compute x̂kjk k k (i ) (i ) , Pkjk ). Note, however, that first two moments (x̂kjk MMSE is calculation of the moments needed for x̂kjk an integration problem, as opposed to solving the MAP maximization problem needed for x̂kjk given every component density, which can usually be reduced to differentiation and equation solving. Note also that finding the global maximizer of a mixture density numerically is much simpler than that of a general multivariate function. For instance, the global maximizer of a mixture density is bound to be in the convex set formed by the outmost peak locations of all component densities. While the output processing of an MMSE-based MM algorithm is usually soft decision based in that its overall estimate is a weighted sum of results from each elemental filter, a hard decision based output processing could also be used. Reference [98] proposed for certain applications to use the estimate from a single elemental filter that has the smallest MSE among all elemental filters. The best (i) elemental filter x̂kjk can be identified as the one with (i) MMSE 0 (i) MMSE the smallest deviation (x̂kjk ¡ x̂kjk ) (x̂kjk ¡ x̂kjk ) from the optimal estimate because it can be easily shown [190] that (i) 2 (i) MMSE 2 MMSE 0 (i) MMSE kxk ¡ x̂kjk k = kxk ¡ x̂kjk k + (x̂kjk ¡ x̂kjk ) (x̂kjk ¡ x̂kjk ) (i) 2 (i) 0 (i) where kxk ¡ x̂kjk k = E[(xk ¡ x̂kjk ) (xk ¡ x̂kjk ) j z k ] is the conditional MSE. This in general requires knowledge MMSE of the MMSE-optimal estimate x̂kjk . Linear MMSE: An optimal linear MMSE estimator was proposed in [81] for an MJLS, defined similarly by (1)—(2) except that w and v are replaced by mode-independent w and Dv, with D a matrix. The cornerstone of this LMMSE estimator is the introduction of the (M ¢ n)-dimensional stacked vector yk = [(yk(1) )0 , : : : , (yk(M) )0 ]0 to represent the n-dimensional base-state subject to the assumed model uncertainty among M known models, where every yk(i) is an n-dimensional zero vector except yk(j) = xk if model m(j) is true. The problem of estimating the hybrid state (xk , sk ) is thus reduced to the conventional problem of estimating yk . As a result, the recursive (1) 0 (M) 0 0 optimal LMMSE estimator ŷkjk = [(ŷkjk ) , : : : , (ŷkjk )] of yk is available. The base-state estimator is simply P (i) and MSE(x̂kjk ) is equal to sum over all x̂kjk = i ŷkjk n £ n blocks of MSE(ŷkjk ). A more informative but concise description of this LMMSE estimator was given in [192]. Simulation results given in [81] show that this optimal linear MMSE estimator performs in general not as well as the suboptimal nonlinear IMM estimator, and there was no single case considered in which the LMMSE estimator outperforms the IMM estimator significantly. This demonstrates the high nonlinearity of hybrid estimation problems and the need for good nonlinear estimators. More recently, [83] showed that for a mean-square stable MJLS with an ergodic Markov chain, MSE(ŷkjk ) of this LMMSE estimator converges to a unique positive semidefinite solution of an algebraic Riccati equation. The corresponding matrix in the LMMSE estimator was replaced by this steady-state solution to arrive at a steady-state estimator, similar in spirit to the development of the steady-state Kalman filter. This steady-state estimator was generalized in [82] to the case where the system given a mode sequence involves some uncertain parameters. Most Probable Trajectory Estimation: Conceptually similar to ML sequence estimation, the most probable trajectory (MPT) estimation is a nonlinear filtering approach that determines a state sequence fitting best to the data in some sense (see, e.g., [268]). References [370] and [227] considered state estimation of a continuous-time hybrid system with a fixed number M of possible mode process s(t), where s(t) has N known, possible distributions. A finite-dimensional hybrid filter in recursive form was presented in [370] that is optimal in the MPT sense, which includes optimal base-state sequence estimation and identification of the most probable distribution of s(t). The MPT optimality here is with respect to a cost function defined for a system that is an average of the hybrid system over possible modes. This approach is quite robust with regard to noise characterization: it was applied in [227] to state estimation of a jump-linear system as a piecewise-linear approximation of a highly nonlinear system, including a satellite orbit determination example. B. Cooperation Strategies for MM Estimation Since the number of possible model sequences (hypotheses) increases exponentially with time (more precisely, geometrically with discrete time), brute-force implementations of the above optimal MMSE MAP and x̂kjk are infeasible. CMM estimators x̂kjk Consequently, strategies have been developed to cope with this difficulty. We refer to them as cooperation strategies. CMM algorithms are distinct from one LI & JILKOV: SURVEY OF MANEUVERING TARGET TRACKING. PART V: MULTIPLE-MODEL METHODS 1271 another in the cooperation strategies used. Cooperation strategies developed so far can be classified into two general categories: hypothesis reduction strategies and iterative strategies. Different hypothesis reduction strategies have been proposed to keep the number of hypotheses within a certain limit: 1) merge “similar” model sequences, resulting in a tree with combined branches, which can be achieved by soft decisions and in effect reinitialization of the filters with an (approximately) “equivalent” estimate and covariance; 2) prune “unlikely” model sequences, or select the best model sequences, including the globally best single sequence or the best sequence for each current model, both resulting in a truncated tree, which is necessarily hard-decision based; 3) randomly select a subset of the possible hypotheses; 4) others, such as decoupling weakly coupled mode sequences to form clusters, as briefly reviewed in [280]. The basic idea of merging and pruning is to replace the ever growing mixture tree with a simpler tree that approximates the original tree in some sense. In general, since the base state under A10 and A2 has a mixture density of an ever growing number of components, the problem of hypothesis reduction is in essence that of mixture density reduction, which has been studied in other areas within target tracking (see, e.g., [299], [381]) as well as in statistics extensively (see, e.g., [333], [248]), and thus various mixture density reduction techniques can be applied here. Although closely related, hypothesis reduction is not to be confused with the output processing, discussed in Section IVC. Random selection decides on a subset C of all possible model sequences at random (not necessarily with an equal probability), performs the corresponding conditional filtering operations, and then generates the overall estimate from the conditional estimates. This approach can be argued as a special, albeit extreme, selection strategy, especially when the selection is not equally likely. It has a straightforward batch implementation and several possible recursive implementations. For example, one natural recursive implementation is to select a set of model histories m¤k¡1 at random for each elemental filter at time k, where m¤k¡1 = (i) (m¤k¡2 , mk¡1 ) is formed from some model history m¤k¡2 selected before and some model m(i) at time k ¡ 1. Another one is to select a set of model sequences m¤k = (m¤k¡1 , mk(i) ) at random at time k. Note that the first implementation runs elemental filters based on every model at k, but this is not generally the case in the second implementation. An early publication of the random selection 1272 approach is [2]. More recently developed MM particle filtering algorithms also belong to this class. These results are surveyed in a subsequent part of this survey. Another class is iterative strategies. Disengaged from the tree (mixture) structure of the distribution, they try to solve the estimation problem with recourse to the power of iteration. More developed in this class are those based on the so-called EM algorithm, an elegant and powerful optimization method particularly suitable for ML and MAP estimation. All results in Section VB rely on A10 and A2, but we drop explicit indications of this dependence for simplicity. 1) GPB and IMM Merging Strategies: In these strategies, the ever growing hypothesis tree is approximated repeatedly by a simpler tree, each branch of which lumps “close” or “similar” branches of the original tree. In the following discussion, for simplicity we omit formulas for MSE matrices (error covariances). a) GPB: A straightforward and probably the most natural implementation of this idea is the so-called generalized pseudo-Bayesian algorithms of order n (GPBn) [148, 321, 18]. They reduce the hypothesis tree by having a fixed memory depth such that all the hypotheses that are the same in the latest n time steps are merged and thus each of the M filters runs M n¡1 times at each recursion. The GPB1 and GPB2 algorithms are the most popular ones in this class [1, 148, 74, 321, 18]. Although for simplicity our discussion below is based on the GPB2 algorithm explicitly, it can be extended to the general GPBn case straightforwardly. Here the effects of different (ik ) estimates x̂kjk = E[xk j z k , m(ik k ) ] with probabilities ¹k(ik ) = Pfm(ik k ) j z k g based on the same model pair m(i) and m(j) at k ¡ 1 and k, respectively, are lumped (i,j) with probability (merged) by the single estimate x̂kjk ¹(i,j) k¡1,k : (i,j) (i) x̂kjk = E[xk j z k , mk¡1 , mk(j) ] (j) (i) k ¹(i,j) k¡1,k = Pfmk¡1 , mk j z g: (j) (i) k¡2 Since m(ik k ) = (m(ik¡2 k¡2 ) , mk¡1 , mk ), where m(ik¡2 ) = (i ) k¡2 ) is mode history through time (m1(i1 ) , m2(i2 ) , : : : , mk¡2 k ¡ 2, it follows from the total probability (i,j) and ¹(i,j) (expectation) theorem that x̂kjk k¡1,k are actually k¡2 (i averages of x̂kjk ,i,j) (j) (i) = E[xk j z k , m(ik¡2 k¡2 ) , mk¡1 , mk ] and (j) (i) ¹k(ik¡2 ,i,j) = Pfm(ik¡2 j z k g over m(ik¡2 k¡2 ) , mk¡1 , mk k¡2 ) : (i,j) x̂kjk = X (i x̂kjk X ¹k(ik¡2 ,i,j) : ik¡2 2M k¡2 ¹(i,j) k¡1,k = k¡2 ,i,j) (j) (i) k Pfm(ik¡2 k¡2 ) j z , mk¡1 , mk g ik¡2 2M k¡2 IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS VOL. 41, NO. 4 OCTOBER 2005 This lumping or merging is exact for MMSE output processing in that10 MMSE = x̂kjk X k k (i ) (i ) ¹k = x̂kjk ik 2M k X (i,j) (i,j) ¹k¡1,k x̂kjk (27) (i,j)2M 2 which stems from the linearity of the conditional expectation. This lumping is, however, approximate for conditional filtering, which is a nonlinear (j) operation in that the output x̂kjk is nonlinear in (j) . x̂k¡1jk¡1 This is the case even for a the input linear-Gaussian system with a known mode sequence, (j) (j) where only one term of x̂kjk is linear in x̂k¡1jk¡1 and the other term depends on the measurement at k. For recursive conditional filtering at k, this lumping approximates the recursion This GPB2 approximation can also be applied to CMM estimation based on non-MMSE criteria. For example, f(ik¡3 ,i,j) (xk¡1 j z k¡1 ) (j) (i) = f(xk¡1 j z k¡1 , m(ik¡3 k¡3 ) , mk¡2 , mk¡1 ) (j) (i) , mk¡1 ): ¼ f(i,j) (xk¡1 j z k¡1 ) = f(xk¡1 j z k¡1 , mk¡2 Clearly, this approximation implies (28) but is not implied by (28) and thus is actually more fundamental than (28). Similarly as above, f(i,j) (xk¡1 j z k¡1 ) is actually the probabilistically weighted average of f(ik¡3 ,i,j) (xk¡1 j z k¡1 ) over m(ik¡3 k¡3 ) : f(i,j) (xk¡1 j z k¡1 ) = k¡1 (i ) k¡1 fx̂k¡1jk¡1 , ¹(ik¡1 2 M k¡1 g k¡1 ) : i k (i ) k =) fx̂kjk , ¹(ik ) : ik 2 M k g which consists of M k conditional filtering operations, by the simplified recursion X ik¡3 2M k¡3 [f(ik¡3 ,i,j) (xk¡1 j z k¡1 ) (j) (i) k¡1 , mk¡2 , mk¡1 g]: ¢ Pfm(ik¡3 k¡3 ) j z The (MMSE-based) GPB1 algorithm [1] is based on approximately lumping (merging) the effects of all (ik¡2 ,i) past model histories on x̂k¡1jk¡1 with ¹k¡1 to yield, (ik¡2 ,i) for i 2 M: k¡2 (i,j) , ¹(i,j) fx̂k¡1jk¡1 k¡2,k¡1 (i ,i) (i) (i) ¼ x̂k¡1jk¡1 = E[xk¡1 j z k¡1 , mk¡1 ] x̂k¡1jk¡1 X (i) k¡1 g= ¹k¡1 ¹(i) k¡1 = Pfmk¡1 j z (ik¡2 ,i) 2 : (i, j) 2 M g (i,j) (i,j) =) fx̂kjk , ¹k¡1,k : (i, j) 2 M 2 g k¡3 (i ,i,j) (j) (i) = E[xk¡1 j z k¡1 , m(ik¡3 x̂k¡1jk¡1 k¡3 ) , mk¡2 , mk¡1 ] (i,j) (j) (i) ¼ x̂k¡1jk¡1 = E[xk¡1 j z k¡1 , mk¡2 , mk¡1 ] (28) which can be called GPB2’s fundamental assumption. The conditional estimates of a jump-linear system (4)—(5) are given explicitly, for (i, j) 2 M 2 , by (i,j) (i) = E[xk j z k , mk¡1 , mk(j) ] x̂kjk (j) (i) (i) = x̂k¡1jk¡1 + Kk(j) (zk ¡ Hk(j) Fk¡1 x̂k¡1jk¡1 ) (i) , where Kk(j) is the Kalman filter gain at k, and x̂k¡1jk¡1 for i 2 M, is a lumped estimate: k¡1 which requires only M conditional filtering operations, one for each model, at each recursion. The standard GPBn strategy requires M n conditional filtering operations and storage of M n¡1 (ik¡n ,:::,ik¡1 ) estimates x̂k¡1jk¡1 . References [339] and [347] proposed to trade computation for storage by storing (i) (i) only the nM estimates fx̂k¡njk¡n , : : : , x̂k¡1jk¡1 , i 2 Mg, together with fzk¡n , zk¡n+1 , : : : , zk¡1 g, and recomputing (ik¡n ,:::,ik¡1 ) all x̂k¡1jk¡1 , which requires (M 2 + M 3 + ¢ ¢ ¢ + M n ) conditional filtering operations. b) Reinitialization: Hypothesis reduction for recursive single-scan CMM estimation amounts to reinitialization of each elemental filter since it is reflected in the inputs to elemental filters at each recursion [192]. Consider the recursion at time k and (i) denote by X̄k¡1 the set of input quantities (omit MSE matrices) to the elemental filter based on model m(i) , (i) as depicted in Fig. 4. Then X̄k¡1 is as given by (32) (i) , mk¡1 ] = E[xk¡1 j z X (l,i) (l) (i) = j z k¡1 , mk¡1 g: x̂k¡1jk¡1 Pfmk¡2 (31) ik¡2 2M k¡2 which has only M 2 conditional filtering operations, M for each model. In simple words, the lumping amounts to assuming (i) x̂k¡1jk¡1 (30) (29) l2M 10 This is the same as replacing bodies of a distributed mass by the component masses placed at their respective centroids for the calculation of the centroid of the total body system. (i) X̄k¡1 = 8 (ik¡1 ) fx̂k¡1jk¡1 , ¹k¡1 : ik¡1 2 M k¡1 g MMSE-CMM > (ik¡1 ) > > > > > ff(ik¡1 ) (xk¡1 j z k¡1 ), ¹k¡1 : ik¡1 2 M k¡1 g > (ik¡1 ) > > < MAP-CMM (ik¡n ,:::,ik¡1 ) (ik¡n ,:::,ik¡1 ) > fx̂k¡1jk¡1 , ¹k¡n,:::,k¡1 : (ik¡n , : : : , ik¡1 ) 2 M n¡1 g > > > > > > GPBn > > : (j) (j) fx̂k¡1jk¡1 , ¹k¡1 , j 2 Mg LI & JILKOV: SURVEY OF MANEUVERING TARGET TRACKING. PART V: MULTIPLE-MODEL METHODS : GPB2 (32) 1273 Fig. 4. Structure of single-scan recursive MM estimation algorithms. Fig. 5. GPB1, GPB2, and IMM reinitializations. Line on left of round node indicates filtering operation; a node outputs a weighted sum of its inputs to provide reinitialization. (a) GPB1. (b) IMM. (c) GPB2. For GPB1 and MMSE-AMM (first generation), (i) (i) X̄k¡1 has only one element x̄k¡1jk¡1 , given by 8 (i) (i) k¡1 x̄k¡1jk¡1 = x̂k¡1jk¡1 = E[xk¡1 j z k¡1 , m(i) ] > > < (i) = x̂k¡1jk¡1 = E[xk¡1 j z k¡1 ] x̄k¡1jk¡1 > > P : (j) (j) Pfmk¡1 j z k¡1 g = j2M x̂k¡1jk¡1 AMM GPB1: c) IMM: A significantly more cost-effective reinitialization than those of GPB1’s and GPB2’s is (i) = E[xk¡1 j z k¡1 , mk(i) ] x̄k¡1jk¡1 (j) , mk(i) , z k¡1 ] j z k¡1 , mk(i) g = EfE[xk¡1 j mk¡1 X (j) (j) x̂k¡1jk¡1 Pfmk¡1 j z k¡1 , mk(i) g: (33) = j2M 1274 This leads to the IMM algorithm [56, 55, 58]. Like the GPB1 algorithm, the IMM algorithm also runs each of the M filters only once at each recursion. (i) Compared with the GPB1 reinitialization x̄k¡1jk¡1 = E[xk¡1 j z k¡1 ], the extra conditioning on mk(i) in the (i) IMM reinitialization x̄k¡1jk¡1 = E[xk¡1 j z k¡1 , mk(i) ] is both legitimate and effective [190, 192]. It is legitimate because mk(i) is assumed true anyway (i) when calculating x̂kjk in the conditional filtering; it is effective because mk(i) carries valuable information about mk¡1 , since the mode sequence is dependent, which in turn affects xk¡1 . The reinitialization in the GPB1, GPB2, and IMM algorithms is illustrated in Fig. 5 and explained as follows [192]. The GPB1 algorithm reinitializes each filter with the “best possible” IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS VOL. 41, NO. 4 OCTOBER 2005 Fig. 6. Structure of IMM estimation algorithm (with three models). common single quasi-sufficient statistic–the previous overall estimate; the elemental filters interact with one another only through the use of this common input at each recursion, which carries information from all elemental filters. In the IMM algorithm, each filter i at time k has its (i) own, individualized reinitialization x̄k¡1jk¡1 , which forms the “best possible” quasi-sufficient statistic of all old information and the knowledge/assumption that model m(i) is in effect at k. Such individualized reinitialization clearly is intuitively more appealing than GPB1’s common reinitialization. The superiority of IMM to GPB1 stems from this smart extra conditioning or individualized reinitialization, known as mixing, as evidenced by numerous applications detailed later in Section VE. Note that GPB1 uses a single merging operation for both output and conditional filtering, whereas IMM and GPB2 both use two separate ones. Note that the IMM reinitialization (33) differs from (29) of the GPB2 algorithm only in the time at which the model is assumed to be true. In this sense, the IMM algorithm does the mixing at a better time (before conditional filtering) than the GPB2 algorithm (after conditional filtering) [58]. This is clearly seen by comparing Fig. 5(b) and Fig. 5(c). For the case where each model-based system dynamics (j,i) is linear, state-prediction mixing,11 with x̂kjk¡1 = (j) E[xk j mk¡1 , z k¡1 , mk(i) ], (i) x̂kjk¡1 = E[xk j z k¡1 , mk(i) ] (j) = EfE[xk j mk¡1 , z k¡1 , mk(i) ] j z k¡1 , mk(i) g X (j,i) (j) x̂kjk¡1 Pfmk¡1 j z k¡1 , mk(i) g = j2M 11 The benefit of the extra conditioning is more evident for state-prediction mixing than for reinitialization. Fig. 7. Semi-Markov process. is equivalent to the IMM reinitialization (33). If the model-based system dynamics is nonlinear, it is more accurate in general than reinitialization (33) but is computationally less efficient because it requires M predictions in each conditional filtering operation, rather than a single one if (33) is used. The architecture of the IMM algorithm is illustrated in Fig. 6 with three models. A complete recursion of the IMM algorithm with Kalman filters as its elemental filters is summarized in Table II for the Markov jump-linear system (4)—(5) with white Gaussian process and measurement noises. A straightforward implementation may have numerical problems for some applications, especially those with a wide model separation. A numerically robust version of the IMM algorithm (as well as the AMM algorithm) was presented in [216]. d) IMM with semi-Markov models: As explained in Section III.I of Part I [209], a semi-Markov process model has a greater modeling power and suits better to more real-world problems than a Markov model. The evolution of a semi-Markov process can be visualized as follows (Fig. 7). From any mode m(i) , the next mode m(j) to take place is chosen at random according to the transition probability ¼ij , and the time between m(i) and m(j) (i.e., the sojourn time ¿ (i) in mode m(i) ) is chosen at random according to sojourn time pdf fij (¿ ). A class of semi-Markov LI & JILKOV: SURVEY OF MANEUVERING TARGET TRACKING. PART V: MULTIPLE-MODEL METHODS 1275 TABLE II One Cycle of IMM Estimator 1. Model-conditioned reinitialization (for i = 1, 2, : : : , M): P ¢ Predicted mode probability: ¹(i) = Pfmk(i) j z k¡1 g = ¼ ¹(j) j ji k¡1 kjk¡1 jji ¢ (j) (j) Mixing weight: ¹k¡1 = Pfmk¡1 j mk(i) , z k¡1 g = ¼ji ¹k¡1 =¹(i) kjk¡1 Mixing estimate: (i) x̄k¡1jk¡1 = E[xk¡1 j mk(i) , z k¡1 ] = Mixing covariance: ¢ (i) P̄k¡1jk¡1 = P j (j) P jji ¹ x̂(j) j k¡1jk¡1 k¡1 (j) (j) jji (i) (i) [Pk¡1jk¡1 + (x̄k¡1jk¡1 ¡ x̂k¡1jk¡1 )(x̄k¡1jk¡1 ¡ x̂k¡1jk¡1 )0 ]¹k¡1 2. Model-conditioned filtering (for i = 1, 2, : : : , M): Predicted state: (i) (i) (i) (i) (i) x̂kjk¡1 = Fk¡1 x̄k¡1jk¡1 + Gk¡1 w̄k¡1 Predicted covariance: (i) (i) (i) (i) 0 (i) (i) (i) 0 P̄k¡1jk¡1 Pkjk¡1 = Fk¡1 (Fk¡1 ) + Gk¡1 Qk¡1 (Gk¡1 ) Measurement residual: (i) z̃k(i) = zk ¡ Hk(i) x̂kjk¡1 ¡ v̄k(i) Residual covariance: (i) Sk(i) = Hk(i) Pkjk¡1 (Hk(i) )0 + Rk(i) Filter gain: (i) Kk(i) = Pkjk¡1 (Hk(i) )0 (Sk(i) )¡1 Updated state: (i) (i) x̂kjk = x̂kjk¡1 + Kk(i) z̃k(i) Updated covariance: (i) (i) Pkjk = Pkjk¡1 ¡ Kk(i) Sk(i) (Kk(i) )0 3. Mode probability update (for i = 1, 2, : : : , M): ¢ assume Model likelihood: = p[z̃k(i) j mk(i) , z k¡1 ] = N (z̃k(i) ; 0, Sk(i) ) L(i) k Mode probability: = ¹(i) k ¹(i) L(i) kjk¡1 k P j 4. Estimate fusion: Overall estimate: Overall covariance: x̂kjk = Pkjk = (j) (j) ¹kjk¡1 Lk P x̂(i) ¹(i) i kjk k [P (i) + (x̂kjk i kjk P models, highly relevant to MM tracking, is the sojourn-time dependent Markov (STDM) process [258, 69]. Here the process representing the modal state is characterized by the sojourn-time dependent transition probabilities, defined by ¼ij (¿ ) = Pfsk = m(j) j sk¡1 = ¢ ¢ ¢ = sk¡¿ = m(i) 6 = sk¡¿ ¡1 g (i) (i) , ¿k¡1 = ¿g = Pfmk(j) j mk¡1 (i) where ¿k¡1 is the time already stayed in mode m(i) at time k ¡ 1. An extension of the IMM configuration to the case of an STDM process was proposed in [69]. It operates in the same manner as the standard IMM algorithm except that for the recursive cycle (k ¡ 1 ! k) each transition probability ¼ij is replaced by average (expected value) of ¼ij (¿ ) over all possible ¿ values: ¢ (i) ¼ˆ ij,k¡1 = Pfmk(j) j mk¡1 , z k¡1 g = k¡1 X ¿ =1 (i) k¡1 ¼ij (¿ )p(i) ]: k¡1 (¿ ) = E[¼ij (¿ ) j mk¡1 , z (i) (i) k An “exact” recursion for p(i) k (¿ ) = Pf¿k = ¿ j mk , z g was derived in [259]. Later [282] showed that this recursion is actually approximate, and the IMM 1276 (i) (i) 0 (i) ¡ x̂kjk )(x̂kjk ¡ x̂kjk ) ]¹k (i) reinitialization x̄k¡1jk¡1 = E[xk¡1 j z k¡1 , mk(i) ] loses its magic in the case of an STDM process and turns out to be similar to GPB2’s reinitialization (29): E[xk¡1 j z k¡1 , mk(i) ] (j) , mk(i) , z k¡1 ] j z k¡1 , mk(i) g = EfE[xk¡1 j mk¡1 X (j,i) (j) j z k¡1 , mk(i) g: x̂k¡1jk¡1 Pfmk¡1 = j Use of the standard IMM reinitialization (33) here essentially amounts to ignoring the ¢ (j,i) sojourn-time dependence because x̂k¡1jk¡1 = ¢ (j) (j) , mk(i) , z k¡1 ] = E[xk¡1 j mk¡1 , z k¡1 ] = E[xk¡1 j mk¡1 (j) x̂k¡1jk¡1 holds only if the Markov chain is not sojourn-time dependent. From a practical point of view, the STDM model appears rather complicated to design mainly because the required knowledge of the sojourn-time dependent transition probabilities is hard to come by. The problem can be simplified slightly by considering a narrower class–homogeneous semi-Markov processes [337, 241] specified by an embedded Markov chain with given initial mode probabilities, transition probabilities, and sojourn-time pmfs of each mode IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS VOL. 41, NO. 4 OCTOBER 2005 m(i) , defined by ¹̄(i) = m(i) j sk¡1 = ¢ ¢ ¢ = sk¡¿ = m(i) 6 = sk¡¿ ¡1 g k (¿ ) = Pfsk 6 (i) (i) = Pfsk 6 = m(i) j mk¡1 , ¿k¡1 = ¿ g: That is, ¹̄(i) k (¿ ) is the probability that a jump occurs at time k given that the last jump wasP at time k ¡ ¿ (i) to mode m(i) . It is time invariant and 1 ¿ =1 ¹̄k (¿ ) = 1 due to homogeneity of the process. In this case, the STDM transition probabilities can be determined by ¼ij (¿ ) = ¹̄(i) k (¿ )¼ij [337, 241], which can serve as a useful guideline to build an STDM process model. 2) Other Merging-Based Algorithms: a) Bayesian filter bank: Generally speaking, the mode transition may be base-state dependent and thus neither Markov nor semi-Markov (but the base state and mode together are Markov). Reference [296] proposed a general Bayesian density-based scheme for hybrid estimation with nonlinear measurements and such non-Markovian mode jumps. The scheme is linear in the number of models and a computational implementation based on the Gaussian-sum approximation techniques was proposed. For the particular case of homogeneous MJLS this scheme can be used for point estimation and can be implemented via standard techniques of merging close components and/or pruning unlikely components. A more detailed description will be given in a subsequent part of this survey. A systematic treatment of mixture component reduction techniques in a more general setting can be found in [299], [300], and [384]. b) Change of measure: By change of measure, a measure-theoretic technique based on the Radon-Nikodym theorem, [326, 325, 327] developed a Gaussian wavelet estimator (GWE)12 based on hypothesis merging to limit the growth of the Gaussian mixture. Its merging strategy resembles the GPB2 strategy very much.13 However, it differs fundamentally from the IMM (and GPB) algorithms in computing the mode probabilities. It accounts for the effect of the measurement residuals z̃k(i) on the state estimate update directly in the state space: (i) Pfmk¡1 , mk(j) j z k g ¢¤ £ ¡ (i,j) 2 (i,j) k(P (i,j) )¡1 ¡ kx̂kjk¡1 k2(P (i,j) )¡1 / exp 12 kx̂kjk kjk kjk¡1 where kxk2A = x0 Ax. This formula gives more weight to the models that have a larger normalized change in the magnitude of the updated mean, unlike the conventional, intuitively appealing IMM and GPB1 formulas, which give more weight to those with a 12 A Gaussian wavelet is simply a Gaussian mixture, where the mother wavelet is simply the Gaussian distribution function. 13 In [326], [325], and [327], dynamic modes and measurement models are denoted by different indices, resulting in triple indices. smaller normalized measurement residual: Pfmk(i) j z k g / exp(¡ 12 kz̃k(i) k2(S (i) )¡1 ). Reference [328] included k a comparative study between GWE and the IMM algorithm for scenarios with different data rates. While for high rates the differences were tiny, for low rates the GWE showed significant improvement. More generally, an exact estimator with all components of the mixture was presented in [109] by change of measure, along with simulation results illustrating its improvement over the IMM algorithm. Recently, [108] presented simulation results of an approximate implementation with a fixed complexity by “exact pruning” (without explanation), showing superior performance to the IMM algorithm for a passive tracking example. c) Enhancement by mode observations: Tracking performance can be improved by using additional observations of target features, such as target attitude and image, provided by, e.g., an imaging sensor. These observations are related more directly to the target motion mode than the kinematic measurements. Many of them are actually used in target recognition and feature-aided tracking. As such, incorporation of such observations in tracking is closely related to joint tracking and recognition. Issues in modeling and utilization of such information for target tracking have been studied extensively (see, e.g., [163], [188], [8], [329], [330], [93]). Many of them are beyond the scope of this paper and their coverage is planned in a subsequent part. We present here only a brief review of recent results using mode observations within the MM context. Loosely speaking, a modal sensor can be modeled as a classifier over a set M [327] (possibly augmented by a “no-decision” event [363, 364, 112]). Let yk and y k be the mode observations at and through time k, respectively, along with the kinematic measurements z k . It follows from straightforward Bayesian calculus [110, 112] that for the MJLS (1)—(2), the joint posterior density p(xk , sk j z k , y k ) of the hybrid state is again a mixture density with an exponentially increasing number of components, similar to the case with kinematic measurements only. This fact was established earlier (see [109], [326]) in terms of unnormalized14 joint posterior densities q(xk , sk j z k , y k ) by change of measure in discrete time [107]. Main efforts thereafter have been focused on developing tractable approximate estimators with fixed computation/memory.15 14 Using unnormalized density is advantageous in providing linearity of the fundamental Bayesian recursion [296]. 15 Fortunately, it turns out that for the special case of mode-observation-only tracking, optimal estimates for Pfmk(i) j y k g and f(xk j y k ) can be obtained recursively with fixed (nonincreasing) computational requirements. The corresponding algorithms were proposed in [364] for Pfmk(i) j y k g and in [100] for E[xk j y k ]. Further results along this line can be found in [176] and [177]. LI & JILKOV: SURVEY OF MANEUVERING TARGET TRACKING. PART V: MULTIPLE-MODEL METHODS 1277 Based on the optimal Bayesian mixture-density representation, [111], [110], and [112] proposed an extension of the IMM algorithm, referred to as image-enhanced IMM (IE-IMM), derived by IMM-like merging. It consists of all the familiar IMM steps plus an additional step to update the mode probabilities with the mode (image) observations by16 1 Pfmk(i) j z k , y k g = p(sk j mk(i) , z k , y k¡1 )Pfmk(i) j z k , y k¡1 g c (34) where the likelihood p(sk j mk(i) , z k , y k¡1 ) is provided by the model of the modal sensor and Pfmk(i) j z k , y k¡1 g comes from the standard IMM part after the update with the kinematic measurement zk . The IE-IMM design for tracking included a constant-velocity (CV) and two CT models with known turn rates. Simulation results demonstrated that mode observations indeed enhance IMM’s performance significantly in the case of a high quality modal sensor, but the enhancement diminishes as the quality of the mode observations becomes poorer. As mentioned above, a recursion for the unnormalized joint posterior density q(xk , sk j z k , y k ) can be obtained by change of measure. An exact estimator in this class was presented in [109]. Based on the recursion for the unnormalized density, [326], [325], and [327] proposed the approximate implementation, GWE described above. Again, the entire effect of the mode observations is on the mode probability, also given by (34). The comparative simulation results for IMM, IE-IMM, and GWE algorithms presented in [327] indicated that generally GWE provides an improvement over IE-IMM and IMM, which in some cases (e.g., good imager, poor kinematic data) may become considerable (25% over IE-IMM and 50% over IMM) to justify its increase in computation. An exact hybrid filter based on change of measure that accounts for intermittent mode measurements was presented in [3], along with an approximate, GPB-type implementation and EM-based estimation of the transition probability matrix. More details of GWE and image-enhanced MM estimation can be found in [323], [189], and [324]. Feature-aided IMM was also studied in [376] and [385]. d) Multirate IMM: References [138] and [139] proposed to use the IMM strategy to combine a bank of filters, each with a different data rate that is consistent with its assumed target dynamics (e.g., a CA filter would benefit a higher data rate more than a CV filter). A filter with a lower data rate updates less frequently, and thus compared with the corresponding full-rate IMM algorithm using similar 16 This makes perfect sense since the mode observations are modeled as a classifier, which carries no information about target’s kinematic state directly. 1278 models, this multirate IMM algorithm provides certain computational savings at a cost of somewhat larger peak errors at maneuver onset. The multirate data bank is obtained from the original data by a discrete wavelet transform. e) Weighted-model based single filter: To reduce the computational complexity of MM algorithms, [237] proposed an MM-based single filter algorithm, where the filter is based on an average model, which is a probabilistically weighted sum of the models used. The weights are updated over time by a recursive formula of hypothesis probabilities in the multihypothesis version of the Shiryayev sequential probability ratio test (i.e., the quickest change-point detector under some conditions) [238] and approximate model likelihoods assuming Gaussianity. 3) Hypothesis Reduction by Pruning: The basic idea of hypothesis reduction by pruning has been explained in Section IVC under the title “hard decision,” although that subsection is for output processing. More specifically, for hypothesis reduction at time k a “good” subset Bk of model sequences is identified and maintained while discarding/pruning less likely ones in the set Mk of all possible model sequences. The number of model sequences in Bk may or may not change over time. a) B-best: This approach intends to maintain only a number Bk of the best (i.e., most probable or likely) model sequences at time k. This idea has an exact and straightforward batch implementation: select Bk sequences with the largest probabilities (or likelihoods). But recursive implementations are more realistic. Consider a recursion at time k with Bk¡1 “best” model histories (with associated (ik¡1 ) and mode-sequence base-state estimates x̂k¡1jk¡1 k¡1 probabilities ¹(ik¡1 ) ) and M models (elemental filters). Each elemental filter performs Bk¡1 conditional (ik¡1 ) as input filtering operations, using one x̂k¡1jk¡1 in each operation, yielding as many as Bk¡1 M (ik¡1 ,j) updated estimates fx̂kjk , (ik¡1 , j) 2 Bk¡1 £ Mg and the corresponding mode-sequence probabilities ¹k(ik¡1 ,j) . Only Bk of these Bk¡1 M model sequences (and the associated base-state estimates) with the largest ¹k(ik¡1 ,j) are retained17 (and renormalized) for the next recursion (and output at k). The above mode-sequence probabilities ¹k(ik¡1 ,j) can be replaced by model-sequence likelihoods. As explained in Section IVC, this recursive implementation actually does not guarantee that the Bk model sequences are actually most probable among all possible sequences, because the most probable ones may have less 17 As (ik¡1 ,j) such, updated estimates x̂kjk not among the Bk best sequences actually need not be computed, but the corresponding model-sequence likelihoods are needed to obtain ¹k k¡1 . IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS VOL. 41, NO. 4 (i ,j) OCTOBER 2005 Fig. 8. Viterbi algorithm. (a) Hypotheses with link likelihoods. (b) Most likely paths (k = 3). (c) Most likely paths (k = 4). probable partial histories and thus been discarded at an earlier recursion. Another drawback is that some or many of the B most probable model sequences may be quite similar and had better be merged to save processing resources. This B-best idea is the one underlying many hypothesis reduction techniques, such as the one in the multiple hypothesis tracking (MHT) algorithms (see, e.g., [293], [280], [34]). For maneuvering target tracking, it was first proposed in [338], [129], [338], [336], and [337], and extended to Markov jump systems with unknown transition probabilities in [335] and to MM smoothing in [241]. Bk is usually fixed over time, as in these publications, but the approach works for time-varying Bk . For instance, Bk can be determined automatically by requiring all survived sequences to have a probability above a threshold, as proposed in [215], [92], and [63], or probably better to have their ratios of probability to the largest one above a threshold [215]. b) Viterbi algorithm: The Viterbi algorithm (see, e.g., [115]), also known as the forward dynamic programming, aims at finding the best path (model sequence) over a time horizon in a recursive manner. Fig. 8 illustrates this procedure for a hypothesis tree of three modes similar to the one in Fig. 3(a). The number next to a link (i.e., mode transition) in Fig. 8(a) is its log-likelihood of being the true one; the number next to a node (mode) in Figs. 8(b) and 8(c) is the log-likelihood of reaching the node by the most likely path (sequence of transitions), which is assumed to be the sum of the log-likelihoods of the links on the path; only the most likely paths reaching each node are indicated in Figs. 8(b) and 8(c). Clearly, the most likely paths (the thicker lines) through time k = 3 and k = 4 are (1, 2, 2) and (2, 3, 3, 1), respectively. Note that at k = 3 the path (2, 3, 3) is not most likely, but it must be the most likely one arriving at node 3. This idea has a straightforward implementation for hypothesis reduction [11]. Consider a recursion at time k with M model histories (with associated (ik¡1 ) and ¹k¡1 ) and M models. Each elemental x̂k¡1jk¡1 (ik¡1 ) filter j performs M conditional filtering operations, (ik¡1 ) using one x̂k¡1jk¡1 as input in each operation, yielding k¡1 (i M updated estimates x̂kjk ,j) and the corresponding mode-sequence probabilities ¹k(ik¡1 ,j) . Only the sequence with the largest ¹k(ik¡1 ,j) for each j is retained for the next recursion (and output at k). In this way, the best model history for each model at any time is identified. It always has M survived model histories at any time. Each of them is the best model history for an elemental filter at the time. These M histories k include the best one m̂MAP but are not necessarily the M best ones because, for example, the second best model history for a model may be better than the best one for another model. These suboptimal histories are needed to guarantee the inclusion of the best sequence for the future. The above recursive procedure intends to find the k most probable sequence m̂MAP , namely, one with the largest probability ¹k(ik ) = Pfm(ik k ) j z k g 1 k¡1 g = f(zk j m(ik k ) , z k¡1 )Pfmk(ik ) j m(ik¡1 k¡1 ) , z c k¡1 g ¢ Pfm(ik¡1 k¡1 ) j z ln ¹k(ik ) = ln ¹(ik¡1 k¡1 ) (35) + ¢ik¡1 ,k ¢ik¡1 ,k = ln f(zk j m(ik k ) , z k¡1 ) k¡1 g ¡ ln c: + ln Pfmk(ik ) j m(ik¡1 k¡1 ) , z If the mode transition log-likelihood ¢ik¡1 ,k were independent of the mode history m(ik¡2 k¡2 ) = (i ) k¡2 ), the above Viterbi algorithm (m1(1) , m2(2) , : : : , mk¡2 k would yield the most probable sequence m̂MAP . However, for a hybrid system ¢ik¡1 ,k actually depends on the mode history m(ik¡2 k¡2 ) , which is analogous to the case with Fig. 8(a) in which the number next to a link depends on the path reaching its left node. Consequently, the above procedure is only suboptimal even if the mode sequence is a Markov chain such (ik¡1 ) k¡1 that Pfmk(ik ) j m(ik¡1 g = Pfmk(ik ) j mk¡1 g = ¼ik¡1 ,ik k¡1 ) , z k¡1 k ¤ because in this case ln ¹(ik ) = ln ¹(ik¡1 ) + ¢ik¡1 ,k could ¤ ¤ < ln ¹k¡1 for be larger than ln ¹k(ik ) even if ln ¹k¡1 (ik¡1 ) (ik¡1 ) ¤ k¡2 some mode history m(ik¡2 k¡2 different from m(ik¡2 ) . This ) ¤ dependence of ¢ik¡1 ,k on m(ik¡2 k¡2 ) is a major (but subtle) difference between the hidden Markov models LI & JILKOV: SURVEY OF MANEUVERING TARGET TRACKING. PART V: MULTIPLE-MODEL METHODS 1279 (HMM) of a hybrid system (or HMM for target tracking18 ) and the more standard HMM found in such applications as speech processing (see [289] and [290] for a tutorial), where f(zk j m(ik k ) , z k¡1 ) reduces to f(zk j mk(ik ) ) because direct (continuousor discrete-valued) observations of the mode are available. However, this dependence may be removed in the framework of the EM algorithm [287], discussed later in Section VB5. 4) Merging versus Pruning: Experience indicates that base-state estimators for maneuvering target tracking based on merging are usually superior to those based on pruning. We believe that this performance difference stems mainly from their difference in sensitivity to the mode-model mismatch, similar to our discussion in Section VA that contrasts MMSE estimation and some versions of MAP estimation. Simply put, the resultant sequence by merging is not limited to the assumed set of model sequences and in this sense merging is less sensitive than pruning to the mismatch between the assumed models and the true model. In contrast, pruning appears more intuitively appealing than merging for target tracking in the presence of clutter, where a measurement is better treated either from a target or clutter, rather than possibly from something in between. Another major difference between merging and pruning is that the sum of the probabilities of the merged model sequences is always one, while without renormalization that of a fixed number of the survived model sequences after pruning is constantly decreasing, usually dramatically, as time elapses. For example, for a problem with ten possible models at each time, if after pruning ten model sequences are retained, their sum of probabilities at time k could be as low as 10=10k = 10¡(k¡1) (assuming a uniform distribution) or possibly even lower because many good sequences could have been deleted earlier in a recursive implementation. Even if the most likely sequences have a probability a million times that of the average probability of a sequence, this sum has an upper bound 10 £ 106 =10k = 10¡(k¡7) , 8k ¸ 7, which still drops dramatically as time goes. There is hardly any reason to expect that the mean or global maximizer of a mixture density of 10k components can be approximated well by one with the 10 components. It should be clear from the above discussion that a test scenario in which the true model (sequence) is one of those assumed by the MM algorithm favors implicitly the pruning or selection based techniques, which is however not very realistic. In other words, merging outperforms pruning and selection particularly when the true model differs 18 See 1280 [284] for a tutorial on HMM for tracking. significantly from those assumed, as evidenced by the comparative study of seven MM algorithms, including AMM, GPB1, GPB2, IMM, B-best, Viterbi, and reweighted IMM (RIMM), for a few simple maneuvering target tracking scenarios, reported in [283]. a) IMM versus B-best: Surprisingly few implementations of pruning strategies for tracking a single maneuvering target (without clutter) have been reported in the literature. One reason is probably that early simulation results by the authors of the B-best pruning indicated that generally the performance of the B-best pruning is inferior to GPB-type merging of the same computational complexity [336]. Nevertheless, [302] reported an implementation of a B-best strategy based tracker for homing missile guidance that features rapid and large variation in target acceleration. For this quite realistic scenario the B-best strategy implemented with 11 models (quantized acceleration levels in [¡9 g, 9 g]) showed a fairly satisfactory accuracy, but at the very high cost of keeping 253 model sequences. A comparison between a two-model (CV and Singer) IMM algorithm and two B-best strategy based MMSE and MAP filters, respectively, over a simplistic one-dimensional scenario was reported in the short note [356]. It claimed that the B-best strategy provided better accuracy than the IMM algorithm, but no indication was given of how many filters were used. A more thorough and realistic comparison between two trackers based on the IMM and B-best strategies, respectively, was presented in [146] and [147] for civilian (2D air traffic control (ATC) tracking) and military (3D antiaircraft gun tracker) scenarios. The results showed that at a comparable computational complexity the two trackers were highly competitive in terms of estimation accuracy but had complementary strengths and trade-off patterns: the IMM algorithm led to smaller peak errors at maneuver onset, while the B-best algorithm resulted in smaller steady-state errors during nonmaneuvering and maneuvering motions. Although these results make sense intuitively, they provide a rather indirect comparison of the two techniques as hypothesis reduction strategies because different underlying designs were used for the filters. While the IMM strategy used two kinematic models with a Markov model for mode transitions, the B-best strategy used three kinematic models and a sequence of independent binary random variables for the mode evolution. It would be more representative to compare the two strategies with more similar designs and parameters. b) IMM versus Viterbi: Both the Viterbi and IMM algorithms were implemented and compared in [11] for a model-set design of 12 models, quantizing a 2D acceleration vector of a magnitude up to 4g. The simulation results showed a comparable accuracy IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS VOL. 41, NO. 4 OCTOBER 2005 of the Viterbi and IMM algorithms with generally smaller peak errors of the latter during maneuver onset. The Viterbi algorithm was slightly more accurate in cases with small modal separation. A closely related algorithm, along with simulation comparison results, was presented in [287] using the EM technique, discussed later in Section VB5. c) Combined pruning and merging: It is intuitively appealing to combine pruning and merging strategies, or more generally, hard decision with soft decision. Obviously, numerous ways of combination exist. A simple, integrated idea is to prune all mode histories ending at model m(j) at time k if its probability (j) k ¹(j) k = Pfmk j z g X (j) (j) k¡1 Pfmk(j) j z k , m(ik¡1 j zk g = k¡1 ) , mk gPfm(ik¡1 ) , mk ik 2M k = X ¹k(ik¡1 ,j) This is often hard or intractable if the likelihood function fµ (Z) is unavailable (e.g., without a closed form). In many cases, however, fµ (Y, Z) for some “complete” data (Y, Z) is available and has a simple form, where Y is some “nuisance” random parameter, known as missing data, hidden data, unobservable data, or left-out data. Estimation based on fµ (Z) is in this sense an “incomplete-data” problem. It is intuitively appealing to replace fµ (Z) with f̂µ (Z j µ̂) ¢ = E[fµ (Y, Z) j Z, µ̂] given the best available estimate µ̂ of µ, where the average is over Y for the given Z and µ̂ in E[fµ (Y, Z) j Z, µ̂], which is, for example, sometimes equal to fµ (Ŷ(Z, µ̂), Z) with Ŷ(Z, µ̂) = E[Y j Z, µ̂]. The basic idea of the EM algorithm is to solve the problem of arg maxµ fµ (Z) by the iteration µ̂j+1 = arg maxµ f̂µ (Z j µ̂j ) with a better and better estimate µ̂ of µ in the hope that µ̂j+1 = arg max E[fµ (Y, Z) j Z, µ̂j ] ik¡1 2M k¡1 µ is below a threshold, or prune those with the same (i) , mk(j) ) if its probability pair (mk¡1 X (j) (i) k ¹(i,j) ¹k(ik¡2 ,i,j) k¡1,k = Pfmk¡1 , mk j z g = j!1 ¡! arg max E[fµ (Y, Z) j Z, µ] = arg max fµ (Z) = µ̂ML : µ ik¡2 2M k¡2 is below a threshold. This is actually an integration of (i,j) merging and pruning since ¹(j) k and ¹k¡1,k correspond to merging (see Section VB1) and can be obtained (approximately) by the IMM (or GPB1) and GPB2 algorithms, respectively, although the corresponding (i,j) (i) x̂kjk and x̂kjk are not needed directly here. 5) Iteration-Based Algorithms: MMSE estimation and MAP estimation are actually problems of integration and maximization of a posterior density, respectively. ML estimation amounts to finding the global maximizer of the likelihood function. The above hypothesis reduction strategies take advantage of the structure (i.e., mixture density) of the base-state distribution. Alternatively, these problems can be solved numerically without explicit reliance on the mixture-density structure (see, e.g., [175]). Although a large number of numerical integration algorithms are available, to our knowledge, few have been proposed for finding an MMSE estimator specifically in the MM context. Effort along this line appears worthwhile. The situation is better for MAP-based MM estimation. A class of iterative search based algorithms have been proposed, almost all of which rely on the so-called EM algorithm. a) EM algorithm: The EM algorithm [87] is an iterative procedure of finding a maximizer of a likelihood function, particularly suitable for the so-called incomplete-data problems (see, e.g., [249]). Consider the problem of estimating a parameter µ using data Z by the ML method, given by µ̂ML = arg max fµ (Z): µ (36) µ More specifically, starting with some initial estimate µ̂0 of µ, each iteration of the EM algorithm consists of two conceptual steps (they are often combined actually): E-step (expectation): Q(µ j µ̂j ) = E[ln fµ (Y, Z) j Z, µ̂j ] M-step (maximization): µ̂j+1 = arg maxµ Q(µ j µ̂j ). The iteration stops when kµ̂j+1 ¡ µ̂j k is below some threshold and then µ̂j+1 is taken to be µ̂ML . Clearly, Q(µ j µ̂j )jµ=µ̂ ¸ Q(µ j µ̂j )jµ=µ̂j . It follows from Jensen’s j+1 inequality that this iteration enjoys the monotone property fµ (Z)jµ=µ̂ ¸ fµ (Z)jµ=µ̂j , which guarantees j+1 global convergence19 of the likelihood values fµ (Z)jµ=µ̂j for a bounded fµ (Z) under mild regularity conditions and likewise for global convergence of µ̂j under more stringent conditions (see, e.g., [249]). The key in the application of the EM algorithm is to identify the missing data Y and come up with Baum’s auxiliary function Q(µ j µ̂j ) that is equivalent to E[ln fµ (Y, Z) j Z, µ̂j ] and has an easy-to-find maximum. The EM algorithm is attractive mainly for its simplicity, wide applicability, low computation and storage per iteration, and global convergence. Its main shortcoming is the lack of guarantee to converge to global maxima. A main application domain of the EM algorithm is estimation problems with a mixture density [292, 249], to which most target tracking and MM estimation problems belong. The best known applications so far of the EM algorithm in target 19 That is, convergence from any starting point to a stationary point, including saddle points as well as (local and global) maxima. LI & JILKOV: SURVEY OF MANEUVERING TARGET TRACKING. PART V: MULTIPLE-MODEL METHODS 1281 tracking are those with the probabilistic multiple hypothesis tracking (PMHT) method [318, 319, 320, 119, 359], where the above likelihood function fµ (Z) is replaced by the posterior density f(µ j Z) or equivalently the joint density f(µ, Z). b) EM-based MM estimation: For estimation of a hybrid system with base-state sequence xk , mode sequence mk , and data z k , the complete data is fµ, Y, Zg = fxk , mk , z k g (the set is unordered here). Then, two natural choices are (µ, Y, Z) = (xk , mk , z k ) and (µ, Y, Z) = (mk , xk , z k ), depending on what is to be estimated. The first choice aims at estimating the base-state sequence by treating the mode sequence as missing/hidden data, leading to the EM formulation k for the base-state sequence MAP estimation x̂MAP = k k arg maxxk f(x j z ) with E: M: k k Q(xk j x̂[j] ) = E[ln p(xk , mk , z k ) j z k , x̂[j] ] (37) k k x̂[j+1] = arg maxxk Q(xk j x̂[j] ) where subscript [j] stands for iteration j. The second choice is for mode-sequence estimation, which treats the base-state sequence as hidden data, leading to the k EM formulation m̂MAP = arg maxmk f(mk j z k ) with E: M: k k Q(mk j m̂[j] ) = E[ln p(xk , mk , z k ) j z k , m̂[j] ] k k m̂[j+1] = arg maxmk Q(mk j m̂[j] ) : (38) For a Markov jump system, we have the following key decomposition p(xk , mk , z k ) = f(zk j xk , mk )f(xk j mk , xk¡1 ) ¢ p(mk j mk¡1 )p(xk¡1 , mk¡1 , z k¡1 ): The results in the remainder of this subsection are valid only for an MJLS with white, mutually independent, Gaussian process and measurement noises. k As stated in [228] and [229], maxxk Q(xk j x̂[j] ) is equivalent to minimizing a sum of weighted squares of base-state prediction errors, measurement prediction errors, and initial state estimation error of a linear Gaussian system. This system is an “average” Gaussian MJLS over possible modes at each time in that it depends on the mode only through its (i) k k (i) (i) probability ¹̄(i) · = P(m· , z j x̂[j] ) = ®· ¯· =c· , · · k, (i) where c· is the normalization factor, ®(i) · and ¯· are computed via HMM-type forward and backward recursions, respectively, as given in [228] and [229]. From the equivalence of the optimal weighted least-squares, MAP, and Kalman smoothing for such systems it thus follows: in each batch iteration (using z k ) of the EM algorithm for MAP estimation k of the base-state sequence x̂MAP = arg maxxk f(xk j z k ), (i) k the required f¹̄· , · · kg based on x̂[j] can be computed (in the E-step) by HMM-type forward and backward recursions for the above “average” linear 1282 k can be obtained Gaussian system; and then x̂[j+1] (in the M-step) by a fixed-interval Kalman smoother [228, 229]. k As derived in [228] and [229], maxmk Q(mk j m̂[j] ) is equivalent to maxl ±k (l), where ±k (l) is the maximum score of a model sequence mk with m(l) in effect at time k (i.e., mk = m(l) ). Importantly, ±k (l) has a recursive form ±· (l) = maxi [±·¡1 (i) + ln ¼il ] + gl (z· , ȳ· , y·2 ), where ¼il is the mode transition probability (6), gl (¢) is a known function of model 0 k ]0 , ȳ· = E[y· j z k , m̂[j] ], and y·2 = m(l) , y· = [x·0 , x·¡1 0 k k E[y· y· j z , m̂[j] ]. As such, within each batch iteration (using z k ) of the EM algorithm for MAP estimation k of the mode sequence m̂MAP = arg maxmk p(mk j z k ), the required fȳ· , y·2 , · · kg can be obtained (in the E-step) by an efficient fixed-interval Kalman smoother for a system with base state y· equivalent k to the original linear system based on m̂[j] ; and then k m̂[j+1] can be obtained (in the M-step) by solving the optimal path problem with the above score ±· (l) efficiently via dynamic programming (Viterbi algorithm) [228, 229]. Along similar lines, a simpler EM-based ML estimation algorithm of the constant mode sequence (i.e., for the first generation AMM algorithm) was given in [254] and [73] in the context of the so-called mixture of experts (see Section VIII). c) EM-based MM estimation for tracking: Clearly, the above EM-based hybrid estimation techniques can be applied to maneuvering target tracking using multiple models explicitly or implicitly. Indeed, a number of such algorithms have been developed in [285], [286], [287], [231], [230], [358], [359], [27], and [28]. Here the key question is: What amounts to a maneuver model? Similar to the decision-based methods surveyed in Part IV [208], two answers have been proposed: 1) the unknown input (acceleration) uk [285, 286, 287, 231, 230] and 2) the statistics (mean and covariance) of process noise wk [358, 359, 27, 28]. In [285], [286], [287], [231], and [230], the target maneuver is described by a linear time-invariant system in white Gaussian noise xk = Fxk¡1 + Guk + wk , zk = Hxk + vk , wk » N (0, Qk ) vk » N (0, Rk ) with an unknown input (acceleration) uk modeled as a homogeneous Markov chain having M possible levels u(1) , : : : , u(M) and initial and transition probabilities p(u1 ), p(uk j uk¡1 ). The base-state sequence xk is treated as missing data in [285], [286], and [287] for k mode-sequence estimation, where Q(mk j m̂[j] ) of (38) reduces to Q(uk j ûk[j] ) = k X ·=1 [j] [j] fln p(u· j u·¡1 ) ¡ 12 kGu·¡1 ¡ x̂·jk + F x̂·¡1jk k2Q¡1 g: IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS VOL. 41, NO. 4 · OCTOBER 2005 The E-step clearly boils down to the computation of [j] ¢ x̂·jk = E[x· j z k , ûk[j] ], obtainable by a fixed-interval Kalman smoother given ûk[j] , and the M-step can be implemented exactly by the Viterbi algorithm given [j] x̂·jk since the score Q(u· j ûk[j] ) ¡ Q(u·¡1 j ûk[j] ) at time · depends only on the transition (u·¡1 , u· ) [285, 286, 287]. Note that the transition score ¢ik¡1 ,k of (35) used in [11] depends on the entire history mk¡1 only through xk because f(zk j mk , z k¡1 ) = E[f(zk j xk , mk ) j mk , z k¡1 ]. It would be independent of mk¡1 if f̂(zk j mk , z k¡1 ) = E[f(zk j xk , mk ) j m̂k , z k¡1 ] were used, as in the EM formulation, where average is over xk only. As is typical for the PMHT approach, [231] and [230] treat discrete uncertainties (uk here) as missing data20 for base-state sequence estimation via the EM formulation (37), leading to k Q(xk j x̂[j] ) 1X 2 =¡ fkz· ¡ Hx· k2R¡1 + kx· ¡ Fx·¡1 ¡ Gû[j] · kQ·¡1 g · 2 k ·=1 + kx0 ¡ x̄0 k2P ¡1 0 PM (i) (i) (i) (i) k k where û[j] · = i=1 u ¹· and ¹· = Pfu· = u j z , x̂[j] g is computed (E-step) via a forward-backward HMM k k smoother given x̂[l] . The maximizer of Q(xk j x̂[j] ) k given û[j] is found (M-step) by a fixed-interval Kalman smoother [231, 230]. The approach of [358] and [359] differs from that of [231] and [230] in that the maneuver models differ in their process noise wk , governed by a hidden Markov chain with different covariance levels Q(i) , rather than the unknown input levels. This approach was combined with the so-called turbo-PMHT in [298] for maneuvering target tracking in clutter. Further, by choosing the complete-data problem as (µ, Y, Z) = ((xk , mk ), rk , z k ), where rk stands for the sequence of data association events, another version was proposed in [298], in which the forward-backward smoothing for M-step is replaced by an IMM smoother to handle the uncertainty in rk . The EM formulation here is, however, directed to data association, not motion-mode uncertainty. Comparative results of all the above EM-based trackers were also reported in [298]. Similarly, [27] and [28] also model various maneuvers by process noise having M covariance levels (two levels were implemented). However, the base state is treated as a “nuisance.” Conceptually, the E-step amounts to fixed-interval Kalman smoothing, but the M-step is greatly simplified by assuming jumps among the levels are independent over time: it is trivially 20 In fact this work accounted also for clutter measurements in a PMHT framework. implemented for each · = 1, 2, : : : , k independently without a need for dynamic programming. Still another popular formulation of maneuvers is in terms of turn rate. To our knowledge there is no EM-based such formulation in the literature so far. d) Remarks: The optimal MMSE-CMM estimator has an exponentially (geometrically) increasing computational complexity on the order of M k , while the above EM-based MAP-CMM algorithms have a linearly increasing complexity on the order of (M 2 + n3x )k in each iteration for the batch from initial time to time k, where nx is the dimension of the base state. The price paid by these EM-based algorithms to achieve this linear complexity is the following. In general there is no hope to obtain the exact MAP estimate in finite iterations and no guarantee to converge to the exact MAP estimate even with infinite iterations. Like almost all iterative algorithms for optimization problems, there is no guarantee for the global convergence of an EM algorithm to the global maximum–it may well converge to a local maximum or saddle point or possibly even to a minimum in rare cases (see [249] for a simple example). A widely used strategy here is to try different initial points to enhance the chance of converging to the global maximum at the cost of substantially increased computation. While practitioners would not insist on having an exact optimal estimate, giving up the requirement of being close to the global maximum is a major relaxation of the MAP estimation goal. There is no reason to believe that with such a relaxation the corresponding MMSE-based algorithm with a comparable complexity could not be developed. Nevertheless, the EM-based approach has certain undeniable merits: it is systematic, theoretically elegant, and very powerful. This does not imply that it is free of serious drawbacks for practical, real-time applications in maneuvering target tracking. (See [359] for a comprehensive discussion.) What is probably worst is that the EM algorithm requires batch processing of data, just like other MAP estimators. This is acceptable for some applications, such as trajectory determination, but not for most maneuvering target tracking problems, where real-time processing is necessary. Although no recursive form in general, the EM algorithm may serve as a basis for developing approximate, recursive algorithms, as described next. e) EM-based recursive MM estimation for tracking: Two such EM-based recursive algorithms for maneuvering target tracking have been proposed recently in [285], [286], [287], [157], and [158]. The recursive algorithm proposed in [285], [286], and [287] is an approximation of the batch solution for mode-sequence MAP estimation presented therein. It reduces the batch processing to the Viterbi algorithm combined with a one-step Kalman filter by 1) modifying Baum’s auxiliary function LI & JILKOV: SURVEY OF MANEUVERING TARGET TRACKING. PART V: MULTIPLE-MODEL METHODS 1283 Q(uk j ûk[j] ) = E[ln p(xk , uk , z k ) j z k , ûk[j] ] to an approximate sequential version q(uk j ûk¡1 ) = E[ln p(xk , uk , z k ) j z k , uk¡1 ]juk¡1 =ûk¡1 suitable for recursion, 2) replacing the smoothed state estimates by the filtered state estimates, and 3) ignoring the dependence of xk on uk¡1 given E[xk j uk¡1 , z k ], cov(xk j uk¡1 , z k ), and uk . The resulting recursive algorithm has M 2 Kalman filtering cycles at each recursion and is essentially the same as that of [11]. At time k for each transition (link) (uk¡1 , uk ) of the Viterbi trellis, measurement residual z̃k and its covariance Sk are obtained by a one-step Kalman filter, and then the best path for each level of uk is determined by the Viterbi algorithm based on the transition costs ¢ik¡1 ,k = ¢k¡1,k = ln p(uk j uk¡1 ) ¡ 12 kz̃k k2S̃ ¡1 k where S̃k = Sk (Hk Qk Hk )¡1 Sk . As pointed out in [285], [286], and [287], it differs in effect from that of [11] only in that S̃k is in place of Sk . No interpretation for this replacement was given, although it appears beneficial judging from simulation results presented therein with an assumed model set f0, §0:15, §0:3g m/s2 in each of the two coordinates (thus there are 25 models). A simple scenario with a true input level jumping from 0 to 0:15 m/s2 and then back to 0 again was considered, which does not seem representative of the reality. Contradicting the results shown in [11], much poorer results from the IMM algorithm were also given in [287], possibly due to the particular design/implementation. More surprisingly, [287] claimed that the computational time of the IMM algorithm, which has M complexity, is 6 times that of the recursive algorithms of [285], [286], [287] and of [11], which both have M 2 complexity. The above recursive EM-based algorithm is based on mode-sequence estimation. More natural is a recursive EM-based algorithm for base-sequence estimation directly. Such an algorithm was proposed in [157] and [158], named “reweighted IMM (RIMM)” algorithm by its authors. It is identical to the IMM algorithm except that the mixing and output formulas are replaced by a new weighted sum in which the weights account for not only the probability of each model being true but also the accuracy of the estimate from each elemental filter: (j) (j) x̂kjk¡1 = Pkjk¡1 X (i,j) ijj (i,j) (Pkjk¡1 )¡1 x̂kjk¡1 ¹k¡1 (39) i2M (j) )¡1 = (Pkjk¡1 X (i,j) ijj (Pkjk¡1 )¡1 ¹k¡1 i2M x̂kjk = Pkjk X (i) (i) (i) (Pkjk )¡1 x̂kjk ¹k i2M ¡1 Pkjk = X (i) (Pkjk )¡1 ¹(i) k i2M 1284 (40) ijj where the mixing weights ¹k¡1 and mode probability ¹(i) k are computed as in the IMM algorithm (see Table II) and (i,j) (j) (i) (i) x̂kjk¡1 = E[xk j z k¡1 , mk¡1 , mk(j) ] = Fj x̂k¡1jk¡1 + Gj uk¡1 (i,j) Pkjk¡1 = ¼ij (i) Fk(j) Pk¡1jk¡1 (Fk(j) )0 (j) ¹kjk¡1 + Gk(j) Qk(j) (Gk(j) )0 : Both “reweighted” sum formulas are combinations of a probabilistic weighted sum for MMSE-based MM estimation and the “parallel resistors” formula for fusion of (probabilistically correct) estimates with uncorrelated errors (see [19, Sec. 8.3.3]). Note that at each recursion, the RIMM algorithm requires M 2 predictions but M updates, while the IMM algorithm requires M predictions and M updates (see Section VB1). Similar to the GPB2 case, let (i) (i) X̄k¡1 = fx̂k¡1jk¡1 , Pk¡1jk¡1 , ¹(i) k¡1 , i 2 Mg. The reweighted output formula follows from minimizing a sum of quadratic errors of fitting (xk¡1 , xk ) to measurement (i) zk , dynamics, and the estimates x̂k¡1jk¡1 , i2M with a given probabilistic weight ¹(j) k , j 2 M, that is, x̂kjk = arg minxk q(xk j X̄k¡1 ) with q(xk j X̄k¡1 ) = minxk¡1 q(xk¡1 , xk j X̄k¡1 ), where q(xk¡1 , xk j X̄k¡1 ) = X (i) kxk¡1 ¡ x̂k¡1jk¡1 k2 ¹(i) (P (i) )¡1 k¡1 i2M + X fkzk ¡ Hk(j) xk k2 k¡1 (j) + kxk ¡ Fk¡1 xk¡1 k2 (j) ¡1 ) k (R j2M (j) (j) (j)0 ¡1 Q G ) k k k (G g¹(j) : k As shown in [157] and [158], q(xk j X̄k¡1 ) can be written either in the GPB2 form of M 2 pairs of quadratics q(xk j X̄k¡1 ) = X i,j2M fkzk ¡ Hk(j) xk k2 (j) ¡1 ) k (R (j) (i) + kxk ¡ Fk¡1 x̂k¡1jk¡1 k2 ijj ¡1 (i,j) =¹ ) kjk¡1 k¡1 (P g¹(i) k or in the IMM form of M pairs of quadratics: q(xk j X̄k¡1 ) = X j2M fkzk ¡ Hk(j) xk k2 (j) ¡1 ) k (R (j) + kxk ¡ x̂kjk¡1 k2 (j) )¡1 kjk¡1 (P g¹(i) k (j) (j) with mixing estimate x̂kjk¡1 and covariance Pkjk¡1 given by (39). Equation (40) thus follows easily from minimizing q(xk j X̄k¡1 ) in this IMM form. As explained in Section VB1, these two forms are equivalent for output processing, but not for conditional filtering. Each pair of the quadratics above is minimized by a Kalman conditional filter. As argued in [157] and [158], ¡q(xk¡1 , xk j X̄k¡1 ) can be interpreted/justified as an approximation K of Baum’s auxiliary function Q(xk j x̂[l] )= K k+1 K K K E[ln p(x , m , z ) j z , x̂[l] ], 8k · K for a data batch z K = (z1 , : : : , zk , zk+1 , : : : , zK ) for an alternating IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS VOL. 41, NO. 4 OCTOBER 2005 expectation conditional maximization (AECM) algorithm [255, 249]. The approximations arise from truncation of xk to (xk¡1 , xk ) and replacement of the smoothed estimates from multiple iterations by filtered estimates from a single forward path. The AECM algorithm is an extension of the so-called expectation conditional maximization algorithm [256], which replaces a complex M-step of the EM algorithm with several computationally simpler maximization steps conditioned on some constraints on the estimatee, called CM steps. In the AECM algorithm the specification of the complete data could be different for each CM step. This provides more flexibility needed for formulating sequential problems than most other EM-based algorithms. Reference [158] included simulation results suggesting that a two-model RIMM design had a more favorable speed error compared with the standard two-model and three-model IMM designs of [18]. More comparative studies are needed to draw a more general conclusion. f) Other iterative algorithms: An iterative algorithm for JMAP estimation of the base state and mode sequences was proposed in [229] based on the block component optimization [125]: M1: k k x̂[j+1] = arg maxxk f(xk , z k j m̂[j] ) k k = arg maxmk p(mk , z k j x̂[j+1] ) M2: m̂[j+1] which belongs to the general class of the so-called coordinate ascent (or descent) methods [232, 30]. For a Gauss-Markov jump-linear system, the M1 and M2 steps can be accomplished by a Kalman smoother and the Viterbi algorithm, respectively [229]. In fact, we believe it is better to swap the above M1 and M2 steps as follows k k M1: m̂[j+1] = arg maxmk p(mk , z k j x̂[j] ) M2: k k = arg maxxk f(xk , z k j m̂[j+1] ) x̂[j+1] because m̂k is usually not sensitive to x̂k (think, e.g., the case with mk = uk above) but x̂k depends on m̂k significantly. Another coordinate ascent algorithm was proposed in [90] to obtain the MAP estimate of the mode sequence mk by treating each m· as one coordinate through iterations m̂·[j] = [j] [j¡1] , m·+1 , : : : , mk[j¡1] ) arg maxm· p(m· j z k , m1[j] , : : : , m·¡1 for · = 1, : : : , k. Likewise for the MAP estimation of the base-state sequence xk . It was demonstrated in [90] by simulation results from maneuvering target tracking that these two MAP algorithms outperform the corresponding EM-based MAP algorithms of [229] discussed above. C. Multiple-Model Smoothing In maneuvering target tracking a number of problems exist that allow offline processing. One example is trajectory reconstruction (see, e.g., [273]). Also, if an estimation delay can be tolerated the tracking performance may be improved dramatically by smoothing [236, 95, 96, 97, 173, 174, 77, 78]. Besides, smoothing can also be used as an integral part of a nonlinear filtering algorithm to improve performance without time delay. Smoothing is estimation (or more precisely “retrodiction” [95, 96, 97]) of a process at or through time n using data z k through time k (k > n). In the Bayesian setting, the complete solution amounts to finding the distribution function f(yn j z k ) or f(y n j z k ). For hybrid systems, y could be the base state x, mode m, or hybrid state » = (x, m). Then, finding f(yn j z k ) and f(y n j z k ) are state smoothing and state sequence (or trajectory) smoothing, respectively. Formal solutions to many of these smoothing problems in the sense of MMSE and MAP point estimation have been presented in Section VA. Here we focus on base-state smoothing for point (not density) estimation, that is, x̂njk with n < k, since almost all existing results are limited to this case. As given in Section VA, the MMSE MMSE-optimal smoother x̂njk = E[xn j z k ] is a k (i ) weighted sum of x̂njk = E[xn j z k , m(ik k ) ] given mode sequence m(ik k ) with weights ¹k(ik ) . The probabilistic weights ¹k(ik ) = Pfm(ik k ) j z k g can be obtained by Bayes’ rule. For a jump-linear system with white Gaussian (ik ) noise, each conditional smoothed estimate x̂njk is given by the well-known Kalman smoother (see, e.g., [291, 251, 250, 7]). As in the case of MMSE filtering, unfortunately, this optimal solution has an exponentially increasing complexity and is thus infeasible for real-time applications. So a number of suboptimal solutions with polynomial or even linear complexity have been proposed [241, 59, 132, 133, 76, 174, 78]. Some of the issues associated with smoothing for target tracking were discussed in [95], [96], and [97]. For smoothing, particularly recursive smoothing, three common classes have been traditionally considered: fixed interval (x̂njk , n = 1, 2, : : : , k with a fixed k), fixed point (x̂pjk , k = 1, 2, : : : with a fixed p), and fixed lag (x̂k¡Ljk , k = 1, 2, : : : with a fixed L). MM smoothing has been largely limited to the cases of fixed interval and fixed lag except that of [154]. Fixed-Interval Smoothing: Reference [59] presented general results for time-reversion of discrete-time Markov jump systems (MJSs) and, in particular, models in reverse time that are equivalent to the original MJLS. As an application, it presented an optimal solution for fixed-interval smoothing of an MJS based on fusion of posterior distributions obtained by two optimal MM estimators, one running forward for the original system and the other running backward using the equivalent reverse-time model. The approach is quite general, not limited to an MJLS or point estimation. The main difficulty is LI & JILKOV: SURVEY OF MANEUVERING TARGET TRACKING. PART V: MULTIPLE-MODEL METHODS 1285 to obtain the equivalent reverse-time model and the optimal forward and backward MM estimators. For an approximate implementation the IMM algorithm was suggested to replace the optimal MM estimators. This implementation was demonstrated (by simulation via trajectory smoothing of the state of an MJLS) to have very good estimation accuracy and mode identification at a low computational cost. The approximate fixed-interval smoother of [132] is conceptually similar in that it also consists of fusing forward and backward IMM estimators, but with two significant differences. First, fusion is based on a simpler but more restrictive rule–the “parallel-resistor” formula (see, e.g., [116] or [19, Sec. 8.3.3]). As a result, the backward IMM estimator has to be initialized without any prior knowledge of the state.21 Second, it bypassed the task of finding equivalent reverse-time model and derived the required backward IMM algorithm directly from the original MJLS with white Gaussian noise. The simulation results for maneuvering target tracking demonstrated a dramatic improvement of the smoothed estimates in comparison with the forward/backward IMM estimators alone. In both smoothers, fusion is done between every pair of forward and backward conditional filters, resulting in M 2 fusion operations per time step, although fusion between the overall estimates of the two IMM estimators would reduce it to just one fusion operation per time step, but with some performance loss. These IMM smoothers are both MMSE based, although the general approach of [59] is not limited to the MMSE criterion. As pointed out in Section VA, the components of kjk an MMSE sequence estimate x̂MMSE = E[xk j z k ] are kjk MMSE = E[xn j z k ]: x̂MMSE MMSE smoothed estimates x̂njk MMSE MMSE = (x̂1jk , : : : , x̂kjk ), but the components of a MAP kjk sequence estimate x̂MAP = arg maxxk f(xk j z k ) are not MAP MAP smoothed estimates x̂njk = arg maxxn f(xn j z k ): kjk MAP MAP x̂MAP = (x̂1jk , : : : , x̂kjk )MAP 6 = (x̂1jk , : : : , x̂kjk ) because the peak location of the joint pdf f(x, y) is not (x¤ , y ¤ ), where x¤ and y ¤ are the peak locations of the marginal pdfs f(x) and f(y), respectively. Thus, the EM-based algorithms, discussed in Section VB5, for MAP sequence estimation do not provide the fixed-interval MAP smoothed estimates in one shot. However, they can be modified easily for MAP smoothing of a state sequence. More important, a MAP sequence estimate appears more meaningful and useful in practice than a sequence of such MAP smoothed estimates. Fixed-Lag Smoothing: The fixed-lag smoothing algorithm of [241] differs from the MM filtering 21 This fusion rule can be replaced by a more general one (e.g., those of [225]) so that the backward estimator can be initialized as desired. 1286 algorithm of [338], [129], [338], [336], and [337] based on the B-best pruning strategy (see Section VB3) only in that conditional filtering is replaced by conditional smoothing, achieved by fixed-lag Kalman smoothers [7]. Two MM algorithms were developed in [131] and [133] for one-step fixed-lag smoothing based on two different representations of f(xk¡1 j z k ) via the total probability theorem A: B: f(xk¡1 j z k ) = k f(xk¡1 j z ) = M X (i) f(xk¡1 j mk¡1 , z k )¹(i) k¡1jk M X f(xk¡1 j mk(i) , z k )¹(i) k : i=1 i=1 Algorithm A involves M 2 one-step smoothers after (i) (i) approximating f(xk¡1 j mk¡1 , z k ) and f(xk¡1 j mk¡1 , (j) k mk , z ) by single Gaussians via moment matching,22 while algorithm B involves only M one-step smoothers based on the standard IMM approximation of f(xk¡1 j mk(i) , z k ) by a single Gaussian. Algorithm B was extended to L-step fixed-point smoothing in [154]. The central idea of the L-step fixed-interval smoother (over a sliding window) 0 0 of [76] is to form a grand state (xk0 , xk¡1 , : : : , xk¡L )0 by state stacking (augmentation) and run the IMM estimator for this augmented system to produce 0 0 the smoothed estimate E[(xk0 , xk¡1 , : : : , xk¡L )0 j z k ] “automatically.” A key enabling approximation is that f(xk , xk¡1 , : : : , xk¡L j mk(j) , z k ) is Gaussian, which is much stronger than the approximation B above. Another implication is that no mode jumps within the interval (k ¡ L, : : : , k ¡ 1] are accounted for explicitly. Additional strong approximations were made to evaluate the retrodicted probabilities (i) Pfmk¡n j z k g, n = 1, : : : , L since the IMM estimator gives only Pfmk(i) j z k g. Another approximate way of evaluating these mode probabilities was given in [278]. Simulation results over the scenario of [132] demonstrated that even with only a small lag this smoother can outperform the IMM filtering algorithm very significantly at the cost of an increase in computation about (L + 1) times. Further successful tracking applications of this state-stacking based IMM smoother were reported in [77] for a single maneuvering target in clutter (smoother coupled with the probabilistic data association (PDA) filter), and in [78] for multiple maneuvering targets in clutter (smoother coupled with the joint-PDA filter). It was found during our investigation for [154] that this state-stacking based IMM smoother and the one-step IMM smoother of [133] had almost equal accuracy for 22 These M 2 one-step smoothers were lumped into M one-step smoothers by a hardly justifiable heuristic in [131]. IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS VOL. 41, NO. 4 OCTOBER 2005 the scenario simulated, while the latter is much less computationally demanding. For real-time applications, the above smoothing results can only provide time-delayed estimates. Nevertheless, such delayed estimation can be used to improve filtering results (i.e., without a delay) of an MM algorithm, which is inherently nonlinear, by refining its past estimates and rerunning the MM algorithm using the refined estimates. This approach was taken consciously and promoted in [154] for performance enhancement. D. Convergence of CMM Estimation Algorithms For a stochastic jump-linear system, a hybrid estimation algorithm that converges exponentially exists if several conditions, including those on observability, given in [145] are satisfied. Here the exponential convergence [145] refers to an algorithm which correctly identifies the system mode in finite time and has a base-state estimate sequence with a unique mean and convergent covariance, and an estimation-error mean converging exponentially to a bounded set with a guaranteed rate. Earlier, [281] considered the problem of mode-sequence identification. It defines M ¹̃(j) k = 1X L (¦ )ij ¹̃(i) k¡1 exp[¡Asj ] c i=1 Asj = lim kZk(s) ¡ Zk(j) k2 =(L¾ 2 ) L!1 (j)0 (j)0 (j)0 0 where Zk(j) = [z(k¡1)L+1 , z(k¡1)L+2 , : : : , zkL ] is the measurements over the block (interval) of L discrete times, [(k ¡ 1)L + 1, (k ¡ 1)L + 2, : : : , kL] if m(j) is the correct model over the block; s stands for the true model; (¦ L )ij stands for the (i, j)th element of the Lth power of the transition probability matrix ¦; and c is the sum of the numerators over i = 1, : : : , M. For a hybrid system, it was shown in [281] that the weights ¹̃(j) k will converge and the true model s has the largest steady-state value limk!1 ¹̃(s) k under the condition that all mode transitions are possible but infrequent and the true model has the best fit to the data. Note that ¹̃(j) k is an approximation of the following de facto mode probability in the Gaussian case M ¹(j) k = 1X L (j) 2 (s) 2 (¦ )ij ¹(i) k¡1 exp[¡kZk ¡ Zk k =(L¾ )] c i=1 assuming no mode transition within the block, by replacing the exponential factor with its stead-state value as the block size increases. As such, the above (j) results for ¹̃(j) k hold approximately true for ¹k , which is meaningful if ¹(j) k is used as a fitness measure for an MM estimator. In fact, such a block MM estimator was proposed in [281] for mode sequence MAP estimation. By a theoretical analysis of a generic CV-CA IMM algorithm, [88] concluded that even if the true system has a CA plant, the error of the CV filter remains bounded and the steady-state RMS error can be estimated from the acceleration estimates of the CA filter. E. Tracking Applications Almost all tracking applications of the second generation MM algorithms so far are those of the IMM algorithm [12, 14, 15]. Many of these applications have been documented in the survey of [246]. Since then numerous new successful applications have been reported. They further demonstrate the good performance of the IMM algorithm for various tracking problems. While addressing older results briefly, many already discussed in [246], we will pay more attention to more recent ones. 1) Surveillance for Air Traffic Control: The first real application of the IMM algorithm is probably the jump-diffusion prototype tracker developed by Blom’s team as the track maintenance part of a multisensor multitarget tracking system [60] (see also [61] and [55]) for Eurocontrol, the European organization for air navigation. This sophisticated tracker included four models for horizontal motion (straight constant/changing speed motion, and left/right turns) and two for vertical motion (level motion and changing altitude). Such a model set represents well typical en-route motions of a civilian aircraft–horizontal CV motions most of the time with occasional changes in speed, or horizontal constant turns, or small vertical climb/descent maneuvers. It also allows decoupled tracking of the altitude and horizontal motions. An EKF was employed to handle the nonlinearity involved in some of the horizontal models (with 2D position, ground speed, course and transversal acceleration as the state components) (see [209, Sec. V.B.2]). The more efficient second-order-dependence model (3) was employed to govern the mode transitions. The comprehensive performance evaluation with both simulation and real ATC data presented in [60] demonstrated accurate state estimation, fast response to mode changes, and high credibility of the tracker. Overall the tracker outperformed by far the existing ones based on a single model (®-¯-° or EKF based). This IMM-based prototype tracker has been implemented and installed in the Air Traffic Management Surveillance Tracker and Server (ARTAS) of Eurocontrol, which is operational in many European countries and is the basis for the Surveillance Data Processing and Distribution (SDPD) system in Europe [136], [54]. Another comprehensive study of the capabilities of the IMM algorithm for advanced ATC systems within LI & JILKOV: SURVEY OF MANEUVERING TARGET TRACKING. PART V: MULTIPLE-MODEL METHODS 1287 the Hadamard project of the French civil aviation administration was reported in [340], including detailed design and evaluation of six different MM configurations. It was found that the best trade-off between performance and computation was achieved by a two-model IMM configuration of a CV model and a CT model with an unknown turn rate ([209, Sec. VB]) as a state component and fictitious process noise for the longitudinal acceleration. In contrast to [60], no explicit model for the longitudinal acceleration was included because it is rare and small in civil aircraft motion. This two-model CV-CT IMM configuration was shown to meet the rather stringent requirements of the project very well. A principal conclusion of this study was that “stringent speed estimation can be obtained only with MM algorithms.” A detailed design and evaluation of an IMM algorithm with a two-model (CV-CT) configuration was given in [199] (see also [18] and [21]), along with many guidelines and insights on parameter selection and tuning of the algorithm. In particular, it was demonstrated that the CT model is well suited to the ATC applications and best results are obtained if it is included as a maneuver model. An implementation of a two-model (CV-CT) IMM algorithm was reported in [161] for the Micro En Route Automated Radar Tracking System (¹EARTS) that again showed the superiority of the IMM estimator to single-model (adaptive ®-¯ or EKF) based filters. The series of papers [367, 345, 171] presented the work for a large-scale (about 1000 targets) multisensor multitarget tracking system for ATC surveillance, which was developed into the software package MATSurv (for “Multisensor Air Traffic Surveillance”) [366]. It combines the IMM algorithm for state estimation with assignment algorithms for radar data association in a dense environment. This combination is supported by the prior comparative study of IMMPDA versus IMM-assignment on real ATC surveillance data reported in [172]. More specifically, [367] implemented an IMM configuration with two second-order linear models. Reference [345] reported a significant performance enhancement by utilizing a nonlinear CT model in the (CV-CT) IMM configuration, with an improvement of 10—50% in the horizontal prediction errors on real data obtained from five ATC radars. Additionally, the IMM configuration facilitated data association better than the Kalman filter. More results on real and simulated data were presented in [171], along with a discussion of parallelized algorithms with superlinear speedup for multitarget tracking using the IMM estimator. Other references addressing the design of IMM tracking filters for ATC surveillance include [342] and [127]. 2) Defense Applications: Compared with the ATC applications, which involve relatively benign maneuvers, tracking hostile or noncooperative targets, 1288 such as evasive manned aircraft, can be significantly more difficult and challenging. For example, these targets possess very strong maneuverability, often not well known to trackers; their motion behavior is quite unpredictable without sufficient knowledge of their types, missions, tactics, etc.; they may apply countermeasures to degrade the quality of measurements and hamper tracking efforts. The good news is that data quality and rates usually are significantly higher than in the ATC case. As evidenced by the vast majority of studies, the MM approach appears to be the most powerful framework available, capable of meeting the challenges of maneuvering target tracking in a feasible way. a) Benchmark tracking problems: Several benchmark problems were initiated in 1994 for a unified performance evaluation and fair comparison of tracking algorithms using a phased-array radar, which will be discussed in greater detail in a subsequent part. Comparative studies and designs of a variety of tracking algorithms were reported for the first benchmark [40, 48] and second benchmark [47, 45, 49]. They suggested that among all solutions proposed only the IMM algorithm was able to handle satisfactorily the wide range of maneuver scenarios, varying from moderate 2—3 g turns of cargo aircraft to intense series of severe 5—7 g turns of fighter aircraft [44, 86 355, 168, 35 (using 3 models), and 155 (using 2 models)]. Some of these performance studies were verified by real experiments. The IMM design of [355] used CV, constant-thrust, and 3D CT models. The constant-thrust model was implemented adaptively within the standard CA filter by correcting the predicted acceleration vector (before and after mixing) so as to make it parallel to the predicted velocity vector. In a similar manner the predicted acceleration vector of the 3D CT filter was made perpendicular to the predicted velocity and the speed was kept “nearly” constant by means of the kinematic-constraint technique of [4]. The main approaches to the second benchmark problem [50] all use the IMM algorithm as a base-state estimator. They include the one in [355] that combines IMM with the integrated PDA filter, the IMMPDA solutions of [167], and the IMM-MHT solution of [36]. Reference [167] used three coordinate-uncoupled models: a CV model with low process noise for benign motion, a CV model with high process noise for ongoing maneuver, and a CA model with high process noise for maneuver onset/termination. Reference [36] employed a horizontal CT model with polar velocity (see [209, Sec. V.B.2]), a CV model, and a CA model, where the CA and CT models allow altitude changes (see also [35] and [68]). Generally speaking, the IMMPDA solution reduced more radar time while the IMM-MHT reduced more radar energy. More recently [298] presented comparative results for the second benchmark problem between maneuvering (i.e., MM) IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS VOL. 41, NO. 4 OCTOBER 2005 PMHTs (see Section VB5) showing that the PMHT performed reasonably well, almost as well as the above two solutions. The use of the IMM algorithm in the subsequent benchmark problems as a base tracking filter appears beyond question. Additional information for adaptive sampling and waveform selection in phased-array radars using the IMM algorithm can be found in [313], [194], [213], and [316]. b) Ballistic targets: In recent years the IMM algorithm proved very useful for another practical problem of vital importance–tracking of tactical ballistic missiles (TBMs) in all flight phases: boost (including postboost), coast (free flight), and reentry (possibly maneuverable). The motion of a TBM is much more constrained than that of a manned maneuvering aircraft and can be modeled relatively well during any particular flight phase (mode) (see [205]). In contrast to the hard decision based AMM approach of [104] and [103] (see Section IVE), various IMM-based algorithms, which make soft decisions, have been proposed to avoid the deficiencies associated with a hard decision [322, 257, 39, 141, 142, 80]. The IMM algorithm was employed in a prototype system for TBM in [322]. Four models were used: specialized boost, coast, and reentry models, and an auxiliary “general-purpose” CA model intended to provide a “back-up” estimate to the other filters (through the mixing mechanism) in cases when the specialized models are inadequate (e.g., due to unexpected maneuvers, such as trajectory corrections, retargeting). A critical issue in the MM tracking of ballistic targets is to design the transition probabilities since they are time varying, because the possible transitions are strongly dependent on the current mode of flight. This dependence was accounted for in [322] by switching among five transition probability matrices depending on the estimated target altitude and flight phase. For example, the boost and coast models/filters are dropped when reentry phase is established. Another major issue, mixing of different target states/covariances, was avoided by mixing only the common position and velocity components. The implementations of [39], [257], and [141] considered only the boost and coast phases. Reference [39] used a three-model configuration with CV for coast phase, CA for a “generic” filter, and a detailed flight dynamics based ten-state model for boost phase [205]. Specific for this implementation is the unconventional ad hoc state/covariance mixing, allowed only between the boost and CA models, and between the coast and CA models. This seems to make sense if the CA filter is accurate enough in all conditions to provide a backup in case the boost or coast filter does not perform well due to a mode change. For a similar implementation (2-model boost-coast IMM) [257] proposed and analyzed different time-varying distributions of the boost-to-coast transition based on a predicted burnout time and the uncertainty of this prediction. The study showed that the 2-model IMM algorithm is able to “detect” the burnout and provide highly accurate estimates shortly thereafter. No rocket staging, however, was allowed in this study. It seems that a CA or other (e.g., correlated) generic filter for acceleration would help cope with possible staging. For modeling the boost-to-coast transition in a two-model IMM configuration, [141, 142] proposed a sigmoidal function for the transition probability, depending on the estimated altitude and a prior i altitude at which the booster cutoff is likely to occur. Performance comparison of this version of an IMM algorithm having constant transition probabilities with the EKF and ®-¯-° trackers over simulated and real data demonstrated its superior capabilities provided the parameter guess does not mismatch the truth by far. Another implementation of IMM for tracking of a TBM in the entire flight was given in [80]. c) GPB1 applications: The early paper [267] formulated the GPB1 algorithm for a semi-Markov mode sequence, but one with exponential distributed sojourn times was simulated, which is actually Markov. Reference [260] applied this algorithm to passive tracking of a submarine with vertical maneuvers. By quantizing the unknown input into several known levels, the GPB1 algorithm reduces to a single Kalman filter with a probabilistically weighted input for a linear target motion. Submarine tracking was studied further in [262], [263], [261], and [264] by GPB1 tracking in range, velocity, and depth using passive time delay measurements. References [123] and [266] presented detailed GPB1 designs for realistic scenarios of 3D manned maneuvering aircraft tracking. A Singer model with several known quantized mean levels of acceleration was employed for an MM description of the target dynamics and the target maneuver was modeled by (semi-) Markov transitions between the levels. Versions in rectangular and spherical coordinates were developed and investigated, showing a good accuracy and filter stability over a wide range of target acceleration, unreachable by a conventional single (e.g., EKF) filter. Other related implementations and applications of the GPB1 algorithm can be found in [341], [265], and [71]. Reference [160] proposed a six-model GPB1-type MM tracker for a maneuvering reentry vehicle (MaRV) that quantizes the acceleration vector to represent the possible maneuvers, left/right turns, climb/dive, and deceleration within the MaRV model [75, 205]. 3) Tracking in Presence of Correlated Noise, Glint, or Multipath: a) Correlated noise: Radar tracking at high sampling rate leads to significant temporal correlation of the measurement errors, which degrades the performance of those trackers relying on whiteness of the measurement noise. Techniques for handling correlated measurements within the IMM framework LI & JILKOV: SURVEY OF MANEUVERING TARGET TRACKING. PART V: MULTIPLE-MODEL METHODS 1289 were proposed in [128] and [361]. The approach of [128] is based on modeling the errors as a first-order Markov process with known coefficients. After decorrelation using the standard measurement differencing method, the IMM algorithm was applied straightforwardly (see also [246] for a discussion). Reference [361] proposed a more general technique that performs the decorrelation in an adaptive manner within the IMM framework without the assumption of known correlation parameters. It was demonstrated that this adaptive version and the one with known correlation parameters have similar performance. b) Glint: When tracking large targets at short distance, the resulting radar measurement errors (known as glint noise) is characterized by non-Gaussian distributions with a heavy tail due to the interference caused by reflections from different elements of the target. The presence of glint can seriously degrade tracking performance if white Gaussian measurement errors are assumed by the tracker. Modifications of the Kalman filter capable of accounting for glint can be found in [134], [239], and [240] (see also [246]). A good approximate model for the distribution of a glint is a mixture of a Gaussian with a moderate variance and a Laplacian distribution with a large variance and a small weight [360]. This model has been generally accepted now. It was used in [360] to develop a tracking filter implementing the Masreliez filter for non-Gaussian noise [239] for the approximate spherical target-measurement model of [123]. In the context of maneuvering target tracking, this approach was extended in [362], where a two-model IMM configuration with two modified23 Masreliez filters (instead of Kalman filters) was proposed. A different approach was suggested in [84] and [85]. Two measurement models were included in the IMM design–one matched to the system with Gaussian observation noise and the other to the system with Laplacian noise of a large variance. Conditional filtering was implemented in [84] by EKFs for both models.24 Further, to handle the maneuvering target case, the IMM design was expanded in [85] with a (Singer) maneuver model. Under the assumption that the model sets describing the target motion and measurement noise, respectively, are independent,25 a “layered” version of the IMM algorithm was developed, which has computational 23 Since the Masreliez filter requires in general performing a convolution operation, an efficient approximation based on normal expansion of the predicted measurement distribution was developed. 24 The use of an LMMSE-based EKF, rather than the MMSE filter, for Laplacian noise was motivated by its great computational advantage at an acceptable loss of accuracy as compared with the exact nonlinear MMSE filter (derived therein). 25 Target motion and glint seem coupled due to the target attitude changes during maneuver on/off that could cause glint to appear/disappear or change. 1290 advantages. A somewhat similar approach was followed in [373]. The performance evaluation over two scenarios from the benchmark problem of [49] showed a significant advantage of this layered IMM over the algorithm of [362] in terms of noise reduction, faster response to mode changes, and better mode identification. A possible enhancement here is to replace the EKFs in [84] by better filters, such as those based on measurement conversion (see [206]) or the approximate best linear unbiased filters for polar/spherical measurements of [372]. For a 2D homing missile scenario, [317] implemented an IMM algorithm with two decoupled models for range and bearing in Gaussian and Laplacian noise, respectively, as in [84]. In this setting the measurement equations are linear. Another study of target tracking in glint was presented in [331]. It employed mixture reduction techniques [300] for MM estimation. c) Multipath: The multipath propagation effects arise in radar or sonar tracking especially when the target is in a close vicinity of a reflecting surface. For example, due to the combination of the return from a low elevation target and sea-surface reflected returns, the measurement error can be huge relative to the “normal” ones. The effect is very complex and may be devastating to a tracker that assumes normal and uncorrelated measurement errors. An elegant solution to this problem based on the IMM method was proposed in [17]. As shown therein analytically, this is essentially a hybrid estimation problem due to the jumpwise behavior of the multipath error process arising on top of the standard measurement error process. It was demonstrated that the multipath effect can be successfully alleviated by an IMM mechanism with a “no multipath” model and a first-order autocorrelated multipath model, without the need for a detailed physical model. Somewhat related, unknown noise can be identified by an IMM algorithm, as proposed in [200] and [20]. VI. THE THIRD GENERATION: VARIABLE-STRUCTURE MM ESTIMATION A. Theoretical Foundation of VSMM Estimation The first two generations have a fixed structure (FS) in the sense that they use a fixed set of models throughout the time as reflected by Assumption A2. The third generation abandons A2, resulting in having a variable structure, hence the name variable-structure MM (VSMM) estimation. Although not necessary, many VSMM algorithms rely on Assumption A10 that the mode sequence is Markov or semi-Markov, mostly because the second-generation algorithms serve as building blocks of the VSMM algorithms. State Dependency of Mode Set: A key concept in VSMM estimation is the state dependency of a mode IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS VOL. 41, NO. 4 OCTOBER 2005 set [191, 201, 192]. Simply put, given the current mode (and base state), the set of possible modes at the next time is a subset of the mode space, determined by the mode transition law. Consider tracking a car with three models: straight (m(1) = 1), left turn (m(2) = 2), and right turn (m(3) = 3). Initially it goes straight on a street at k = 1; it arrives at a four-way intersection at k = 10, where it could go straight or take a left or right turn; at k = 11, if it took a left turn at k = 10 the car could either go straight or continue the left turn (making it a U turn); then it goes straight until it enters into an open space at k = 20, where any motion-mode could be converted to any other. The state-dependent mode sets through k = 20 are i = 1, 2, 3: The sequence of possible mode sets through k = 20 is S20 = fS1 , : : : , S10 , S11 , S12 , : : : , S20 g = ff1g, : : : , f1, 2, 3g, f1, 2g, f1g, : : : , f1, 2, 3gg S where Sk = i S(i) k is the union of state-dependent mode sets at k. Note that the set of possible modes at time k depends on mode sk¡1 in effect at k ¡ 1 and the base state xk¡1 and xk . It was shown in [191], [201], and [192] that an MM estimator cannot be optimal if at some time k it uses a model set Mk different from the mode space Sk . As such, the use of a fixed model set, say, M = f1, 2, 3g, is clearly not preferable for this example. The second generation abandons the constant mode assumption of the first generation; instead it imposes a Markov-type property on the mode sequence. Somewhat similarly, the third generation abandons the constant mode-space assumption of the first two generations and explores the state dependency of the mode set. Clearly, the state dependency of the mode set cannot be described by the mode set itself. That is why a graph-theoretic formulation of the MM estimation was proposed in [198], [201], and [192], where a mode and a possible transition from one mode to another are represented by a node and a directed edge, respectively, resulting in a directed graph (digraph) as a representation of a mode set and its associated state dependency. This formulation has certain advantages, as elaborated in [201] and [192], and is the basis of a class of VSMM algorithms [215, 195, 346]. Optimal VSMM Estimation: As presented in [198], [191], [201], and [192], the MMSE-optimal VSMM estimator is given by x̂kjk = E[E[xk j Sk , z k ] j z k ] = X Sk k (S ) PfSk j z k g x̂kjk X (Sk ) (Sk ) (Sk ) 0 [Pkjk + (x̂kjk ¡ x̂kjk )(x̂kjk ¡ x̂kjk ) ]PfSk j z k g Sk (42) (Sk ) x̂kjk (Sk ) Pkjk and are the optimal estimate and its where MSE matrix respectively at time k assuming that the true mode-set sequence is Sk , given by k (S ) = E[E[xk j sk , Sk , z k ] j Sk , z k ] x̂kjk = X (s ) Pfsk j Sk , z k g x̂kjk X (S ) (s ) (S ) (s ) 0 (s ) [(x̂kjk ¡ x̂kjk )(x̂kjk ¡ x̂kjk ) + Pkjk ]Pfsk j Sk , z k g sk 2Sk (Sk ) Pkjk = sk 2Sk k k k k k k k (1) (1) (1) S(1) 1 = ¢ ¢ ¢ = S9 = f1g, S10 = f1, 2, 3g, S11 = f1g (3) (i) S(2) 11 = f1, 2g, S11 = f1g, : : : , S20 = f1, 2, 3g, Pkjk = (41) (s ) where x̂kjk is the optimal estimate at time k assuming (sk ) the true mode sequence is sk , and Pkjk is its MSE matrix. The summations in (41)—(42) are over all mode-set sequences such that every possible mode sequence is in one and only one Sk . Note, however, that a mode-set sequence may contain more than one possible mode sequence. The optimal VSMM estimator in the MAP sense is given by x̂kjk = arg maxxk f(xk j z k ), where X f(xk j Sk , z k )PfSk j z k g f(xk j z k ) = Sk = XX Sk sk 2Sk f(xk j sk , z k )Pfsk j Sk , z k gPfSk j z k g is a mixture density, each component f(xk j Sk , z k ) of which is itself a mixture density. RAMS Approach: The optimal VSMM estimator is computationally infeasible. For most applications, its higher level with multiple model-set sequences should be replaced, due to limited computational resources, by a single (hopefully “best”) model-set sequence, obtained in practice through model-set adaptation in a recursive manner. This is the recursive adaptive model-set (RAMS) approach [196, 198, 191, 201, 195]. In general, each recursion of a RAMS algorithm has two tasks. 1) Model-set adaptation determines at each time the model set to use for the MM estimation, utilizing posterior information contained in the data as well as prior knowledge. This is unique for VSMM estimation. Different RAMS algorithms differ from each other primarily with respect to how the model-set adapts. 2) Model-set sequence conditioned estimation intends to provide best possible estimates given a model-set sequence. It consists of a) initialization– assign initial probabilities to new models and initialize the filters based on them–which is absent in the first two generations, and b) cooperation strategies and conditional filtering, similar to those of the first two generations. LI & JILKOV: SURVEY OF MANEUVERING TARGET TRACKING. PART V: MULTIPLE-MODEL METHODS 1291 Model-Set Adaptation: Model-set adaptation can be decomposed as model-set expansion and model-set reduction [196, 195]. This decomposition has several significant advantages over naive model-set switching in terms of tractability, performance, and generality [196, 224, 195]. Model-set expansion is often more important than model-set reduction: Although inclusion of an impossible model could be as bad as missing a possible model, the performance of an MM algorithm will suffer greatly if a highly likely model is missed, but only slightly if a highly unlikely model is included. As a result, a delay in including the correct model will always result in significant performance deterioration, while a delay in terminating an incorrect model usually does not incur great performance loss if the correct model is in the set. Unfortunately, model-set expansion is in general much more difficult than model-set reduction. Both expansion and reduction of a model-set require two functional tasks: model-set “candidation”, which determines candidate sets for expansion or reduction, and model-set decision, which selects and retains the best candidate set(s). Model-set candidation for expansion amounts to activation or generation of a set of new models, which is the main task of each model-set adaptation algorithm, discussed in Section VID. This candidation is much easier for model-set reduction than expansion. model-set M1 to M2 , which is equal to the product of model-set marginal likelihood ratios under mild conditions [196] B. Model-Set Decision Given Candidate Sets in a sequential setting, where their union M = M1 [ M2 is used before a decision is made. Note that M1 and M2 may include common models. This problem is solved optimally in [196], [195], and [214] by the following model-set sequential likelihood ratio test (MS-SLRT) for some thresholds A and B: choose M1 when ¤k ¸ B; choose M2 when ¤k · A; otherwise use M, go to the next time cycle, ask for one more measurement, and continue to test. This test is optimal in the sense of making quickest decisions with guaranteed conditional decision error bounds Model-set decision may be formulated as a statistical decision problem, in particular, a problem of testing statistical hypotheses in a sequential setting, which is natural since observations are available sequentially, and beneficial in terms of decision delay and threshold determination. As hypotheses are always assumed fixed in the standard theory of hypothesis testing, in this subsection, the true mode s in effect and the model sets are assumed constant during the time period over which the test is performed. Model-Set Likelihood and Probability: Since the task is to decide on the right model set, the probabilities and/or likelihoods of the model sets involved are naturally of major interest. The marginal likelihood of a model-set M at time k is the sum of the predicted probabilities Pfmk(i) j s 2 M, z k¡1 g times the marginal likelihoods f[z̃k j s = m(i) , z k¡1 ] of all the models m(i) in M [196, 224, 195]: ¢ k¡1 ] LM k = f[z̃k j s 2 M, z X = f[z̃k j s = m(i) , z k¡1 ]Pfs = m(i) j s 2 M, z k¡1 g m(i) 2M where z̃k is the measurement residual. The joint ¢ likelihood of the model-set M is defined as LkM = f[z̃ k j s 2 M]. Let ¤k be the joint likelihood ratio of 1292 ¢ ¤k = LkM1 LkM2 = Y LM1 · k0 ···k 2 LM · where k0 is the test starting time. The (posterior) probability that the true mode is in M is defined by ¢ k ¹M k = Pfs 2 M j s 2 M, z g X X (i) = Pfmk(i) j s 2 M, z k g = ¹k m(i) 2M m(i) 2M which is the sum of the probabilities of all modes in M, where M is the union of all model sets under consideration, including M as a subset. The mode probability ¹(i) k is available from an MM estimator using M. Several hypothesis tests were proposed in [196], [195], and [214] and applied in [224], [219], [195], [204], [203], [222], and [346] for model-set decision given candidate sets based on model-set likelihoods or probabilities. Model-Set Decision Given Two Model Sets: Assume s 2 M. Consider the problem of choosing between two model sets M1 and M2 , that is, testing H1 : s 2 M1 versus H2 : s 2 M2 PfChoose M2 j s 2 M1 g · ®, PfChoose M1 j s 2 M2 g · ¯, 0 < ®, ¯ < 1 for any given ® and ¯. Replacing the likelihood ratio M2 1 ¤k in the above by the probability ratio P k = ¹M k =¹k yields the model-set sequential probability ratio test (MS-SPRT) [196, 195, 214]. It is optimal in the sense of making quickest decisions with guaranteed joint decision error bounds PfChoose M2 , s 2 M1 g · ®, PfChoose M1 , s 2 M2 g · ¯, 0 < ®, ¯ < 1: The thresholds A and B are given approximately by A = ¯=(1 ¡ ®), B = (1 ¡ ¯)=®. Clearly, MS-SLRT and MS-SPRT can be used to answer such important questions as “Which model set is better to use, M1 or M2 ?” and “Is it better to delete a subset M1 from IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS VOL. 41, NO. 4 OCTOBER 2005 the current model set M?” They are also basis for solutions of problems involving more than two model sets. Model-Set Decision Given More Than Two Model Sets: Consider the problem of whether it is better to add one of the model sets M1 , : : : , MN to the current set M. This can be formulated as testing H0 : s 2 M versus H1 : s 2 M1 ¢ ¢ ¢ versus HN : s 2 MN in a sequential setting, where M is used before a decision is made. The following multiple model-set sequential likelihood ratio test (MMS-SLRT) was proposed in [196], [195], and [214] as a solution to this problem. S1. Perform N MS-SLRTs simultaneously for N pairs of hypotheses (H0 : s 2 M versus H1 : s 2 M1 ), : : : , (H0 : s 2 M versus HN : s 2 MN ). These tests are one-sided in the sense that H0 is never rejected, which is implemented by using thresholds B = (1 ¡ ¯)=® (or B = 1=®) and A = 0. This step ends when only one of the hypotheses H1 , H2 , : : : , HN remains. Specifically, reject all Mi for which ¢ ¤k = LkM =LkMi ¸ B and continue to the next time cycle to test for the remaining pairs with one more measurement until only one of the hypotheses H1 , H2 , : : : , HN , say Hj , is not rejected. S2. Perform an MS-SLRT to test H0 : s 2 M versus Hj : s 2 Mj , where Hj is the winning hypothesis in S1. With a slight modification, this test can be used to answer other important questions, such as “Is it better to delete one of the model sets M1 , : : : , MN from the current set M?”. Probably the most versatile test developed so far for model-set decision is the mode-set probability sequential ranking test (MSP-SRT) [196, 195, 214]. It is based on ranking of mode-set probability: At each time k, rank all Nk of the model sets M1 , M2 , : : : , MN that have survived (i.e., not yet rejected or accepted) by time k as M(1) , M(2) , : : : , M(Nk ) such that their M(i) ¢ mode-set probabilities ¹k decreasing order: M(1) ¹k M(2) ¸ ¹k = Pfs 2 M(i) j z k g are in a M(Nk ) ¸ ¢ ¢ ¢ ¸ ¹k : Then, a sequential decision is made by comparing a mode-set probability ratio P k with a pair of thresholds M A and B, where P k = ¹k (i) =¹M k if the current model set M is involved, such as to answer the question “is it better to add/delete some (unknown number) of the model sets M1 , M2 , : : : , MN to/from M,” otherwise, M M P k = ¹k (i) =¹k (1) , such as for the problems of choosing one, L (known), or some (unknown) out of the model sets M1 , M2 , : : : , MN . Alternative solutions were also presented in [196] and [195], where the MS-SLRTs and model-set likelihood ratio ¤k in MMS-SLRT are replaced by MS-SPRTs and mode-set probability ratio P k , respectively, and P k in MSP-SRT are replaced by ¤k . In addition, a so-called multiple-level test was also presented in [196]. See [196], [195], and [214] for more details, along with simulation results of testing model sets with some simple models typically used in maneuvering target tracking. These tests are general, intuitively appealing, computationally efficient, and easy to implement because they use only model-set likelihoods or probabilities, which are available in MMSE-based MM algorithms if the model sets are already used. Note that an adaptation of the model set is accomplished whenever a model set other than the current one is accepted. As such, model-set adaptation requires a series of hypothesis tests. C. MM Estimation Given Model-Set Sequence In this subsection, it is assumed that the sequence of model sets has been determined by, say, model-set adaptation. For simplicity, Mk and Mk are also used to denote the events fsk 2 Mk g and fsk 2 Mk g, respectively, and a perfect match between modes and models is assumed. Initialization of New Models and Filters: A model is a new one if it is in Mk but not in Mk¡1 . Two important questions arise naturally: 1) How to assign initial probabilities to the new models? 2) How to obtain initial estimates and error covariances for the filters based on the new models? Answers to these questions are essential for the implementation of any MM algorithm of a truly variable structure. Many heuristics and ad hoc treatments have appeared in the literature. In fact, the key to the optimal initialization of new models and the corresponding elemental filters is the state dependency of the mode set, explained in Section VIA. As applied to model and filter initialization here with a single state estimate and mode probability, the optimal assignment of the initial probability to a new model accounts only for the probabilities of those models that may switch to it; and the optimal initial state estimate for a filter based on a (new or old) model is determined only from the estimates (and the probabilities) of the filters based on those models that may switch to the new model. After writing down formulas for the optimal initialization, it can be recognized that they are similar to those in the model-conditioned reinitialization (mixing) step of the IMM estimator (see Table II). This recognition leads to the VS-IMM recursion (Table III), presented in [193] and [195]. It gives a generic recursion for VSMM estimation based on a time-varying model set. It was shown in [193] that the VS-IMM recursion is optimal in the MMSE sense for LI & JILKOV: SURVEY OF MANEUVERING TARGET TRACKING. PART V: MULTIPLE-MODEL METHODS 1293 TABLE III VS-IMM Recursion 1. Model-set conditioned (re)initialization [8m(i) 2Mk ]: ¢ P predicted mode probability: = Pfmk(i) jMk ,Mk¡1 ,z k¡1 g= ¹(i) kjk¡1 mixing weight: ¹k¡1 = Pfmk¡1 j mk(i) ,Mk¡1 ,zk¡1 g=¼ji ¹(j) =¹(i) k¡1 kjk¡1 mixing estimate: (i) = E[xk¡1 j mk(i) ,Mk¡1 ,z k¡1 ]= x̄k¡1 mixing covariance: (i) P̄k¡1 = ¢ jji (j) ¢ P (j) m(j) 2Mk¡1 m(j) 2Mk¡1 P (j) k¡1 ¼ji ¹ jji (j) ¹ x̂ m(j) 2Mk¡1 k¡1jk¡1 k¡1 (j) (j) jji (i) (i) [Pk¡1jk¡1 + (x̄k¡1 ¡ x̂k¡1jk¡1 )(x̄k¡1 ¡ x̂k¡1jk¡1 )0 ]¹k¡1 2. Model-conditioned filtering [8m(i) 2Mk ]: ¢ predicted state: (i) (i) (i) x̂kjk¡1 = E[xk j mk(i) ,Mk¡1 ,z k¡1 ]=Fk¡1 +G(i) w̄(i) x̄ k¡1 k¡1 k¡1 predicted covariance: (i) (i) (i) (i) 0 (i) (i) (i) 0 P̄k¡1 Pkjk¡1 = Fk¡1 (Fk¡1 ) + Gk¡1 Qk¡1 (Gk¡1 ) measurement residual: (i) z̃k(i) = zk ¡ E[zk j mk(i) ,Mk¡1 ,zk¡1 ]=zk ¡Hk(i) x̂kjk¡1 ¡v̄ (i) k residual covariance: (i) Sk(i) = Hk(i) Pkjk¡1 (Hk(i) )0 + Rk(i) filter gain: (i) Kk(i) = Pkjk¡1 (Hk(i) )0 (Sk(i) )¡1 updated state: (i) (i) x̂kjk = E[xk j mk(i) ,Mk¡1 ,zk ] =x̂kjk¡1 +K (i) z̃ (i) k k updated covariance: (i) (i) Pkjk = Pkjk¡1 ¡ Kk(i) Sk(i) (Kk(i) )0 ¢ ¢ 3. Mode probability update [8m(i) 2Mk ]: ¢ model likelihood: = p[z̃ (i) j mk(i) ,Mk¡1 ,z k¡1 ] L(i) k mode probability: ¢ ¹(i) = Pfmk(i) jMk ,Mk¡1 ,zk ]= k 4. Fusion: overall estimate: overall covariance: ¢ x̂kjk = E[xk jMk ,Mk¡1 ,zk ] = Pkjk = P m(i) 2Mk E[xk j mk(i) , z k ] = E[xk j mk(i) , Mk¡1 , z k ] 8 m(i) 2 Mk and the linear-Gaussian assumption of the Kalman filter given the system mode. The VS-IMM recursion automatically initializes all new models and filters “optimally”: All new models are assigned the optimal initial probabilities and the filters based on these models are initialized with the “optimal” initial conditions (estimates and error covariances). This VS-IMM recursion is almost identical to one cycle of the IMM algorithm (compare Tables III and II). It is a natural extension of the IMM algorithm given a time-varying model set. It is extremely useful for VSMM estimation because of its cost-effectiveness, efficiency, and applicability. Other than the model sets Mk and Mk¡1 and the transition law between their models, it requires exactly the same thing as the IMM algorithm does. This recursion is used in most VSMM algorithms developed so far. Another nice feature of the VS-IMM recursion is that as shown in [193], it uses the transition probabilities 1294 = P N (z̃ (i) ;0,S (i) ) k k ¹(i) L(i) kjk¡1 k m(j) 2Mk ¹(j) L(j) kjk¡1 k P x̂(i) ¹(i) m(i) 2Mk kjk k (i) (i) (i) 0 (i) [Pkjk + (x̂kjk ¡ x̂kjk )(x̂kjk ¡ x̂kjk ) ]¹k a Markov jump-linear system under the following two fundamental assumptions of the RAMS approach (of zero depth): Pfmk(i) j Mk , z k g = Pfmk(i) j Mk , Mk¡1 , z k g, assume ¼ij with respect to the total set M, rather than Mk or Mk¡1 . Were this not true, each possible model set would require a distinct design of the corresponding set of transition probabilities. Fusion of Two MM Estimates: A question important for MM estimation is the following: Given two separate MM estimates based on two model sets, respectively, how to obtain the estimate based on all models in these two sets? For example, a model-set adaptation algorithm may decide to add a set M2 of models to the current model set M1 after the estimates based on model set M1 have been obtained. The solution to this problem is the following optimal fusion rule, presented in [193] and [195]. Consider two optimal MM estimators based on a common model-set history Mk¡1 but two distinct model sets M1 and M2 at time k, respectively: (i) (i) (i) (i) fx̂kjk , Pkjk , Lk , ¹kjk¡1 gm(i) 2M1 (i) (i) (i) (i) fx̂kjk , Pkjk , Lk , ¹kjk¡1 gm(i) 2M2 (i) where L(i) k and ¹kjk¡1 are model likelihood and predicted model probability respectively of model m(i) in set M1 or M2 . It was shown in [193] that the optimal MM estimator based on model set M = M1 [ M2 at time k and a common history Mk¡1 is IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS VOL. 41, NO. 4 OCTOBER 2005 given by x̂kjk = X (i) (i) ¹k x̂kjk X (i) (i) (i) [Pkjk + (x̂kjk ¡ x̂kjk )(x̂kjk ¡ x̂kjk )0 ]¹(i) k m(i) 2M Pkjk = m(i) 2M where ¢ (i) ¹(i) k = Pfmk k¡1 j M, M k ,z g = P (i) L(i) k ¹kjk¡1 (i) (i) m(i) 2M Lk ¹kjk¡1 : Note that for a common model m(i) of the two sets, (i) (i) (i) (i) its x̂kjk , Pkjk , Lk , ¹kjk¡1 are identical for the two MM estimators (i.e., they do not depend on which model set is used at k). If M1 and M2 have a common model m(j) , then ¹(i) k above can be obtained from all (ijM1 ) ¢ = Pfmk(i) j sk 2 M1 , z k g and model probabilities ¹k (ijM2 ) ¢ ¹k = Pfmk(i) j sk 2 M2 , z k g of the two MM estimators (i) directly without knowledge of L(i) k and ¹kjk¡1 : ¹(i) k = 1 (ijM1 ) , ¹ ® k (jjM1 ) ¹(i) k = ¹k (jjM ) ®¹k 2 8 m(i) 2 M1 (ijM2 ) ¹k , 8 m(i) 2 M2 where ®= X (ijM1 ) ¹k m(i) 2M1 (jjM1 ) + ¹k X (jjM ) ¹k 2 m(i) 2(M2 ¡M1 ) (ijM2 ) ¹k : Most VSMM algorithms developed so far, including those of [224], [219], [215], [195], [207], [212], and [346], use this optimal fusion rule. D. VSMM Algorithms We believe that MM estimation will eventually develop into one of a “kit of tools,” represented by various FS and VS algorithms. Development of good model-set adaptation algorithms is perhaps the most important task in VSMM estimation. As stated before, model-set adaptation consists of model-set candidation and model-set decision given candidate sets. Very general and satisfactory results have been obtained for model-set decision, but no such results are available or even in sight for model-set candidation, which as a result, becomes the main task for each individual model-set adaptation algorithm. In other words, different VSMM algorithms may have the same procedure to select the best set from the candidate sets but they differ from one another primarily in model-set candidation, namely, how the candidate sets are determined. An adaptive structure is a variable structure in which the structure varies via adaptation in real time. Many adaptive structures are possible. They can be classified into two broad families, active model-set and model-set generation [195], depending on whether the total model-set (i.e., the set of all possible models) can be specified as a finite set in advance or not. Active Model-Sets: In the active model-set family [195], the total model-set is finite and can be determined in advance before any measurement is received. At any given time it uses an active or working subset of the total model-set determined adaptively, hence the name. Its underlying idea is somewhat similar to that of the active-set method for constrained optimization problems: At each time some models may be terminated and others may be activated. As outlined in [198], [201], and [192], model-set switching is one of the simplest classes of active model-set structures in which the active set is determined by switching among a number of predetermined subsets of the total model-set. These subsets are the candidate sets for model-set adaptation. The switching can be soft as well as hard, similar to soft and hard decision for output processing (see Section IVC). The soft switching assumes that each predetermined subset at any time has a certain probability of having a member model matching the true mode [224]. A hard switching is one based on a set of “hard” rules. The key task with this class of structures is the design of the model subsets, determination of the candidate subsets, and decision procedure for switching. Such a structure, called model-group switching (MGS) algorithm, was presented in [224] and [218] with a comprehensive design example given in [219] and [217], where each “group” represents a certain cluster of closely related system modes, hence the name. This MGS algorithm uses a two-stage hard switching: In addition to the current group, a candidate group is activated first if deemed appropriate by a hard rule, the union of the groups is run, with the help of the optimal fusion rule of two MM estimates in Section VIC, until a decision is made between the two groups by the sequential tests of Section VIB. It runs only one group most of the time and thus provides a substantial saving in computation over an FS algorithm using the total model-set, as demonstrated in [219], [217], and [195]. Different groups may have common models, which facilitate group switching and initialization of newly activated filters. Several designs of the model subset switching (see, e.g., [226], [149], [150], [219], [217], [170]) have been reported for maneuvering target tracking, along with illustrations of their superior cost effectiveness to the FS-IMM algorithm. Another simple class of active model-set structures is called likely-model set (LMS) structure, outlined in [198], [201], and [192]. Simply put, its active set is formed by deleting the models in the total set that are unlikely to match the true mode at the given time. In order to follow the true mode that may jump, it must have a mechanism of expanding LI & JILKOV: SURVEY OF MANEUVERING TARGET TRACKING. PART V: MULTIPLE-MODEL METHODS 1295 the active model set, i.e., determination of the candidate sets. There are various possible ways of expansion. One of the most natural ideas is based on the concept of the state dependency of the mode set (see Section VIA) that given the current mode the set of possible modes at the next time is a subset of the mode space, determined by the mode transition law (i.e., the adjacency relations of the modes). A simple implementation is the following, resulting in the so-called LMS algorithm [215, 195]. Identify each model in the model set Mk¡1 in effect at time k ¡ 1 to be unlikely (e.g., if its probability is below a threshold t1 ), principal (if its probability exceeds t2 ), or significant (if its probability is in between t1 and t2 ). Then, the model set Mk can be obtained as follows: 1) discard the unlikely ones; 2) keep the significant ones; and 3) activate the models to which the principal ones may switch directly. Clearly, the model activation relies on the graph-theoretic representation of the model set [198, 201, 195], as briefly mentioned in Section VIA. The unlikely models in Mk¡1 are those whose ratios of probability to the largest mode probability are below a certain threshold and can be eliminated following the sequential ranking test of Section VIB. Alternatively, a simpler but less accurate way is to delete all the models in Mk¡1 except the B models of the largest probabilities, where B is a constant, determined from computational considerations. As demonstrated in [215] and [195] for a maneuvering target tracking example, this LMS algorithm is somewhat more cost effective than the MGS algorithm of [224] and substantially outperforms the FS-IMM algorithm. A simplified version of the LMS idea was proposed in [346], where the model set used at any time is the state-dependent set of models that can be switched from a principal model, called minimal submodel set in [346], including the principal model itself–the significant models are not necessarily kept and the unlikely models are not necessarily deleted. The principal model is identified as either the one with the largest probability and likelihood or the one that is closest to the estimated true mode at the time. Adaptation of the model set then amounts to switching among the state-dependent model sets, determined by the sequential tests of Section VIB. It is substantially more cost effective than the FS-IMM algorithm, as demonstrated in [346]. Still another simple class is those with a hierarchical architecture. The active set in this hierarchical model-set structure consists of hierarchical levels of models [73, 297]. The makeup (i.e., model subset) of a lower level as a candidate set is determined under the guidance of the higher level(s). An MM estimator typically operates at each level but interactions among levels are generally beneficial [297]. If some models that form one or more levels are generated (instead of activated) in real time, the 1296 corresponding hierarchical structure may be deemed to belong to the model-set generation family. Not all hierarchical MM algorithms have an adaptive structure. For example, those proposed in [118], [348], [85], and [202] are hierarchical MM algorithms of a fixed structure since the model set used is time invariant, albeit of a hierarchical structure. Model-Set Generation: In the model-set generation family [195], new models are generated in real time and thus it is impossible to specify the total model-set as a finite set in advance. A natural idea for model-set generation is to augment the working set Mk of models by one (or more) that match an estimate m̂k of the true mode at time k, leading to the so-called estimated-mode augmentation. The augmented model m̂k can be an estimate of the mode under any optimality criterion in principle, such as 1) the expected mode (conditional mean) MMSE m̂kjk = E[sk j sk 2 Mk , z k ] X = m(i) Pfsk = m(i) j sk 2 Mk , z k g m(i) 2Mk resulting in the expected-mode augmentation [207, ML 212], 2) the estimate m̂kjk = arg maxm f(z k j sk = m) that is the model with the largest likelihood, resulting in the ML model augmentation [297], and 3) MAP estimate m̂kMAP = arg maxm Pfsk = m j z k g, which is the model with the largest posterior probability. A promising alternative is to augment the model set also by the predicted modes, such as MMSE m̂k+1jk = E[sk+1 j sk 2 (Mk [ m̂kjk ), z k ] ML = arg max f(z k j sk+1 = m) m̂k+1jk m MAP m̂k+1jk = arg max Pfsk+1 = m j z k g m to anticipate the next mode transition, leading to what can be called predicted-mode augmentation. As shown in [207] and [212] theoretically, such an augmentation improves accuracy of MM estimation, which is supported by the demonstrations given in [186], [207], [212], and [297] for a variety of applications. Since the estimated mode is added constantly to the working set Mk of models, the optimal fusion rule of two MM estimators of Section VIC is instrumental here. More generally, the model set Mk can be augmented by two or more models using average modes over a (likely) subset of Mk [207, 212] or models with the largest likelihoods or probabilities. Augmentations based on MMSE, ML, and MAP estimates have distinct characteristics. While the expected mode is in the convex hull formed by models in Mk , this is not necessarily so for the MAP and ML mode estimates. MMSE-based augmentation is limited to the case where all models are in the same vector space (and thus their sum is meaningful) and depends on the IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS VOL. 41, NO. 4 OCTOBER 2005 current model set Mk used, but allows clearly an , z k ], Mk[j] = M[j¡1] [ iteration m̂k[j] = E[sk j sk 2 M[j¡1] k k [j] m̂kjk to improve the mode estimation. This is not the case for ML and MAP estimates. Note that the mode estimates need not be obtained by filters based on models in Mk . For example, an adaptive IMM algorithm for radar tracking of a maneuvering target was proposed in [186] that uses an acceleration model determined by a separate Kalman filter on top of a fixed set of models (i.e., augmented by this model). Another natural and simple class of adaptive structure is the so-called adaptive grid structure, where a model is represented by a grid point. As outlined in [198], [201], and [192], it quantizes the mode space unevenly and adaptively. It usually starts from a coarse grid and adjusts the grid in real time based on data as well as prior information. The grid adjustment usually includes a local grid refinement over one or more highly likely subsets of the mode space. The possible locally-refined grids form the candidate model sets here, which are not given explicitly though. This structure belongs to the model-set generation family unless all models in all the grid levels can be determined in advance. The problem here is closely related to model-set design, where theoretical results of Section VIIA provide guidelines. Many practical schemes for adaptation of the grid are possible. The algorithms/designs of [185], [120], [244], [271], [105], [150], [344], [306], [114], and [288] are examples of this structure, where illustrations of their superior cost effectiveness to the FS-IMM algorithm were also given. More specifically, adaptation of an initial coarse grid to a subsequent fine grid was proposed in [120] for an AMM algorithm combined with the PDA filter [19]. Also for an AMM algorithm, [244] presented a filter bank that moves over a predefined fixed grid according to a decision logic, including five versions based on measurement residuals, expected mode, variation in mode estimates, mode probabilities, and error covariance, respectively. This moving-bank method was adopted in several applications [126, 301, 343]. References [198] and [201] suggested to employ the expected mode m̂kMMSE as the center of an adaptive grid for an example of nonstationary noise identification. A set of target acceleration models was proposed in [270] and [271] where the acceleration of the center model is determined by an additional Kalman filter in a two-stage filtering setting (see [208, Sec. 6.3]). It appears that the performance can be enhanced if the (conditional and/or fused) estimates from the MM estimator is utilized in the two-stage filter as well. Reference [89] proposed to replace the above two-stage filter with a fuzzy Kalman filter characterized by a fuzzified process noise covariance. Reference [105] proposed a moving set of CT models centered around one with a turn rate determined by the magnitude of the acceleration divided by the speed of the target. In [149] and [150], a set Mk of three CT models–left, center, and right–were made adaptive by online adjustment of their assumed turn rates !kL , !kC , and !kR centered around !kC based on their posterior mode probabilities ¹Lk , ¹Ck , ¹Rk , C ; that is, with the expected turn rate taken to be !k+1 C k L L C !k+1 := !ˆ kjk = E[!k j z , Mk ] = !k ¹k + !k ¹Ck + !kR ¹Rk . As pointed out in [207] and [212], an alternative C is to use the predicted turn rate !k+1 := !ˆ k+1jk = k L L C C E[!k+1 j z , Mk ] = !k ¹k+1jk + !k ¹k+1jk + !kR ¹Rk+1jk , where ¹k+1jk are predicted mode probabilities. As C presented in [288], choosing !k+1 := !ˆ kjk or !ˆ k+1jk L R and the corresponding !k+1 and !k+1 by a marginal model-set likelihood ratio test yields improved performance. An enhancement of the motion and sizing of the moving bank was proposed in [344] based on a probabilistic discretization of the mode space locally centered around m̂kMMSE using the probability of the normalized measurement residual ¡1 0 z̃(i) of each elemental filter as a measure squared z̃(i) S(i) of the model-mode mismatch. The “filter spawning” technique proposed in [114] for fault detection and estimation first detects a mode change (by a MAP test), decides on the direction of the new mode, and then locally refined models (grid points) are spawned along that direction with the help of the current mode estimate. Performance superiority of these adaptive-grid structures to the corresponding fixed structures were demonstrated in all the publications above. Two perturbation model based adaptive grid schemes were proposed in [307], [304], [305] and [306] for ship and aircraft tracking. The first scheme uses a fixed grid but each elemental filter includes the deviation (perturbation) of its assumed mode value from the true mode as a state variable and estimates it, resulting in de facto an adaptive-grid scheme. The second scheme differs from the first one in that the fixed grid is replaced by an adaptive grid, where each grid point is updated by the corresponding estimated perturbation at each recursion. To avoid the mask of the model differences by their corresponding elemental filters, it was proposed in [233] to adjust assumed model parameters in real time to keep the inter-residual distance measure ²0ij Sij ²ij below a threshold, where ²ij = z̃(i) ¡ z̃(j) and Sij > 0 is a weight matrix. Another implementation using UKF instead of EKF is [382]. Reference [274] reported simulation results of automatic adjusting model parameters by an if-then rule for process noise covariance and heuristic estimation of turn rate based on kinematics. Reference [185] proposed a simplex-directed mathematical programming scheme for an AMM algorithm, where the grid is formed by the vertices of a simplex and updated by certain rules (e.g., replacement of worst vertices by their mirror images and scaling up or down of the simplex) based on mode (vertex) LI & JILKOV: SURVEY OF MANEUVERING TARGET TRACKING. PART V: MULTIPLE-MODEL METHODS 1297 probabilities. Other programming schemes are of course possible here, as discussed next. Closely related to the adaptive-grid structures, still another class of variable structures relies on optimization techniques. Here, the problem is formulated as that of finding the optimal model set given data. Although applicable to model-set adaptation, it is actually more suitable for model-set design, to be discussed in Section VIIA. Such an algorithm was proposed in [162] based on the genetic algorithm (GA) [137]. This algorithm uses a population of n strings (chromosomes) of real or binary codes, M = fm(1) , m(2) , : : : , m(n) g, where each string m(i) (not necessarily an index) represents a possible model. Starting from a random sample uniformly distributed over the mode space as the initial population M0 , it runs an AMM algorithm to obtain posterior model probabilities in each generation. The posterior model probability serves as the objective function, known as the fitness function in GA. The next generation is produced by the genetic operations of selection, crossover, and mutation. This process is repeated as desired. Selection is a process by which individual strings are tentatively selected as candidate parents of the next generation based on their fitness values. It is an implementation of the “survival of the fittest.” Crossover (or recombination) produces the new generation of strings with hopefully improved fitness by randomly selecting mating pairs from the tentative parent pool and crossing over of these pairs (e.g., crossover of ABDCE and abcde to generate ABcde and abCDE). It guarantees the diversity and improvement of each generation and is generally considered the most important and representative GA operation. Mutation is the occasional (with a small probability) random alteration of (the single digit of) the value of a string. Other less popular/fundamental GA operations [124], such as inversion, were not used in [162]. Although numerous possibilities exist, no concrete information was provided therein as how the operations of selection, crossover, and mutation were implemented, except that the so-called biased roulette wheel selection was hinted where a string is selected at random with a probability proportional to its fitness. It was demonstrated in [162] via three simple examples that this GA-based algorithm converges to the true model in dozens of generations, each using only one measurement update and having a population of size 10 (i.e., 10 models), although a more typical size is 50 in most GA applications. The MAP estimate and the associated error covariance were chosen to reinitialize all elemental filters; however, how the prior probabilities of the models in the new generation are assigned is not clear. GA was also suggested in [72] to update the parameters of the entire model set in the context of “mixture of experts.” Other generally applicable optimization techniques (see, e.g., the survey of [175]), such as the popular simulated 1298 annealing and tabu search, are potentially applicable here as well. The simplex-directed scheme of [185] and the recursive quadratic programming of [72], [73] also belong to this class in some sense. It is possible to include one or more adaptive models or elemental filters, in addition to fixed models, within an adaptive structure. The above estimated-mode augmentation is an example. This makes sense intuitively since the fixed models can obtain rapidly a rough initial estimate for the adaptive models, which fine-tune themselves automatically to yield accurate estimates. This leads to a two-level adaptive structure, meaning that both the models (or its elemental filters) and the model sets are adaptive. It belongs to the model-set generation family in general. These adaptive structures are easily implementable and more cost-effective than the state-of-the-art second-generation algorithm. They are particularly suitable for different classes of problems and thus are complementary to each other. Their combinations are certainly possible and may be advantageous for certain problems. E. Tracking Applications A challenging application tackled very actively in the most recent years is tracking of ground targets, in particular, in a road network. This is usually aided by reports of a ground motion target indicator (GMTI). This problem is characterized mainly by the presence of a large number of constraints on the target motion, depending on the target type as well as the terrain conditions, available in the form of topography information, such as road maps. The existence of these constraints requires the use of a model set that is too large for conventional FSMM estimation algorithms. That is why the only application here of the first two generations of MM algorithms known to us is that of an FS-IMM algorithm to tracking dim ground targets in heavy clutter without any road and terrain constraints by a ground based infrared search and track (IRST) sensor, reported in [38]. The actual mode set for any given target and transitions between modes are naturally time varying and state dependent, making the VSMM method ideally suited to this problem. To our knowledge the VS-IMM solution is the only effective one to this problem, used by all contractors in the Affordable Moving Surface Target Engagement (AMSTE) study sponsored by U.S. Defense Advanced Research Projects Agency. A formulation of the problem and a comprehensive VS-IMM solution were first published in [169] and [170]. The specific design implemented the VS-IMM recursive algorithm (see Table III) with an individual model-set adaptation for each target based on its current and predicted state and the known topography of the surveillance region. Main situations that require addition or deletion of a model IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS VOL. 41, NO. 4 OCTOBER 2005 at each revisit time are: on road/off road motion, motion in junctions/intersections, road entry/exit conditions, road obscuration. Furthermore, [166] incorporated an additional “stopped-target” model into the total model set to cope with the possible “move-stop-move” evasion strategy of targets since the GMTI is incapable of detecting slow motion or stationary targets. Quite significant advantages of the VS-IMM algorithm over the FS-IMM algorithm were demonstrated by the simulations. Similar VS-IMM approaches and results were also presented in [310] and [275]. References [79] and [53] proposed to combine a VS-IMM estimator with a joint belief-probability data association approach to track, identify, and group multiple moving targets. A variable structure was used to capture the behavior of highly maneuverable targets through move-stop-move cycle, incorporating features such as motion constraints on road networks and high maneuver terrain. References [9] and [295] employed particle filters into the framework of [170] to cope with the non-Gaussianity of the posterior densities when constraints are applied. The VS-IMM method was employed in [113] and [379] to handle another interesting ground target problem–monitoring the motion of aircraft and vehicles in an airport area based on surface movement radar data. Such surveillance is an essential part of airport movement guidance and control systems. Embedding map information was made by incorporating an elegant kinematic constraint technique [314] (e.g., by using the curvature of a taxiway) in the CT model with polar velocity (see [209, Sec. V.B.2]). Its comparison with the EKF and FS-IMM algorithm using synthesized and real data demonstrated again that the VS-IMM algorithm is highly beneficial in terms of mode identification, accuracy, and computational savings. Other, most recent references that study VSMM tracking of ground targets subject to constraints include [377], [375], [383], and [378]. VII. MM ALGORITHM DESIGN ISSUES A. Model-Set Design Model-set design and choice/decision are closely related but different. Model-set choice/decision deals with the problem of deciding which set is the best given a family of candidate model sets, discussed in Section VIB. Model-set design does not have a given family of candidate sets in general. It determines the model set to be used for a given problem. Clearly, model-set choice can be viewed as an integral part of model-set design. Model-set design is the most important issue in the application of MM estimation. The performance of an MM algorithm for a given problem depends largely on the set of models used and the primary difficulty in the application of the Fig. 9. Minimum distribution-mismatch design of model set M=fm(1) ,:::,m(6) g. MM method is the design of the model set. Numerous publications have appeared in which ad hoc designs were presented, as surveyed in Sections VE and VIIC. There are two types of model-set design: online and offline. Offline design is for the total model set M used by the first two generations or the initial model set in VSMM estimation. In an FSMM algorithm, the model set used cannot vary and is determined a priori by model-set design. In a VSMM algorithm, the model set in effect at any time is determined by model-set adaptation, discussed in Section VIB, which may be viewed as an online (real-time) design process. This section focuses on offline model-set design. General Design Methods: Model-set offline design was formulated in [222] and [197] mathematically as a problem of approximating the true mode as a random variable s with a certain distribution by a discrete random variable m (random model) with a certain pmf. The range of the variable m is the model set and the pmf is the initial model probabilities needed for MM estimation. Three general design methods were proposed in [221], [222], and [197] based on this formulation: minimum distribution-mismatch, minimum modal-distance, and moment matching. The minimum distribution-mismatch design minimizes the mismatch (or distance) kFs ¡ Fm k = maxx jFs (x) ¡ Fm (x)j between the cumulative distribution functions (cdf) Fs (x) and Fm (x) of the mode s and model m. Given the number of models M = jMj, it was shown in [221], [222], and [197] that this method yields in the scalar case the optimal model set M = fm(1) , : : : , m(M) g such that Fm (m(i) ) = (2i ¡ 1)=2M = Fs (x)jx=m(i) , meaning that m(i) can be determined as follows: Divide the range of the cdf Fs (x) by 2M equal intervals and the value of x such that Fs (x) = (2i ¡ 1)=2M is then the optimal location of m(i) (see Fig. 9). This optimal model set has a uniform pmf pm (m(i) ) = 1=M. A procedure that uses a minimal number of models given any tolerance on the distribution mismatch was also given in [222] and [223] for the case where m is a vector. LI & JILKOV: SURVEY OF MANEUVERING TARGET TRACKING. PART V: MULTIPLE-MODEL METHODS 1299 The minimum modal-distance design minimizes the distance ks ¡ mk between s and m in the mode/model space, rather than in the space of distribution functions. The problem of (scalar or vector) quantization and data compression [121] is in a sense a special case of the model-set design problem in this formulation. Significant theoretical results for this design were presented in [221], [222], and [197], including the following conditions for a model set to be optimal and properties of an optimal model set. Assume that S = fS1 , : : : , SN g is a partition of the mode space where Si is covered by model m(i) exclusively in the sense fs 2 Si g = fm = m(i) g. Then, the following conditions hold for the optimality in the sense of minimizing distance ks ¡ mk2 = E[(s ¡ m)0 (s ¡ m)] (and some more general metrics): 1) given any partition S = fS1 , : : : , SN g of mode space, a model set M = fm(1) , : : : , m(M) g is optimal if each model m(i) is the centroid (mean) of the corresponding partition member Si : m(i) = arg minm E[(s ¡ m)0 (s ¡ m) j s 2 Si ]; 2) given any model set M = fm(1) , : : : , m(M) g, a partition is optimal if and only if points in any partition member Si are closer to m(i) than to any other m(j) 2 M; that is, a point s must be assigned to its nearest neighbor m(i) in M. Simply put, the optimal model set is within the class in which models are located at the centroid (mean) of members of a nearest-neighbor partition of the mode space. This result suggests iterative procedures to find an optimal model set. For example, we may start with an initial partition of mode space; find a candidate of the model set as the centroid of each partition member; use the nearest-neighbor rule to update the partition; and repeat this process. This centroid model set that covers each Si by its centroid m(i) = E[s j s 2 Si ] exclusively has several nice and intuitive properties [221]. For example, the (random) model and mode have the same mean: E[m] = E[s]; the modeling error is orthogonal to the model: E[m(s ¡ m)0 ] = 0; the cross power of the mode and model is equal to the power of the model E[m0 s] = E[s0 m] = E[m0 m]. Many of these results were inspired by those in vector quantization and data compression. The above designs require knowledge of the distribution of the true mode, which is hard to come by in practice. The moment-matching design matches the moments of the model to those of the mode, which is much more easily available. Let s̄ and Cs be the mean and covariance matrix of mode s. It was shown in [222], [197], and [221] that we can always use a model set with as few as but not fewer than rank(Cs ) + 1 models to match s̄ and Cs exactly. A set of concrete moment-matching designs was presented in [221], [222], and [197], including those with minimum number of models, those with symmetric pmfs, and those with equal inter-model distance (called diamond-set designs in [221], [222], and [197] because the model locations form a 1300 diamond geometrically). In each of these designs, the probability mass and the location of every model are determined. The simplest possible diamond-set design (with one at the center and six on the first layer) was implemented in [211] and [207] for an example of maneuvering target tracking using MM algorithms. Examples of some of these design methods can be found in [365], [211], [207], [212], [221], [222], and [223]. As pointed out in Section VID, model-set design can be formulated as that of finding the optimal model set given data, based on optimization techniques. Such an algorithm was proposed in [162] based on the GA [137]. Similar to the algorithm described in Section VID, this algorithm uses a population of N strings (vectors), where each string is M-dimensional, representing a set of M models. Starting from N random samples uniformly distributed over the mode space as the initial population, it runs N AMM algorithms in parallel for K measurement updates to obtain the probabilities of all models in each generation. The maximum model probability maxi fPfs = m(i) j s 2 Mj g, i = 1, : : : , Mg in each string Mj serves as the fitness of the string. The next generation is produced by the genetic operations of selection, crossover, and mutation, as described in Section VID. This process is repeated as desired. This algorithm was demonstrated in [162] to converge to the true model in 40 generations, each over a batch of K = 50 measurement updates. Note that this GA-based method is applicable only to the case with a known, fixed number of models. As pointed out in Section VID, other generally applicable optimization techniques, such as simulated annealing and tabu search, are potentially applicable here and some would allow a variable number of models. A key issue here is the choice of objective (fitness) function for optimization. Many objective functions are possible, particularly those discussed in [222], [204], [203], and [223]. The use of model probability calculated within each model set as the fitness function in [162] does not appear desirable since the probability of a model is relative only to others within the set and thus is meaningful for comparison only within a model set but not across different sets. For example, m(1) in the set M1 = fm(1) , m(2) , m(4) g may have a larger probability than m(3) in the set M2 = fm(1) , m(2) , m(3) g even if m(3) is closer to the true model: Pfm = m(1) j m 2 M1 g > Pfm = m(3) j m 2 M2 g > Pfm = m(1) j m 2 M2 g. A simple way out is to calculate model probabilities over the union of the model sets, obtainable from the model likelihoods in the N AMM estimators, rather than within each model set. Guidelines for Model-Set Design: Clearly, many criteria/measures for model-set design are possible and their choice is important. An array of such criteria and measures were proposed and discussed in [222], [204], and [203] for different purposes of IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS VOL. 41, NO. 4 OCTOBER 2005 MM estimation. These include: kx ¡ x̂M k for base-state estimation, where x̂M is an estimate of the base state x using model set M and ku ¡ vk2 = E[(u ¡ v)0 (u ¡ v) j z];26 ks ¡ ŝk for mode estimation, where ŝ is the estimate of the mode s; rates (probabilities) of correct, incorrect, and no identification for mode identification; k» ¡ »ˆM k for hybrid-state estimation, where »ˆM is an estimate of the hybrid state » using model set M; and more general information-theoretic measures based on the Kullback-Leibler information and mutual information. One of the most natural and simplest measures is kx ¡ x̂M k for base-state estimation, formally introduced in [190], [191], and [201] for model-set design. More theoretical results for model-set design are available based on this measure than on other measures. Let x̂M = E[x j z, S = M] be the optimal MM estimators assuming model sets M is the optimal model set, where z is the data. Given an arbitrary model set M and let D = (M ¡ S) [ (S ¡ M) be its symmetric difference from the optimal set S. Note first that it follows from (17) that kx ¡ x̂B k · kx ¡ x̂A k if and only if kx̂S ¡ x̂B k · kx̂S ¡ x̂A k, where x̂S is the optimal MM estimator using the mode space S. It was shown in [191], [201], and [192] that kx̂S ¡ x̂M k = j1 ¡ cj kx̂S ¡ x̂D k, where c = Pfs = m(i) js 2 Mg=Pfs = m(i) js 2 Sg for any model m(i) common to M and S, which implies that use of too many models is as bad as use of too few models. Moreover, consider the problem of adding an arbitrary model set A to another arbitrary model set M without overlap (i.e., M \ A = Ø). Let M0 = M [ A. It was shown in [191], [201], and [192] that set M0 is better than set M in the sense that x̂M0 has a smaller MSE than x̂M if and only if p b2 cos2 µ + 1 ¡ b2 ¡ b cos µ r< (43) 1¡b where b = Pfs = m(i) j s 2 M0 g=Pfs = m(i) j s 2 Mg for any model m(i) in M, r = kx̂S ¡ x̂A k=kx̂S ¡ x̂M k, and cos µ = (x̂S ¡ x̂M )0 (x̂S ¡ x̂A )=(kx̂S ¡ x̂M kkx̂S ¡ x̂A k). Note that (43) describes a ball of radius 1=(1 ¡ b) centered at (¡b=(1 ¡ b), 0, 0, : : : , 0) if x̂M and x̂S are placed at (1, 0, 0, : : : , 0) and (0, 0, 0, : : : , 0), respectively. A simple example was given in [222] and [223] that illustrates how this result, which requires knowledge of the optimal MM estimator x̂S , can be used in practice. The above result holds even if M0 is not a subset of the mode space S. If M0 ½ S (e.g., when M0 is a set of discrete points of a continuous mode space S as a parameter space), as shown in [211], assuming x̂A and x̂M have uncorrelated estimation errors, set M0 is better than set M (i.e., adding A to M is better) in the sense that x̂M0 has a smaller MSE than x̂M if and only if the posterior probability of the model set A is below a 26 With this definition of the norm, kx ¡ x̂M k2 is actually the conditional MSE of x̂M . threshold: Pfs 2 A j s 2 (M [ A), zg < 2kx ¡ x̂M k2 : kx ¡ x̂M k2 + kx ¡ x̂A k2 This condition always holds if kx ¡ x̂A k < kx ¡ x̂M k, which should be the case if x̂A =: x̂ŝ = E[x j z, s = ŝ] (i.e., augment M by A = fŝg). This result provides a theoretical support for the estimated-mode augmentation (see Section VID) for VSMM estimation, as presented in [211], [207], and [212]. Even if x̂A is worse than x̂M , x̂M0 can still be better than x̂M if and only if Pfs 2 A j s 2 (M [ A), zg satisfies 0 the above inequality when E[x̃M x̃A ] = 0, or if and only 0 0 0 if E[x̃M = 0 [211]. x̃A ] < E[x̃M x̃M ] when E[x̃M x̃A ] 6 In order to apply the MM method to problems with uncertain parameter s over space S, two important questions are: 1) which quantity is best selected as the estimatee (i.e., the quantity to be estimated) and 2) how to quantize the parameter space S. The following general guideline was presented in [194] for estimatee selection: If the ultimate goal is to estimate a parameter s, which is related to another parameter p nonlinearly, then a model set fs1 , : : : , sM g in the space of s is superior to a model set fp1 , : : : , pM g in the space of p even if p has a better physical interpretation. For the second question, a procedure to determine the choice of the quantization points M = fm(1) , : : : , m(M) g was presented in [311], given the number of quantization points M. R It minimizes J(M) = S kx ¡ x̂M k2W f(s)ds, where W is a weighting matrix, specified by the designer, and the R pdf f(s) = 1= S ds. The resultant choice is optimal in the sense of having the minimum average weighted MSE for the true mode over the set S. In the Gaussian case, this vector minimization problem can be solved numerically in a straightforward fashion. An example was given in [311] that demonstrates its superiority to several heuristic choices, including the simple, popular uniform quantization scheme. Caution must be exercised in model-set design. For example, there should be enough separation between models so that they are “distinguishable,” “observable,” or “identifiable.” This separation should well exhibit itself in the measurement residuals [145], especially between the filters based on the true model and mismatched models, respectively. Otherwise, the MM estimator will not be selective in terms of choosing the correct model because the residuals have a dominant effect on estimation results. A necessary condition for the effective performance of MM estimation was presented in [70] for a stochastic linear time-invariant system with an uncertain parameter. For a single-input single-output system with an uncertain input bias, the dc gain Gdc of the system transfer function from the input to the output (measurement) must be non-zero. This makes sense intuitively. The steady-state output (i) (measurement residual) z̃ss is proportional to the dc LI & JILKOV: SURVEY OF MANEUVERING TARGET TRACKING. PART V: MULTIPLE-MODEL METHODS 1301 gain times the bias difference (as the step input), which is the actual input bias b minus the input bias (i) b(i) assumed in filter i: z̃ss = (I ¡ B)Gdc (b ¡ b(i) ), with B depending on the steady-state Kalman filter gain for filter i, where I is the identity matrix. For the case of uncertain system matrix parameters but known input (i) (i) (i) u, z̃ss = (I ¡ B)[Gdc ¡ Gdc ]u, where Gdc is the input gain for filter i. For a multiple-input multiple-output system, the necessary condition becomes that each column of the dc gain matrix (difference) must have at least one non-zero element. Other relevant results can be found in, e.g., [23]. Such results are beneficial for performance enhancement of MM estimation, such as those presented in [243]. Model Efficacy: Each model has a certain effective coverage region of the true mode within the model set in use. Knowledge about such relative efficacy is quite useful in model-set design. A concept of relative efficacy of a model in terms of its coverage, along with its quantitative measures, was introduced in [222]. More specifically, a “window” function wi (x) was introduced to quantify the efficacy of model m(i) in covering the true mode s = x relative to other models in the set, where wi (x) is a function of x. The larger wi (x) is, the more effective the model m(i) is (relative to other models in the set) given s = x. Two versions of wi (x) were defined in [222]. Consider a model set M = fm(1) , : : : , m(M) g. A probability-based efficacy of model m(i) in M is wi (x) = Pfm = m(i) j s = x, m 2 Mg; that is, wi (x) is the probability that the random model m will take on the value m(i) given that it has to take on a value in M and the fact that the true mode s is equal to x. Its effect on model R probability is clear through Pfm = m(i) j m 2 Mg = wi (x)fs (x)dx, where fs (x) is the pdf of the true mode. Alternatively, the testing-based model efficacy, defined by wi (x) = PfHi not rejected j s = xg=L, is the probability that Hi is not rejected by an (optimal) test given s = x divided by the number L of hypotheses that are not rejected at the end of the test for the hypothesis testing problem H1 : m = m(1) versus ¢ ¢ ¢ versus 2HM : m = m(M) using all available data. This definition is theoretically equivalent to but implementationally advantageous than the definition wi (x) = Pfaccept Hi j s = xg. A simple example of these two model efficacies can be found in [222]. More related, theoretically elegant results can be found in [386]. B. Determination of Transition Probabilities Theoretically, post-first-generation MM estimators assume that the transition probability matrix (TPM) governing the mode jumps is completely known. In target tracking, however, it is practically unknown, since it depends critically on the unknown control inputs, or worse, the mode sequence is not really Markov. The determination (design, tuning, 1302 adaptation) of the TPM amounts to identifying a Markov transition law that “best” fits the unknown truth, similar to tuning of the process noise covariance Q in the Kalman filtering. Fortunately, the performance of MM estimation is not very sensitive to the choice of the TPM provided it is not too far off; but to a certain degree this choice provides a trade-off between the peak estimation errors at the onset and termination of a maneuver and the steady-state errors during CV motion (see, e.g., [199] and [21]). Offline Design: Traditionally, the TPM has been considered in tracking as a design parameter chosen a priori. Numerous designs and tuning results have been reported in the literature. Most of them are ad hoc, but some are more or less systematic, including those proposed and studied in [43], [18], [68], [39], [257], and [62]. A simple design of TPM ¦ = (¼ij )M£M , appeared as early as in [267], [260], and [123] and used by many, is ¼ii = q, ¼ij = (1 ¡ q)= (M ¡ 1), i, j = 1, : : : , M directly in discrete time for some large q (e.g., q = 0:97 [11]). Such a design directly in discrete time is questionable for a discretized system with a nonuniform sampling (revisit) interval since the TPM depends on the sampling intervals as well as target behavior (in continuous time). A more systematic design is based on modeling the Markov jump process in the continuous time [43, 68, 39]. It follows from the forward and backward Kolmogorov equations [279] that ¦(T) = e¤T , where ¤ = (¸ij )M£M is the transition density matrix of the process, defined P similarly as for ¦, with ¸ii < 0, ¸ij > 0, i 6 = j and M j=1 ¸ij = 0. The diagonal elements ¸ii of ¤ and the mean sojourn time ¿¯i of mode m(i) are related by ¸ii = ¡1=¿¯i since the sojourn time ¿i of a state (mode) m(i) of a Markov jump process has an exponential distribution with parameter ¡¸ii . Its direct discrete-time counterpart is ¼ii = 1 ¡ 1=¿¯i , where ¿¯i is expressed in discrete time [18, 19]. From ¸ii = ¡1=¿¯i and ¦(T) = e¤T it follows approximately that ¼ii = e¡T=¿¯i , which is more widely used, such as in [60] for the design of an IMM-based ATC surveillance system and in [155] for TPM adaptation in a two-model IMM tracker with adaptive sampling for the first benchmark problem [48]. Another design, used in [161], is ¼ii = 1 ¡ T=¿¯i , which is in fact the above direct discrete-time design but with ¿¯i in continuous time and is equal to the linear approximation of the approximate continuous-time model ¼ii = e¡T=¿¯i . This was modified to ¼ii = maxfqi , 1 ¡ T=¿¯i g in [199] and [19], where qi is the minimum probability for mode m(i) to stay on, as opposed to jumping to any other mode. The choice of qi = ¹(i) 1 was suggested in [39], where ¹(i) is the steady-state probability of mode m(i) , 1 independent of the initial mode. As presented in [68], ¦ can be designed as follows if ¹(i) 1 is known for each mode. First determine (numerically) ¤ from the relationship limT!1 (e¤T )0 = ¦(1)0 = [¹1 , ¹1 , : : : , ¹1 ], IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS VOL. 41, NO. 4 OCTOBER 2005 (2) (M) 0 where ¹1 = [¹(1) 1 , ¹1 , : : : , ¹1 ] , and then get ¦ = ¦(T) = e¤T for the given sampling interval T. This method was used in [68] and [39] for different IMM configurations with a nonuniform T for air defence system applications. Compared with those based on mean sojourn time ¿¯i above, this method has pros and cons: It obtains off-diagonal elements ¼ij as well as diagonal elements ¼ii , but it relies on knowledge of ¯ ¹(i) 1 , which is often harder to come by than ¿i , and an asymptotic relationship, based on which results are usually less accurate. This reliance on the asymptotics 0 can be replaced by ¦(T0 ) = e¤T if ¦(T0 ) is known 0 for some T . Note that a simple use of ¼ij and ¼ii from the above two P methods would usually violate the requirement M j=1 ¼ij = 1. It would be better to combine them by solving (approximately, if needed) 0 limT!1 (e¤T )0 = (¹1 , ¹1 , : : : , ¹1 ), or ¦(T0 ) = e¤T 0 if ¦(T ) is known, for ¤ with ¸ii = ¡1= ¿¯i , 8i, if possible. Online Adaptation: The offline design, being completely a priori in nature, does not provide estimates of the TPM using online data. In some cases, prior information about the TPM may be inadequate or lacking. The “unreasonable” need to provide TPM a priori even in the case of insufficient information has been cited by some as one of the main reservations about using Markov-chain-based MM estimation algorithms (see, e.g., [114]). A number of algorithms have been proposed recently in [151], [153], [152], and [91] for online estimation of the TPM. These algorithms are naturally and easily incorporable into a typical MM (e.g., IMM) estimation algorithm, resulting in TPM-adaptive MM estimation. More specifically, [151], [153], and [152] developed a Bayesian framework and proposed several suboptimal algorithms for recursive MMSE estimation ˆ = of the TPM starting from an initial estimate ¦ 0 (1) (M) 0 ˆ 0 ] . Among them, the most cost-effective ˆ 0 ,:::,¼ [¼ one–the so-called quasi-Bayesian estimator–assumes a Dirichlet prior distribution of the TPM and its recursion is given by ¼ˆ k(ij) = 1 ®(ij) , k+1 k (ij) ®(ij) k = ®k¡1 + PM ˆ k(i) = [¼ˆ k(i1) , : : : , ¼ˆ k(iM) ]0 ¼ 1 (ij) (ij) j=1 ®k¡1 gk gk(ij) = 1 + 1 ˆ L ¹0k ¦ k¡1 k (ij) (ij) ®k¡1 gk (j) ˆ (i)0 ¹(i) k [Lk ¡ ¼k¡1 Lk ] (M) 0 (1) (M) 0 where Lk = [L(1) k , : : : , Lk ] and ¹k = [¹k , : : : , ¹k ] are vectors of model likelihoods and probabilities, respectively. This algorithm was shown in [151], [153], and [152] to have reasonable performance at an almost negligible computational expense. Also adopting the Dirichlet prior, [91] derived recursive hybrid estimation schemes with an unknown TPM by obtaining posterior marginal pdfs of the base and modal states analytically. Note that these online adaptation algorithms are more generally applicable than the offline designs. It is intuitively appealing to combine offline design with online adaptation to take advantage of both prior knowledge and online information: the a priori designed TPM is refined by the online TPM adaptation using online data from the current scenario; the adaptation may be slow if the prior TPM is (nearly) “noninformative” (e.g., uniformly distributed) but could be speeded up by a good initial TPM. C. Various MM Designs and Performance Studies Successful application of any particular MM algorithm to a real-world maneuvering target tracking problem largely amounts to design of the model set, ad hoc adjustment of the algorithm, tuning of parameters, and choice of the best trade-off variant by performance evaluation and comparison. The tracking literature is abundant in various studies on model-set design, parameter choice/tuning, and performance evaluation/comparison. Many of them are generic enough to be applied to a wide range of problems and situations, although a “universal” approach best for all applications is impossible. Here we give a brief review of these more generic designs. More problem-specific implementations were addressed in Sections IVE, VE, and VIE. Most MM algorithm designs follow two basic ideas concerning how maneuvers are modeled: parametric and structural. Parametric designs select one or more parameters to represent the effect of the target maneuvers; each model is characterized by a quantized value of the parameters. Structural designs use models of different structures to describe different maneuvers. All designs include one or more CV models. The reader is referred to [209] for model details. Parametric Designs: Typical parameters to be quantized are input (acceleration), process noise level, and turn rate. Most common examples are: exponentially correlated acceleration (ECA) model (i.e., the Singer model) with constant mean levels [260, 123, 266], second-order linear kinematic model with multiple acceleration levels (see, e.g., [11]) or with multiple process noise levels (see, e.g., [367] and [298]). In particular, [260] designed a linear target motion model, with unknown input uk quantized into several known levels u(i) . The corresponding GPB1 algorithm reduces to a single Kalman filter with input ūk , which is a probabilistically weighted sum of the known input levels u(i) (see [208, Sec. 5.3.8 of Part IV]). This technique has been used by many others. It is also applicable in principle when the input level is over a continuum if the transition pdf between any pair (uk¡1 , uk ) is known. This, however, involves integration in general to obtain the average LI & JILKOV: SURVEY OF MANEUVERING TARGET TRACKING. PART V: MULTIPLE-MODEL METHODS 1303 ūk . Assuming the transition pdf between (uk¡1 , uk ) is Gaussian and uk¡1 is Gaussian distributed, the average ūk can be obtained in a close form, and such a form was obtained in [187] for a constant-position model with u as velocity bias. References [11] and [10] investigated a 2D design consisting of one CV and 12 CA models with acceleration values distributed symmetrically over a 2D region centered at zero and bounded by 40 m/s2 . The tracking performance was found to depend on j¢ajT2 =¾, where j¢aj is the quantization step, T the sampling time, and ¾ 2 the measurement noise variance. This design has later been used in a number of theoretical and comparative studies (see, e.g., [226], [219], [215], and [207]). A shortcoming associated with quantization of the input acceleration is that many models/filters are needed to cover even a moderate range. A much smaller number of models are needed if instead the process noise level is quantized [367, 298]. Structural Designs: Structural and hybrid (structural/parametric) designs can be much more efficient than quantization only. Typically, CV, CA, and/or CT models are used here. Most common examples are: CV-CA [16], [18], CV-CT (see, e.g., [101], [21]), and CV-CA-CT [36]. More specifically, [16] presented two IMM designs with two models (CV-CA) and three models (CV-CA-CA), respectively, denoted as IMM2 and IMM3, and demonstrated that they beat the input-estimation method [64] significantly in accuracy and dramatically in computation. Reference [140] suggested the inclusion of a CA model with higher process noise to cover the transitions between CV and CA motions and provide a faster response. Reference [52] introduced an innovative model, exponentially increasing acceleration (EIA), particularly suitable for fast maneuver detection and demonstrated that an IMM algorithm with a CV-EIA-ECA model set improved the accuracy achieved by the IMM2 during maneuvers. Explicit modeling of CT motion was proposed in [101] and [102], where two model sets were designed, one CA-CT-CT (left/right turns with known turn rates) and the other CV-2DCT (with estimated turn rate). Using explicit CT models proved to be very beneficial for precision tracking during turns. Further significant enhancements were presented in the series of papers [351], [352], and [353]. While utilizing the EIA for speedy maneuver responses, the proposed IMM design includes a 3D CT model, implemented with the kinematic constraint [4], to provide constant speed prediction. The resulting sophisticated tracker using three models (CV-EIA-3DCT) with kinematic constraint outperformed considerably the CV-2DCT of [101] in tracking nonhorizontal planar maneuvers [351]. These and many other comprehensive studies of various IMM designs, proposed by Blair and Watson [46, 51, 43, 44, 354], preceded and led to the solution of 1304 the second benchmark problem proposed in [355]. Solutions to the benchmark problems were discussed in Section VE and more discussions will be given in a subsequent part. Although using only three filters the above designs are computationally involved since nonlinear and/or correlated models are used. If computational load is of a great concern, an alternative is the interacting multiple bias model scheme [42, 350, 349] (see also [246]). The main idea is to model the maneuver acceleration as an isolated system bias and employ the two-stage (bias free + bias only) reduced-order estimator [117] rather than the complete Kalman filter for the augmented state (including the bias state). (See [208] for a survey and discussion of issues associated with the two-stage filtering.) Its two-model (CV and bias) version, referred to as interacting acceleration compensation algorithm, was demonstrated in [350] to achieve about 50% reduction in computation relative to the CV-CA configuration with similar performance if the data rate is high enough. A comprehensive study reported in [68], [37], [39], and [36] evaluated configurations with CA, CV, Singer models and several new models, including a horizontal CT model with polar velocity [122] and a 3D version with two additional states (velocity elevation angle and its rate) [66]. Comprehensive simulations over a great variety of scenarios showed that the horizontal CT model combined with decoupled altitude filtering performed slightly better than the complete 3D model overall. The former was included into the IMM-MHT solution to the second benchmark problem (see Section VE). Reference [32] includes an evaluation of an IMM design with normal and tangential accelerations within its proposed 2D curvilinear model (see [209]). Hybrid Designs: Parametric and structural designs can of course be integrated, leading to hybrid designs, which involve both models with different structures and quantized parameters within structures. Examples of hybrid designs include: CV and CA models with multiple process noise levels (see, e.g., [18] and [21]) and CV and CT models with multiple turn rates (see, e.g., [101], [199], and [18]). Additional references concerning design, performance evaluation/comparison, and/or other aspects related to IMM configurations for maneuvering target tracking include [67], [159], [88], [41], [165], [277], [135], [303], [143], [31], [144], [294], and [380]. VIII. NON-PROBABILISTIC/STATISTICAL TECHNIQUES MM approach is a general methodology, not limited to the probabilistic/statistical setting of the previous sections. Many nonstatistical methods have been proposed (see, e.g., [272]). In this section, IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS VOL. 41, NO. 4 OCTOBER 2005 we discuss only those that have been proposed for maneuvering target tracking. They are based on alternative means of modeling and handling the target motion-mode uncertainty, including evidential reasoning [369], neural networks [72, 73, 6], fuzzy logic [247, 89, 374], genetic algorithms [29, 162], and deterministic algorithms [281, 5, 135]. Much of the pros and cons of the probabilistic/ statistical methods of MM estimation stems from their reliance on the total probability/expectation theorem and Bayes’ rule. These theorems require partitioning of the probability space (i.e., one and only one member is true) even when information is not sufficient for doing so. In other words, they force us to overstate the degree of certainty when evidence or knowledge is actually incomplete or subjective, and thus the claim of probabilistic exactitude or optimality is actually more or less artificial. This is probably the main weakness of the probabilistic formalism. Its main strengths include: it is rigorous, systematic, and particularly suitable for sequential processing, among other things. Nonstatistical approaches offer a variety of alternatives with distinctive flexibilities that are valuable for handling many real-world problems. For example, the Dempster-Shafer reasoning overcomes the partitioning limitation of the Bayesian methods by allowing representation of neither exclusive nor exhaustive hypotheses of posterior evidence. As such, in the context of MM estimation it may offer a framework potentially better to handle possible model-truth mismatch. Like other nonprobabilistic approaches, however, its standard version does not provide a way of fusing old knowledge and new information as Bayes’ rule does in the probabilistic setting. As a result, not surprisingly we are not aware of any effective nonstatistical methods for conditional filtering, which requires sequential processing (fusion) of old evidence and new data with perfect knowledge of the underlying model. Nonstatistical techniques can be applied to all the other components of MM estimation: model-set determination, cooperation strategy, and output processing. The use of genetic algorithms for model-set determination (adaptation and design) has been discussed in Sections VID and VIIA. Application of nonstatistical methods to output processing is most natural (see, e.g., [72], [73], [247], and [374]) since it amounts to fusion of evidence obtained at the same time. By the similarity between output processing and (hard and soft) decision based cooperation strategies, these methods can also be used for cooperation strategies, although we are unaware of such use in the literature. Reference [369] proposed an approach that integrates maneuver detection with two-model MM estimation based on the Dempster-Shafer evidential reasoning (see, e.g., [309] and [39]). Consider sequential testing of no-maneuver (H0 ) against maneuver (H1 ) hypotheses. The belief Bel(Hi ) and plausibility Pls(Hi ) of Hi are obtained by an extension of the Dempster-Shafer theory. They can be interpreted loosely as lower and upper probabilities, respectively, in that probability is an interval (not a single number) PfHi g = [Bel(Hi ), Pls(Hi )].27 The interval length Pls(Hi ) ¡ Bel(Hi ) reflects residual ignorance in our knowledge. If Bel(H0 ) > Pls(H1 ), or equivalently (for two-model case), Bel(H0 ) > 0:5, the target is deemed not maneuvering, its estimate is based on the nonmaneuver model alone, and a maneuver onset detector is turned on, which declares a maneuver onset if Bel(H0 , H1 ) > Pls(H0 , H0 ). If Bel(H1 ) > Pls(H0 ) or equivalently Bel(H1 ) > 0:5, the target is deemed maneuvering, its estimate is based on the maneuver model alone, and a maneuver termination detector is turned on, which declares a maneuver termination if Bel(H1 , H0 ) > Pls(H1 , H1 ). In other cases (i.e., when the intervals PfH0 g and PfH1 g have an overlap), including after a maneuver onset or termination is declared, the target state estimate is the weighted sum of estimates x̂(0) and x̂(1) from both models using their normalized belief values as weights x̂ = [x̂(0) Bel(H0 ) + x̂(1) Bel(H1 )]=[Bel(H0 ) + Bel(H1 )]: Given new measurements, the belief is updated based on Dempster’s rule of combination. By the random-set theory, however, this rule holds only if the bodies of evidence being combined are independent, which is actually not the case here. Earlier, this way of integrating hard decision and soft decision was proposed in [332] in a probabilistic setting. The approach of [369] was developed within the AMM framework since x̂(0) and x̂(1) are obtained by two elemental filters working independently, but it can certainly be extended to the later generations. References [72] and [73] considered MM estimation by means of the so-called “mixture-of-experts” system, gained popularity in the neural network literature recently. Here each conditional filter (rather than neural network) is viewed as an expert. The overall estimate is a weighted sum (mixture) of the output of all experts. The weights are computed by the softmax operation P 0 (i) zk0 a(i) ¹(i) = i ezk a , which is a differentiable kjk¡1 = e version of the “winner takes all” strategy, meaning that the filter (expert) with the best performance will have a weight close to unity. Here zk is the measurement. The internal weight vector a(i) is updated (for the use at the next time) by a steepest descent search with a step size (learning rate) ´: (i) (i) a(i) := a(i) + ´(¹(i) k ¡ ¹kjk¡1 )zk , where ¹k follows from ¹(i) kjk¡1 and likelihood by Bayes’ rule as in the IMM 27 For instance, we may think the probability of raining tomorrow is in between 50% and 70% (not in California, of course). LI & JILKOV: SURVEY OF MANEUVERING TARGET TRACKING. PART V: MULTIPLE-MODEL METHODS 1305 algorithm (see Table II). A hierarchical version was also explored in [73]. The approach of [72] and [73] is essentially an AMM algorithm, but the parameters of the best model are made adaptive by quadratic programming, resulting in an algorithm with a certain VSMM flavor. The simulation results presented seem to suggest that this estimator is slightly better than the probabilistic AMM algorithm in terms of response time to mode jumps as well as best filter selection when the truth is not in the model set. Its performance, however, is sensitive to the design parameter ´ and its competitiveness relative to the IMM algorithm is not known. References [247] and [374] proposed to use fuzzy weights as an alternative to the probabilistic weights in the IMM algorithm. In [247], the space of true mode s (e.g., acceleration a [247] or turn rate ! [374]) is quantized to obtain a mode set fm(1) , : : : , m(M) g. A corresponding set of (Gaussian or triangular) membership functions ¹(i) (s) 2 [0, 1], centered at m(i) , is designed as a measure of the “validity” of each model m(i) . A mode estimate ŝk = âkjk is obtained by an independent CA filter. ThePmodel weights are (i) (j) then computed as ¹(i) j ¹ (ŝk ) and an k = ¹ (ŝk )= IMM algorithm is run using these weights as if they were the posterior probability weights Pfmk(i) j z k g. Such total reliance on the not-so-accurate estimate ŝk = âkjk for weight update is undesirable. The inferior performance of this fuzzy variant relative to the standard IMM algorithm, as indicated by the simulation results, does not justify the extra design effort required here. Reference [374] differs from [247] by using ad hoc fuzzy if-then rules to obtain ¹(i) (s) based on normalized measurement residual squared. Note that these fuzzy weights do not fit particularly well into the IMM configuration; they can be used equally well/poorly for any other merging-based MM estimation algorithm. As mentioned in Section VID, [89] proposed to use a fuzzified process noise covariance in the Kalman filter to provide an acceleration estimate for grid adaptation in VSMM estimation; simulation results of automatic adjusting model parameters by an if-then rule for process noise covariance were reported in [274]. In general, such heuristic blending of a probabilistic formalism with fuzzy techniques is not appealing. As a side effect of their flexibility, many nonstatistical techniques are more susceptive to misuse than statistical methods. For example, a main weakness of these nonstatistical methods stems from their lack of solid, systematic weight update, while the statistical methods have a built-in solid mechanism for sequential update of the weights thanks to Bayes’ rule. The above fuzzy variants rely on ad hoc heuristics for weight update, while the “mixture of experts” of [72] and [73] had recourse to an optimization (search) algorithm with the help of Bayes’ rule. 1306 A two-model MM algorithm was derived in [5] using a deterministic approach. It can be classified as a second generation algorithm with a cooperation strategy that resembles a mixture of the IMM and GPB1 strategies. As pointed out in [135], however, an IMM algorithm with uniform transition probabilities ¼ij = 0:5, 8i, j has the same state estimate formula yet a superior formula for the error covariance. As pointed out before at the beginning of Section IV and in IVD, although fundamental for Bayesian MM estimation, Assumptions A1, A10 , A2b, and A2c of Sections IV and V are not needed for classical (non-Bayesian) estimation, such as ML estimation. This is more so for nonstatistical methods of this section, although these methods can still be classified into the three generations as above. IX. CONCLUDING REMARKS The MM estimation approach provides the state-of-the-art solutions to many maneuvering target tracking problems. There are basically two directions to improve the existing solutions. The first one is to design a better set of models. Numerous publications have appeared in which various ad hoc designs were presented. This will certainly continue. It is extremely challenging to obtain effective, systematic, and generally applicable results for model-set design. Relevant theoretical results are scarce and this deserves more attention. The other direction is to develop and design better algorithms. The MM approach started with the first-generation AMM algorithms in which elemental filters work independently. Its advantage over many non-MM approaches stems from its superior processing of results from elemental filters for output a posteriori. The first generation has significant applications for nontracking problems, but limited value for maneuvering target tracking, because of its inability to account for information contained in one elemental filter for better performance of another filter. Represented by the IMM algorithm, the second-generation (CMM) algorithms explore effective cooperation strategies among elemental filters while inheriting the first generation’s superior rule for output processing. The IMM algorithm has been so successful in solving a number of maneuvering target tracking problems in the real world that it has become a standard tool for maneuvering target tracking. Significant advances have been made in recent years and further developments are sure to come. However, their fundamental limitations are clear and not minor: They believe at any given time one of their elemental filters is perfect and none of them may provide incorrect, misleading, confusing, or any other harmful information. In short, they trust themselves IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS VOL. 41, NO. 4 OCTOBER 2005 so much that they refuse to adapt themselves to the outside world, although their estimates are adaptive to a certain degree. They cannot be expected to perform well if they are exposed to an environment to which none of the existing elemental filters fit well, such as one that is unknown or new to them. The third generation is potentially much more advanced in the sense of having an open architecture–a variable structure–than its ancestors, which have a closed architecture. Not only does it inherit the second generation’s effective cooperation strategies and the first generation’s superior output processing, but it also adapts to the outside world by producing new elemental filters if the existing ones are not good enough and by eliminating those elemental filters that are harmful. The decisions on terminating harmful ones are relatively easier–general and rather successful rules have been obtained. The task of producing good new filters systematically in a general setting is much more challenging. A breakthrough here would be a new milestone in MM estimation. Similar to model-set design, ad hoc designs for producing new filters are almost always obtainable that outperform the first two generations, given a particular problem. Many products in this area can be expected in the future. The main drawback of most algorithms in this generation is their complexity. The first two generations do have certain “intelligence” at different levels in that they learn the environment during the course of estimation, along with their capability of self-assessment, but they stop short of drastically adjusting themselves for better performance. The third generation is more intelligent as characterized by its self-adjustment to the outside world, in particular its ability to (re)produce elemental filters, for the best performance possible. The operation of a non-MM algorithm amounts to deciding on the best single individual first, letting him perform, and then sending out his estimation results. For the MM estimation, the first generation can be thought of as a fixed group of individuals working independently. Its superiority to non-MM algorithms stems from the fact that its output is generated after all individuals have performed, which allows, for example, use of the best performance a posteriori and optimal combination of individual results. The price paid is that all these individuals have to perform. The elemental filters in the second generation in effect form a cooperative team with a fixed membership. It outperforms the first generation because of its team work via cooperation. The third generation can be likened to an adaptive, cooperative team with a possibly variable membership. It may recruit new members and fire bad or incompetent members or put them on probation. This additional flexibility enables the third generation to handle a wider spectrum of intricate and challenging problems in uncertain, complex, and changing situations. All three generations have their reasons to exist because they have their best domains of application. Clearly, a non-MM algorithm would be optimal if the best possible individual for the task at the time could always be chosen. This is possible only in the absence of uncertainty about the task. If the task were fixed in time but unknown over a set and the group were formed by the best possible individuals for every task in the set, the first generation would be optimal. The second generation would potentially be optimal if the task might be changing over time within a set and the team were formed by the best possible individuals for every task in the set. If either the best possible individual for each of the tasks is not part of the team or some team members do not match any of the tasks, it would be possible for a variable team (third generation) to outperform the champions of the first two generations. ACKNOWLEDGMENTS Help from a number of people for writing this part of the survey is appreciated. The authors would like to thank particularly Yaakov Bar-Shalom, Henk Blom, and Dave Sworder. REFERENCES [1] [2] [3] [4] [5] [6] [7] [8] Ackerson, G. A., and Fu, K. S. On state estimation in switching environments. IEEE Transactions on Automatic Control, AC-15, 1 (Jan. 1970), 10—17. Akashi, H., and Kumamoto, H. Random sampling approach to state estimation in switching environments. Automatica, 13, 4 (July 1977), 429—433. Allam, S., Dufour, F., and Bertrand, P. Discrete time estimation of a Markov chain with marked point process observations. Application to Markovian jump filtering. IEEE Transactions on Automatic Control, 46, 6 (2001), 903—908. Alouani, A. T., and Blair,W. D. Use of a kinematic constraint in tracking constant speed, maneuvering targets. IEEE Transactions on Automatic Control, 38, 7 (July 1993), 1107—1111. Alouani, A. T., and Rice, T. R. Single-model multiple-process noise soft switching filter. In Proceedings of the SPIE Conference on Sensor Fusion: Architectures, Algorithms and Applications, 1999, 260—278. Amoozegar, F. Neural-network-based target tracking state-of-the-art survey. Society of Photo-Optical Instrumentation Engineers, 37, 3 (Mar. 1998), 836—846. Anderson, B. D. O., and Moore, J. B. Optimal Filtering. Englewood Cliffs, NJ: Prentice-Hall, 1979. Andrisani, D., Kuhl, F. P., and Gleason, D. A nonlinear tracker using attitude measurements. IEEE Transactions on Aerospace and Electronic Systems, 22, 3 (Sept. 1986), 533—539. LI & JILKOV: SURVEY OF MANEUVERING TARGET TRACKING. PART V: MULTIPLE-MODEL METHODS 1307 [9] Arulampalam, M. S., Gordon, N., Orton, M., and Ristic, B. A variable structure multiple model particle filter for GMTI tracking. In Proceedings of the 2002 International Conference on Information Fusion, Annapolis, MD, July 2002, 927—934. [24] Baram, Y., and Sandell, N. R., Jr. An information theoretic approach to dynamical systems modeling and identification. IEEE Transactions on Automatic Control, AC-23, 1 (Feb. 1978), 61—66. [10] Averbuch, A., Itzikowitz, S., and Kapon, T. Parallel implementation of multiple model tracking algorithms. IEEE Transactions on Parallel and Distributed Systems, 2, 2 (Apr. 1991), 242—252. [25] Baram, Y., and Sandell, N. R., Jr. Consistent estimation on finite parameter sets with application to linear systems identification. IEEE Transactions on Automatic Control, 23, 3 (June 1978), 451—454. [11] Averbuch, A., Itzikowitz, S., and Kapon, T. Radar target tracking–Viterbi versus IMM. IEEE Transactions on Aerospace and Electronic Systems, 27, 3 (May 1991), 550—563. [26] Basseville, M. Detecting changes in signals and systems. Automatica, 24, 3 (May 1988), 309—326. [12] Bar-Shalom, Y. Recursive Tracking algorithms: From the Kalman filter to intelligent trackers for cluttered environment. In Proceedings of the 1989 IEEE International Conference Control and Applications, Jerusalem, Israel, Apr. 1989. [27] Bergman, N. Recursive Bayesian estimation. Navigation and tracking applications. Ph.D. dissertation, Department of Electrical Engineering, Linkoping University, Sweden, 1999. [13] Bar-Shalom, Y. (Ed.) Multitarget-Multisensor Tracking: Advanced Applications. Norwood, MA: Artech House, 1990. [28] [14] Bar-Shalom, Y. (Ed.) Multitarget-Multisensor Tracking: Applications and Advances, Vol. II. Norwood, MA: Artech House, 1992. Bergman, N., and Gustafsson, F. Three statistical batch algorithms for tracking manoeuvring targets. In Proceedings of the 5th European Control Conference, Karlsruhe, Germany, 1999. [29] Bar-Shalom, Y., and Blair, W. D. (Eds.) Multitarget-Multisensor Tracking: Applications and Advances, Vol. III. Boston, MA: Artech House, 2000. Berketis, K., Katsikas, S. K., and Likothanassis, S. D. Multimodel partitioning filters and genetic algorithms. Nonlinear Analysis, Theory, Methods and Applications, 30, 4 (1997), 2421—2427. [30] Bar-Shalom, Y., Chang, K. C., and Blom, H. A. P. Tracking a maneuvering target using input estimation versus the interacting multiple model algorithm. IEEE Transactions on Aerospace and Electronic Systems, 25, 2 (Apr. 1989), 296—300. Bertsekas, D. Nonlinear Programming. Athena Scientific (2nd ed.), Sept. 1999, ISBN: 1886529000. [31] Bessell, A., Ristic, B., Farina, A., Wang, X., and Arulampalam, M. S. Error performance bounds for tracking a manoeuvring target. In Proceedings of the 2003 International Conference on Information Fusion, Cairns, Australia, July 2003, 903—910. [32] Best, R. A., and Norton, J. P. A new model and efficient tracker for a target with curvilinear motion. IEEE Transactions on Aerospace and Electronic Systems, 33, 3 (July 1997), 1030—1037. [33] Bickel, P. J., Klassen, C. A. J., Ritov, Y., and Wellner, J. A. Efficient and Adaptive Estimation for Semiparametric Models. New York: Springer, 1998. [34] Blackman, S. S. Multiple Target Tracking with Radar Applications. Norwood, MA: Artech House, 1986. [35] Blackman, S. S., Busch, M. T., and Popoli, R. F. IMM/MHT tracking and data association for benchmark tracking problem. In Proceedings of the 1995 American Control Conference, Seattle, WA, June 1995, 2606—2610. [36] Blackman, S. S., Busch, M. T., and Popoli, R. F. IMM/MHT solution to radar benchmark tracking problem. IEEE Transactions on Aerospace and Electronic Systems, 35, 2 (Apr. 1999), 730—737. Also in Proceedings of the 1995 American Control Conference, Seattle, WA, June 1995, 2606—2610. [37] Blackman, S. S., Dempster, R. J., and Roszkowski, S. H. IMM/MHT application to radar and IR multitarget tracking. In Proceedings of the 1997 SPIE Conference on Signal and Data Processing of Small Targets, vol. 3163, 1997, 429—439. [15] [16] [17] [18] [19] [20] Bar-Shalom, Y., Kumar, A. K., Blair, W. D., and Groves, G. W. Tracking low elevation targets in the presence of multipath propagation. IEEE Transactions on Aerospace and Electronic Systems, 30, 4 (Oct. 1994). Bar-Shalom, Y., and Li, X. R. Estimation and Tracking: Principles, Techniques, and Software. Boston, MA: Artech House, 1993. (Reprinted by YBS Publishing, 1998). Bar-Shalom, Y., and Li, X. R. Multitarget-Multisensor Tracking: Principles and Techniques. Storrs, CT: YBS Publishing, 1995. Bar-Shalom, Y., Li, X. R., and Chang, K. C. Non-stationary noise identification with interacting multiple model algorithm. In Proceedings of the 5th International Symposium on Intelligent Control, Philadelphia, PA, Sept. 1990, 585—589. [21] Bar-Shalom, Y., Li, X. R., and Kirubarajan, T. Estimation with Applications to Tracking and Navigation: Theory, Algorithms, and Software. New York: Wiley, 2001. [22] Baram, Y. A sufficient condition for consistent discrimination between stationary Gaussian models. IEEE Transactions on Automatic Control, AC-23, 5 (Oct. 1978), 958—960. [23] 1308 Baram, Y. Nonstationary model validation from finite data records. IEEE Transactions on Automatic Control, AC-25, 1 (Feb. 1980), 10—19. IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS VOL. 41, NO. 4 OCTOBER 2005 [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] Blackman, S. S., Dempster, R. J., Sasaki, D. M., Singer, P. F., and Tucker, G. K. Application of IMM/MHT tracking with spectral features to ground targets. In Proceedings of the 1999 SPIE Conference on Signal and Data Processing of Small Targets, vol. 3809, Denver, CO, July 1999, 456—467. Blackman, S. S., and Popoli, R. F. Design and Analysis of Modern Tracking Systems. Norwood, MA: Artech House, 1999. Blair, W. D. Toward the integration of tracking and signal processing for phased array radar. In Proceedings of the 1994 SPIE Conference on Signal and Data Processing of Small Targets, vol. 2235, Orlando, FL, 1994. Blair, W. D., and Bar-Shalom, Y. Tracking maneuvering targets with multiple sensors: Does more data always mean better estimates. IEEE Transactions on Aerospace and Electronic Systems, 32, 1 (Jan. 1996), 450—456. Blair, W. D., and Watson, G. A. Interacting multiple bias model algorithm with application to tracking maneuvering targets. In Proceedings of the 31st IEEE Conference on Decision and Control, Tucson, AZ, Dec. 1992, 3790—3795. Blair, W. D., and Watson, G. A. Interacting multiple model algorithm with aperiodic data. In Proceedings of the SPIE Symposium on Acquisition, Tracking and Pointing, Orlando, FL, Apr. 1992. Blair, W. D., and Watson, G. A. IMM algorithm for solution to benchmark problem for tracking maneuvering targets. In Proceedings of the SPIE Symposium on Acquisition, Tracking and Pointing, Orlando, FL, Apr. 1994. Blair, W. D., and Watson, G. A. Benchmark problem for radar resource allocation and tracking maneuvering targets in the presense of false alarms and ECM. Technical Report NSWCDD/TR-96/10, Naval Surface Warfare Center Dahlgren Division, Dahlgren, VA, Feb. 1996. Blair, W. D., and Watson, G. A., and Alouani, A. T. Tracking constant speed targets using a kinematic constraint. In Proceedings of the 1991 IEEE Southeast Conference, 1991. Blair, W. D., and Watson, G. A., Gentry, G. L., and Hoffman, S. A. Benchmark problem for beam pointing control of phased array radar against maneuvering target in the presence of ECM and FA. In Proceedings of the 1995 American Control Conference, Seattle, WA, June 1995, 2601—2605. Blair, W. D., and Watson, G. A., and Hoffman, S. A. Benchmark problem for beam pointing control of phased array radar against maneuvering target. In Proceedings of the 1994 American Control Conference, Baltimore, MD, June 1994, 2071—2075. Blair, W. D., and Watson, G. A., Kirubarajan, T., and Bar-Shalom, Y. Benchmark for radar resource allocation and tracking targets in the presence of ECM. IEEE Transactions on Aerospace and Electronic Systems, 34, 4 (Oct. 1998), 1097—1114. Also in Proceedings of the 1995 American Control Conference, Seattle, WA, June 1995, 2601—2605. Blair, W. D., and Watson, G. A., Kirubarajan, T., and Bar-Shalom, Y. Benchmark for radar resource allocation and tracking targets in the presence of ECM. IEEE Transactions on Aerospace and Electronic Systems, 34, 4 (Oct. 1998), 1097—1114. [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] Blair, W. D., and Watson, G. A., and Rice, T. R. Interacting multiple model filter for tracking maneuvering targets in spherical coordinates. In Proceedings of the of IEEE Southeastcon 1991, Williamsburg, VA, Apr. 1991, 1055—1059. Blair, W. D., and Watson, G. A., and Rice, T. R. Tracking maneuvering targets with an interacting multiple model filter containing exponentially correlated acceleration models. In Southeastern Symposium on Systems Theory, Columbia, SC, Mar. 1991. Blasch, E., and Connare, T. Improving track maintenance through group tracking. In Proceedings of the Workshop on Estimation, Tracking, and Fusion–A Tribute to Yaakov Bar-Shalom, Monterey, CA, May 2001, 360—371. Bloem, E. A., Blom, H. A. P., and van Schaik, F. J. Advanced data fusion for airport surveillance. In Proceedings of the JISSA 2001 International Conference On Airport Surveillance Sensors, Paris, Dec. 2001. Blom, H. A. P. A sophisticated tracking algorithm for ATC surveillance data. In Proceedings of the International Radar Conference, Paris, France, May 1984. Blom, H. A. P. An efficient filter for abruptly changing systems. In Proceedings of the 23rd IEEE Conference on Decision and Control, Las Vegas, NV, Dec. 1984. Blom, H. A. P. Overlooked potential of systems with Markovian switching coefficients. In Proceedings of the 25th IEEE Conference on Decision and Control, Athens, Greece, Dec. 1986. Blom, H. A. P., and Bar-Shalom, Y. The interacting multiple model algorithm for systems with Markovian switching coefficients. IEEE Transactions on Automatic Control, 33, 8 (Aug. 1988), 780—783. Blom, H. A. P., and Bar-Shalom, Y. Time-reversion of a hybrid state stochastic difference system with a jump-linear smoothing application. IEEE Transactions on Information Theory, 36, 4 (July 1990), 836—847. Blom, H. A. P., Hogendoorn, R. A., and van Doorn, B. A. Design of a multisensor tracking system for advanced air traffic control. In Y. Bar-Shalom (Ed.), Multitarget-Multisensor Tracking: Applications and Advances, Vol. II, Norwood, MA: Artech House, 1992, ch. 2. Blom, H. A. P., Hogendoorn, R. A., and van Schaik, F. J. Bayesian multisensor tracking for advanced air traffic control systems. In A. Benoit (Ed.), Aircraft Trajectories: Computation, Prediction and Control, AGARDOgraph 301, 1990. Bloomer, L., and Gray, J. E. Are more models better?: The effect of the model transition matrix on the IMM filter. In The 34th Southeastern Symposium on System Theory (SSST), Huntsville, AL, Mar. 2002. Boers, Y., and Driesen, H. A multiple model multiple hypothesis filter for systems with possibly erroneous measurements. In Proceedings of the 2002 International Conference on Information Fusion, Annapolis, MD, July 2002, 700—704. Bogler, P. L. Tracking a maneuvering target using input estimation. IEEE Transactions on Aerospace and Electronic Systems, AES-23, 3 (May 1987), 298—310. Brown, R. G. Introduction to Random Sigals and Kalman Filtering. New York: Wiley, 1983. LI & JILKOV: SURVEY OF MANEUVERING TARGET TRACKING. PART V: MULTIPLE-MODEL METHODS 1309 [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] 1310 Bullock, T. E., and Sangsuk-Iam, S. Maneuver detection and tracking with a nonlinear target model. In Proceedings of the 23rd IEEE Conference on Decision and Control, Las Vegas, NV, Dec. 1984. Burassa, S., Fontaine, P., Shahbazian, E., and Simard, M-A. Comparison of different parallel filtering techniques. In Proceedings of the 1993 SPIE Conference on Signal and Data Processing of Small Targets, Orlando, FL, Apr. 1993, 319—330. Busch, M., and Blackman, S. Evaluation of IMM filtering for an air defence system application. In Proceedings of the 1995 SPIE Conference on Signal and Data Processing of Small Targets, vol. 2561, 1995, 435—447. Campo, L., Mookerjee, P., and Bar-Shalom, Y. State estimation for systems with sojourn-time-dependent Markov model switching. IEEE Transactions on Automatic Control, 36, 2 (Feb. 1991), 238—243. Caputi, M. J. A necessary condition for effective performance of the multiple model adaptive estimator. IEEE Transactions on Aerospace and Electronic Systems, 31, 3 (July 1995), 1132—1139. Caputi, M. J., and Moose, R. L. A modified Gaussian sum approach to estimation of non-Gaussian signals. IEEE Transactions on Aerospace and Electronic Systems, 29, 2 (Apr. 1993), 446—451. Chaer, W. S., Bishop, R. H., and Ghosh, J. A mixture-of-experts framework for adaptive Kalman filtering. IEEE Transactions on Systems, Man, and Cybernetics–Part B: Cybernetics, 27, 3 (June 1997), 452—464. Chaer, W. S., Bishop, R. H., and Ghosh, J. Hierarchical adaptive Kalman filter for interplanetary orbit determination. IEEE Transactions on Aerospace and Electronic Systems, 34, 3 (July 1998), 883—895. Chang, C. B., and Athans, M. State estimation for discrete systems with switching parameters. IEEE Transactions on Aerospace and Electronic Systems, AES-14, 5 (May 1978), 418—425. Chang, C. B,. Whiting, R. H,. and Athans, M. On the state and parameter estimation for maneuvering reentry vehicles. IEEE Transactions on Automatic Control, AC-22, 2 (Feb. 1977), 99—105. Chen, B., and Tugnait, J. K. Interacting multiple model fixed-lag smoothing algorithm for Markovian switching systems. IEEE Transactions on Aerospace and Electronic Systems, 36, 1 (Jan. 2002), 432—500. Chen, B., and Tugnait, J. K. Multisensor tracking of a maneuvering target in clutter by using IMMPDA fixed-lag smoothing. IEEE Transactions on Aerospace and Electronic Systems, 36, 3 (Jan. 2000), 983—991. Chen, B., and Tugnait, J. K. Tracking of multiple maneuvering targets in clutter using IMM/JPDA filtering and fixed-lag smoothing. Automatica, 37, 2 (Feb. 2001). Connare, T., Blasch, E., Greenewald, J., Schmitz, J., Salvatore, F., and Scarpino, F. Group IMM tracking utilizing track and identification fusion. In Proceedings of the Workshop on Estimation, Tracking, and Fusion–A Tribute to Yaakov Bar-Shalom, Monterey, CA, May 2001, 205—220. [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91] [92] [93] [94] Cooperman, R. L. Tactical ballistic missile tracking using the interacting multiple model algorithm. In Proceedings of the 2002 International Conference on Information Fusion, Annapolis, MD, July 2002, 824—831. Costa, O. L. V. Linear minimum mean square error estimation for discrete-time Markovian jump linear systems. IEEE Transactions on Automatic Control, 39, 8 (Aug. 1994), 1685—1689. Costa, O. L. V., and Guerra, S. Robust linear filtering for discrete-time hybrid Markov linear systems. International Journal of Control, 75, 10 (2002), 712—727. Costa, O. L. V., and Guerra, S. Stationary filter for linear minimum mean square error estimator of discrete-time Markovian jump systems. IEEE Transactions on Automatic Control, 47, 8 (Aug. 2002), 1351—1356. Daeipour, E., and Bar-Shalom, Y. An interacting multiple model approach for target tracking with glint noise. IEEE Transactions on Aerospace and Electronic Systems, 31, 2 (Apr. 1995), 706—715. Daeipour, E., and Bar-Shalom, Y. IMM tracking of maneuvering targets in the presence of glint. IEEE Transactions on Aerospace and Electronic Systems, 34, 3 (July 1998), 996—1003. Daeipour, E., Bar-Shalom, Y., and Li, X. R. Adaptive beam pointing control of a phased array radar using an IMM estimator. In Proceedings of the 1994 American Control Conference, Baltimore, MA, June 1994, 2093—2097. Dempster, A. P., Liard, N. M., and Rubin, D. B. Maximum likelihood from incomplete data via the EM algorithm. Journal of Royal Statistical Society, B, 39 (1977), 1—38. Derbez, E., Remillard, B., and Jouan, A. A comparison of fixed gain IMM against two other filters. In Proceedings of the 2000 International Conference on Information Fusion, Paris, France, July 2000, ThB2-3—ThB2-9. Ding, Z., Leung, H., and Chan, K. Model-set adaption using a fuzzy Kalman filter. In Proceedings of the International Conference on Information Fusion, Paris, France, July 2000, MoD2. Doucet, A., and Andrieu, C. Iterative algorithms for state estimation of jump Markov linear systems. IEEE Transactions on Signal Processing, 49, 6 (June 2001), 1216—1227. Doucet, A., and Ristic, B. Recursive state estimation for multiple switching models with unknown transition probabilities. IEEE Transactions on Aerospace and Electronic Systems, 38, 3 (July 2002), 1098—1104. Driessen, J. N., and Boers, Y. A multiple model multiple hypothesis filter for tracking maneuvering targets. In Proceedings of the 2001 SPIE Conference on Signal and Data Processing of Small Targets, vol. 4473, San Diego, CA, 2001, 279—288. Drummond, O. Feature, attribute, and classification aided target tracking. In Proceedings of the 2001 SPIE Conference on Signal and Data Processing of Small Targets, vol. 4473, San Diego, CA, 2001, 542—548. Drummond, O. E. Multiple-object estimation. Ph.D. dissertation, University of California, Los Angeles, 1992. IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS VOL. 41, NO. 4 OCTOBER 2005 [95] [96] [97] [98] [99] [100] [101] [102] [103] [104] [105] [106] [107] [108] [109] Drummond, O. E. Multiple target tracking with multiple frame, probabilistic data association. In Proceedings of the 1993 SPIE Conference on Signal and Data Processing of Small Targets, vol. 1954, Apr. 1993. Drummond, O. E. Multiple sensortracking with multiple frame, probabilistic data association. In Proceedings of the 1995 SPIE Conference on Signal and Data Processing of Small Targets, vol. 2561, Apr. 1995. Drummond, O. E. Target tracking with retrodicted discrete probabilities. In Proceedings of the 1997 SPIE Conference on Signal and Data Processing of Small Targets, vol. 3163, 1997, 249—268. Drummond, O. E. Best hypothesis target tracking and sensor fusion. In Proceedings of the 1999 SPIE Conference on Signal and Data Processing of Small Targets, vol. 3809, Denver, CO, July 1999, 586—599. Drummond, O. E., Li, X. R., and He, C. Comparison of various static multiple-model estimation algorithms. In Proceedings of the 1998 SPIE Conference on Signal and Data Processing of Small Targets, vol. 3373, Apr. 1998, 510—527. Dufour, F., and Bertrand, P. An image-based filter for discrete-time Markov jump linear systems. Automatica, 32, 2 (1996), 241—247. Dufour, F., and Mariton, M. Tracking a 3D maneuvering target with passive sensors. IEEE Transactions on Aerospace and Electronic Systems, 27, 4 (July 1991), 725—739. Dufour, F., and Mariton, M. Passive sensor data fusion and maneuvering target tracking. In Y. Bar-Shalom (Ed.), Multitarget-Multisensor Tracking: Applications and Advances, Vol. II, Norwood, MA: Artech House, 1992, ch. 3. Easthope, P. F. Using TOTS for more accurate and responsive multi-sensor, end-to-end ballistic missile tracking. In Proceedings of the SPIE Conference on Signal and Data Processing of Small Targets 2000, vol. 4048, Apr. 2000. Easthope, P. F., and Heys, N. W. Multiple-model target-oriented tracking system. In Proceedings of the SPIE Conference on Signal and Data Processing of Small Targets 1994, vol. 2235, Apr. 1994. Efe, M., and Atherton, D. P. Maneuvering target tracking using adaptive turn rate models in the interacting multiple model algorithm. In Proceedings of the 35th IEEE Conference on Decision and Control, Kobe, Japan, Dec. 1996, 3151—3156. Efe, M., and Atherton, D. P. The IMM approach to the fault detection problem. In 11th IFAC Symposium on System Identification, Fukuoka, Japan, July 1997. Elliott, R. J., Aggoun, L., and Moore, J. B. Hidden Markov Models. New York: Springer-Verlag, 1997. Elliott, R. J., Dufour, F., and Malcolm, W. P. A comparison of angle-only tracking algorithms. In Proceedings of the 2001 SPIE Conference on Signal and Data Processing of Small Targets, vol. 4473, San Diego, CA, 2001, 270—278. Elliott, R. J., Dufour, F., and Sworder, D. D. Exact hybrid filters in discrete time. IEEE Transactions on Automatic Control, 41, 12 (Dec. 1996), 1807—1810. [110] Evans, J. S. Studies in nonlinear filtering theory–Random parameter linear systems, target tracking and communication constrained estimation. Ph.D. dissertation, University of Melbourne, Melbourne, Australia, Jan. 1998. [111] Evans, J. S., and Evans, R. J. State estimation for Markov switching systems with modal observations. In Proceedings of the 36th IEEE Conference on Decision and Control, San Diego, CA, Dec. 1997, 1688—1693. [112] Evans, J. S., and Evans, R. J. Image-enhanced multiple model tracking. Automatica, 35, 11 (Nov. 1999), 1769—1786. [113] Farina, A., Ferranti, L., and Colino, G. Constrained tracking filters for A-SMGCS. In Proceedings of the 2003 International Conference on Information Fusion, Cairns, Australia, July 2003, 414—421. [114] Fisher, K. A., and Maybeck, P. S. Multiple model adaptive estimation with filtering spawning. IEEE Transactions on Aerospace and Electronic Systems, 38, 3 (2002), 755—768. [115] Forney, G. D. The Viterbi algorithm. Proceedings of the IEEE, 61, 3 (Mar. 1973), 268—278. [116] Fraser, D. C., and Potter, J. E. The optimum linear smoother as a combination of two optimum linear filters. IEEE Transactions on Automatic Control, AC-14, 4 (1969), 387—390. [117] Friedland, B. Treatment of bias in recursive filtering. IEEE Transactions on Automatic Control, AC-14, 4 (1969), 359—367. [118] Fry, C. M., and Sage, A. P. On hierarchical structure adaptation and systems identification. International Journal of Control, 20, 3 (1974), 433—452. [119] Gauvrit, H., Le Cadre, J. P., and Jauffret, C. A formulation of multitarget tracking as an incomplete data problem. IEEE Transactions on Aerospace and Electronic Systems, 33, 4 (Oct. 1997), 1242—1257. [120] Gauvrit, M. Bayesian adaptive filter for tracking with measurements of uncertain origin. Automatica, 20 (Mar. 1984), 217—224. [121] Gersho, A., and Gray, R. M. Vector Quantization and Signal Compression. Boston, MA: Kluwer, 1992. [122] Gertz, J. L. Multisensor surveillance for improved aircraft tracking. Lincoln Laboratory Journal, 2, 3 (1989), 381—396. [123] Gholson, N. H., and Moose, R. L. Maneuvering target tracking using adaptive state estimation. IEEE Transactions on Aerospace and Electronic Systems, AES-13, 3 (May 1977), 310—317. [124] Goldberg, D. E. Genetic Algorithms in Search, Optimization and Machine Learning. Reading, MA: Addison-Wesley, 1989. [125] Goutsias, J., and Mendel, J. M. Optimal simultaneous detection and estimation of filtered discrete semi-Markov chains. IEEE Transactions on Information Theory, 34, 3 (May 1988), 551—568. LI & JILKOV: SURVEY OF MANEUVERING TARGET TRACKING. PART V: MULTIPLE-MODEL METHODS 1311 [126] [127] [128] [129] [130] [131] [132] [133] [134] [135] [136] [137] [138] [139] [140] 1312 Gustafson, J. A., and Maybeck, P. S. Flexible spacestructure control via moving-bank multiple model algorithms. IEEE Transactions on Aerospace and Electronic Systems, 30, 3 (July 1994), 750—757. Gustafsson, F. Adaptive Filtering and Change Detection. New York: Wiley, 2001. Guu, J. A., and Wei, C. H. Maneuvering target tracking using IMM method at high measurement frequency. IEEE Transactions on Aerospace and Electronic Systems, 27, 3 (May 1991), 514—519. Hadidi, M. T., and Schwartz, S. C. Sequential detection with Markov interrupted observations. In Proceedings of the 16th Allerton Conference on Communication, Control and Computing, University of Illinois, Oct. 1978. Hawkes, R. M., and Moore, J. B. Performance bounds for adaptive estimation. Proceedings of the IEEE, 64, 8 (1976), 1143—1150. Helmick, R. E., Blair, W. D., and Hoffman, S. A. One-step fixed-lag smoothers for Markovian switching systems. In Proceedings of the American Control Conference, 1994, 782—786. Helmick, R. E., Blair, W. D., and Hoffman, S. A. Fixed-interval smoothing for Markovian switching systems. IEEE Transactions on Information Theory, 41, 6 (Nov. 1995), 1845—1855. Helmick, R. E., Blair, W. D., and Hoffman, S. A. One-step fixed-lag smoothers for Markovian switching systems. IEEE Transactions on Automatic Control, 41, 7 (July 1996), 1051—1056. Hewer, G. A., Martin, R. D., and Zeh, J. Robust preprocessing for Kalman filtering of glint noise. IEEE Transactions on Aerospace and Electronic Systems, 23, 1 (Jan. 1987), 120—128. Ho, T-J., and Farooq, M. Comparing an IMM algorithm and a multiple-process soft switching algorithm: Equivalence relashionship and tracking performance. In Proceedings of the 2000 International Conference on Information Fusion, Paris, France, July 2000, MoD2.17—MoD2.24. Hogendoorn, R. A., Rekkas, C., and Neven, W. H. L. ARTAS: An IMM-based multisensor tracker. In Proceedings of the 1999 International Conference on Information Fusion, Sunnyvale, CA, July 1999, 1021—1028. Holland, J. H. Adaptation in Natural and Artificial Systems. Ann Arbor, MI: University of Michigan Press, 1975. Hong, L., Ding, Z., and Wood, R. A. Development of multirate model and multirate interacting multiple model algorithm for multiplatform multisensor tracking. Optical Engineering, 37, 2 (1998), 453—467. Hong, L. Multirate interacting multiple model filtering for target tracking using multirate models. IEEE Transactions on Automatic Control, 44, 7 (July 1999), 1326—1340. Houles, A., and Bar-Shalom, Y. Multisensor tracking of a maneuvering target in clutter. IEEE Transactions on Aerospace and Electronic Systems, 25, 2 (Mar. 1989), 176—189. [141] [142] [143] [144] [145] [146] [147] [148] [149] [150] [151] [152] [153] [154] Hutchins, R. G., and San Jose, A. IMM Tracking of a theater ballistic missile during boost phase. In Proceedings of the 1998 SPIE Conference on Signal and Data Processing of Small Targets, vol. 3373, 1998, 528—531. Hutchins, R. G., and San Jose, A. Trajectory tracking and backfitting techniques against theater ballistic missiles. In Proceedings of the 1999 SPIE Conference on Signal and Data Processing of Small Targets, vol. 3809, 1999, 532—526. Hutchins, R. G., Wilson, D., Allred, L. K., and Duren, R. Alternative architectures for IMM tracking of maneuvering aircraft. In Proceedings of the 2002 SPIE Conference on Signal and Data Processing of Small Targets, vol. 4728, Orlando, FL, Apr. 2002. Hwang, I., Balakrishnan, H., and Tomlin, C. Flight-Mode-based aircraft conflict detection using a residual-mean interacting multiple model algorithm. In Proceedings of AIAA Guidance, Navigation, and Control Conference, Austin, TX, Aug. 2003. Hwang, I., Balakrishnan, I., and Tomlin, C. Observability criteria and estimator design for stochastic linear hybrid systems. In Proceedings of the IEE European Control Conference, Cambridge, UK, Sept. 2003. Isaksson, A., and Gustafsson, F. Comparison of some Kalman filter based methods for manoeuvre tracking and detection. In Proceedings of the 34th IEEE Conference on Decision and Control, New Orleans, LA, Dec. 1995, 1525—1531. Isaksson, A., Gustafsson, F., and Bergman, N. Pruning versus merging in Kalman filter banks for manoevre tracking. URL: citeseer.nj.nec.com/isaksson97pruning.html, 1997. Jaffer, A. G., and Gupta, S. C. On estimation of discrete processes under multiplictive and additive noise conditions. Information Science, 3 (1971), 267. Jilkov, V. P., and Angelova, D. S. Performance evaluation and comparison of variable structure multiple-model algorithms for tracking maneuvering radar targets. In Proceedings of the 26th European Microwave Conference, Prague, Czech, Sept. 1996. Jilkov, V. P., Angelova, D. S., and Semerdjiev, T. A. Mode-set adaptive IMM for maneuvering target tracking. IEEE Transactions on Aerospace and Electronic Systems, 35, 1 (Jan. 1999), 343—350. Jilkov, V. P., and Li, X. R. Adaptation of transition probability matrix for multiple model estimators. In Proceedings of the 2001 International Conference on Information Fusion, Montreal, QC, Canada, Aug. 2001, ThB1.3—ThB1.10. Jilkov, V. P., and Li, X. R. On-line Bayesian estimation of transition probabilities for Markovian jump systems. IEEE Transactions on Signal Processing, 52, 6 (June 2004), 1620—1630. Jilkov, V. P., Li, X. R., and Angelova, D. Bayesian estimation of transition probabilities for Markovian jump systems by stochastic simulation. In Springer Lecture Notes in Computer Science, Vol. 2542, 2003, 307—315. Jilkov, V. P., Li, X. R., and Lu, L. Performance enhancement of IMM estimation by smoothing. In Proceedings of the 2002 International Conference on Information Fusion, Annapolis, MD, July 2002, 713—720. IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS VOL. 41, NO. 4 OCTOBER 2005 [155] [156] [157] [158] [159] [160] [161] [162] [163] [164] [165] [166] [167] [168] Jilkov, V. P., Mihaylova, L. S., and Li, X. R. An alternative IMM solution to benchmark radar tracking problem. In Proceedings of the International Conference on Multisource-Multisensor Information Fusion, July 1998, 924—929. Johansen, T. A., and Murray-Smith, R. The operating regime approach. In R. Murray-Smith and T. A. Johansen (Eds.), Multiple Model Approaches to Modelling and Control, Taylor & Francis, 1997, ch. 1, 3—73. Johnston, L. A., and Krishnamurthy, V. Mode-matched filtering via the EM algorithm. In Proceedings of the 1999 American Control Conference, San Diego, CA, June 1999, 1930—1934. Johnston, L. A., and Krishnamurthy, V. An improvement to the interacting multiple model (IMM) algorithm. IEEE Transactions on Signal Processing, 49, 12 (2001), 2893—2908. Jouan, A., Bosse, E., Simard, M-A., and Shahbazian, E. Comparison of various schema of filter adaptivity for the tracking of maneuvering targets. In Proceedings of the 1998 SPIE Conference on Signal and Data Processing of Small Targets, vol. 3373, Orlando, FL, 1998, 247—258. Kameda, H., Tsujimichi, S., and Kosuge, Y. Target tracking for maneuvering reentry vehicles using multiple maneuvering models. In Proceedings of the 36th SICE (Society of Instrument and Control Engineers) Annual Conference, Japan, 1997, 1031—1036. Kastella, K., and Biscuso, M. Tracking algorithms for air traffic control applications. Air Traffic Control Quarterly, 3, 1 (Jan. 1996), 19—43. Katsikas, S. K., Likothanassis, S. D., Beligiannis, G. N., Berketis, K. G., and Fotakis, D. A. Genetically determined variable structure multiple model estimation. IEEE Transactions on Signal Processing, 49, 10 (Oct. 2002), 2532—2611. Kendrick, J. D., Maybeck, P. S., and Reid, J. G. Estimation of aircraft target motion using orientation measurements. IEEE Transactions on Aerospace and Electronic Systems, AES-17, 2 (Mar. 1981), 254—260. Kerr, T. H. Duality between failure detection and radar/optical maneuver detection. IEEE Transactions on Aerospace and Electronic Systems, 25 (July 1989), 520—528. Kirubarajan, T., and Bar-Shalom, Y. Kalman filter vs. IMM estimator: When do we need the latter? IEEE Transactions on Aerospace and Electronic Systems, 39, 4 (Oct. 2003), 1452—1457. Kirubarajan, T., and Bar-Shalom, Y. Tracking evasive move-stop-move targets with a GMTI radar using a VS-IMM estimator. IEEE Transactions on Aerospace and Electronic Systems, 39, 3 (2003), 1098—1103. Kirubarajan, T., Bar-Shalom, Y., Blair, W. D., and Watson, G. A. IMMPDAF for radar management and tracking benchmark with ECM. IEEE Transactions on Aerospace and Electronic Systems, 34, 4 (Oct. 1998), 1115—1134. Kirubarajan, T., Bar-Shalom, Y., and Daeipour, E. Adaptive beam pointing control of a phased array radar in the presense of ECM and false alarms using IMMPDAF. In Proceedings of the 1995 American Control Conference, Seattle, WA, June 1995, 2616—2620. [169] [170] [171] [172] [173] [174] [175] [176] [177] [178] [179] [180] [181] [182] Kirubarajan, T., Bar-Shalom, Y., Pattipati, K. R., and Kadar, I. Ground target tracking with topography-based variable-structure IMM estimator. In Proceedings of the 1998 SPIE Conference on Signal and Data Processing of Small Targets, vol. 3373, Orlando, FL, 1998, 222—233. Kirubarajan, T., Bar-Shalom, Y., Pattipati, K. R., and Kadar, I. Ground target tracking with topography-based variable structure IMM estimator. IEEE Transactions on Aerospace and Electronic Systems, 36, 1 (Jan. 2000), 26—46. Kirubarajan, T., Pattipati, K. R., Popp, R. L., and Wang, H. Large-scale air surveillance using an IMM estimator. In Proceedings of the Workshop on Estimation, Tracking and Fusion: A Tribute to Yaakov Bar-Shalom, Monterey, CA, May 2001, 427—466. Kirubarajan, T., Yeddanapudi, M., Bar-Shalom, Y., and Pattipati, K. R. Comparison of IMMPDA and IMM-assignment algorithms on real air trafic surveillance data. In Proceedings of the 1996 SPIE Conference on Signal and Data Processing of Small Targets, vol. 2759, Orlando, FL, 1996. Koch, W. Retrodiction for Bayesian multiple hypothesis/multiple targettracking in densely cluttered environment. In Proceedings of the 1996 SPIE Conference on Signal and Data Processing of Small Targets, vol. 2759, Orlando, FL, 1996. Koch, W. Fixed-interval retrodiction approach to Bayesian IMM-MHT for maneuvering multiple targets. IEEE Transactions on Aerospace and Electronic Systems, 36, 1 (Jan. 2000), 2—14. Kolda, T. G., Lewis, R. M., and Torczon, V. Optimization by direct search: New perspectives on some classical and modern methods. SIAM Review, 45, 3 (2003), 385—482. Krishnamurthy, V., and Elliott, R. J. Filters for estimating Markov modulated Poisson processes and image-based tracking. Automatica, 33, 5 (1997), 821—833. Krishnamurthy, V., and Evans, J. Finite Dimensional filters for passive tracking of Markov jump linear systems. Automatica, 33, 5 (1998), 821—833. Kyger, D. W., and Maybeck, P. S. Redusing lag in virtual display using multiple model adaptive estimation. IEEE Transactions on Aerospace and Electronic Systems, 34, 4 (Oct. 1998), 1237—1248. Lainiotis, D. G. Optimal adaptive estimation: Structure and parameter adaptation. IEEE Transactions on Automatic Control, AC-16, 2 (Apr. 1971), 160—170. Lainiotis, D. G. Partitioning: A unifying framework for adaptive systems, I: Estimation. Proceedings of the IEEE, 64, 8 (Aug. 1971), 1261—1436. Lainiotis, D. G., and Papaparaskeva, P. Efficient algorithms of clustering adaptive nonlinear filters. IEEE Transactions on Automatic Control, 44, 7 (July 1999), 1454—1459. Lainiotis, D. G., and Park, S. K. On joint detection, estimation and system identification. International Journal of Control, 17, 3 (1973), 609—633. LI & JILKOV: SURVEY OF MANEUVERING TARGET TRACKING. PART V: MULTIPLE-MODEL METHODS 1313 [183] [184] [185] [186] [187] [188] [189] [190] [191] [192] [193] [194] [195] [196] [197] 1314 Lainiotis, D. G., and Sims, F. L. Performance measure for adaptive Kalman estimators. IEEE Transactions on Automatic Control, AC-15 (Apr. 1970), 249—250. Lainiotis, D. G., and Sims, F. L. Estimation: A brief survey. Information Sciences, 7, 3 (1974), 191—20. Also in D. G. Lainiotis (Ed.), Estimation Theory, New York: American Elsevier, 1974. Lamb, P. R., and Westphal, L. C. Simplex-directed partitioned adaptive filters. International Journal of Control, 30, 4 (1979), 617—627. Layne, J. Monopulse radar tracking using an adaptive interacting multiple model method with extended Kalman filters. In Proceedings of the 1998 SPIE Conference on Signal and Data Processing of Small Targets, vol. 3373, Orlando, FL, Apr. 1998. Layne, J., and Weaver, S. Stochastic estimation using a continuum of models. In Proceedings of the 2000 International Conference on Information Fusion, Paris, France, July 2000. Lefas, C. C. Using roll-angle measurement to track aircraft maneuvers. IEEE Transactions on Aerospace and Electronic Systems, AES-20 (Nov. 1984), 672—681. Leondes, C. T., Sworder, D., and Boyd, J. E. Multiple model methods in path following. Journal of Mathematical Analysis and Applications, 251, 2 (Nov. 2000), 609—623. Li, X. R. Hybrid state estimation and performance prediction with applications to air traffic control and detection threshold optimization. Ph.D. dissertation, University of Connecticut, 1992. Li, X. R. Multiple-model estimation with variable structure: Some theoretical considerations. In Proceedings of the 33rd IEEE Conference on Decision and Control, Orlando, FL, Dec. 1994, 1199—1204. Li, X. R. Hybrid estimation techniques. In C. T. Leondes (Ed.), Control and Dynamic Systems: Advances in Theory and Applications, Vol. 76, New York: Academic Press, 1996, 213—287. Li, X. R. Model-set sequence conditioned estimation in multiple-model estimation with variable structure. In Proceedings of the 1998 SPIE Conference on Signal and Data Processing of Small Targets, vol. 3373, Orlando, FL, Apr. 1998, 546—558. Li, X. R. Optimal selection of estimatee for multiple-model estimation with uncertain parameters. IEEE Transactions on Aerospace and Electronic Systems, 34, 2 (Apr. 1998), 653—657. Li, X. R. Engineer’s guide to variable-structure multiple-model estimation for tracking. In Y. Bar-Shalom and W. D. Blair (Eds.), Multitarget-Multisensor Tracking: Applications and Advances, Vol. III, Boston, MA: Artech House, 2000, ch. 10, 499—567. Li, X. R. Multiple-model estimation with variable structure–Part II: Model-set adaptation. IEEE Transactions on Automatic Control, 45, 11 (Nov. 2000), 2047—2060. Li, X. R. Model-set design for multiple-model estimation–Part I. In Proceedings of the 2002 International Conference on Information Fusion, Annapolis, MD, July 2002, 26—33. [198] [199] [200] [201] [202] [203] [204] [205] [206] [207] [208] [209] [210] Li, X. R., and Bar-Shalom, Y. Mode-set adaptation in multiple-model estimators for hybrid systems. In Proceedings of the 1992 American Control Conference, Chicago, IL, June 1992, 1794—1799. Li, X. R., and Bar-Shalom, Y. Design of an interacting multiple model algorithm for air traffic control tracking. IEEE Transactions on Control Systems Technology, Special issue on Air Traffic Control, 1, 3 (Sept. 1993), 186—194. Li, X. R., and Bar-Shalom, Y. A recursive multiple model approach to noise identification. IEEE Transactions on Aerospace and Electronic Systems, 30, 3 (July 1994), 671—684. Li, X. R., and Bar-Shalom, Y. Multiple-model estimation with variable structure. IEEE Transactions on Automatic Control, 41, 4 (Apr. 1996), 478—493. Li, X. R., and Dezert, J. Layered multiple-model algorithm with application to tracking maneuvering and bending extended target in clutter. In Proceedings of the 1998 International Conference on Information Fusion, Las Vegas, NV, July 1998, 207—214. Li, X. R., and He, C. Model-set choice for multiple-model estimation. In Proceedings of the IFAC 14th World Congress, Beijing, China, July 1999, Paper no. 3a-154, 169—174. Li, X. R., and He, C. Model-set design, choice, and comparison for multiple-model estimation. In Proceedings of the 1999 SPIE Conference on Signal and Data Processing of Small Targets, vol. 3809, Denver, CO, July 1999, 501—513. Li, X. R., and Jilkov, V. P. A survey of maneuvering target tracking–Part II: Ballistic target models. In Proceedings of the 2001 SPIE Conference on Signal and Data Processing of Small Targets, vol. 4473, San Diego, CA, July—Aug. 2001, 559—581. Li, X. R., and Jilkov, V. P. A survey of maneuvering target tracking–Part III: Measurement models. In Proceedings of the 2001 SPIE Conference on Signal and Data Processing of Small Targets, vol. 4473, San Diego, CA, July—Aug. 2001, 423—446. Li, X. R., and Jilkov, V. P. Expected-mode augmentation for multiple-model estimation. In Proceedings of the 2001 International Conference on Information Fusion, Montreal, QC, Canada, Aug. 2001, WeB1.3—WeB1.10. Li, X. R., and Jilkov, V. P. A survey of maneuvering target tracking–Part IV: Decision-based methods. In Proceedings of the 2002 SPIE Conference on Signal and Data Processing of Small Targets, vol. 4728, Orlando, FL, Apr. 2002, 511—534. Li, X. R., and Jilkov, V. P. Survey of maneuvering target tracking. Part I: Dynamic models. IEEE Transactions on Aerospace and Electronic Systems, 39, 4 (Oct. 2003), 1333—1364. Li, X. R., and Jilkov, V. P. A survey of maneuvering target tracking–Approximation techniques for nonlinear filtering. In Proceedings of the 2004 SPIE Conference on Signal and Data Processing of Small Targets, vol. 5428, Orlando, FL, Apr. 2004. IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS VOL. 41, NO. 4 OCTOBER 2005 [211] [212] [213] [214] [215] [216] [217] [218] [219] [220] [221] [222] [223] Li, X. R., Jilkov, V. P., and Ru, J-F. Multiple-model estimation with variable structure–Part VI: Expected-mode augmentation. IEEE Transactions on Aerospace and Electronic Systems, 41, 3 (July 2005). Li, X. R., Jilkov, V. P., Ru, J-F., and Bashi, A. Expected-mode augmentation algorithms for variable-structure multiple-model estimation. In Proceedings of the IFAC 15th World Congress, Barcelona, Spain, July 2002. Paper no. 2816. Li, X. R., Slocumb, B. J., and West, P. D. Tracking in the presence of range deception ECM and clutter by decomposition and fusion. In Proceedings of the 1999 SPIE Conference on Signal and Data Processing of Small Targets, vol. 3809, Denver, CO, July 1999, 198—210. Li, X. R., and Solanky, T. Applications of sequential tests to target tracking by multiple models. In N. Mukhopadhyay, S. Datta, and S. Chattopadhyay (Eds.), Applied Sequential Methodologies, New York: Marcel Dekker, 2004, 219—247. Li, X. R., and Zhang, Y. M. Multiple-model estimation with variable structure–Part V: Likely-model set algorithm. IEEE Transactions on Aerospace and Electronic Systems, 36, 2 (Apr. 2000), 448—466. Li, X. R., and Zhang, Y. M. Numerically robust implementation of multiple-model algorithms. IEEE Transactions on Aerospace and Electronic Systems, 36, 1 (Jan. 2000), 266—278. Li, X. R., Zhang, Y. M., and Zhi, X. R. Design and evaluation of model-group switching algorithm for multiple-model estimation with variable structure. In Proceedings of the 1997 SPIE Conference on Signal and Data Processing of Small Targets, vol. 3163, San Diego, CA, July 1997, 388—399. Li, X. R., Zhang, Y. M., and Zhi, X. R. Multiple-model estimation with variable structure: Model-group switching algorithm. In Proceedings of the 36th IEEE Conference on Decision and Control, San Diego, CA, Dec. 1997, 3114—3119. Li, X. R., Zhang, Y. M., and Zhi, X. R. Multiple-model estimation with variable structure–Part IV: Design and evaluation of model-group switching algorithm. IEEE Transactions on Aerospace and Electronic Systems, 35, 1 (Jan. 1999), 242—254. Li, X. R., and Zhao, Z-L. Measures of performance for evaluation of estimators and filters. In Proceedings of the 2001 SPIE Conference on Signal and Data Processing of Small Targets, vol. 4473, San Diego, CA, July—Aug. 2001, 530—541. Li, X. R., Zhao, Z-L., and Li, X-B. General model-set design methods for multiple-model approach. IEEE Transactions on Automatic Control, AC-50, 9 (2005), 1260—1276. Li, X. R., Zhao, Z-L., Zhang, P., and He, C. Model-set design, choice, and comparison for multiple-model approach to hybrid estimation. In Proceedings of the Workshop on Signal Processing, Communications, Chaos and Systems, Newport, RI, June 2002, 59—92. Li, X. R., Zhao, Z-L., Zhang, P., and He, C. Model-set design for multiple-model estimation–Part II: Examples. In Proceedings of the 2002 International Conference on Information Fusion, Annapolis, MD, July 2002. 1347—1354. [224] Li, X. R., Zhao, Z-L., and Zhang, Y. M. Multiple-model estimation with variable structure–Part III: Model-group switching algorithm. IEEE Transactions on Aerospace and Electronic Systems, 35, 1 (Jan. 1999), 225—241. [225] Li, X. R., Zhu, Y. M., Wang, J., and Han, C. Z. Optimal linear estimation fusion–Part I: Unified fusion rules. IEEE Transactions on Information Theory, 49, 9 (Sept. 2003), 2192—2208. [226] Lin, H-J., and Atherton, D. P. An investigation of the SFIMM algorithm for tracking manoeuvring targets. In Proceedings of the 32nd IEEE Conference on Decision and Control, San Antonio, TX, Dec. 1993, 930—935. [227] Liu, R. H,. and Zhang, Q. Nonlinear filtering: A hybrid approximation scheme. IEEE Transactions on Aerospace and Electronic Systems, 37, 2 (Apr. 2001), 470—480. [228] Logothetis, A., and Krishnamurthy, V. MAP state sequence estimation for jump Markov linear systems via the expectation-maximization algorithm. In Proceedings of the 36th IEEE Conference on Decision and Control, San Diego, CA, Dec. 1997, 1700—1705. [229] Logothetis, A., and Krishnamurthy, V. Expectation maximization algorithms for MAP estimation of jump Markov linear systems. IEEE Transactions on Signal Processing, 47, 8 (Aug. 1999), 2139—2156. [230] Logothetis, A., and Krishnamurthy, V. A Bayesian EM algorithm for optimal tracking of a maneuvering target in clutter. Signal Processing, 82, 3 (2002), 473—490. [231] Logothetis, A., Krishnamurthy, V., and Holst, J. On maneuvering target tracking via the PMHT. In Proceedings of the 36th IEEE Conference on Decision and Control, San Diego, CA, Dec. 1997, 5024—5029. Also in [320, 157—162]. [232] Luenberger, D. G. Linear and Nonlinear Programming (2nd ed.). Reading, MA: Addison-Wesley, 1984. [233] Lund, E. J., Balchen, J. G., and Foss, B. A. Multiple model estimation with inter-residual distance feedback. Modeling, Identification and Control, 13, 3 (1992), 127—140. [234] Magalhaes, M. F., and Binder, Z. A true multimodel estimation algorithm. In Preprints of 10th World Congress of IFAC, vol. 10, Munich, July 1987, 260—264. [235] Magill, D. T. Optimal adaptive estimation of sampled stochastic processes. IEEE Transactions on Automatic Control, AC-10 (1965), 434—439. [236] Mahalanabis, A. K., Zhou, B., and Bose, N. K. Improved multi-target tracking in clutter by PDA smoothing. IEEE Transactions on Aerospace and Electronic Systems, 26 (1990). [237] Malladi, D. P., and Speyer, J. L. A new approach to multiple model adaptive estimation. In Proceedings of the 1997 IEEE Conference on Decision and Control, San Diego, CA, 1997, 3460—3467. [238] Malladi, D. P., and Speyer, J. L. A generalized Shiryaev Sequential probability ratio test for change detection and isolation. IEEE Transactions on Automatic Control, 44, 8 (1999), 1522—1534. LI & JILKOV: SURVEY OF MANEUVERING TARGET TRACKING. PART V: MULTIPLE-MODEL METHODS 1315 [239] [240] [241] [242] [243] [244] [245] [246] [247] [248] [249] [250] [251] [252] [253] [254] 1316 Masreliez, C. J. Approximate non-Gaussian filtering with linear state and observation relations. IEEE Transactions on Automatic Control, AC-20 (1975), 107—110. Masreliez, C. J., and Martin, R. D. Robust Bayesian estimation for the linear model and robustifying the Kalman filter. IEEE Transactions on Automatic Control, AC-22 (June 1977), 361—371. Mathews, V. J., and Tugnait, J. K. Detection and estimation with fixed lag for abruptly changing systems. IEEE Transactions on Aerospace and Electronic Systems, AES-19, 5 (Sept. 1983), 730—739. Maybeck, P. S. Stochastic Models, Estimation and Control, vol. II. New York: Academic Press, 1982. Maybeck, P. S., and Hanlon, P. D. Performance enhancement of a multiple model adaptive estimator. In Proceedings of the 32nd IEEE Conference on Decision and Control, San Antonio, TX, Dec. 1993, 462—268. Also in IEEE Transactions on Aerospace and Electronic Systems, (Oct. 1995). Maybeck, P. S., and Hentz, K. P. Investigation of moving-bank multiple model adaptive algorithms. AIAA J. Guidance, Control, and Dynamics, 10, 1 (Jan.—Feb. 1987), 90—96. Maybeck, P. S., and Suizu, R. I. Adaptive tracker field-of-view variation via multiple model filtering. IEEE Transactions on Aerospace and Electronic Systems, AES-21 (July 1985), 529—539. Mazor, E., Averbuch, A., Bar-Shalom, Y., and Dayan, J. Interacting multiple model methods in target tracking: A survey. IEEE Transactions on Aerospace and Electronic Systems, 34, 1 (1998), 103—123. McGinnity, S., and Irwin, G. W. Fuzzy lodic approach to manoeuvring target tracking. IEE Proceedings–Radar, Sonar, and Navigation, 145, 6 (Dec. 1998), 337—341. McLachlan, G. J., and Basford, K. E. Mixture Models: Inference and Applications to Clustering. New York: Marcel Dekker, 1988. McLachlan, G. J., and Krishnan, T. The EM Algorithm and Extensions. New York: Wiley, 1997. Meditch, J. Stochastic Linear Estimation and Control. New York: McGraw-Hill, 1969. Meditch, J. S. A Survey of data smoothing for linear and nonlinear synamic systems. Automatica, 9, 3 (Mar. 1973), 151—162. Meer, D. E., and Maybeck, P. S. Multiple model adaptive estimation for space-time point process observations. In Proceedings of the 23rd IEEE Conference on Decision and Control, Las Vegas, NV, Dec. 1984, 811—818. Mehra, R. K., Rago, C., and Seereeram, S. Failure detection and identification using a nonlinear interactive multiple model (IMM) filtering approach with aerospace applications. In 11th IFAC Symposium on System Identification, Fukuoka, Japan, July 1997. Meila, M., and Jordan, M. Markov mixtures of experts. In R. Murray-Smith and T. A. Johansen (Eds.), Multiple Model Approach to Modelling and Control, Taylor & Francis, 1997, ch. 5, 145—166. [255] [256] [257] [258] [259] [260] [261] [262] [263] [264] [265] [266] [267] [268] [269] Meng, X. L., and Van Dyk, D. The EM algorithm–An old folk-song sung to a fast new tune. Journal of Royal Statistical Society B, 59, 3 (1997), 511—567. Meng, X. L., and Rubin, D. B. Maximum likelihood estimation via the ECM algorithm: A general framework. Biometrika, 80 (1993), 267—278. Miller, M., Drummond, O., and Perrella, A. Multiple-model filters for boost-to-coast transtition of theater ballistic missiles. In Proceedings of SPIE Conference on Signal and Data Processing of Small Targets 1998, vol. 3373, 1998, 355—376. Mookerjee, P., Campo, L., and Bar-Shalom, Y. Estimation in systems with semi-Markov switching model. In Proceedings of the 26th IEEE Conference on Decision and Control, Dec. 1987. Mookerjee, P., Campo, L., and Bar-Shalom, Y. Sojourn time distribution in a class of semi-Markov chains. In Proceedings of the 1987 Conference Information Science Systems, Johns Hopkins University, Mar. 1987. Moose, R. L. An adaptive state estimator solution to the maneuvering target problem. IEEE Transactions on Automatic Control, AC-20, 3 (June 1975), 359—362. Moose, R. L. Passive range estimation of an underwater maneuvering target. IEEE Transactions on Acoustic, Speech, and Signal Processing, ASSP-35, 3 (Mar. 1987), 274—285. Moose, R. L., and Dailey, T. E. Adaptive underwater target tracking using passive multipath time-delay measurements. IEEE Transactions on Acoustic, Speech, and Signal Processing, ASSP-33 (Aug. 1985), 777—787. Moose, R. L., and Godiwala, P. M. Passive depth tracking of underwater maneuvering targets. IEEE Transactions on Acoustic, Speech, and Signal Processing, ASSP-33 (Aug. 1985), 1040—1044. Moose, R. L., Sistanizadeh, M., and Skagfjord, G. Adaptive state estimation for a system with unknown input and measurement bias. IEEE Journal of Oceanic Engineering, (Jan. 1987), 222—227. Moose, R. L., Sistanizadeh, M. K., and Skagejord, G. Adaptive estimation for a system with unknown measurement bias. IEEE Transactions on Aerospace and Electronic Systems, AES-22, 6 (Nov. 1986), 732—739. Moose, R. L., VanLandingham, H. F., and McCabe, D. H. Modeling and estimation of tracking maneuvering targets. IEEE Transactions on Aerospace and Electronic Systems, AES-15, 3 (May 1979), 448—456. Moose, R. L., and Wang, P. L. An adaptive estimator with learning for a plant containing semi-Markov switching parameters. IEEE Transactions on Systems, Man, Cybernetics, SMC-3 (May 1973), 277—281. Mortensen, R. E. Maximum likelihood recursive nonlinear filtering. Journal of Optim. Theory Application, 2 (1968), 386—394. Mosier, D., and Sundareshan, M. A multiple model for passive ranging. In Proceedings of the 2001 SPIE Conference on Signal and Data Processing of Small Targets, vol. 4473, San Diego, CA, 2001, 222—233. IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS VOL. 41, NO. 4 OCTOBER 2005 [270] [271] [272] [273] [274] [275] [276] [277] [278] [279] [280] [281] [282] [283] [284] Munir, A., and Atherton, D. P. Maneuvering target tracking using an adaptive interacting multiple model algorithm. In Proceedings of the 1994 American Control Conference, Baltimore, MD, June 1994. Munir, A., and Atherton, D. P. Adaptive interacting multiple model algorithm for tracking a manoeuvring target. IEE Proceedings–Radar, Sonar, and Navigation, 142, 1 (Feb. 1995), 11—17. Murray-Smith, R., and Johansen, T. A. (Eds.) Multiple Model Approaches to Modelling and Control. Taylor & Francis, 1997. Neven, W. H. L., Blom, H. A. P., and de Kraker, P. C. Jump linear model based aircraft trajectory reconstruction. In Proceedings of the 1994 SPIE Conference on Signal and Data Processing of Small Targets, vol. 2235, Orlando, FL, 1994, 540—556. Ng, C. W., Lau, A., and How, K. Y. Auto-tuning interactive multiple model. In Proceedings of the SPIE Conference on Acquisition, Tracking, and Pointing, XII, Orlando, FL, Apr. 1998. Noe, B. J., and Collins, N. Variable structure interacting multiple model filter (VS-IMM) for tracking targets with transportation network constraints. In Proceedings of the 2000 SPIE Conference on Signal and Data Processing of Small Targets, vol. 4048, Orlando, FL, Apr. 2000, 247—258. Oshman, Y., Shinar, J., and Weizman, S. A. Using a multiple model adaptive estimator in a random evasion missile/aircraft encounter. AIAA Journal of Guidance, Control, and Dynamics, 24, 6 (2001), 1176—1186. Owen, M. W., and Stubberud, S. C. Interacting multiple model tracking using a neural extended Kalman filter. In Proceedings of the International Joint Conference on Neural Networks, 1999, 2788—2791. Pan, Q., Jia, Y. G., and Zhang, H. G. A d-step fixed-lag smoothing algorithm for Markovian switching systems. In Proceedings of the 2002 International Conference on Information Fusion, Annapolis, MD, July 2002, 721—726. Papoulis, A., and Pillai, S. U. Probability, Random Variables, and Stochastic Processes. New York: McGraw-Hill, 2002. Pattipati, K. R., and Sandell, N. R., Jr. A unified view of state estimation in switching environments. In Proceedings of the 1983 American Control Conference, 1983, 458—465. Petridis, V., and Kehagias, A. A multi-model algorithm for parameter estimation of time varying nonlinear systems. Automatica, 34 (1998), 469—475. Petrov, A. I., and Zubov, A. G. On applicability of the interacting multiple-model approach to state estimation for systems with sojourn-time dependent Markov model switching. IEEE Transactions on Automatic Control, 41, 1 (Jan. 1996), 136—140. Pitre, R. R., Jilkov, V. P., and Li, X. R. A comparative study of multiple-model algorithms for maneuvering target tracking. In Proceedings of the 2005 SPIE Conference Signal Processing, Sensor Fusion, and Target Recognition XIV, Orlando, FL, Mar.—Apr. 2005. Pulford, G., and Evans, R. J. A survey of HMM tracking with emphasis on over-the-horizon radar. Technical Report 7, CSSIP, Australia, May 1995. [285] [286] [287] [288] [289] [290] [291] [292] [293] [294] [295] [296] [297] [298] [299] [300] Pulford, G., and La Scala, B. Manoeuvring target tracking using the expectation-maximisation algorithm. In 4th International Conference On Control, Automation, Robotics & Vision, Singapore, Dec. 1996. Also in [320, 295—299]. Pulford, G., and La Scala, B. MAP estimation of target manoeuvre sequence with the expectation-maximisation algorithm. In Studies in Probabilistic Multi-Hypothesis Tracking and Related Topics, vol. SES-98-01, Naval Undersea Warfare Center Division, Newport, RI, Feb. 1998, 277—292. Pulford, G., and La Scala, B. MAP estimation of target manoeuvre sequence with the expectation-maximisation algorithm. IEEE Transactions on Aerospace and Electronic Systems, 38, 2 (Apr. 2002), 367—377. Qiao, X., and Wang, B. A new approach to grid adaptation of AGIMM algorithm. In Proceedings of the 2003 International Conference on Information Fusion, Cairns, Australia, July 8—11, 2003, 400—405. Rabiner, L. R. A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE, 77, 2 (Feb. 1989), 257—286. Rabiner, L. R., and Juang, B. H. An introduction to hidden Markov models. IEEE ASSP Magazine, (Jan. 1986), 4—16. Rauch, H. E., Tung, F., and Striebel, C. T. Maximum Likelihood estimation of linear dynamic systems. AIAA Journal, 3 (Aug. 1965), 1445—1450. Redner, R. A., and Walker, H. F. Mixture densities, maximum likelihood and the EM algorithm. SIAM Review, 26, 2 (Apr. 1984). Reid, D. B. An algorithm for tracking multiple targets. IEEE Transactions on Automatic Control, AC-24 (Dec. 1979), 843—854. Ristic, B., and Arulampalam, M. S. Tracking a manoeuvring target using angle-only measurements: Algorithms and performance. Signal Processing, 83, 6 (2003), 1223—1238. Ristic, B., Arulampalam, S., and Gordon, N. Beyond the Kalman Filter. Particle Filters for Tracking Applications. Norwood, MA: Artech House, 2004. Rozovskii, B. L., Petrov, A., and Blazek, R. B. Interacting banks of Bayesian matched filters. In Proceedings of the 2000 SPIE Conference on Signal and Data Processing of Small Targets, vol. 4048, Orlando, FL, Apr. 2000. Ru, J-F., and Li, X. R. Interacting multiple-model algorithm with maximum likelihood estimation for FDI. In Proceedings of the 2003 IEEE International Symposium on Intelligent Control, Houston, TX, Oct. 2003, 661—666. Ruan, Y., and Willet, P. Maneuvering PMHTs. In Proceedings of the 2001 SPIE Conference on Signal and Data Processing of Small Targets, vol. 4473, San Diego, CA, 2001, 186—197. Salmond, D. J. Mixture reduction algorithms for uncertain tracking. Technical Report 88004, Royal Aerospace Establishment, Farnborough, England, Jan. 1988. Salmond, D. J. Mixture reduction algorithms for target tracking in clutter. In Proceedings of the 1990 SPIE Conference on Signal and Data Processing of Small Targets, vol. 1305, 1990, 434—445. LI & JILKOV: SURVEY OF MANEUVERING TARGET TRACKING. PART V: MULTIPLE-MODEL METHODS 1317 [301] [302] [303] [304] [305] [306] [307] [308] [309] [310] [311] [312] [313] [314] [315] 1318 Schiller, G. J., and Maybeck, P. S. Control of a large space structure using MMAE/MMAC techniques. IEEE Transactions on Aerospace and Electronic Systems, 33, 4 (Oct. 1997), 1122—1131. Schnepper, K. A comparison of GLR and multiple model filters for a target tracking problem. In Proceedings of the 25th IEEE Conference on Decision and Control, Athens, Greece, Dec. 1986, 666—670. Schutz, R., Engelberg, B., Soper, W., and Mottl, R. IMM modeling for AEW applications. In Proceedings of the 2001 SPIE Conference on Signal and Data Processing of Small Targets, vol. 4473, San Diego, CA, 2001, 210—221. Semerdjiev, E., and Mihaylova, L. Adaptive Interacting multiple model algorithm for manoeuvring ship tracking. In Proceedings of the 1998 International Conference on Information Fusion, Las Vegas, NV, July 1998, 974—979. Semerdjiev, E., Mihaylova, L., and Li, X. R. An adaptive IMM estimator for aircraft tracking. In Proceedings of the 1999 International Conference on Information Fusion, Sunnyvale, CA, July 1999, 770—776. Semerdjiev, E., Mihaylova, L., and Li, X. R. Variable- and fixed-structure augmented IMM algorithm using coordinate turn model. In Proceedings of the 2000 International Conference on Information Fusion, Paris, France, July 2000, MoD2.25—MoD2.32. Semerdjiev, E., Mihaylova, L., and Semerdjiev, T. Manoeuvring ship model identification and interacting multiple model tracking algorithm design. In Proceedings of the 1998 International Conference on Information Fusion, Las Vegas, NV, July 1998, 968—973. Sengbush, R. L., and Lainiotis, D. G. Simplified parameter quantization procedure for adaptive estimation. IEEE Transactions on Automatic Control, AC-14 (Aug. 1969), 424—425. Shafer, G. A Mathematical Theory of Evidence. Princeton, NJ: Princeton University Press, 1976. Shea, P. J., Zadra, T., Klamer, D., Frangione, E., and Brouillard, R. Improved state estimation through use of roads in ground tracking. In Proceedings of the 2000 SPIE Conference on Signal and Data Processing of Small Targets, vol. 4048, Orlando, FL, Apr. 2000, 321—332. Sheldon, S. N., and Maybeck, P. S. An optimizing design strategy for multiple model adaptive estimation and control. IEEE Transactions on Automatic Control, 38, 4 (Apr. 1993), 651—654. Shima, T., Oshman, Y., and Shinar, J. Efficient multiple model adaptive estimation in ballistic missile interception scenarios. AIAA Journal of Guidance, Control, and Dynamics, 25, 4 (2002), 667—675. Shin, H-J., Hong, S-M., and Hong, D-H. Adaptive-update-rate target tracking for phased-array radar. IEE Proceedings, Pt. G, 142, 2 (1995). Simon, D., and Chia, T. L. Kalman filtering with state equality constraints. IEEE Transactions on Aerospace and Electronic Systems, 38, 1 (Jan. 2002), 128—136. Sims, F. L., Lainiotis, D. G., and Magill, D. T. Recursive algorithm for the calculation of the adaptive Kalman filter weighting coefficients. IEEE Transactions on Automatic Control, AC-14 (Apr. 1969), 215—218. [316] [317] [318] [319] [320] [321] [322] [323] [324] [325] [326] [327] [328] [329] [330] [331] Slocumb, B. J., West, P. D., and Li, X. R. Implementation and analysis of the decomposition-fusion ECCM technique. In Proceedings of the 2000 SPIE Conference on Signal and Data Processing of Small Targets, vol. 4048, Orlando, FL, Apr. 2000, 486—497. Song, T. L., and Lee, D. G. Effective filtering of target glint. IEEE Transactions on Aerospace and Electronic Systems, 36, 1 (Jan. 2000), 234—240. Streit, R., and Luginbuhl, T. E. A probabilistic multi-hypothesis tracking algorithm without enumeration and pruning. In Proceedings of the 6th Joint Service Data Fusion Symposium, Laurel, MD, June 14—18, 1993, 1015—1024. Streit, R., and Luginbuhl, T. E. Maximum likelihood method for probabilistic multi-hypothesis tracking. In Proceedings of the 1994 SPIE Conference on Signal and Data Processing of Small Targets, vol. 2335, Apr. 1994. Streit, R. L. (Ed.) Scientific and Engineering Studies, Studies in Probabilistic Multi-Hypothesis Tracking and Related Topics, vol. SES-98-01, Naval Undersea Warfare Center Division, Newport, RI, Feb. 1998. Sugimoto, S., and Ishizuka, I. Identification and estimation algorithms for Markov chain plus AR process. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 83), 1983, 247—250. Sviestins, E. Multi-radar tracking for theater missile defence. In Proceedings of the 1995 SPIE Conference on Signal and Data Processing of Small Targets, vol. 2561, San Diego, CA, July 1995, 384—394. Sworder, D., and Boyd, J. Enhanced multiple model algorithms. Automatica, 2000. Sworder, D., and Boyd, J. Maneuver sequence identification. In Proceedings of the 2003 SPIE Conference on Signal and Data Processing of Small Targets, vol. 5204, San Diego, CA, Aug. 2003. Sworder, D., Boyd, J., and Elliott, R. Modal estimation in hybrid systems. Journal of Mathematical Analysis and Applications, 245, 1 (2002), 225—247. Sworder, D., and Boyd, J. Estimation Problems in Hybrid Systems. New York: Cambridge University Press, 1999. Sworder, D., and Boyd, J. E. A new merging formula for multiple model trackers. In Proceedings of the 2000 SPIE Conference on Signal and Data Processing of Small Targets, vol. 4048, Apr. 2000, 498—509. Sworder, D., and Boyd, J. E. Measurement rate reduction in hybrid systems. AIAA Journal of Guidance, Control, and Dynamics, 24, 2 (2001), 411—414. Sworder, D. D., and Hutchins, R. G. Utility of imaging sensor sensors in tracking systems. Automatica, 29, 2 (Mar. 1993), 445—449. Sworder, D. D., Singer, P. F., and Hutchins, R. G. Image-enhanced estimation methods. Proceedings of the IEEE, 81, 6 (June 1993), 797—812. Tanner, G. Accounting for glint in target tracking. In Proceedings of the 1998 SPIE Conference on Signal and Data Processing of Small Targets, vol. 3373, Orlando, FL, Apr. 1998. IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS VOL. 41, NO. 4 OCTOBER 2005 [332] [333] [334] [335] [336] [337] [338] [339] [340] [341] [342] [343] [344] [345] [346] [347] Thorp, J. S. Optimal tracking of maneuvering targets. IEEE Transactions on Aerospace and Electronic Systems, AES-9 (July 1973), 512—519. Titterington, D. M., Smith, A. F. M., and Makov, U. E. Statistical Analysis of Finite Mixture Distributions. New York: Wiley, 1985. Tobin, D. M., and Maybeck, P. S. Enhancements to a multiple model adaptive estimator–Target image tracker. IEEE Transactions on Aerospace and Electronic Systems, 24 (July 1988), 417—425. Tugnait, J. K. Adaptive estimation and identification for discrete systems with Markov jump parameters. IEEE Transactions on Automatic Control, AC-27, 5 (Oct. 1982), 1054—1065. Tugnait, J. K. Detection and estimation for abruptly changing systems. Automatica, 18, 5 (Sept. 1982), 607—615. Tugnait, J. K. Detection and identification of abrupt changes in linear systems. In Proceedings of the 1983 American Control Conference, June 1983, 960—965. Tugnait, J. K., and Haddad, A. H. A detection-estimation scheme for state estimation in switching environments. Automatica, 15, 4 (July 1979), 477—481. Tzafestas, S., and Watanabe, K. Techniques for adaptive estimation and control of discrete-time stochastic systems with abruptly changing systems. In C. T. Leondes (Ed.), Advances in Control and Dynamic Systems, Vol. 55, New York: Academic Press, 1993, 111—148. Vacher, P., Barret, I., and Gauvrit, M. Design of a tracking algorithm for an advanced ATC system. In Y. Bar-Shalom (Ed.), Multitarget-Multisensor Tracking: Applications and Advances, vol. II, Norwood, MA: Artech House, 1992, ch. 1. VanLandingham, H. F., and Moose, R. L. Digital control of high performance aircraft using adaptive estimation techniques. IEEE Transactions on Aerospace and Electronic Systems, AES-13, 2 (Mar. 1977), 112—120. Varon, D. New advances in air traffic control tracking of aircraft. Journal of Air Traffic Control, (Oct.—Dec. 1994), 6—12. Vasquez, J. R., and Maybeck, P. S. Density algorithm based moving-bank MMAE. In Proceedings of the 1999 IEEE Conference on Decision and Control, Phoenix, AZ, Dec. 1999, 4117—4122. Vasquez, J. R., and Maybeck, P. S. Enhanced motion and sizing of bank in moving-bank MMAE. In Proceedings of the 1999 American Control Conference, San Diego, CA, June 1999, 1555—1562. Wang, H., Kirubarajan, T., and Bar-Shalom, Y. Precision large scale air traffic surveillance using IMM/assignment estimators. IEEE Transactions on Aerospace and Electronic Systems, 35, 1 (Jan. 1999), 255—266. Wang, X., Challa, S., Evans, R., and Li, X. R. Minimal sub-model-set algorithm for maneuvering target tracking. IEEE Transactions on Aerospace and Electronic Systems, 39, 4 (Oct. 2003), 1218—1231. Watanabe, K. Adaptive Estimation and Control: Paritioning Approach. New York: Prentice-Hall, 1992. [348] [349] [350] [351] [352] [353] [354] [355] [356] [357] [358] [359] [360] [361] Watanabe, K., and Tzafestas, S. G. A hierarchical multiple model adaptive control of discrete-time stochastic systems for sensor and actuator uncertainties. Automatica, 26, 5 (Sept. 1990), 875—886. Watson, G. A. IMAM algorithm for tracking maneuvering targets in clutter. In Proceedings of the 1996 SPIE Conference on Signal and Data Processing of Small Targets, vol. 2759, 1996, 304—315. Watson, G. A., and Blair, D. W. Interacting acceleration compensation algorithm for tracking maneuvering targets. IEEE Transactions on Aerospace and Electronic Systems, 31, 3 (July 1995), 1152—1159. Watson, G. A., and Blair, W. D. IMM algorithm for tracking targets that maneuver through coordinated turns. In Proceedings of the of Signal and Data Processing for Small Targets, vol. SPIE 1698, Apr. 1992, 236—247. Watson, G. A., and Blair, W. D. Multiple model estimation for control of phased array radar. In Proceedings of the 1993 SPIE Conference on Signal and Data Processing of Small Targets, Orlando, FL, Apr. 1993, 275—286. Watson, G. A., and Blair, W. D. Tracking targets with multiple sensors using the interacting multiple model algorithm. In Proceedings of the 1993 SPIE Conference on Signal and Data Processing of Small Targets, Orlando, FL, Apr. 1993. Watson, G. A., and Blair, W. D. Revisit Control of a phased array radar for tracking maneuvering targets when supported by a precision ESM sensor. In Proceedings of the 1994 SPIE Conference on Signal and Data Processing of Small Targets, vol. 2235, Orlando, FL, 1994. Watson, G. A., and Blair, W. D. Solution to second benchmark problem for tracking maneuvering targets in the presence of FA and ECM. In Proceedings of the 1995 SPIE Conference on Signal and Data Processing of Small Targets, vol. 2561, San Diego, CA, July 1995. Whang, I. H., and Lee, J. G. Maneuvering target tracking via model transition hypotheses. In Proceedings of the 35th IEEE Conference on Decision and Control, Japan, Dec. 1996, 3157—3158. Wheaton, B. J., and Maybeck, P. S. Second-order acceleration model for an MMAE target tracker. IEEE Transactions on Aerospace and Electronic Systems, 31, 1 (1995), 151—166. Willet, P,. Ruan, Y., and Streit, R. The PMHT for maneuvering target tracking. In Proceedings of the 1998 SPIE Conference on Signal and Data Processing of Small Targets, vol. 3373, Orlando, FL, 1998, 416—427. Also in [320, 165—176]. Willet, P,. Ruan, Y., and Streit, R. PMHT: Problems and some solutions. IEEE Transactions on Aerospace and Electronic Systems, 38, 3 (July 2002), 738—753. Wu, W-R. Target tracking with glint noise. IEEE Transactions on Aerospace and Electronic Systems, 29, 1 (Jan. 1993), 174—185. Wu, W-R., and Chang, D-C. Maneuvering target tracking with colored noise. IEEE Transactions on Aerospace and Electronic Systems, 32, 4 (Oct. 1996), 1311—1319. LI & JILKOV: SURVEY OF MANEUVERING TARGET TRACKING. PART V: MULTIPLE-MODEL METHODS 1319 [362] [363] [364] [365] [366] [367] [368] [369] [370] [371] [372] [373] [374] 1320 Wu, W-R., and Cheng, P-P. A nonlinear IMM algorithm for maneuvering target tracking. IEEE Transactions on Aerospace and Electronic Systems, 30, 3 (July 1994), 875—885. Yang, C., and Bar-Shalom, Y. Discrete-time point process filter for image-based target mode estimation. In Proceedings of the 29th IEEE Conference on Decision and Control, Honolulu, HA, Dec. 1990. Yang, C., Bar-Shalom, Y., and Lin, C-F. Discrete-time point process filter for mode estimation. IEEE Transactions on Automatic Control, 37, 11 (1992), 1812—1816. Yang, M., Ru, J-F., Chen, H-M., Li, X. R., and Rao, N. S. V. Predicting internet end-to-end delay: A statistical case study. In Annual Review of Communications, vol. 58, (International Engineering Consortium), 2005. Yeddanapudi, M., Bar-Shalom, Y., and Pattipati, K. R. MATSurv: Multisensor air trafic surveillance data. In Proceedings of the 1995 SPIE Conference on Signal and Data Processing of Small Targets, vol. 2561, 1995. Yeddanapudi, M., Bar-Shalom, Y., and Pattipati, Y. IMM estimation for multitarget-multisensor air trafic surveillance. Proceedings of the IEEE, 85, 1 (Jan. 1997), 80—94. Yeom, S-W., Kirubarajan, T., and Bar-Shalom, Y. Track segment association, fine-Step IMM and initialization with Doppler for improved track performance. IEEE Transactions on Aerospace and Electronic Systems, 40, 1 (Jan. 2004), 293—309. Yoon, J., Park, Y. H., Whang, I. H., and Seo, J. H. An evidential reasoning approach to maneuvering target tracking. In Proceedings of the AIAA Conference Guidance, Navigation, and Control, New Orleans, LA, Aug. 1997. Zhang, Q. Hybrid Filtering for linear systems with non-Gaussian disturbances. IEEE Transactions on Automatic Control, 45, 1 (2000), 50—61. Zhang, Y. M., and Li, X. R. Detection and diagnosis of sensor and actuator failures using IMM estimator. IEEE Transactions on Aerospace and Electronic Systems, 34, 4 (Oct. 1998), 1293—1312. Zhao, Z.-L., Li, X. R., and Jilkov, V. P. Optimal linear unbiased filtering with nonlinear radar measurements for target tracking. IEEE Transactions on Aerospace and Electronic Systems, 40, 4 (Oct. 2004), 1324—1336. Zuo, D., Han, C., Bian, S., Zheng, L., and Zhu, H. Tracking maneuvering target in glint environment. In Proceedings of the 2003 International Conference on Information Fusion, Cairns, Australia, July 2003, 1394—1399. Zuo, D., Han, C., Lin, Z., Zhu, H., and Hong, H. Fuzzy multiple model tracking algorithm for maneuvering target. In Proceedings of the 2002 International Conference on Information Fusion, Annapolis, MD, July 2002, 818—823. [375] [376] [377] [378] [379] [380] [381] [382] [383] [384] [385] [386] Benameur, K., Pannetier, B., and Nimier, V. A comparative study on the use of road network information in GMTI tracking. In Proceedings 2005 International Conference on Information Fusion, Philadelphia, PA, July 2005. Blasch, E. P., and Yang, C. Ten ways to fuse GMTI and HRRR measurements for joint tracking and identification. In Proceedings 2004 International Conference on Information Fusion, Vol. II, 1006—1013, Stockholm, Sweden, June 2004. Cheng, Y., and Singh, T. Efficient particle filtering for road-constrained target tracking. In Proceedings 2005 International Conference on Information Fusion, Philadelphia, PA, July 2005. Gattein, S., Pannetier, B., and Vannoorenberghe, P. Analysis and integration of road projection methods for multiple road target initiation and tracking. In Proceedings 2005 International Conference on Information Fusion, Philadelphia, PA, July 2005. Golino, G., and Farina, A. Plot-to-track correlation in A-SMGCS using the target images from a surface movement radar. In Proceedings 2004 International Conference on Information Fusion, Vol. II, 999—1005, Stockholm, Sweden, June 2004. Kaempchen, N., and Dietmayer, K. C. J. IMM vehicle tracking for traffic jam situations on highways. In Proceedings 2004 International Conference on Information Fusion, Vol. II, 868—875, Stockholm, Sweden, June 2004. Maybeck, P. S., and Smith, B. D. Multiple model tracker based on Gaussian mixture reduction for maneuvering targets in dense clutter. In Proceedings 2005 International Conference on Information Fusion, Philadelphia, PA, July 2005. Opitz, F., and Kausch, T. UKF controlled variable-structure IMM algorithms using coordinated turn models. In Proceedings 2004 International Conference on Information Fusion, Vol. II, 138—145, Stockholm, Sweden, June 2004. Pannetier, B., Benameur, K., Nimier, V., and Rombaut, M. VS-IMM using map information for a ground target tracking. In Proceedings 2005 International Conference on Information Fusion, Philadelphia, PA, July 2005. Schrempf, O. C., Feiermann, O., and Hanebeck, U. D. Optimal mixture approximation of the product of mixtures. In Proceedings 2005 International Conference on Information Fusion, Philadelphia, PA, July 2005. Yang, C., Bakich, M., and Blasch, E. P. Pose angular-aiding for maneuvering target tracking. In Proceedings 2005 International Conference on Information Fusion, Philadelphia, PA, July 2005. Zhao, Z., and Li, X. R. The behavior of model probability in multiple model algorithms. In Proceedings 2005 International Conference on Information Fusion, Philadelphia, PA, July 2005. IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS VOL. 41, NO. 4 OCTOBER 2005 X. Rong Li (S’90–M’92–SM’95–F’04) received the B.S. and M.S. degrees from Zhejiang University, Hangzhou, Zhejiang, PRC, in 1982 and 1984, respectively, and the M.S. and Ph.D. degrees from the University of Connecticut, USA, in 1990 and 1992, respectively. He joined the Department of Electrical Engineering, University of New Orleans in 1994, where he is now university research professor, department chair, and director of Information and Systems Technology Research Center. During 1986—1987 he did research on electric power at the University of Calgary, AB, Canada. He was an assistant professor at the University of Hartford, West Hartford, CT, from 1992 to 1994. He has authored or coauthored four books: Estimation and Tracking (with Yaakov Bar-Shalom, Norwood, MA: Artech House, 1993), Multitarget-Multisensor Tracking (with Yaakov Bar-Shalom, Storrs, CT: YBS Publishing, 1995), Probability, Random Signals, and Statistics (Boca Raton, FL: CRC Press, 1999), and Estimation with Applications to Tracking and Navigation (with Yaakov Bar-Shalom and T. Kirubarajan, New York: Wiley, 2001); seven book chapters; and more than 200 journal and conference proceedings papers. His current research interests include signal and data processing, target tracking, information fusion, stochastic systems, statistical inference, and electric power. Dr. Li has served the International Society of Information Fusion as president (2003), vice president (1998—2002), a member of Board of Directors (since 1998), general chair for 2002 International Conference on Information Fusion, and steering chair or general vice-chair for 1998, 1999, and 2000 International Conferences on Information Fusion; served IEEE Transactions on Aerospace and Electronic Systems as associate editor from 1995 to 1996 and as editor from 1996 to 2003; served Communications in Information and Systems as editor since 2001; received a CAREER award and an RIA award from the U.S. National Science Foundation. He received 1996 Early Career Award for Excellence in Research from the University of New Orleans and has given numerous seminars and short courses in North America, Europe, Asia, and Australia. He won several outstanding paper awards, is listed in Marquis’ Who’s Who in America and Who’s Who in Science and Engineering, and consulted for several companies. Vesselin P. Jilkov (M’01) received his B.S. and M.S. degree in mathematics from the University of Sofia, Bulgaria in 1982, the Ph.D. degree in the technical sciences in 1988, and the academic rank senior research fellow of the Bulgarian Academy of Sciences in 1997. He was a research scientist with the R&D Institute of Special Electronics, Sofia, (1982—1988) where he was engaged in research and development of radar tracking systems. From 1989 to 1999 he was a research scientist with the Central Laboratory for Parallel Processing–Bulgarian Academy of Sciences, Sofia, where he worked as a key researcher in numerous academic and industry projects (Bulgarian and international) in the areas of Kalman filtering, target tracking, multisensor data fusion, and parallel processing. Since 1999 Dr. Jilkov has been with the Department of Electrical Engineering, University of New Orleans, where he is currently an assistant professor, and is engaged in teaching and conducting research in the areas of hybrid estimation and target tracking. His current research interests include stochastic systems, nonlinear filtering, applied estimation, target tracking, information fusion. Dr. Jilkov is author/coauthor of over 55 journal articles and conference papers. He is a member of ISIF (International Society of Information Fusion). LI & JILKOV: SURVEY OF MANEUVERING TARGET TRACKING. PART V: MULTIPLE-MODEL METHODS 1321