Sequential Adaptive Sensor Management – A. Hero • • • • y • • x Single-target state vector: Sequential: only one sensor deployed at a time Adaptive: next sensor selection based on present and past measurements Multi-modality: sensor modes can be switched at each time Detection/Classification/Tracking: task is to minimize decision error Centralized decision making: sensor has access to entire set of previous measurements Smart targets: may hide from active sensor System Block Diagram Actions Prediction Sensor Scheduler a1 Perfmnce Monitor/ Predictor Preprocessor Feature Selector/ Extractor Confidence Feature Mapper Decisions Detector/ Classifier a3 Adaptive Sequential Acquisition • Sensor acquires data • Adaptive sensor scheduling having density • Sensor selection criteria: design to – Minimize predicted MSE, Pe, (Pm, Pf), time-to-detect, etc. – Maximize predicted information gain (Kreucher&etal:ISPN03): k=1 k=2 k=3 k=3 Multitarget Tracking via a Particle Filter Representation of the JMPD Time update : Evolve density according to ChapmanKolmogorov Equation Propagate Particles Forward in Time Add/Remove Partitions to Particles to account for target birth/death Measurement Update density via Bayes’ Rule Update particle weights based on measurements z Resample Progress (since June 04) • Developed novel multitarget particle filter to represent the JMPD and propagate through time • Developed method of adaptively factorizing the JMPD when applicable to allow for computationally tractable proposals • Developed interacting multiple model formulation • Studied the effect of mismatch in target motion models on filter performance • Developed an importance density method for simultaneous detection and tracking that accounts for target arrival and removal • Developed sensor models based on realistic GMTI, ATR, and SAR sensors • Developed model for multimodality sensor that provides both kinematic and identification information and used for simultaneous detect, track, and ID of 10 real targets Information Based Sensor Resource Allocation Progress (since June 04) • Developed a method of information prediction based on computing the Expected Renyi Divergence between prior and posterior JMPD • Implemented method using particle filter representation of the JMPD • Studied the effect of mismatch in target motion models on filter performance • Compared “task-driven” optimization to “information-driven” optimization • Developed value-to-go approximation for tractable approximate non-myopic scheduling • Developed reinforcement learning methods for non-myopic scheduling and applied to “smart” target problem using a multi-modality sensor • Simulated sensor management for simultaneous detect/track and ID with multi modality sensor Predict information gain for each possible sensing action Time update the JMPD Compute expected information gain between time updated JMPD and time/measurement updated JMPD Make best observation Measurement update the JMPD Progress Highlighted Today 1. Particle Filtering for simultaneous detection, tracking, and identification (Kreucher&Etal Aerospace2005) 2. Investigation of sensitivity to model mismatch 3. Multi-modality non-myopic sensor management via Reinforcement Learning and Value-to-go Approximation (Kreucher&Hero:ICASSP2005) 4. Optimal multi-stage design of experiments for adaptive waveform design (Rangarajaran&etal:ICASSP2005) Progress 1: PF for Simultaneous Detection, Tracking and Identification • JMPD formulation simultaneously addresses detection, tracking and identification • Until recently, our PF implementation has ignored the detection problem – Problem becomes significantly more complicated when target number is unknown and time varying – There is a non-zero probability for a new target arriving at each position within the surveillance area (leads to exponential explosion of possibilities) – Particle filter implementation must use an importance density that efficiently samples from distributions on target number and target state • Solution is a measurement-directed importance density that is biased towards proposing new targets in areas of high (accumulated) likelihood and is biased toward removing targets in areas of low likelihood • This extension allows us to solve the complete problem – target detection, tracking and identification via sensor management with no initial knowledge about the number and states of the targets. Simultaneous Detection, Tracking and Identification • Simulation result – No tip-offs at startup • Unknown number of targets • Unknown position & velocities – Goal is to detect and track the ten real targets • Monte Carlo testing on the algorithm – Performance measured in two ways: • The number of targets correctly detected and tracked versus time (true number of targets is 10) • The filter estimate of target number versus time (true number of targets is 10) Simultaneous Detection, Tracking and Identification • Simulation result – No tip-offs about anything at startup • Unknown number of targets • Unknown position, velocity, ID – Goal is to detect, track and identify the ten real targets • Performance measured in two ways: – The number of targets correctly detected and tracked versus time (truth is 10) – The filter estimate of target number versus time (truth is 10) Progress 1: PF for Simultaneous Detection, Tracking and Identification • Simulation result – No tip-offs about anything at startup • Unknown number of targets • Unknown position, velocity, ID – Goal is to detect, track and identify the ten real targets • Performance measured in two ways: – The number of targets correctly detected and tracked versus time (truth is 10) – The filter estimate of target number versus time (truth is 10) Progress 2: Effect of model mismatch Approach • We investigate the effect of mismatch between the filter estimate of SNR and the actual SNR • Experiment: 10 (real) targets with myopic SM. • CFAR detection w/ pf = .001, and pd = pf1/(1+SNR*M) – i.e. Rayleigh distributed energy returns from both background & signal. Threshold set for Pf = .001. – For a constant pf, SNR determines what pd is • Filter has an estimate of SNR (and hence pd) and uses this for SM and filtering. What is the effect on tracking of erroneous SNR info? • Bottom line: Filter appears quite robust to mismatch in SNR, pd, pf, target model. Effect of Pd, Pf mismatch • We use a sensor model: p(y|S,a) – For thresholded GMTI returns, this is characterized by Pd and Pf • Simulation : 10 (real) targets tracking and (myopic) sensor management. – How does misestimating Pd & Pf effect performance? Effect of dynamic model mismatch • Diffusive target model p(Sk,Tk|Sk-1,Tk-1) includes models of how individual targets move and how targets arrive and leave surveillance region – We have been in a mismatch scenario all along since we use real targets – This study quantifies how mismatch in motion model effects performance Mismatch of the filter (measured as amount of over estimation) Normalized tracking error (ratio of tracking error with mismatch to tracking error when matched) True diffusivity of the targets Progress 3: Non-Myopic Sensor Management • There are many situations where long-term planning provides benefit – Sensor platform motion creates time varying sensor/target visibility • Sensor/target line of sight may change resulting in targets becoming obscured • Delay measuring targets that will remain visible in order to interrogate targets that are predicted to become obscured – Convoy Movement may involve targets that overtake/pass one another • Targets may become closely spaced (and unresolvable to the sensor) • Plan ahead to measure targets before they become unresolvable to the sensor – Crossing Targets become unresolvable to the sensor • Sensor resolution may prohibit successful target identification if targets are too close together • Plan ahead to identify targets before they become too close • Planning ahead in these situations allows better prediction of reemergence point, target trajectory, target intention Relevant Multi-target Tracking Scenario Sensor Position Shadowed Target Visible Target Region of Interest Extra dwells at time 1 help predict where target reemerges at time 6 Time 1 Time 2 Time 3 Non-myopic strategy scans regions that will become obscured while deferring regions that will remain visible in the future. Not made by myopic strategy Time 4 Time 5 Time 6 Value Function Approximation The Bellman equation describes the value of an action in terms of the immediate (myopic) benefit and the long-term (non-myopic) benefit. Bellman equation: Value of state Myopic part of V under action a Non-myopic correction under a I. VTG approximation: II. Linear Q-learning approximation: Generate s, a, s’, r Calculate Qest r max Q k (s' , a' ) a' s, a, s’, Qest Update Qk to Qk+1 Example: Two Real Targets • Target Trajectories Taken From Real, Recorded Data – 2 moving ground targets – Need to estimate the position and velocity in x and y (4-d state vector for each target) • • • Time varying visibility taken from real elevation map & simulated platform trajectory Sensor decides where to steer an agile antenna and illuminates a 100mx100m patch on the ground. Thresholded measurements indicate the presence or absence of a target (with pd and pfa) At initialization the filter the target position is known to be in a 300m x 500m area on the ground (i.e. the prior for target position is uniform over this region) Comparing the Management Strategies Algorithm Random Myopic Non-myopic via VTG Non-myopic via RL Time for Training ~50 hours Time for Testing 0.04s / second 0.12s / second 0.37s / second 0.60s / second We Suspect that the training time for the RL algorithm could be reduced (perhaps by even an order of magnitude with a C-based implementation) Non-myopic via RL timing • Generate Training Episodes : • (50 timesteps x 0.5s/second + 10s fixed cost per episode) * 2000 episodes = 1200 minutes • Batch training : • 36 possible actions (Q-functions to estimate) x 20 minutes per action = 720 minutes • Update value of Q function (i.e. 2nd pass) : 500 minutes • Batch train on second pass : 720 minutes Example : Multiple Modality Sensor • A sensor has two waveforms – Waveform 1 (X-band) has good detection performance but is susceptible to line of sight visibility – Waveform 2 (HF) has poorer detection performance but is not susceptible to visibility • The platform is moving and so sensor to ground visibility changes with time • The filter is to detect and track a target in the surveillance area – No information about target location a priori – Q-learning used to learn the best non-myopic policy Progress 4: Optimal Experimental Design • • • Upper left box - Beam scheduling, waveform selection, beam steering operator, and transmission into the medium, denoted by channel function Right side box - Processes received signals and retransmits. Lower left box - Processes output after reinsertion. Motivation • Imaging a medium using an array of sensors. • Widely studied in mine detection, ultrasonic medical imaging, foliage penetrating radar, nondestructive testing, and active audio. • GOAL: Optimally design a sequence of measurements to image a medium of multiple scatterers using an array of transducers. • Four signal processing steps: 1. 2. 3. 4. Transmission of time varying signals into the medium. Recording of backscattered field from medium. Transmission of the processed backscatter signals. Measurement and spatial filtering of backscattered signals. Mathematical Description • Channel between transmitted field and received backscattered field, • Four signal processing steps • where receiver noises are i.i.d • Design objective: minimize MSE under transmitted energy constraint Analytical Results • Constraint: • Nearly optimal design: • MSE improvement factor: Comments and Extensions • Results are robust to variation of estimator error residual esp at low SNR • Results apply to 2-stage min MSE design under average energy constraint when Greens function is known and non-random • Analytical results for multi-stage (>2) waveform design? • Random (Rayleigh/Rician) media? • Extension to non-quadratic objective functions? • Classification, detection, regularized image reconstruction? Pubs Since June 2004 • Sequential adaptive sensor management – “Adaptive Multi-modality Sensor Scheduling for Detection and Tracking of Smart Targets”, C. Kreucher, D. Blatt, A. Hero, and K. Kastella, accepted for publication, Nov. 2004 – “Sensor Management Using An Active Sensing Approach ”, C. Kreucher, D. Blatt, A. Hero, and K. Kastella, accepted for publication, Oct 2004 – “Multitarget tracking using a particle filter representation of the joint multi-target probability density,” C. Kreucher, K. Kastella, and A. Hero, accepted for publication, Sept. 2004 – “Efficient methods of non-myopic sensor management for multitarget tracking,” C. Kreucher, A. Hero, K. Kastella, and D. Chang, 43rd IEEE Conference on Decision and Control, December 2004. – “Multiplatform Information based Sensor Management,” C. Kreucher, A. Hero, and K. Kastella, to appear at SPIE Defense and Security Symposium, March 2005 – “Non-myopic Approaches to Scheduling Agile Sensors for Multitarget Detection, Tracking, and Identification,” C. Kreucher, and A. Hero, to appear at IEEE ICASSP March 2005 – “Particle Filtering for Multitarget Detection and Tracking,” C. Kreucher, M. Morelande, A. Hero and K. Kastella, to appear at IEEE Aerospace Conference, March 2005 Pubs Since June 2004 (ctd) • Iterative function optimization – “A convergent incremental gradient algorithm with constant stepsize,” D. Blatt, A. Hero, H. Gauchman, SIAM Optimization, submitted Sept. 2004 – “Convergent incremental optimization transfer algorithms,” S. Ahn, J. Fessler, D. Blatt, A. Hero. IEEE Trans. on Medical Imaging, submitted Oct. 2004 • Predicting model mismatch – "Tests for global maximum of the likelihood function," D. Blatt and A. O. Hero, Proc. of ICASSP , Philadelphia, March, 2005. – "On tests for global maximum of the log-likelihood function," D. Blatt and A. O. Hero, , IEEE Trans. on Info Theory, submitted Jan. 2005. – • Sequential waveform scheduling – "Optimal experimental design for an inverse scattering problem,“R. Rangangaran, R. Raich and A. O. Hero, to appear in Proc. of ICASSP, Philadelphia, March, 2005. Synergistic Activities and Awards(20032004) • General Dynamics Medal Paper Award – C. Kreucher, K. Castella, and A. O. Hero, "Multitarget sensor management using alpha divergence measures,” Proc First IEEE Conference on Information Processing in Sensor Networks , Palo Alto, April 2003 • • EMM-CVPR-03, ASP-03, EUSIPCO-04, ICASSP-05, SSP-05, A. Hero plenary speaker: General Dynamics, Inc – K. Kastella: collaboration with A. Hero in sensor management, July 2002– C. Kreucher: doctoral student of A. Hero, Sept. 2002-2004 • ARL – – – – • ARLTAB oversight: A Hero is member 2004ARL SEDD: A. Hero is member of yearly review panel, May 2002NAS-Robotics: A. Hero chaired cross-cutting review panel, May 2004. B. Sadler: N. Patwari (doctoral student of A. Hero) internship in distributed sensor information processing, summer 2003 ERIM Intl. – B. Thelen&N. Subotic: H. Neemuchwala (Hero’s PhD student) internship in applying entropic graphs to pattern classification, summer 2003 • Chalmers Univ., Sweden – M. Viberg: A. Hero was Opponent on multimodality landmine detection doctoral thesis, Aug 2003 Transitions • • • • PF/SM to ISP Phase II (Schmidt at Raytheon) MRF backscatter modeling to GD (Kastella/Onstott) SM to NSF-ITR (UM, UW, BU) SM approaches integrated into – Dynamic Machine Learning (Prof. Satinder Singh/Chris Kreucher) – Generalization error (Prof. Susan Murphy/Doron Blatt) • Collaboration with Prof. Hilllel Gauchman (UIUC Math) on distributed optimization • Collaboration with GD on Willow Run experiment for multi-modal tracking of dismounts and vehicles Personnel on A. Hero’s sub-Project (2003-2004) • Chris Kreucher, 4th year grad student – UM-Dearborn – General Dynamics Sponsorship • Neal Patwari, 3rd year doctoral student – Virginia tech – NSF Graduate Fellowship/MURI GSRA • Doron Blatt, 3rd year doctoral student – Univ. Tel Aviv – Dept. Fellowship/MURI GSRA • Raghuram Rangarajan, 3rd year doctoral student – IIT Madras – Dept. Fellowship/MURI GSRA