STATISTICAL AND ANALYTICAL TECHNIQUES IN SYNTHETIC APERTURE RADAR IMAGING By Kaitlyn Voccola A Thesis Submitted to the Graduate Faculty of Rensselaer Polytechnic Institute in Partial Fulfillment of the Requirements for the Degree of DOCTOR OF PHILOSOPHY Major Subject: MATHEMATICS Approved by the Examining Committee: Margaret Cheney, Thesis Adviser William Siegmann, Member David Isaacson, Member Matthew Ferrara, Member Richard Albanese, Member Rensselaer Polytechnic Institute Troy, New York August 2011 (For Graduation August 2011) c Copyright 2011 by Kaitlyn Voccola All Rights Reserved ii CONTENTS LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v ACKNOWLEDGMENT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii ABSTRACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 2. Standard SAR & Backprojection . . . . . . . . . . . . . . . . . . . . . . . 5 2.1 Model for SAR data . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.2 Image Formation, Backprojection, and Microlocal Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 12 3. SAR and Detection & Estimation . . . . . . . . . . . . . . . . . . . . . . . 16 3.1 3.2 Detection & Estimation Theory and the Generalized Likelihood Ratio Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.1.1 Detection & Estimation . . . . . . . . . . . . . . . . . . . . . 18 3.1.2 The Generalized Likelihood Ratio Test . . . . . . . . . . . . . 21 Continuous GLRT . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 3.2.1 General Continuous-Time Random Processes . . . . . . . . . . 23 3.3 Reproducing-Kernel-Hilbert-Space Representations of Continuous-Time Random Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.4 The Relationship between the GLRT and Backprojection in SAR imaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 4. Polarimetric synthetic-aperture inversion for extended targets in clutter . . 38 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 4.1.1 4.2 Polarimetric Concepts . . . . . . . . . . . . . . . . . . . . . . 41 Radar Cross Section for Extended Targets . . . . . . . . . . . . . . . 44 4.2.1 Radar Cross Section and Polarimetric Radar Cross Section . . 45 4.2.2 Method of Potentials Solution of Maxwell’s Equations . . . . . 46 4.2.3 Helmholtz Equation in Cylindrical Coordinates . . . . . . . . 49 4.2.4 Scattering in Two Dimensions . . . . . . . . . . . . . . . . . . 52 4.2.5 RCS for Infinitely Long Cylinder . . . . . . . . . . . . . . . . 54 iii 4.2.5.1 4.2.5.2 4.2.6 4.3 Normal Incidence . . . . . . . . . . . . . . . . . . . . 54 Oblique Incidence . . . . . . . . . . . . . . . . . . . 55 Finite Cylinder RCS . . . . . . . . . . . . . . . . . . . . . . . 57 Dipole SAR Scattering Model . . . . . . . . . . . . . . . . . . . . . . 58 4.3.1 Comparison to the extended target RCS model . . . . . . . . 64 4.3.2 Scattering Model for the Target . . . . . . . . . . . . . . . . . 67 4.3.3 Scattering Model for Clutter . . . . . . . . . . . . . . . . . . . 70 4.4 Total Forward Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 4.5 Image Formation in the presence of noise and clutter . . . . . . . . . 73 4.6 4.5.1 Statistically Independent Case . . . . . . . . . . . . . . . . . . 74 4.5.2 Correlated Clutter and Target Case . . . . . . . . . . . . . . . 88 Numerical Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . 95 4.6.1 Numerical Experiments . . . . . . . . . . . . . . . . . . 4.6.1.1 Example One - Horizontally Polarized Target 4.6.1.2 Example Two - Vertically Polarized Target . . 4.6.1.3 Example Three - 45 ◦ Polarized Target . . . . . . . . . . . . . . . . . . . . 99 100 106 111 5. Conclusions and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . 116 LITERATURE CITED . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 APPENDICES A. FIOs and Microlocal Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 122 A.1 Fourier Integral Operators . . . . . . . . . . . . . . . . . . . . . . . . 122 A.2 Microlocal Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 B. Calculation of the Radiation of a Short Dipole . . . . . . . . . . . . . . . . 124 B.0.1 Vector potential . . . . . . . . . . . . . . . . . . . . . . . . . . 124 B.0.2 Far-field radiation fields . . . . . . . . . . . . . . . . . . . . . 124 B.0.3 Radiation vector for a dipole . . . . . . . . . . . . . . . . . . . 125 iv LIST OF TABLES 4.1 Initial SCR in dB vs. Final Standard Processed Image SCR in dB, horizontally polarized target . . . . . . . . . . . . . . . . . . . . . . . . 105 4.2 Initial SCR in dB vs. Final Coupled Processed Image SCR in dB, horizontally polarized target . . . . . . . . . . . . . . . . . . . . . . . . 105 4.3 Initial SCR in dB vs. Final Standard Processing Image SCR in dB, vertically polarized target . . . . . . . . . . . . . . . . . . . . . . . . . . 109 4.4 Initial SCR in dB vs. Final Coupled Processing Image SCR in dB, vertically polarized target . . . . . . . . . . . . . . . . . . . . . . . . . . 110 4.5 Initial SCR in dB vs. Final Standard Processing Image SCR in dB, 45 ◦ polarized target . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 4.6 Initial SCR in dB vs. Final Coupled Processing Image SCR in dB, 45 ◦ polarized target . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 v LIST OF FIGURES 4.1 Linear, Circular, and Elliptical Polarization States . . . . . . . . . . . . 42 4.2 The Polarization Ellipse . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 4.3 Cylindrical Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 4.4 Scattering scenario with infinite length cylinder lying along the z-axis, incident field is normal to the cylinder [28] . . . . . . . . . . . . . . . . 53 4.5 Scattering Scenario for an infinite length cylinder when the incident field makes an angle φ with the x − y plane (oblique incidence) [28] . . 56 4.6 Scattering Scenario for a finite length cylinder [28] . . . . . . . . . . . . 57 4.7 Spherical coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 4.8 HH component of target vector and target plus clutter vector, horizontally polarized target . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 4.9 HH, HV, and VV target only data for the case, horizontally polarized target . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 4.10 HH, HV, and VV target embedded in clutter data for the case, horizontally polarized target . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 4.11 HH image created using the standard processing vs. the true target function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 4.12 HH image created using the coupled processing vs. the true target function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 4.13 SCR vs. MSE for the standard processed images and coupled processed images respectively, horizontally polarized target . . . . . . . . . . . . . 104 4.14 VV component of target vector and target plus clutter vector, vertically polarized target . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 4.15 HH, HV, and VV target only data for the case, vertically polarized target107 4.16 HH, HV, and VV target embedded in clutter data for the case, vertically polarized target . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 4.17 VV image created using the standard processing vs. the true target function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 vi 4.18 VV image created using the coupled processing vs. the true target function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 4.19 SCR vs. MSE for the standard processed images and coupled processed images respectively, vertically polarized target . . . . . . . . . . . . . . 109 4.20 HV component of target vector and target plus clutter vector, 45 ◦ polarized target . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 4.21 HH, HV, and VV target only data, 45 ◦ polarized target . . . . . . . . . 112 4.22 HH, HV, and VV target embedded in clutter data, 45 ◦ polarized target 113 4.23 HV image created using the standard processing vs. the true target function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 4.24 HV image created using the coupled processing vs. the true target function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 4.25 SCR vs. MSE for the standard processed images and coupled processed images respectively, 45 ◦ polarized target . . . . . . . . . . . . . . . . . 114 vii ACKNOWLEDGMENT To my advisor, Dr. Margaret Cheney, for her support and encouragement. Thank you for your guidance and for sharing your expertise, for allowing me to explore what I found most interesting. To Dr. Birsen Yazici, Dr. Matthew Ferrara, and Dr. Richard Albanese thank you for the many fruitful discussions, all your input is invaluable. Also to Dr. William Siegmann and Dr. David Isaacson thank you for being a part of my committee and for all your time and input to my thesis. To my friends and colleagues in the math department, Lisa Rogers, Analee Miranda, Heather Palmeri, Tegan Webster, Joseph Rosenthal, Jessica Jones, Ashley Thomas, Peter Muller, and Jensen Newman thank you for your support, friendship, and our productive discussions as well as our gossip sessions. In particular a thank you to Joseph Rosenthal for his computer support and his help with the figures in this document. Also a special thank you to Dawnmarie Robens for always being there to listen and for getting me through some of my toughest days as a graduate student. To Subin George, Meredith Anderson, Jismi Johnson, Laura Hubelbank, Tinu Thampy, Peter Ruffin, Ebby Zachariah, Sam John, thank you for being incredible friends and my second family. Also a special thanks to Mrs. Mosher for getting me through my worst days and making it possible for me to get to this point. To my many amazing teachers along the way who have encouraged my love of mathematics and science and shared with me their passion for learning thank you, Mrs. Fedornak, Mrs. Dunham, Ms. Gomis, Mr. Seppa, Mme. Refkofsky, and Dr. Kovacic. To Stevie, thank you for being my first example, for bringing me laughter and for always finding a way to make me smile. Thank you for being who you are, your creativity and passion inspire me. To Elissa, my best friend, for your support, your love, your phone calls and bathroom dance parties. You believe in me more than I do myself and I will always viii be thankful. Your drive and your success amaze me everyday, and I will forever be your biggest fan. To my parents, for everything. Thank you for your constant love and support, and for allowing us to always dream big dreams. ix For Mom, for never letting me give up. x ABSTRACT In synthetic-aperture radar (SAR) imaging, a scene of interest is illuminated by electromagnetic waves. The goal is to reconstruct an image of the scene from the measurement of the scattered waves using airborne antenna(s). This thesis is focused on incorporating statistical modeling into imaging techniques. The thesis first considers the relationship between backprojection in SAR imaging and the generalized likelihood ratio test (GLRT), a detection and estimation technique from statistics. Backprojection is an analytic image reconstruction algorithm. The generalized likelihood ratio test is used when one wants to determine if a target of interest is present in a scene. In particular it considers the case when the target depends on a parameter which is unknown prior to processing the data. Under certain assumptions, namely that the noise present in the scene can be described by a Gaussian distribution, we show that the test statistic calculated in the GLRT is equivalent to the value of a backprojected image for a given location in the scene. Next we consider the task of developing an imaging algorithm for extended targets embedded in clutter and thermal noise. We consider the case when a fully polarimetric radar system is used. Also note that we assume scatterers in our scene are made up of dipole scattering elements in order to model the directional scattering behavior of extended targets. We formulate a statistical filtered-backprojection scheme in which we assume the clutter, noise, and the target are all represented by stochastic processes. Because of this statistical framework we choose to find the filter which minimizes the mean-square error between the reconstructed image and the actual target. Our work differs from standard polarimetric SAR imaging in that we do not perform channel-by-channel processing. We find that it is preferable to use what we call a coupled processing scheme in which we use all sets of collected data to form all elements of the scattering matrix. We show in our numerical experiments that not only is mean-square error minimized but also the final signal-to-clutter ratio is reduced when utilizing our coupled processing scheme. xi CHAPTER 1 Introduction In synthetic-aperture radar (SAR) imaging, a scene of interest is illuminated by electromagnetic waves. The goal is to reconstruct an image of the scene from the measurement of the scattered waves using airborne antenna(s). This thesis is focused on the use of statistics in SAR imaging problems. The theory of statistics has been widely used in detection and estimation schemes. These techniques process radar data in order to determine if a target is present in the scene of interest and also to estimate parameters describing the target. However most existing imaging algorithms were derived based on purely deterministic considerations. The significant and unwelcome effects of noise make the inclusion of statistical aspects critical. In addition the complexity of the scattering objects suggests a stochastic modeling approach. For example, treating foliage as a random field seems an obvious choice, but a stochastic model is best even for objects such as vehicles whose radar cross-section varies widely with angle of view. The first body of work in this thesis investigates the relationship of methods that arise from this statistical theory and a standard imaging technique, backprojection. In particular we consider the relationship between backprojection in SAR imaging and the generalized likelihood ratio test (GLRT) [39]. Backprojection is a commonly used analytic image reconstruction algorithm [1, 2]. It has the advantage of putting the visible edges of the scene at the right location and orientation in the reconstructed images. This property can be shown using the theory of microlocal analysis [47, 48, 49, 50]. The generalized likelihood ratio test is used when one wants to determine if a target of interest is present in a scene. In particular it considers the case when the target depends on a parameter, such as location, which is unknown prior to processing the data. We focus on the GLRT because of its wide use in SAR detection problems [17, 23, 39]. Emanuel Parzen developed an entire theory of representing stochastic processes in terms of reproducing kernel Hilbert spaces [8]. This theory enables one to formulate detection and estimation algorithms, in- 1 2 cluding the GLRT, for the case when the data is defined on a continuous index set. Parzen developed this theory mainly for communication applications. This dissertation shows how this theory can be applied to radar problems. We show, moreover, that the test statistic calculated in the GLRT under Gaussian noise assumptions is equivalent to the value of the backprojected image at each pixel or location on the ground. A summary of this work appears in [46]. A similar connection was noted in [6] for image formation using the Radon transform in the case when the image is parametrized by a finite number of parameters. This result sheds light on the overlapping results that are derived from two very different theories. We find that in special cases utilizing microlocal analysis in developing imaging algorithms produces the same data processing techniques as those derived using statistical theory. This suggests that using both techniques and also further investigation of this relationship can lead to a better understanding of image reconstruction. The second half of the the thesis focuses on the develop of a hybrid technique that uses both analytical and statistical theory in the framework of backprojection. This technique was previously developed for the scalar case in [15]. This work showed that incorporating the statistics of the scene into the imaging algorithm leads to clutter mitigation and also minimizes the effect of noise on the image. We extend these results for a full vector treatment of the transmission and scattering of the electromagnetic waves. That is, we consider the case when a fully polarimetric radar system is used. Polarimetric radar has the advantage of producing multiple sets of data during a single data collection. Therefore it provides one with more information for the image reconstruction or detection task. However previous work has not shown that this extra information actually improves image quality or detection ability enough to justify the additional hardware and computation cost incurred when utilizing a polarimetric system. We present a technique that gives quantifiable improvements in image quality, namely reduced mean-square error (MSE) and improved final image signal-to-clutter ratio (SCR). We also note that most work in polarimetry has focused solely on detection and estimation schemes [16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28]. It is typically assumed that one can reconstruct each element of the target scattering vector from 3 the corresponding data set. Therefore standard imaging schemes are applied to each set of data separately. In this work we begin by deriving the model for the scattered field systematically and uncover the assumption that leads to this treatment of polarimetric radar data. We choose to not make this assumption and we find that it is optimal to use what we called a coupled processing technique. Optimal in this case is in the mean-square sense. This coupled processing uses every data set to reconstruct each element of the target vector. This does add to the computation time of the imaging algorithm but we show that this improves MSE and final image SCR as stated above. Also this processing enables one to reconstruct target orientation correctly when typical polarimetric processing fails. We also note that in developing this polarimetric imaging scheme we consider specifically extended targets embedded in clutter and thermal noise. It is of interest in SAR to develop better target models that display anisotropic or directional scattering. A simple target that displays this behavior is a curve, though most man-made objects scatter electromagnetic waves anisotropically. We consider the curve specifically for simplicity but this work can be extended for more complicated targets. As we stated previously this is essentially the same as saying the radar cross section of man-made targets varies widely with angle of view. This type of scattering has been studied previously in many different capacities. For example in [29] the radar data is broken up into data received from different sets of observation angles, or different intervals of the bandwidth. This does help one to characterize the different returns from different angles, but it leads to a reduction in resolution by processing data defined on smaller apertures or with smaller bandwidths. We instead choose to characterize this behavior in the scattering or forward model. We assume scatterers in our scene are made up of dipole scattering elements as opposed to the typical point scatterer assumption. This idea to model scatterers as dipoles was previously considered in [27]. Dipole scatterers have the advantage of having an orientation and a location associated with them and they display anisotropic scattering behavior. Our work differs from [27] in that we make simplifying assumptions that allow us to write out an analytic expression for the image obtained. The work [27], on the other hand, focuses on a purely numerical scheme. 4 The thesis is organized as follows. Chapter 2 describes the standard SAR forward model using the scalar wave equation to describe the wave propagation. Next we consider the general method of filtered-backprojection and describe the pseudolocal property of the image-fidelity operator. The third chapter focuses on detection and estimation techniques, namely the generalized likelihood ratio test. We summarize the use of the Radon-Nikodym derivative and the reproducing-kernelHilbert-space representation of a stochastic process to write out the expression for the test statistic. We conclude with our result describing the relationship between backprojection and the GLRT. In the fourth chapter we shift our study to the task of imaging extended targets. We begin by reviewing the basics of polarimetry and also a full-vector solution of Maxwell’s equations which is used to arrive at the expressions for the incident and scattered fields. We then outline our specific forward model for a dipole SAR system and using the assumption that scatterers on the ground are made up of dipole elements. Next we derive the optimal BP filters in the mean-square (MS) sense and finish with the results of numerical simulations. We conclude with final remarks about the scope and impact of this thesis work and describe areas in which the work may be utilized further. CHAPTER 2 Standard SAR & Backprojection The received data in synthetic-aperture radar can be thought of as weighted integrals of a function T over some curved manifolds such as circles, ellipses, or hyperbolas. The function T models the reflectivity or irradiance of the region of interest. We consider three different SAR modalities: mono-static SAR, bi-static SAR, and hitchhiker, or passive, SAR. Mono-static SAR involves one antenna used for both transmission and reception while bi-static SAR utilizes two different antennas for transmission and reception possibly mounted on separate airborne platforms. Passive SAR involves one or more receiving antenna(s) which attempt to receive a scattered field that is the result of fields transmitted by sources of opportunity. These sources are transmitters already present in the environment such as cell-phone towers or satellites. We will derive the expression for the data in the mono-static case and give the analogous formulas for the bi-static and hitchhiker cases with references. We use the following font conventions: bold face italic font (x) denotes two-dimensional vectors and bold face roman font (x) denotes threedimensional vectors. Our goal is to end up with the following relationship between data, denoted d, and T , d = F[T ] (2.1) where F is called the forward operator. We will then seek an inverse of F, known as a backprojection operator, in order to reconstruct, or form an image of T . 5 6 2.1 Model for SAR data In SAR, the waves are electromagnetic and therefore their propagation is de- scribed by Maxwell’s equations. In the time domain these equations have the form ∇ × E(t, x) = − ∂B(t, x) ∂t ∇ × H(t, x) = J (t, x) + (2.2) ∂D(t, x) ∂t (2.3) ∇ · D(t, x) = ρ (2.4) ∇ · B(t, x) = 0 (2.5) where E is the electric field, B is the magnetic induction field, D is the electric displacement field, H is the magnetic intensity or magnetic field, ρ is the charge density, and J is the current density. Since most of the propagation takes place in dry air we assume that the electromagnetic properties of free space hold. Therefore we have that ρ = 0 and J = 0. We also have the free space constitutive relations which are expressed as follows: D = 0 E (2.6) B = µ0 H. (2.7) Using these assumptions and taking the curl of (2.2) and substituting the result into (2.3) gives us the following expression: ∂ 2E ∇ × ∇ × E = −µ0 0 2 . ∂t (2.8) We use the triple-product identity (also known as the BAC-CAB identity) on the left-hand side of the above equation to obtain ∇(∇ · E) − ∇2 E = −µ0 0 ∂ 2E . ∂t2 (2.9) Note that ∇ · E = 0 because of the free space assumption, which implies we have ∂ 2E ∇ E = µ0 0 2 . ∂t 2 (2.10) 7 This is equivalent to stating that in Cartesian coordinates each component of the vector E satisfies the scalar wave equation. Using similar steps we may say that H also satisfies the wave equation in free space. In particular these expressions hold for a constant wave speed, that is c0 = (µ0 0 )−1/2 . When a scatterer is present the wave speed is no longer constant and it will change when the wave interacts with the object. We may think of this as a perturbation in the wave speed, which we write as 1 c2 (x) = 1 − T (x) c20 (2.11) where T is known as the scalar reflectivity function. Using this function to describe scatterers is analogous to assuming all scatterers are made up of point scatterers, as pointed out in [27]. This will be contrasted in chapter 4 with a dipole model for scatterers. Using this variable wave speed we say that E satisfies the following wave equation: 1 ∂2 ∇ − 2 E(t, x) = 0. c (x) ∂t2 2 (2.12) This model is often used for radar scattering despite the fact that it is not entirely accurate. The reflectivity function T really represents a measure of the reflectivity for the polarization measured by the antenna. In chapter 4 we will describe an object by a scattering vector dependent on all the different possible polarizations of the antennas. This second approach does not neglect the vector-nature of the electromagnetic fields. In order to incorporate the antennas we consider the total electric field E tot = E in + E sc which is composed of the incident and scattered fields. The full wave propagation and scattering problem is described by the following two wave equations: 1 ∂2 2 ∇ − 2 E tot (t, x) = j(t, x) c (x) ∂t2 1 ∂2 2 ∇ − 2 2 E in (t, x) = j(t, x) c0 ∂t (2.13) (2.14) where j is a model for the current density on the antenna. Using (2.11) and sub- 8 tracting (2.14) from (2.13) results in the following expression for the scattered field: 1 ∂2 2 ∇ − 2 2 E sc (t, x) = −T (x)∂t2 E tot (t, x). c0 ∂t (2.15) We then utilize the Green’s function solution of the wave equation to arrive at the Lippmann-Schwinger integral equation [1]: Z sc E (t, x) = δ(t − τ − |x − y|/c0 ) T (y)∂t2 E tot (τ, y)dτ dy. 4π|x − y| (2.16) It is important to observe that E sc appears on both sides of the equation. Also note that this equation is nonlinear in that the two unknown quantities, E sc and T , are mutiplied on the right-hand side of (2.16). In order to linearize, and ultimately reconstruct T , we invoke the Born, or single-scattering, approximation. This amounts to replacing E tot with E in in (2.16). For more details on the Born approximation see [1]. We now obtain the following approximate expression for the scattered field: Z sc E (t, x) = δ(t − τ − |x − y|/c0 ) T (y)∂t2 E in (τ, y)dτ dy. 4π|x − y| (2.17) If we take the Fourier transform of (2.17) we obtain the frequency domain expression for the scattered field Z sc E (ω, x) = eik|x−y| T (y)ω 2 E in (ω, y)dy 4π|x − y| (2.18) where k = ω/c0 . We will now outline a specific model for the incident field and also explain how to take into account the fact that the antennas are in motion for the synthetic aperture radar case. i.) Mono-static SAR: Recall we assume that E in satisfies the scalar wave equation given in (2.14). In the frequency domain this equation is written as (∇2 + k 2 )E in (ω, x) = J(ω, x) (2.19) where J is the Fourier transform of j, the current density on the antenna. If we 9 again use the Green’s function solution of the wave equation we have Z E(ω, x) = eik|x−y| eik|x−x0 | J(ω, y)dy ≈ F (k, x\ − x0 ) 4π|x − y| 4π|x − x0 | (2.20) where x\ − x0 indicates we have taken the unit vector in the direction of x−x0 . Also note we have used the far-field approximation and assumed that the antenna center is located at the position x0 . F is what is known as the radiation pattern of the transmitting antenna and is analogous to the radiation vector [1], which is explicitly calculated for a dipole antenna in Appendix B. This quantity is proportional to the Fourier transform of the current density. Using this model for the incident field we are able to write out the scattered field as Z sc E (ω, x) = eik|y−x0 | eik|x−y| T (y)ω 2 F (k, y\ − x0 )dy. 4π|x − y| 4π|y − x0 | (2.21) To obtain the expression for the data received we evaluate E sc at the received antenna location, which is x0 in the mono-static case. Therefore we have sc Z e2ik|x0 −y| A(ω, x0 , y)T (y)dy (2.22) ω 2 Ft (k, y\ − x0 )Fr (k, y\ − x0 ) , 2 (4π|x0 − y|) (2.23) E (ω, x0 ) = where A, the amplitude, is given by A(ω, x0 , y) = Ft is the radiation pattern and Frec is the reception pattern of the antenna. In order to include the antenna motion in our model we assume the antenna follows a path γ ∈ R3 . This may also be thought of as the flight path for the aircraft which the antenna is mounted on in a SAR system. If we for example consider a pulsed system where pulses are transmitted at times tn , we have that the antenna has a position γ(tn ). To simplify our analysis we choose to instead parametrize the flight path with a continuous parameter denoted s. We call s the slow-time and in contrast now call t fast-time. We utilize these two different time scales because the scale on which the antenna moves is significantly slower than the scale on which 10 electromagnetic waves travel. We therefore have that the position of the antenna at a given slow time s is γ(s). We now replace x0 with γ(s) in (2.22) to obtain the following expression for the received mono-static SAR data Z D(s, ω) = e2ik|γ(s)−y| A(ω, s, y)T (y)dy (2.24) where D is the Fourier transform of d. In the time domain we have Z d(s, t) = F[T ](s, t) = e−iω(t−rs,x ) A(x, s, ω)T (x)dxdω, (2.25) where rs,x = 2|γ(s) − x|/c0 . (2.26) Note that for simplicity we will now assume flat topography where x = (x1 , x2 ), x = (x, 0). Also observe we have obtained a linear relationship between the data, d, and the target reflectivity T , in terms of the forward operator F. The method of stationary phase can be used to show that the main contributions to (2.25) come from the critical set X = {(x1 , x2 , 0) : c0 t = 2|γ(s) − x|}, (2.27) which consists of circles centered at γ(s) = (γ1 (s), γ2 (s), γ3 (s)) with radius p c20 t2 /4 − γ32 (s). We now quote the analogous expression for the data received in the bi-static and hitchhiker SAR cases. ii.) Bi-static SAR: We begin by assuming that the transmitting antenna and receiving antenna are located on separate airborne platforms, and hence move along different paths γT (s) ∈ R3 and γR (s) ∈ R3 , respectively. The data can be written [3] Z d(s, t) = F[T ](s, t) = e−iω(t−rs,x ) A(x, s, ω)T (x)dxdω, (2.28) 11 where rs,x = |γT (s) − x|/c0 + |γR (s) − x|/c0 . (2.29) The method of stationary phase can be used to show that the main contributions to (2.28) come from the critical set X = {(x1 , x2 , 0) : c0 t = |γT (s) − x| + |γR (s) − x|}. (2.30) We conclude that our received data are closely related to integrals of T along ellipses. iii.) Hitchhiker SAR: This is a passive SAR modality which relies on sources of opportunity to image the ground irradiance. The system consists of airborne receiver(s) that traverse arbitrary paths over the scene of interest. The first step in received signal processing involves correlating the fast-time signal at different slowtime points. The tomographic reconstruction uses the correlated signal dij defined by Z 0 dij (s, s , t) = di (s, τ )dj (τ − t, s + s0 )dτ. (2.31) The received signal at the ith receiver di , is given as follows Z di (s, t) = di,y (s, t)dy. (2.32) This is a superposition of signals over all transmitters at locations denoted y. If we assume there are N ≥ 1 receiving antennas, each following a trajectory γRi (s) ∈ R3 , i = 1, ..., N we can write [4] the Born approximated model of the correlated signal for receivers i and j 0 0 dij (s, s , t) = F[T ](s, s , t) = Z 0 e−iω(t−rij (s,s ,x)) T (x)A(x, s, s0 , ω)dxdω (2.33) for i, j = 1, ..., N where rij (s, x) = |x − γRi (s)|/c0 − |x − γRj (s + s0 )|/c0 (2.34) is the hitchhiker range. Note A(x, s, s0 , ω) is again an amplitude which includes 12 geometrical spreading factors, antenna beam patterns, and transmitter waveforms. Here T is the scene irradiance. The leading-order contribution comes from the critical set X = {(x1 , x2 , 0) : c0 t = |x − γRi (s)| − |x − γRj (s + s0 )|}, (2.35) which is a set of hyperbolas. In general the data in all three modalities can be written as Z d(s, t) = F[T ](s, t) = e−iω(t−φ(s,x)) A(x, s, ω)T (x)dxdω. (2.36) The general phase φ takes on the following forms φ(s, x) = rs,x = 2|γ(s) − x|/c0 ) φ(s, x) = rs,x = |γT (s) − x|/c0 + |γR (s) − x|/c0 φ(s, x) = rij (s, s0 , x) = |x − γRi (s)| − |x − γRj (s + s0 )| (2.37) in the mono-SAR, bi-SAR, and hitchhiker SAR cases respectively. Under mild conditions, the operator F of (2.36) that connects the scene T to the data d is a Fourier Integral Operator (FIO). A precise definition of an FIO can be found in Appendix A. It is known that the behavior of an FIO is determined mainly by the critical set X of the phase. In the case of mono-SAR, X is the set of circles, for bi-static SAR, it is the set of ellipses, and in hitchhiker SAR it is the set of hyperbolas. An approximate inverse of F can be computed by another FIO which is described in the next section. 2.2 Image Formation, Backprojection, and Microlocal Analysis Let us now focus primarily, for the sake of simplicity, on the general model for the data (2.36). To form an image we aim to invert (2.36) by applying an imaging 13 operator K to the collected data. Because F is a Fourier integral operator we can compute an approximate inverse by means of another FIO. This FIO typically has the form of a filtered-backprojection (FBP) operator. In our case, we form an approximate inverse of F as a FBP operator, which first filters the data and then backprojects to obtain the image. Our imaging operator therefore takes on the form Z I(z) = K[d](z) := eiω(t−φ(s,z)) Q(z, s, ω)dωd(s, t)dsdt Z = e−iωφ(s,z) Q(z, s, ω)D(s, ω)dωds, (2.38) where z = (z1 , z2 ), z = (z, 0), D(s, ω) is the Fourier transform of the data in fasttime, and Q is a filter which is determined in several different ways depending on the application. If we insert the equation for the data into above we have Z I(z) = KF[T ](z) = eiω(φ(s,x)−φ(s,z)) Q(z, s, ω)A(x, s, ω)dωdsT (x)dx. (2.39) In synthetic aperture imaging we are especially interested in identifying singularities, or edges and boundaries of objects, from the scene of interest in our image. The study of singularities is part of microlocal analysis, which also encompasses FIO theory [47, 48, 49, 50]. We define the singular structure of a function by its wavefront set which is the collection of singular points and their associated directions. A precise definition of the wavefront set is given in Appendix A. We use the concepts of microlocal analysis to analyze how singularities in the scene, or in T , correspond to singularities in the image. We can rewrite (2.39) as Z I(z) = L[T ](z) = L(z, x)T (x)dx, (2.40) where L = KF is known as the image-fidelity operator. Note L, the kernel of L, is called the point-spread function (PSF). We will show that L is a pseudodifferential operator which has a kernel of the form Z L(z, x) = ei(x−z)·ξ p(z, x, ξ)dξ, (2.41) 14 where p must satisfy certain symbol estimates as in the definition for a general FIO. See Appendix A for more details. It is essential that our image-fidelity operator be a pseudodifferential operator because this class of FIOs has the property that they map wavefront sets in a desirable way. This property is known as the pseudolocal property which states W F (Lf ) ⊆ W F (f ), that is, the operator L does not increase the wavefront set. In the imaging setting this says that the visible singularities of the scene, T , are put in the correct location with the correct orientation. This is desirable for our application because we can say with certainty that no singularities or edges in the image are the result of artifacts. However, we note that it is possible some edges present in T may not appear in the image, especially if the viewing aperture is limited. Also we observe that singularities in the data due to multiple scattering effects which are not captured by the model may give rise to artifacts in the image. We now show that the filtered-backprojection operator is indeed a pseudodifferential operator and therefore produces an image where edges are in the correct location and have the correct orientation. In order to demonstrate that L is the kernel of a pseudodifferential operator our goal is to determine K, the imaging operator, so that L is of the form in equation (2.41). First we must ensure K is an FIO by requiring Q to satisfy a symbol estimate similar to that of p, i.e. we assume for some mQ sup |∂ωα ∂sβ ∂zρ11 ∂zρ22 Q(z, s, ω)| ≤ C0 (1 + ω 2 )(mQ −|α|)/2 (2.42) (s,z)∈K where K is any compact subset of R × R2 and for every multi-index α, β, γ, there is a constant C = C(K, α, β, γ). Again see Appendix A for more details on the symbol estimate. We now must show that the phase of L can be written in the form Φ(x, z, ξ) = i(x − z) · ξ. In order to determine how close the phase is to that of a pseudodifferential operator, one applies the method of stationary phase to the s and ω integrals. The stationary phase method gives us an approximate formula for the large-parameter behavior of our oscillatory integral. First we introduce a large pa- 15 rameter β by the change of variables ω = βω 0 . The stationary phase theorem tells us that the main contribution to the integral comes from the critical points of the phase, i.e. points satisfying the critical conditions 0 = ∇ω0 Φ ∝ φ(s, x) − φ(s, z) (2.43) 0 = ∇s Φ ∝ ∇s (φ(s, x) − φ(s, z)). One of the solutions of the above equations is the critical point x = z. Other critical points lead to artifacts in the image, however their presence depends on the measurement geometry and the antenna beam pattern. We assume the flight trajectories and antenna beam patterns are such that we obtain only the critical point when x = z. In the neighborhood of the point x = z, we use a Taylor expansion of the exponent to force the phase to look like that of a pseudodifferential operator. We make use of the formula Z f (x) − f (z) = 0 1 d f (z + µ(x − z))dµ = (x − z) · dµ Z 1 5f |z+µ(x−z) dµ (2.44) 0 where in our case f (z) = ωφ(s, z). We then make the Stolt change of variables Z (s, ω) → ξ = Ξ(s, ω, x, z) = 1 5f |z+µ(x−z) dµ. (2.45) 0 This change of variables allows us to obtain a new form of the point-spread function Z L(z, x) = i(x−z)·ξ e ∂(s, ω) dξ. QA(x, s(ξ), ω(ξ)) ∂ξ (2.46) From this expression, we see the phase of L is of the form required of pseudodifferential operators. With the symbol estimate requirements made on Q, we have shown that our image-fidelity operator is indeed a pseudodifferential operator. We conclude that the pseudolocal property holds and therefore our microlocal-analysisbased reconstruction method preserves the singularities or edges in our scene of interest. CHAPTER 3 SAR and Detection & Estimation Detection and estimation theory is an area of statistics focused on processing random observations, or measurements. The goal is to extract information about whatever physical process led to the observations. Typically one assumes that there exists a random observation Y ∈ Γ, where Γ is known as the observation space. The two types of problems addressed in this theory are: (i). detection, in which one makes a decision among a finite number of possible situations describing Y ; and (ii). estimation in which one assigns values to some quantity not observed directly. To demonstrate these concepts more concretely we consider the case of radar, and SAR in particular. In this case Y is thought of as the signal received at the antenna, and Γ would be the set of all possible returns. The detection task would be to decide whether or not a specific scatterer, or target, contributed to Y . That is, we seek to decide if a target is present in the scene of interest or not. The estimation task seeks to determine quantities associated with a target such as location, shape, velocity, etc. In this work, we focus on the task of determining whether or not a target is present in the scene and also we will aim to estimate its location. This goal encompasses both detection and estimation and therefore we will utilize a technique which performs both tasks. In particular we consider the well known generalized likelihood ratio test (GLRT) [37, 39]. This test is designed specifically for the problem of detecting signals (targets) which depend on unknown parameters. It includes a detection step using the likelihood ratio test from Neyman-Pearson detection theory and also a maximum likelihood estimation step to determine the location (or any other desired unknown parameter). We motivate each of these steps individually and then explain in more detail how they are combined to form the GLRT. We begin with this background information on the general idea of estimation and detection schemes and the GLRT in particular. We then summarize Parzen’s work on the use of the Hilbert space inner products for expressing the test statistic in the case of 16 17 continuous-time Gaussian random processes [8]. We conclude the chapter with our result of applying this work to the SAR problem and will demonstrate that the test statistic calculated in the case of the GLRT is in fact a backprojected image formed with a matched filter. 3.1 Detection & Estimation Theory and the Generalized Likelihood Ratio Test In most cases when we consider a detection or estimation problem our measure- ments will be corrupted by noise. The effect of this noise on the data is unknown. This leads to a probabilistic treatment of the task. We therefore must define a family of probability distributions on Γ, each of which corresponds to some state that contributed to the data or some set of the desired parameters one wishes to calculate. After we model these distributions we will then attempt to determine the optimal way of processing the data in order to make our decision or estimate the unknown parameters. This method is chosen based on a variety of criteria. For example the choice depends on whether the data is a discrete-time or continuous-time random process. Note the random process may instead be dependent on discrete or continuous spatial variables such as position. In addition the a priori information we have about the data, and also the way in which we will evaluate the performance of our detection and/or estimation scheme contributes to this decision. Usually one chooses between a Bayesian criterion, such as minimizing a cost function, and the Neyman-Pearson criterion in which we search for the most powerful test given a test size. We will go into more detail about these choices below. Note that in order to assign a probability distribution on Γ we first must determine probabilities for each subset of the observation space. In some cases it is not possible to perform this in a manner that is consistent for all subsets so we must assume that there exists a σ−algebra, denoted G, containing all the subsets of Γ to which we are able to assign probabilities. We now call the pair (Γ, G) the observation space. 18 3.1.1 Detection & Estimation The detection part of our task may be formulated as a binary hypothesis testing problem in which we hope to decide whether the observation is associated with the target-absent or target-present situation. We call these two cases the null and alternative hypotheses respectively. We model these situations with the two probability distributions P0 and P1 defined on (Γ, G). We typically express this type of problem in the following manner: H0 : Y ∼ P0 H1 : Y ∼ P1 . (3.1) We can also express this in terms of probability density functions p0 and p1 as H0 : Y ∼ p0 H1 : Y ∼ p1 . (3.2) Specifically for the radar problem we write H0 : Y = n H1 : Y = d + n (3.3) where Y denotes the measured data, d is specifically the data from the target (typically given by equation (2.36)), and n is additive, or thermal, noise. Our goal is to process Y is such a way that we are able to determine if the target was present in our scene of interest. That is, we must decide if the null or alternative hypothesis is the true hypothesis describing our measured data. In order to process the data we define a decision rule denoted δ. This rule, or test, partitions the observation space Γ into two subsets Γ1 ∈ G and Γ0 = ΓC 1 where the superscript C denotes the complement. When Y ∈ Γj we decide that Hj is true for j = 0, 1. In the GLRT we choose the Neyman-Pearson criterion in order to define δ. This method seeks to maximize the probability of detection (also known as the 19 power of δ) for a given probability of false alarm (also known as the size of δ). It is important to observe that when using this criterion there will always be a trade-off between power, or probability of detection, and size, or probability of false alarm. We choose this criterion over the Bayes approach because it is often the case that we do not have prior information on the probability distributions describing the data. In Bayes estimation one typically assigns costs to our decisions. Therefore in this approach one seeks to minimize the Bayes risk, which is the average cost incurred by a decision rule δ. In order to calculate this quantity one must assign probabilities to the occurrences of H0 and H1 . We do not necessarily have information on how often H1 occurs versus H0 . Consequently for the GLRT we focus on the Neyman-Pearson approach from this point onward. We may express the process of maximizing the power given a certain size explicitly as: max PD (δ) subject to PF (δ) ≤ α δ (3.4) where PD is the probability of detection or the power of δ, PF is the probability of false alarm or the size of δ, and α is the desired test size, or accepted falsealarm rate. We define δ via the Neyman-Pearson lemma [37] which states that the likelihood ratio test is the uniformly most powerful test given the false-alarm rate α. Specifically we define δ as 1 if λ = p1 (Y )/p0 (Y ) > η δ(Y ) = q if λ = p1 (Y )/p0 (Y ) = η 0 if λ = p1 (Y )/p0 (Y ) < η. (3.5) Note that p1 and p0 are the probability density functions of the data under each R hypothesis. We determine η and q such that PF = p0 (y)dy = α. The quantity λ = p1 (Y )/p0 (Y ) is known as the likelihood ratio. It is often referred to as the test statistic as well. It is well known that λ is a sufficient statistic for Y [37]. In words this test amounts to calculating the likelihood ratio λ and comparing it some threshold η. If λ exceeds η then we decide that a target was indeed present in our 20 scene. If λ < η then we decide the null hypothesis was true. The case when λ exactly equals the threshold is usually assigned to one of the hypotheses. As we said η may be calculated such that the probability of false alarm is exactly equal to our predetermined acceptable false alarm rate α. One may also determine η experimentally by performing the likelihood ratio test for several possible η values and then choosing whichever value gives the best possible probability of detection when the false alarm rate is α or less. For those who are familiar with estimation and detection this process is usually done in conjunction with the formation of the receiver operating characteristics (ROC) curve [37]. We note here that if Y is a discrete-time random process it is a simple task to define and express p1 and p0 explicitly. The discrete case corresponds to our data depending on a discrete set of slow-times and fast-times. However, if the measurements depend on continuous-time variables writing an explicit expression for the probability density functions is not possible. We are required to use the RadonNikodym derivative of the probability measure P1 with respect to the measure P0 in order to calculate λ in this case. As seen in the previous chapter we assume that our data depends on some interval of the real line for both slow-time and fast-time when deriving the backprojected image. We will therefore need to express λ in the continuous case. This issue will be discussed directly in the next section. We now consider the second task of estimating unknown parameters describing Y . In particular we will be interested in estimating the location of a possible target. In this problem we will again need to define probability distributions to describe the data but now the distributions will vary depending on the unknown parameter, which we denote θ. We assume that θ lies in some set or space of possible values called the parameter space which we denote Λ. We will focus specifically on the maximum likelihood estimation technique which aims to find the value of θ which makes our observation Y most likely. Note we now write the probability density function associated with Y with the notation pθ (Y ). We then choose our estimate of θ in the following manner θ̂M L = arg max pθ (Y ) θ∈Λ (3.6) 21 where θ̂M L is the maximum likelihood estimate of θ. Note this is equivalent to maximizing the function log(pθ (Y )). This quantity, log(p) is simpler to compute than p when p has an exponential form, for example in the case of a Gaussian random process. 3.1.2 The Generalized Likelihood Ratio Test We now describe the joint detection and estimation procedure known as the generalized likelihood ratio test [37, 39]. In general we now have the hypothesis testing problem H0 : Y ∼ pθ0 H1 : Y ∼ pθ1 (3.7) where θ0 and θ1 are two unknown parameters and pθ0 and pθ1 are the probability density functions of the data under the null and alternative hypotheses respectively. The goal is to decide which hypothesis is true and also to estimate the corresponding unknown parameter θ0 or θ1 depending on which hypothesis is true. In the radar case we let θ0 = 0, since the target is not present under H0 and there is no parameter to estimate in this case. Therefore our hypothesis testing problem has the form H0 : Y ∼ p H1 : Y ∼ pθ . (3.8) Comparing this to equation (3.3) we see that n ∼ p and that θ is related to the target data, d. We already know what form the expression for d has so the true unknown will end up being the target reflectivity itself, T . Later we will assume a specific type of scatterer to further simplify the unknown we wish to estimate. We first calculate the likelihood ratio for our data and then we maximize the resulting test statistic over all possible θ ∈ Λ in order to obtain our estimate θ̂M L . 22 Mathematically we calculate λθ = pθ p (3.9) θ̂M L = arg max λθ . θ∈Λ (3.10) We would then make the choice H0 is true or H1 is true using the statistic λθ̂M L in the test (3.5). Note this hypothesis testing problem is known as a detection problem in the presence of unknowns. It is important to observe that the GLRT is not optimal as the likelihood ratio test is optimal in the standard detection case. We define an optimal test as one that is uniformly most powerful. It is well known that this ideal test exists when λ(Y ) is monotonically increasing or decreasing [37]. In practice our test statistic is not usually monotonic therefore it is very challenging to find this optimal test. Therefore it is often necessary for us to rely on a suboptimal approach. We choose to utilize the GLRT because if a uniformly most powerful test does indeed exist for our hypothesis testing problem it will coincide with the GLRT. Also we note that the GLRT is asymptotically uniformly most powerful as the number of observations approaches infinity. 3.2 Continuous GLRT As noted previously typically we restrict ourselves to problems using discrete- time in order to be able to explicitly express the probability density functions of the data under each hypothesis and hence express the test statistic λ. There are some applications in which the observations are best modeled as a continuous-time random process. In particular if one recalls from the previous chapter we defined our data in terms of continuous fast-time and slow-time parameters. Therefore we find that extending the standard GLRT for continuous-time random processes is necessary in order for us to compare the GLRT with the backprojection method. We define a continuous-time random process Y as a collection of random variables {Y (t), t ∈ [0, T ]} indexed by a continuous parameter t. We have chosen the observation interval, or index set S = [0, T ] for simplicity. We describe the extension 23 for measurements dependent on two time parameters in the following section. We begin by describing the standard treatment of continuous-time random processes in detection and estimation schemes [37]. 3.2.1 General Continuous-Time Random Processes In the continuous-time case the observation space Γ becomes a function space. We therefore must consider the hypothesis testing problem H0 : Y ∼ P H1 : Y ∼ Pθ (3.11) where P and Pθ are probability measures as opposed to probability density functions in the discrete-time case. In order to perform the detection and estimation task as before we must define these families of densities on function spaces. We note that a density is a function that can be integrated in order to calculate probabilities. Thus it is necessary for us to choose a method of integration on function spaces; we focus on the Lebesgue-Stieltjes integral. This integral is a type of Lebesgue integral with respect to the Lebesgue-Stieltjes measure. An obvious example of such a measure is a probability measure. We will look at this example in order to better understand this type of integral since integration with respect to probability measures is familiar. We assume that we have a probability measure µ on the observation space (Γ, G). We let X be a measurable function from (Γ, G) to (R, B), where B denotes the Borel sets of R (i.e. the smallest σ−field containing all intervals (a, b], a, b ∈ R). We may then define the following probability distribution, PX on (R, B): PX (A) = µ(X −1 (A)), A ∈ B (3.12) where X −1 (A) = {y ∈ Γ|X(y) ∈ A}. Clearly X is a random variable and we can demonstrate a Lebesgue-Stieltjes integral by taking the expectation of X. Taking 24 the expectation is often thought of as averaging X weighted by µ. We have Z E[X] = Z X(y)µ(dy) = Xdµ. (3.13) Γ This is a Lebesgue-Stieltjes integral. We now are be able to define the idea of a probability density which is needed to calculate the likelihood ratio. We first state the fellowing definition: Definition: Absolute continuity of measures [7]. Suppose that µ0 and µ1 are two measures on (Γ, G). We say that µ1 is absolutely continuous with respect to µ0 (or that µ0 dominates µ1 ) if the condition µ0 (F ) = 0 implies that µ1 (F ) = 0. We use the notation µ1 µ0 to denote this condition. We also now quote the Radon-Nikodym theorem which will be key in defining the probability densities. The Radon-Nikodym Theorem [7]. Suppose that µ0 and µ1 are σ−finite measures on (Γ, G) and µ1 µ0 then there exists a measurable function f : Γ → R such that Z f dµ0 , µ1 (F ) = (3.14) F for all F ∈ G. Moreover f is uniquely defined except possibly on a set G0 with µ0 (G0 ) = 0. We call the function f the Radon-Nikodym derivative of µ1 with respect to µ0 and write it as f = dµ1 /dµ0 . Now recall we introduced the Radon-Nikodym derivative in order to express explicitly the probability densities p and pθ . Using the Radon-Nikodym theorem we see that if there exists a σ−finite measure µ on (Γ, G) such that P µ and Pθ µ for all θ ∈ Λ we may express the densities as p = dP/dµ and pθ = dPθ /dµ, θ ∈ Λ. Note that P and Pθ are the probability measures associated with p and pθ respectively. We observe that we can always define a measure µ such that µ = P +Pθ . 25 Therefore there always exists µ on (Γ, G) such that P µ and Pθ µ. Thus we can define our densities with respect to µ as suggested, i.e. p = dP/dµ and pθ = dPθ /dµ, θ ∈ Λ. Now to perform the GLRT we still need to write explicitly the likelihood ratio pθ /p. In order to compute this quantity we require the additional condition that Pθ P in order to use the Radon-Nikodym theorem again and have that the Radon-Nikodym derivative of Pθ with respect to P exists. It may be shown [37] that for any µ dominating both Pθ and P that we have dPθ /dµ pθ dPθ = = . dP dP/dµ p (3.15) We conclude that the likelihood ratio is simply the Radon-Nikodym derivative of Pθ with respect to P when Pθ P . We now briefly remark on the case when Pθ is not absolutely continuous with respect to P . In this case there exists a set F such that P (F ) = 0 and Pθ (F ) > 0. The extreme case when we have P (F ) = 0 and Pθ (F ) = 1 is known as orthogonality of P and Pθ . This case is important to observe in light of detection theory because in this case we say the hypothesis testing problem (3.7) is perfectly detectable. We therefore have the following steps to consider when given a hypothesis testing problem as in (3.7). First determine if the two measures are orthogonal; in this case we are finished with the detection task and no processing of the data is necessary. If this is not the case we then must determine if Pθ P . If so we then can calculate the Radon-Nikodym derivative as the likelihood ratio and perform the steps in the GLRT. 3.3 Reproducing-Kernel-Hilbert-Space Representations of Continuous-Time Random Processes We now move on to discuss Parzen’s reproducing kernel Hilbert space treat- ment of the estimation and detection tasks with continuous-time random processes [8]. The hope is to find a condition which guarantees that the measures P and 26 Pθ are absolutely continuous with respect to each other. If this condition is found one is able to guarantee the existence of the Radon-Nikodym derivative of Pθ with respect to P and in turn write out an explicit expression for the likelihood ratio. His technique begins by approximating a continuous-time process with a discrete-time process. He then develops the condition for when the limit of the discrete-time likelihood ratio exists and is equivalent to the continuous-time likelihood ratio. Parzen is then able to give explicit formulas for the likelihood ratio in the case of Gaussian random processes. It turns out that the main condition that must be satisfied is that the signal we are trying to detect must lie in the reproducing kernel Hilbert space with reproducing kernel given by the covariance kernel of the noise process. We will outline the main details of his work below. In order to calculate the likelihood ratio in the continuous case one begins by approximating with the discrete finite dimensional case. W consider the case when the index set is a finite subset of S; that is, we let S 0 = (t1 , ..., tn ) ⊂ S. (3.16) We then define the hypothesis testing problem on this discrete subset S 0 as, H0 : Y ∼ PS 0 (3.17) H1 : Y ∼ Pθ,S 0 , (3.18) where PS 0 and Pθ,S 0 denote the probability measures associated with [Y (t), t ∈ S] under H0 and H1 respectively. In addition we assume that Pθ,S 0 PS 0 and therefore the Radon-Nikodym derivative, or likelihood ratio, exists on this subset and is given by λS 0 = dPθ,S 0 . dPS 0 (3.19) This assumption that Pθ,S 0 PS 0 is not strong as we are typically able to write out the probability density functions and hence the likelihood ratio whenever S 0 is 27 discrete. Parzen then shows that if the quantity known as the divergence, given by JS = lim JS 0 = lim EH1 (log λS 0 ) − EH0 (log λS 0 ), 0 0 S →S S →S (3.20) is finite, then the probability measures defined on the full continuous index set S satisfy Pθ P . Therefore the Radon-Nikodym derivative of Pθ with respect to P exists. In the case that the divergence is finite Parzen also showed we may calculate the Radon-Nikodym derivative or likelihood ratio λ via λ= dPθ = lim λS 0 S 0 →S dP (3.21) as the limit above exists in this case. Now let us consider the case when the data has a Gaussian distribution. In particular we consider the hypothesis testing problem H0 : m(t) = n(t) (3.22) H1 : m(t) = M (t) + n(t), (3.23) where m is the measured data, n is additive noise, and M is the signal we wish to detect. In the radar case M would correspond to data received from a target. Note that we assume that n is Gaussian with zero-mean and covariance kernel given by K(t, t0 ) = E[n(t)n(t0 )]. (3.24) Therefore we have that m under H1 is also a Gaussian random process with covariance kernel K but with mean-value M in this case. We begin by assuming that the time variable t resides in a continuous index set S = (0, ∞). Next it is assumed [8] that we can approximate S with a discrete subset S 0 . It is well known that the likelihood ratio, given a discrete-time random Gaussian noise process, can be written as log λS 0 = (m, M )K,S 0 − 1/2(M, M )K,S 0 (3.25) 28 where S 0 is the discrete index set described above and the inner product (·, ·)K,S 0 is given as follows (f, g)K,S 0 = X f (t)K −1 (t, t0 )g(t0 ) (3.26) t,t0 ∈S 0 for any two functions f, g defined on S [8, 37]. In this case Parzen has shown that the divergence is finite, JS < ∞, if and only if limS 0 →S (M, M )K,S 0 < ∞. Therefore we have that Pθ P and consequently we can calculate λ if (M, M )K,S 0 approaches a limit as S 0 → S. Parzen has also shown that the functions M having this property are elements of the reproducing kernel Hilbert space with reproducing kernel K, denoted H(K). We now discuss briefly the reproducing kernel Hilbert space H(K) and how Parzen uses this theory to calculate the likelihood ratio. First note that if K is the kernel of a random process {m(t), t ∈ S} it may be shown that there exists a unique Hilbert space denoted H(K) which satisfies the following definition. Definition - Reproducing kernel Hilbert space [8]. A Hilbert space H is said to be a reproducing kernel Hilbert space, with reproducing kernel K, if the members of H(K) are functions on some set S, and if there is a kernel K on S ⊗ S having the two properties: for every t ∈ S, where K(·, t) is the function defined on S, with value at s ∈ S equal to K(s, t), K(·, s) ∈ H(K) (g, K(·, s))K = g(s), ∀g ∈ H(K) where (·, ·)K denotes the inner product on H(K). Again we mention that Parzen found that lim (M, M )K,S 0 < ∞ if and only if M ∈ H(K) S 0 →S (3.27) 29 and also we may say in this case that lim (M, M )K,S 0 = (M, M )K.S . (3.28) S 0 →S That is, we may calculate the inner product between two elements of H(K) using the limit (3.28). Using this fact we may use the following result to calculate λ. Theorem - Radon-Nikodym derivative [8]. Let Pθ be the probability measure induced on the space of sample functions of a time series {m(t), t ∈ S} with covariance kernel K and mean value function M . Assume that either (i) S is countable or (ii) S is a separable metric space, K is continuous, and the stochastic process {m(t), (t) ∈ S} is separable. Let P be the probability measure corresponding to the Gaussian process with covariance kernel K and with zero mean. Then Pθ and P are absolutely continuous with respect to one another, or orthogonal, depending on whether M does belong or does not belong to H(K). If M ∈ H(K) then the Radon-Nikodym derivative of Pθ with respect to P is given by 1 λ[m(t)] = exp (m, M )K − (M, M )K 2 (3.29) where (·, ·)K denotes the inner product on H(K). We remark that in words a function is an element of the reproducing kernel Hilbert space if it is as smooth as the noise. If this is the case we have that Pθ P and we can calculate λ using the above theorem. If this is not the case then Pθ and P are orthogonal and M is perfectly detectable. We quote another result of Parzen’s which will be key in our calculation of the likelihood ratio in the following section. These next set of results outline how one describes the elements of H(K) which we will use in order to determine if M ∈ H(K) and hence determines if we can calculate λ. Integral Representation Theorem [8]. Let K be a covariance kernel. If a measurable space (Q, B, µ) exists, and in the Hilbert space of all B−measurable 30 functions on Q satisfying Z |f |2 dµ < ∞ (f, f )µ = (3.30) Q there exists a family [f (t), t ∈ S] of functions satisfying 0 Z 0 K(t, t ) = (f (t), f (t ))µ = f (t)f (t0 )dµ (3.31) Q then the reproducing kernel Hilbert space H(K) consists of all functions g on S which may be represented as Z g(t) = g ∗ f (t)dµ (3.32) Q for some unique function g ∗ in the Hilbert subspace L[f (t), t ∈ S] of L2 (Q, B, µ) spanned by the family of functions [f (t), t ∈ S]. Note the superscript ∗ is simply notation and the bar (or overline) is used to indicate complex conjugation. The norm of g is given by ||g||2K = (g, g)K,S = (g ∗ , g ∗ )µ . (3.33) If [f (t), t ∈ S] spans L2 (Q, B, µ) then m(t) may be represented as a stochastic integral with respect to an orthogonal random set function [Z(B), B ∈ B] with covariance kernel µ: Z m(t) = f (t)dZ (3.34) E[Z(B1 )Z(B2 )] = µ(B1 B2 ). (3.35) Q Further, Z (g, m)K,S = g ∗ dZ. (3.36) Q We also quote the integral representation theorem specifically for the case when the random process m is wide-sense stationary. 31 Integral Representation Theorem for Stationary Processes [8]. Let S = [t : −∞ < t < ∞] and let [m(t), t ∈ S] be a stationary time series with spectral density function f (ω) so that Z 0 0 eiω(t−t ) f (ω)dω. K(t, t ) = Then H(K) consists of all functions g on S of the form Z G(ω)eiωt dω g(t) = where G(ω) = g ∗ (ω)f (ω) (for some unique function g ∗ (ω)), for which the norm ||g||2K |G(ω)|2 dω f (ω) Z = is finite. The corresponding random variable (m, g)K can be expressed in terms of the spectral representation of m. If Z eiωt dZ(ω) m(t) = then Z G(ω) dZ(ω). f (ω) (m, g)K = Parzen also extends the integral representation theorem result for the case when we have a discrete set of stationary processes. We may think of this as being analogous to the case of a discrete slow-time parameter. That is, we have the random process [ms (t), −∞ < t < ∞, s = s1 , ..., sn ] where si ∈ R for all i = 1, ..., n. In this case we express the covariance kernel of the noise process as 0 0 Z ∞ Ks,s0 (t, t ) = E[Xs (t)X s0 (t )] = 0 eiω(t−t ) fs,s0 (ω)dω (3.37) −∞ where fs,s0 (ω) is the spectral density function. We now extend the integral representation theorem for stationary processes to define the elements of H(K). Any gs (t) ∈ H(K) defined on s ∈ [s1 , ..., sn ] and t ∈ R is given by Z gs (t) = Gs (ω)eitω dω (3.38) 32 where 1 Gs (ω) = 2π Z eitω gs (t)dt (3.39) is the Fourier transform of gs and is written as Gs (ω) = g ∗ (ω)fs,s0 (ω). Note g ∗ (ω) is unique. The norm on H(K) is written as ||g||2K = Z X sn Gs (ω)f s,s0 (ω)Gs0 (ω) dω < ∞ (3.40) s,s0 =s1 0 where f s,s (ω) is the inverse of fs,s0 (ω). We are now left to write the inner product expressions (M, M )K and (m, M )K . Clearly we may use (3.40) to express (M, M )K . For the second term of the likelihood ratio we utilize the integral representation theorem again, that is sn X Z (m, M )K = s,s0 =s 0 Gs (ω)f s,s (ω)dZs0 (ω) (3.41) eitω dZs (ω). (3.42) 1 where Z ms (t) = 3.4 The Relationship between the GLRT and Backprojection in SAR imaging We now outline our specific hypothesis testing problem and discuss how to calculate the likelihood ratio. For simplicity, we assume that the object in our scene is a point scatterer with scattering strength C: T (x) = Cδ(x − y), where y is the unknown location of the object in our scene. Note that we did not previously address the maximum likelihood estimation step of the GLRT, though the only unknown we defined was the signal itself M . Now we define a specific form for the signal dy which depends on an unknown parameter y. This parameter, y, will be what we estimate in the maximum likelihood step of the GLRT. In this case, (2.36) has the 33 form Z dy (s, t) = F[T ](s, t) = C e−iω(t−φ(s,y)) A(y, s, ω)dω. (3.43) We note that A depends on y only through the geometrical spreading factors, which are slowly varying. Consequently we neglect the dependence on y and write A(y, s, ω) ≈ Ã(s, ω). (3.44) Thus we write d˜y (s, t) = F[T ](s, t) = C Z e−iω(t−φ(s,y)) Ã(s, ω)dω. (3.45) In order to determine whether the target is present, and if it is present determine its location, we consider the hypothesis testing problem H0 : m(s, t) = n(s, t) Hy : m(s, t) = d˜y (s, t) + n(s, t) (3.46) where m(s, t) is our measured data and n(s, t) is additive white Gaussian noise. Also we assume that our additive white Gaussian noise is stationary in fast-time and slow-time with spectral density S(ω; s1 , s2 ) = σ 2 δ(s1 − s2 ). (3.47) Taking the Fourier transform gives us the following covariance kernel Z K(t1 , t2 ; s1 , s2 ) = E[n(s1 , t1 )n(s2 , t2 )] = Z = σ 2 eiω(t1 −t2 ) dω. σ 2 eiω(t1 −t2 ) dωeiα(s1 −s2 ) δ(s1 − s2 )ds (3.48) for (s1 , t1 ) and (s2 , t2 ) ∈ S, where S = {(s, t) : s ∈ R, t ∈ R} is the index set on which our stochastic processes m and n are defined. In addition note that σ is a constant. 34 For our GLRT task we wish to detect the presence of T and also estimate its location y, i.e. we wish to find the maximum likelihood estimate of y and calculate the likelihood ratio in order to determine if T is present or not. We express this as follows: py (m) p(m) = arg(max λ(y)) λ(y) = yM L y∈Λ (3.49) where Λ is the set of ground locations. In order to decide if a target did exist at location yM L we would compare the statistic λyM L to a predetermined threshold η as in the test defined in (3.5). Recall in the previous section in order to calculate λ for our continuous-time data we must be able to calculate dPy /dP which exists if and only if Py P . We will use the Hilbert space techniques of Parzen’s to form an expression for the likelihood ratio and the maximum likelihood estimate of y in terms of reproducing kernel inner products. We summarize our result in the following theorem. Theorem. Given the hypothesis testing problem (3.46) and the definitions (3.45) of the data d˜y and (3.48) of the noise covariance kernel K respectively we have that the likelihood ratio, or test statistic, for detecting d˜y is given by the following backprojection operator: Z λ(y) = KF[T ](y) = eiω(t−φ(s,y)) Ã(s, ω)dωm(s, t)dsdt. (3.50) We also have that the maximum likelihood estimate of y is given by, yM L Z iω(t−φ(s,y)) = arg max |λ(y)| = arg max e Ã(s, ω)dωm(s, t)dsdt. y∈Λ y∈Λ (3.51) Proof. We begin by describing the reproducing kernel Hilbert space generated by the covariance kernel of the noise process n. Recall that previously we only considered random processes dependent on a discrete slow-time parameter. Our process 35 depends on a continuous slow-time and therefore we must generalize Parzen’s results for our situation. For a continuous slow-time process, i.e. [m(s, t), s ∈ R, t ∈ R], we simply replace the summations in the preceding expressions with integrals over the slow-time parameter. In this case the elements of H(K) are functions g defined on s ∈ R and t ∈ R of the form Z g(s, t) = G(s, ω)eitω dω (3.52) where G(s, ω) = g ∗ (s, ω)f (ω; s, s0 ) and f (ω; s, s0 ) is again the spectral density of the noise process n. If we assume K has the form in equation (3.48) we have that f (ω; s, s0 ) = σ 2 . (3.53) We now note that we may write the data in the form d˜y (s, t) = Z eitω D̃y (s, ω)dω, (3.54) D̃y (s, ω) = Ceiωφ(s,y) Ã(s, ω). (3.55) where Therefore we see that d˜y ∈ H(K) and we may use Parzen’s result for the RadonNikodym derivative to compute λ(y). If we extend the expression for (m, d˜y )K for continuous slow-time we have Z D̃y (s, ω) dZ(ω; s)ds f (ω; s, s) Z Z Ce−iωφ(s,y) Ã(s, ω) itω e m(s, t)dt dωds = σ2 Z C = 2 e−iωφ(s,y) Ã(s, ω)M (s, ω)dωds σ (m, d˜y )K = (3.56) where M (s, ω) is the fast-time Fourier transform of m(s, t) and also observe we have written this statement in terms of a single slow-time s as our process is stationary in slow-time. We may use similar steps to evaluate the other term in the likelihood 36 ratio (d˜y , d˜y )K . We find that 1 (d˜y , d˜y )K = 2 σ Z |C Ã(s, ω)|2 dsdω. (3.57) Thus using Parzen’s theorem we find that C λ(y) = 2 σ Z −iωφ(s,y) e 1 Ã(s, ω)M (s, ω)dωds + 2 σ Z |C Ã(s, ω)|2 dsdω. (3.58) Note that the second term of (3.58) does not depend on the unknown parameter y and therefore does not provide any information for our estimation and detection task. We can therefore neglect the second term of (3.58) and obtain the following expression for the test statistic at each possible location y: Z λ(y) = e−iωφ(s,y) Ã(s, ω)M (s, ω)dsdω. (3.59) If we take the inverse Fourier transform of M (s, ω) we then obtain the time domain version of the test statistic: Z λ(y) = eiω(t−φ(s,y)) Ã(s, ω)dω m(s, t) dsdt. (3.60) We see that (3.60) is a special case of (2.38), where in (3.60) we have used m rather than d to denote the collected data. In (3.60), the filter Q of (2.38) is the matched filter, namely the complex conjugate of the amplitude Ã. We have shown that, with this choice of filter, the FBP image is equivalent to the test statistic calculated at each possible location y. We observe that one may think of the values of the test statistic for each y as a value assigned to a pixel. All these pixel values can be plotted to obtain a corresponding ‘test statistic image’, which as we have shown, is equivalent to a filtered-backprojection image formed with a matched filter. The final step of the detection and estimation problem is to estimate the unknown y. We achieve this simply by maximizing the above expression over all 37 possible y ∈ X, i.e. yM L Z iω(t−φ(s,y)) = arg max |λ(y)| = arg max e Ã(s, ω)dω m(s, t) dsdt. y∈Λ y∈Λ (3.61) This completes the proof. We remark that this result can easily be extended to the case when our additive noise is colored by replacing σ 2 with a spectral density function f (ω). In this case we obtain a similar result to above, but instead the filter is a matched filter in conjunction with a whitening filter, as one would expect. It is also important to observe that this result holds only for the case when (d˜y , d˜y )K is not dependent on R y, that is, the filter energy σ12 |C Ã(s, ω)|2 dsdω does not depend on the unknown parameter y. If this were not the case, which it often is not, one would not obtain simply a backprojection image expression for the test statistic λ(y). We conclude our study of the relationship of backprojection and the generalized likelihood ratio test with a discussion of the consistency property of the maximum likelihood estimate yM L . Recall that in backprojection we can use the pseudolocal property of the image-fidelity operator to guarantee that the target appears at the correct location y. It is well known [37] that yM L → y in probability as the number of measurements approaches infinity. We observe that in reality our data depends on a finite number of measurements, that is, t and s do not really span the entire real line. Therefore our image-fidelity operator is an approximate pseudodifferential operator. However as the number of measurements approaches infinity we will have an exact pseudodifferential operator. Similarly in the GLRT as the intervals on which t and s take values approaches the entire real line our maximum likelihood estimate of y converges to the true location in probability. This is known as consistency of the estimate. We note that further work is needed to truly understand why such similar results come from markedly different theories. This leads one to assume that use of both analysis (microlocal analysis in particular) and statistics is necessary and may lead to new breakthroughs in imaging capabilities. CHAPTER 4 Polarimetric synthetic-aperture inversion for extended targets in clutter 4.1 Introduction In this second body of work we consider the task of developing an imaging algorithm specifically for extended targets (curve-like, edges). Our goal is to create a model that reflects the directional scattering of edges. We also choose to work with a polarimetric radar system so that all sets of polarimetric data are available for the reconstruction task. The specific polarimetric radar system we consider includes two dipole antennas mounted on an aircraft. Each antenna has a linear polarization orthogonal to the other. Also both antennas are used for transmission and reception and in this way one is able to collect four sets of data, one for each transmitter and receiver pair. This type of system is unique in that it includes the polarization state of the electromagnetic waves in the model of the wave propagation. This model is derived from a full vector solution of Maxwell’s equations. This is different from standard SAR models of wave propagation in which one assumes the scalar wave equation is sufficient. A key difference between standard SAR and polarimetric SAR is how one describes the scatterers present in the scene of interest. In standard scalar SAR a scatterer is described by a scalar scattering strength, or a reflectivity function. This assumption indicates that each complex scatterer is made up of point scatterers. In polarimetric SAR the scatterers are described by a scattering vector (a 4 × 1 vector in particular), and therefore its scattering strength is dependent on the polarization states of the antennas used for transmission and reception. This may be thought of as describing each complex scatterer as a collection of dipole elements [27]. In addition when one uses the standard scalar wave equation model there is only one polarimetric channel of data available for use in the reconstruction. A polarimetric system enables one to incorporate all polarimetric channels of data and therefore provides more information for the reconstruction scheme. 38 39 Most current work in polarimetric radar, or polarimetry, is not focused on imaging algorithms. It is assumed that one may obtain an image of each element of the scattering vector describing the object of interest from the corresponding data set using standard SAR imaging algorithms. That is, if one antenna’s polarization state is denoted a and the second b, we may reconstruct the element of the scattering vector denoted Sa,b from the data set collected when antenna a is used for transmission and antenna b for reception. One goal in polarimetry is target detection which focuses on applying detection and estimation schemes to polarimetric images [16, 17, 18, 19, 20, 22, 23, 24, 26]. There is also another body of work in polarimetry aimed at other applications such as geographical or meteorological imaging. This work focuses on estimating parameters that distinguish types of distributed scatterers such as foliage or rain droplets. It has been found by many researchers [18, 20, 21, 26, 41] that a single scattering vector does not describe these distributed scatterers adequately. These scatterers, like foliage or vegetation, are prone to spatial and/or time variations and therefore a correlation, or covariance matrix is needed to describe them. In this work we will focus on man-made targets. In particular we investigate the optimal backprojection imaging operator for polarimetric SAR data. We assume that objects in the scene of interest are made up of the dipole elements described above. The actual scattering vector for any object is assumed to be a second-order random process. In addition, measurement noise is included in the data model as a second-order process. The object of interest is assumed to display directional or anisotropic scattering behavior, therefore modeling an edge or curve. This object may be thought of as an edge of any manmade object, for example a vehicle. We make the assumption that any individual dipole element making up the curve is only visible when the radar look direction is perpendicular to the orientation of that dipole element. This assumption serves to incorporate directional scattering and also to make a distinction between scatterers that make up the object of interest and other scatterers present in the scene (i.e. clutter). The clutter scattering behavior is assumed to be isotropic. This directional scattering assumption is strong but it enables us to write an analytic inversion 40 scheme. We will discuss the validity of this assumption more in section 4.6. The imaging technique used is an extension of the algorithm in [15] for the vector case. A filtered-backprojection type reconstruction method [1, 2] is used where a minimum mean-square error (MSE) criterion is utilized for selecting the optimal filter. We found that it is optimal to use what is called a coupled filter, or a fully dense filter matrix. This differs from the standard polarimetric SAR imaging algorithms which assume that a diagonal filter (i.e. one reconstructs each element of the scattering vector from its corresponding data set) may be used. We begin with a short introduction to polarimetric radar and the polarization state of electromagnetic waves. We also briefly review the standard radar cross section (RCS) models used for extended targets from the radar literature. We will then consider our specific dipole SAR forward model which stems from a method of potentials solution to Maxwell’s equations. This model will be rewritten in similar variables as the standard RCS model in order to make a formal comparison of the two models. We then describe the assumptions made in order to describe the directional scattering behavior of the target and clutter. These assumptions also serve to linearize the forward model so that we are able to write out an analytic inversion scheme. We will describe the imaging process in general and then discuss the optimal filters in the case when target and clutter are statistically independent and also in the case when the two processes are correlated. We will conclude with numerical simulations comparing our imaging scheme with the standard polarimetric channel-by-channel processing method. We will demonstrate that our method improves mean-square error (as it was defined to be optimal in the MS sense), and also the final image signal-to-clutter ratio. We will also see examples where the coupled processing technique helps us to reconstruct the correct target orientation when the standard processing fails in this respect. 41 4.1.1 Polarimetric Concepts When discussing the wave propagation for polarimetric SAR we begin again with Maxwells equations. Recall ∇ × E(t, x) = − ∂B(t, x) ∂t ∇ × H(t, x) = J (t, x) + (4.1) ∂D(t, x) ∂t (4.2) ∇ · D(t, x) = ρ (4.3) ∇ · B(t, x) = 0 (4.4) where E is the electric field, B is the magnetic induction field, D is the electric displacement field, H is the magnetic intensity or magnetic field, ρ is the charge density, and J is the current density. For simplicity we will consider again the same case when Maxwell’s equations simplifies to the wave equation for each element of the electric and magnetic field. These assumptions are used solely for the reason of obtaining a wave solution which is simple enough for us to visualize the polarization state. We will go back to the full Maxwell’s equations to model the wave propagation in the following section. The simplest solution to the wave equation (for a linear source free homogeneous medium) is known as a plane wave. These waves have constant amplitude in a plane perpendicular to the direction of propagation. We express the electric field of a plane wave as E(r, t) = E(r)cos(ωt) (4.5) where r ∈ R3 is the position vector, k ∈ R3 is the direction of propagation, ω is the angular frequency, and t is time (specifically our fast-time). Note that in particular we call this type of wave a monochromatic plane wave because it varies in time with a single angular frequency. We may also write its representation in a form that is independent of t E(r) = Eeik·r (4.6) where E is a constant-amplitude field vector. It is important to observe that we have defined a right-handed coordinate system, denoted (ĥ, v̂, k̂). We have that 42 E lies in the plane perpendicular to k̂ and therefore may be written as a linear combination of the basis vectors which define this plane, that is, E = Eh ĥ + Ev v̂. (4.7) We may now discuss polarization of waves. This quantity is used to describe the behavior of the field vector in time. If we look specifically in the plane perpendicular to the direction of propagation and let the field vector vary with time it will trace out its polarization state. In general the vector traces out an ellipse, which is known as the polarization ellipse. As shown in Figure (4.1) the shape of the el- Figure 4.1: Linear, Circular, and Elliptical Polarization States lipse varies with polarization state. We have pictured from left to right respectively, linear, circular, and elliptical polarization states. These states are defined by two angles, the orientation angle ψ and the ellipticity angle χ. 43 Figure 4.2: The Polarization Ellipse The orientation angle describes the direction, or slant, of the ellipse and takes on values in the range 0 ≤ ψ ≤ π. The ellipticity angle takes on the values −π/4 ≤ χ ≤ π/4 and characterizes the shape of the ellipse. For example χ = 0 describes the linear states, in particular we have the horizontal state described further by ψ = 0 and the vertical state where in this case ψ = π. For circular polarization we have χ = π/4. All other ellipticity angles describe the various elliptical states. We now discuss briefly the polarimetric scattering scenario. Again for simplicity we will assume that our transmitting antenna transmits a fully polarimetric monochromatic plane wave denoted E i . This wave has propagation direction ki and its field vector is written in terms of the basis vectors defining the plane perpendicular to ki . That is, E i = Ehi ĥi + Evi v̂i . (4.8) As in standard SAR this wave will interact with the target, or scatterer, and the wave speed will change. In addition the polarization state and/or degree of polarization may change due to this target interaction. We will focus on characterizing this change now and will add in the wave speed change, or reflectivity function, when we write our full forward model. Now we assume that the wave which scatters off the target is received at the antenna which lies in direction ks , in the far-field of the object. We therefore may express the scattered field vector in terms of the basis vectors defining the plane 44 perpendicular to ks . That is E s = Ehs ĥs + Evs vˆs . (4.9) Note that the right-handed coordinate system (ĥs , v̂s , k̂s ) is not typically the same as the coordinates (ĥi , v̂i , k̂i ). This process, which takes E i and returns E s , is thought of as a transformation performed by the scatterer. We describe this transformation mathematically as E s = [S]E i Shh Shv E i, = Svh Svv (4.10) where [S] is known as the scattering matrix for the scatterer present in the scene. This will be incorporated into the quantity we reconstruct later when we discuss polarimetric imaging. Note that for a given frequency and scattering geometry [S] depends only on the scatterer, however it does depend on the basis we use to describe the waves. Also we remark that in polarimetric SAR we attempt to measure the scattering matrix by transmitting two orthogonal polarizations on a pulse-to-pulse basis and then receiving the scattered waves in the same two orthogonal polarizations. In our specific case we will perform this task with two orthogonal dipole antennas, denoted a and b. 4.2 Radar Cross Section for Extended Targets In the preceding section we have discussed how to model the polarization change when a field interacts with a target. Also in chapter two we discussed modeling the change in wave speed with the scalar reflectivity function. We will describe a third way of characterizing a target known as the radar cross section. This quantity is most common in radar literature and preferred by radar engineers. For this reason we will outline the accepted radar cross section for our extended or curve-like targets and we will also compare our scattering model to this after our model is explained in section 4.3. 45 4.2.1 Radar Cross Section and Polarimetric Radar Cross Section We begin by describing the radar equation which may be used to express the interaction between the incident field, the target, and the receiving antenna. It is written in terms of the power the target absorbs, or intercepts, from the incident wave and then reradiates to the receiving antenna. Mathematically we express this relationship as PR = PT GT (θ, φ) Aer (θ, φ) σ 2 4πrT2 4πrR (4.11) where PR is the power detected at the receiving antenna, PT is the transmitted power, GT is the transmitting antenna gain, Aer is the effective aperture of the receiving antenna, and rT and rR are the distances between the target and the transmitting and receiving antenna respectively. Also we have the spherical angles θ and φ that describe the azimuth and elevation angles of observation. We note that one may arrive at the radar equation directly from our standard SAR forward model given in equation (2.25). The RCS is given by the quantity σ. It is defined as the cross section of an equivalent idealized isotropic scatterer that generates the same scattered power density as the target in the observed direction. We may express σ in the form σ = 4πr2 |E s |2 . |E i |2 (4.12) Observe that σ depends on the frequency transmitted, the polarization state of the wave transmitted, the flight path or antenna placement, and also the target’s geometry and dielectric properties. We are mainly concerned with its polarization dependence so we will discuss that in more detail now. We define the polarization-dependent RCS as σqp = 4πr2 |Eqs |2 |Epi |2 (4.13) where p is the polarization state of the transmitted field and q is the polarization of the scattered field. We now recall the expression for the polarization scattering process (4.10). In the literature the relationship between the radar cross section and 46 the scattering matrix elements is defined as σqp = 4π|Sqp |2 . (4.14) Note that this expression neglects to include all terms of the scattered field. For example, if we use the standard (ĥ, v̂) basis we have that Ehs = Shh Ehi + Shv Evi . However for equation (4.14) to hold it must be assumed that Ehs ≈ Shh Ehi . In our scattering model we choose not to make this assumption. This will be the main difference between our model and the accepted model from the radar literature. This point will be discussed in more detail in the following section. 4.2.2 Method of Potentials Solution of Maxwell’s Equations We now go on to discuss how to arrive at the specific expressions for the electric fields E i and E s . We begin with Maxwell’s equations in the frequency domain: ∇ × E(ω, x) = iωB(ω, x) (4.15) ∇ × H(ω, x) = J (ω, x) − iωD(ω, x) (4.16) ∇ · D(ω, x) = ρ(x) (4.17) ∇ · B(ω, x) = 0. (4.18) First note that since the magnetic induction field B has zero divergence it may be expressed as the curl of another field A. We call A the vector potential and write B = ∇ × A. (4.19) We insert (4.19) into (4.15) to arrive at ∇ × (E(ω, x) − iωA(ω, x)) = 0. (4.20) 47 Now we use another fact from vector calculus, that is, a vector field whose curl is zero can be written as the gradient of a potential. Mathematically E − iωA = −∇Φ (4.21) where Φ is called the scalar potential. We rewrite this to express the electric field as E = −∇Φ + iωA. (4.22) We now assume that our medium is free space and therefore the free space constitutive relations hold. That is, we have D = 0 E (4.23) B = µ0 H. (4.24) Using these and also (4.22) in (4.16) and (4.17) we arrive at a system of equations for A and Φ. We have ∇ × (µ0 ∇ × A) = J − iωE = J − iω0 (−∇Φ + iωA) ∇ · (0 E) = 0 ∇ · (iωA − ∇Φ) = ρ. (4.25) (4.26) We pause here to discuss an issue with the definitions of A and Φ. Observe that if one adds the gradient of any scalar field, say ψ, to A, the physical magnetic induction field will not change because the curl of ψ is always zero. Also if one adds the quantity (iωψ) to Φ then E will remain unchanged. Therefore the transformation A → A + ∇ψ Φ → Φ + iωΦ (4.27) does not affect E and H. This is called a gauge transformation. In order to solve for our fields we must add an additional constraint to the system of equations for A and Φ. We will use the constraint known as the Lorenz gauge, which states, ∇ · A − iω0 µ0 Φ = 0. 48 Now returning to our system of equations for A and Φ we begin to solve by using the triple-product (or BAC-CAB identity) in (4.25). We have µ−1 0 2 −1 ∇ × (∇ × A) = µ0 ∇(∇ · A) − ∇ A = J − iω0 µ0 (iωA − ∇Φ). (4.28) Rearranging terms gives us ∇2 A + ω 2 0 µ0 A = µ0 J − iω0 µ0 ∇Φ + ∇(∇ · A). (4.29) Next we use the definition k 2 = ω 2 0 µ0 and write ∇2 A + k 2 A = µ0 J + ∇(∇ · A − iω0 µ0 Φ). (4.30) Note that the expression in parentheses is exactly the Lorenz gauge constraint. Therefore this expression simplifies to the Helmholtz equation ∇ 2 A + k 2 A = µ0 J . (4.31) Now we move on to solve for Φ. To find the expression for Φ we begin with equation (4.26), which we restate here 0 ∇ · (iωA − ∇Φ) = ρ. (4.32) From the Lorenz gauge we have that ∇ · A = iω0 µ0 Φ, which gives us iω0 (iω0 µ0 Φ) − 0 ∇ · (∇Φ) = ρ. (4.33) This expression may be rewritten in terms of k again as ∇2 Φ + k 2 Φ = −ρ/0 . (4.34) Therefore we see that solving the Maxwell’s equations and finding the electric and magnetic field expressions reduces to solving two uncoupled Helmholtz equations 49 in free space. We will focus from this point on specifically on cylindrical extended targets for simplicity. This type of extended target is dealt with extensively in radar cross section literature and we will ultimately compare our forward model with this accepted RCS model. It is common to solve the Helmholtz equation in cylindrical coordinates in order to find the expressions for our vector and scalar potentials when one considers a target that is cylindrical in shape. This solution of the Helmholtz equation will then be used to calculate E and H. Once we have the expression for the electric field, we may write down the radar cross section for our target. 4.2.3 Helmholtz Equation in Cylindrical Coordinates As shown above, finding the expression for the electric (and hence magnetic) field amounts to solving the Helmholtz equation. We begin by considering the scalar Helmholtz equation in cylindrical coordinates in a source-free region. This equation is given by ∂ψ 1 ∂ 2ψ ∂ 2ψ 1 ∂ ρ + 2 2 + 2 + k 2 ψ = 0. ρ ∂ρ ∂ρ ρ ∂φ ∂z (4.35) We define ρ, φ, and z in Figure (4.3) depicting standard cylindrical coordinates. To solve this partial differential equation we will use the method of separation of variables and hence look for a solution of the form: ψ = R(ρ)Φ(φ)Z(z). (4.36) Without going through all the intermediary calculations we arrive at the following set of separated equations for R, Φ, and Z d dR ρ ρ + [(kρ ρ)2 − n2 ]R = 0 dρ dρ d2 Φ + n2 Φ = 0 dφ2 d2 Z + kz2 Z = 0 dz 2 (4.37) (4.38) (4.39) where n is the separation constant and we have the relation kρ2 + kz2 = k 2 . We 50 Figure 4.3: Cylindrical Coordinates first observe that equations (4.38) and (4.39) are simply harmonic equations and therefore the solutions are given by Φ = h(nφ) and Z = h(kz z) where h is any harmonic function. Now equation (4.37) is a Bessels equation of order n. We will denote the solution of the equation as Bn (kρ ρ). It is well known that in general Bn (kρ ρ) is a linear combination of any two linearly independent Bessel functions. We have now that a solution of the Helmholtz equation in cylindrical coordinates is given by the elementary wave function ψkρ ,n,kz . These functions are written ψkρ ,n,kz = Bn (kρ ρ)h(nφ)h(kz z). (4.40) The general solutions are therefore linear combinations of these functions (4.40). The general solution is given as a sum over all possible values of n and kz (or n and kρ ), i.e. ψ= XX n kz Cn,kz ψkρ ,n,kz = XX n Cn,kz Bn (kρ ρ)h(nφ)h(kz z), (4.41) kz where Cn,kz are constants. We may also obtain general solutions which integrate 51 over all possible kz (or kρ ) when these quantities are continuous. We note that n is usually discrete and therefore we will continue to sum over n values. In this case the general solution is given by ψ= XZ n fn (kz )Bn (kρ ρ)h(nφ)h(kz z)dkz (4.42) kz where the integration is over any contour in C when kz ∈ C, or any interval in R when kz ∈ R. The functions fn (kz ) are analogous to the constants Cn,kz and are obtained from the boundary conditions imposed in a specific propagation and/or scattering scenario. In order to fully define these solutions we must choose appropriate Bessel and harmonic functions. We will first choose the harmonic functions h(nφ) = einφ and h(nkz ) = eikz z as they are linear combinations of both sine and cosine. For Bessels equation we may choose functions based on the behavior we expect to see at ρ = 0 or ρ → ∞. If we seek a solution that is non-singular at ρ = 0 then we are required to select Bessels functions of the first kind, denoted Jn (kρ ρ). If instead we seek a solution which decays as ρ → ∞, i.e. outward-traveling waves, we will choose (2) Hankel function of the second kind, denoted Hn (kρ ρ). Now we move on to write expressions for E and H in terms of these elementary wave functions ψ. Using the method of potential solutions in cylindrical coordinates we may write the elements for a field polarized along the z-axis, also known as TM to z (i.e. no Hz element) as Eρ = Eφ = Ez = Hρ = Hφ = Hz = 1 ∂ 2ψ iω ∂ρ∂z 1 ∂ 2ψ ρiω ∂φ∂z 1 ∂2 ( + k 2 )ψ iω ∂z 2 1 ∂ψ ρ ∂φ ∂ψ − ∂ρ 0. (4.43) (4.44) (4.45) (4.46) (4.47) (4.48) 52 One may express any field TM to z in terms of these solutions when in a source-free region. Similarly we may express a field orthogonally polarized, known as TE to z (i.e. no Ez component) as Hρ = Hφ = Hz = Eρ = Eφ = Ez = 1 ∂ 2ψ iωµ ∂ρ∂z 1 ∂ 2ψ ρiωµ ∂φ∂z 1 ∂2 ( + k 2 )ψ iωµ ∂z 2 −1 ∂ψ ρ ∂φ ∂ψ ∂ρ 0. (4.49) (4.50) (4.51) (4.52) (4.53) (4.54) Any TE field may be expressed in terms of these solutions. Also note we may express any arbitrarily polarized field as a superposition of the TM and TE fields above. 4.2.4 Scattering in Two Dimensions We now consider an example scattering problem in two dimensions. We begin with this simple scenario and build off these solutions to obtain the RCS of our extended target in three dimensions. We assume that we have an incident plane wave which scatters off an infinitely long cylinder lying along the z-axis. For example we assume that the incident field is z-polarized (TM to z), therefore we have that Ezi = E0 e −ikρ cos φ = E0 ∞ X i−n Jn (kρ)einφ . (4.55) n=−∞ For details on how to obtain this expression for the incident field see [31]. The total field is the sum of the incident and scattered fields which is expressed mathematically as Ez = Ezi + Ezs . (4.56) We assume that our solution is composed of outward-traveling waves and therefore 53 Figure 4.4: Scattering scenario with infinite length cylinder lying along the z-axis, incident field is normal to the cylinder [28] we have that the scattered field must include the Hankel function of the second kind as described above. Explicitly we write Ezs = E0 ∞ X i−n an Hn(2) (kρ)einφ . (4.57) n=−∞ This gives us the following expression for the total field: Ez = E0 ∞ X i−n (Jn (kρ) + an Hn(2) (kρ))einφ . (4.58) n=−∞ Now in order to determine the coefficients an we must impose a boundary condition. We will assume that for this cylinder the z-component of the electric field is zero on the surface of the scatterer. If we assume that the cylinder has radius a we say that Ez = 0 at ρ = a. This boundary condition allows us to solve for the coefficients an which are given by an = −Jn (ka) (2) . (4.59) Hn (ka) It is also of interest to consider an asymptotic or approximate solution for the scattered field in the far-field of the cylinder. In this case one may utilize asymptotic (2) formulas for Hn for kρ → ∞. We find in this case that Ezs approaches the following 54 form: s Ezs → E0 ∞ 2i −ikρ X an einφ e πkρ n=−∞ (4.60) where an is defined as in equation (4.59). We may also consider the case when the incident field has the orthogonal polarization, that is transverse to z or TE to z. In this case we write the incident field as Hzi −ikx = H0 e ∞ X = H0 i−n Jn (kρ)einφ . (4.61) n=−∞ Repeating similar steps we obtain the expression for the scattered field in the far field of the cylinder s Hzs → H0 ∞ 2i −ikρ X e bn einφ πkρ n=−∞ where bn = −Jn0 (ka) (2)0 . (4.62) (4.63) Hn (ka) For more details on the derivation for this incident field see [31]. 4.2.5 RCS for Infinitely Long Cylinder Note that so far we have not commented on the length of the object. Obviously in our actual scattering scenario the object has finite length. However we will first consider an object of infinite length which simplifies the cross section calculations. We begin with the case when the incident field is normal to the cylinder and then extend for oblique incidence. Finally we will consider the case when the cylinder has finite length. 4.2.5.1 Normal Incidence For an infinitely long object one is required to use a variation of the radar cross section called the scattering cross section. It is defined as σ c = 2π lim ρ ρ→∞ (E s · E s∗ ) . (E i · E i∗ ) (4.64) 55 We consider an infinite cylinder because our target of interest will be significantly longer than it is wide. We first express the scattered fields obtained above in the case when the quantity ka 1. This indicates that as the object length approaches infinity its width becomes infinitesimally small. In this case we use approximate values for the coefficients an and bn and consider only the first few terms in the series. For the case when the incident field is polarized TM to z we have that the n=0 term is dominant and in that term we use the small argument formula (as (2) ka 1) for H0 leading to the following expression for the scattered field: Ezs = Ezi r π ei(kρ+π/4) √ 2 kρ(log(2/γka) − iπ/2) (4.65) where γ = 1.781 is Euler’s constant and the logarithm in the denominator arises in the small argument expansion of the Bessel function. In the case when the incident field is polarized TE to z we have that the n = 0, ±1 all contribute significantly leading to the following expression for the scattered field: Hzs = Hzi r π (ka)2 ei(kρ−3π/4) √ (1 + 2 cos(φ)). 2 2 kρ (4.66) If we insert these expressions into the definition for the scattering cross section we obtain the following bi-static scattering widths (again for ka 1): π2a ka(log2 (2/γka) + π 2 /4) σTc E (φ) = π 2 a[(ka)3 (1/2 + cos(φ))2 ], σTc M (φ) = (4.67) (4.68) where φ is defined in the diagram in the previous section. 4.2.5.2 Oblique Incidence In this case we assume the incident field has a propagation direction in the x − z plane and that the cylinder axis still lies along the z−axis. The angle φ is the angle between the incident wave’s propagation direction and the x − y plane. We calculate the scattered field and hence the scattering cross section at a point, say P 0 , which lies in a plane intersecting the z−axis and makes an angle φ0 with the 56 x−axis. The TM waves lies in the x − z plane and the TE wave lies orthogonal to the TM component. Figure 4.5: Scattering Scenario for an infinite length cylinder when the incident field makes an angle φ with the x − y plane (oblique incidence) [28] We have that the z−components of the H and E fields are given by EzT M = E0T M cos(Ψ) exp(ik(z sin(Ψ) − x cos(Ψ))) (4.69) HzT M = H0T M cos(Ψ) exp(ik(z sin(Ψ) − x cos(Ψ))) (4.70) p where H0T E = − /µE0T E . We now simply quote the resulting scattering cross sections, again in the case when ka 1. We have 4 π 2 a cos(Ψ) cos2 (Ψ) ka cos(Ψ)[log2 (2/γka) + π 2 /4] 4 σTc E (φ) = π 2 a cos(Ψ)[(ka cos(Ψ))3 (1/2 + cos(φ))2 ]. cos2 (Ψ) σTc M (φ) = (4.71) (4.72) For more details on calculating the scattered fields and the cross sections see [28]. 57 4.2.6 Finite Cylinder RCS We now consider the case of a finite cylinder which is the actual scatterer we consider in our scenario. We assume its length, denoted h, is significantly longer than several wavelengths. This assumption allows us to ignore the resonance effect of the scattered field when the cylinder has a length which is a multiple of a half-wavelength. As the length increases the scattered field appears mainly in the specular direction so we may use the results for the infinitely long cylinder to calculate the scattering cross section for a long, thin cylinder. We essentially assume that the scattered field is the same as for an infinitely long cylinder in the regions very near the cylinder radially and also when |z| > h. Otherwise we assume the fields are zero. More details may be found in [32]. We are mainly concerned with the mono-static case so we assume Ψs = Ψi = Ψ. Figure 4.6 describes the scattering scenario. In this case we obtain the following cross Figure 4.6: Scattering Scenario for a finite length cylinder [28] section expression 2 2πh2 cos2 (γi ) cos2 (γs ) sin(2kh sin(Ψ)) σ(Ψ) = 2kh sin(Ψ) [log2 (2/γka cos(Ψ)) + π 2 /4] (4.73) 58 where γi and γs are the angles that define the directions of the desired incident and scattered polarization states with respect to the TM planes. 4.3 Dipole SAR Scattering Model We now move on to discuss our mathematical model for scattering. We assume our SAR system is made up of two dipole antennas, a and b which travel along paths γ a and γ b . We assume that dipole a transmits the waveform pa (t), and the scattered field is received on both a and b. Similarly dipole b transmits the waveform pb (t), and the scattered field is received on both a and b. We denote the Fourier transforms of the waveforms by Pa and Pb . We also assume the dipoles have direction eba and ebb respectively. We model our object of interest, or target, as a collection of dipoles located at various pixels and with various orientations. We say a given target dipole at location x has orientation, or direction, ebT (x) = [cos θ(x), sin θ(x), 0]. Similarly we model our clutter as unwanted scatterers which are again made up of dipoles at various locations y with orientations ebC (y). We also assume our measurements are corrupted by noise n. Therefore we can say our forward model in the frequency domain is of the form Di,j (k, s) = F T [Ti,j ](k, s) + F C [Ci,j ](k, s) + ni,j (k, s) (4.74) where i = a, b and j = a, b. We call Di,j the set of the data collected when we transmit on the ith antenna and receive on the jth antenna. Also note Ti,j and Ci,j are the functions that describe the target and clutter and ni,j is the noise that corrupts the measurements when we transmit on i and receive on j. We now go into more detail to describe the scattering from the dipoles that make up our target and clutter. Please note we use the convention where vectors appear in bold font e.g. x and matrices are underlined e.g. A. Now we will return to the method of potentials solution to Maxwell’s equations 59 in section 4.2.2. Instead of solving the resulting Helmholtz equations (4.31) and (4.38) in cylindrical coordinates we will remain in the standard Cartesian coordinate system and utilize the Green’s function solutions as in chapter 2. Therefore we have that Z A(x) = eik|x| eik|x−y| µ0 J (y)dy ≈ µ0 4π|x − y| 4π|x| Z e | eik|x| b) J (y)dy = µ0 F (k x 4π|x| {z } b·y −ikx b)=F [J](kx b) F (kx (4.75) where F is the radiation vector. Observe that we have also used the far-field approximation [2] in (4.75). The expression for Φ may be obtained using the Lorenz gauge equation, ∇ · A − iω0 µ0 Φ = 0. We then obtain E from A via the following expression E = iω A + k −2 ∇(∇ · A) . (4.76) Taking |x| large results in Erad (x) = iωµ0 eik|x| eik|x| b(b [F − x x · F )] = iωµ0 [b x × (b x × F )] 4π|x| 4π|x| (4.77) where we have used the triple product vector identity. For more details on the farfield approximation utilized here see [2]. We will now adopt this general result for our specific antenna. In the frequency domain, the far-field electric field due to a radiating dipole of length a, located at position γ a (s) and pointing in direction eba , is a eikRx,s a a b b b a · eba )Pa (k) F a (k R Ea (k, x) = Rx,s × Rx,s × eba x,s a 4πRx,s (4.78) a a a where Rx,s = x − γ a (s), Rx,s = |Rx,s |, and a F (k cos θ) = asinc ka cos θ 2 (4.79) is the antenna pattern of the dipole a. We calculate the radiation vector for our dipole antenna in Appendix B. We may obtain the time domain version of the 60 electric field by taking the Fourier transform; we have Z E a (t, x) ∝ eikt Ea (k, x)dk. (4.80) Recall that we assume that any scatterer in our scene of interest is modeled as a dipole located at position x and pointing in direction ebsc (x) = [cos θsc (x), sin θsc (x), 0]. This scatterer may be part of the extended target or a clutter scatterer. Each dipole making up the scatterer acts as a receiving antenna with antenna pattern F sc . We can calculate the current excited on the dipole at position x and pointing in direction ebsc (x), due to the incident field Ea . We have b a · ebsc ) Isc ∝ ebsc · Ea F sc (k R x,s a h i eikRx,s a a a a sc a b b b b Pa (k). = ebsc · Rx,s × Rx,s × eba F (k Rx,s · eba )F (k Rx,s · ebsc ) a 4πRx,s (4.81) We assume that the current induced on the dipole radiates again as a dipole antenna, and in this process has strength ρ(x) and again antenna pattern F sc . Thus we obtain the field back at γ b (s): a b h i eik(Rx,s +Rx,s ) a a b b b b b b b b b e · R E(k, γ b (s)) ∝ ρ(x) × R × e R × R × e sc a sc x,s x,s x,s x,s a Rb 16π 2 Rx,s x,s b a · ebsc )F sc (k R b b · ebsc )F b (k R b b · ebb )F a (k R b a · eba )Pa (k). F sc (k R x,s x,s x,s x,s (4.82) We will assume the measured data is given by the current on the dipole located at 61 position γ b (s) with orientation ebb . We calculate this as in equation (4.81). We have Da,b (k, s) ∝ ebb · E(k, γ b (s)) Z a b eik(Rx,s +Rx,s ) sc b a b b · ebsc ) = ρ(x) F (k Rx,s · ebsc )F sc (k R x,s a Rb 16π 2 Rx,s x,s b a · eba )Pa (k) b b · ebb )F a (k R F b (k R x,s x,s i h i h b a × eba ebb · R bb × R b b × ebsc ba × R dx. ebsc · R x,s x,s x,s x,s (4.83) Here the two subscripts on the left side of (4.83) indicate that we transmit on dipole a and receive on dipole b. Also note that we integrate over all possible ground locations x in the scene of interest. Now recall as in standard SAR that we ultimately aim to have a forward model of the form D = F[T + C] + n or more specifically as in equation (4.74). Ideally this forward operator F T (and also F C ) is a linear operator. In this case it is very simple to calculate analytically the appropriate approximate inverse operator. We observe that in our current form our model is far from linear. Our two unknown quantities are ρ(x) and also ebsc (x). The model is linear in terms of ρ, however ebsc appears as the argument of the radiation patterns (or sinc functions) and also appears in the vector triple products. Note we suppress the dependence of ebsc on x for ease of writing. In order to linearize our model we will ultimately make simplifying assumptions. We first approach the vector product expressions. We will show that no approximation is needed to write this portion of the model in a linear fashion. We note that the vector expressions on the last line of (4.83) can be rewritten with the help of the triple product, or BAC-CAB, identity as h i h i bb × R b b × ebsc ba × R b a × eba ebb · R ebsc · R x,s x,s x,s x,s h i h i b a × ebsc · R b a × eba b b × ebsc · R b b × ebb = R R x,s x,s x,s x,s h i h i ba × R b a × eba · ebsc R bb × R b b × ebb · ebsc = R x,s x,s x,s x,s (4.84) 62 We note, moreover, that again from the BAC-CAB identity, we have h i R̂ × R̂ × ê = R̂ R̂ · ê − ê R̂ · R̂ = − ê − R̂ R̂ · ê = −PR⊥ b ê, (4.85) where PR⊥ b denotes the operator that projects a vector onto the plane perpendicular b Thus we can write (4.84) as to R. h i h i ⊥ PR⊥ ê · ê P ê · ê . a sc b sc ba bb R x,s (4.86) x,s b the direction of propagation. Thus the above We observe that we can consider R operation projects the antenna directions onto the plane perpendicular to the direction of propagation. This is precisely the right-handed coordinate system we discussed in section 4.1.1. In addition we may rewrite (4.84) using tensor products, that is we have h i h i a a b b b b b b Rx,s × Rx,s × eba · ebsc Rx,s × Rx,s × ebb · ebsc = (Ra⊥ ⊗ Rb⊥ ) : (b esc ⊗ ebsc ) (4.87) where R = R̂ × R̂ × ê , ⊗ is the standard tensor product, and : is known as the ⊥ double dot product, where you multiply the entries of the matrices component-wise and sum (the matrix analog to the vector dot product). In addition we may express this operation in a linear fashion. If Ra⊥ = (xa , ya , za ) and Rb⊥ = (xb , yb , zb ) we have that x2a xa ya xa ya ya2 cos2 θsc x x x y y x y y cos θ sin θ a b a b a b a b sc sc esc ⊗ ebsc ) = (Ra⊥ ⊗ Rb⊥ ) : (b . xa xb ya xb xa yb yb yb sin θsc cos θsc 2 2 2 xb xb yb xb yb yb sin θsc (4.88) Note that the third coordinate of R⊥ is not present above because in the double dot product these are multiplied by the third coordinates of ebsc which are always zero. 63 Also note that we now have expressed our forward model in terms of the quantity cos2 θ cos θ sin θ S(θ) = , sin θ cos θ (4.89) sin2 θ which is the scattering vector, or the vectorized scattering matrix, from polarimetry literature for a dipole scatterer [16, 26]. We now choose to define our two unknowns as ρ(x) and S(θsc ). Observe that we have suppressed the dependence of θsc on x for ease of writing. If we receive on both a and b, we obtain a data matrix consisting of Da,a Da,b , D(k, s) = Db,a Db,b (4.90) or equivalently we have the data vector Da,a D a,b D(k, s) = . Db,a (4.91) Db,b If we make the assumption that antennas a and b are collocated, that is, we assume a monostatic system, we have the following data expression for any dipole scatterer: Z D(k, s) = Aa,a x2a A x x a,b a b e2ik(Rx,s ) Ab,a xa xb Ab,b x2b Aa,a xa ya Aa,a xa ya Aa,b xa yb Aa,b ya xb Ab,a ya xb Ab,a xa yb Ab,b xb yb Ab,b xb yb Aa,a ya2 cos2 θsc cos θ sin θ Aa,b ya yb sc sc ρsc (x) dx sin θsc cos θsc Ab,a yb yb Ab,b yb2 sin2 θsc (4.92) where we define a b b x,s · ebs ))2 F i (k R b x,s · ebi )F j (k R b x,s · ebj )Pi (k) (4.93) Ai,j = (1/16π 2 Rx,s Rx,s )(F sc (k R 64 for i = a, b and j = a, b. 4.3.1 Comparison to the extended target RCS model We now pause in the derivation of our forward model in order to comment on how our model compares with the RCS model for an extended object described in section 4.2.6. Recall that we have the following expression for the RCS in this case (i.e. finite length cylinder, length much greater than wavelength) [28, 33]: 2 sin(2kh sin(Ψ)) 2πh2 cos2 (γi ) cos2 (γs ) . σ(Ψ) = 2kh sin(Ψ) [log2 (2/γka cos(Ψ)) + π 2 /4] (4.94) We will now express our model in terms of the angles Ψ, γi , and γs in order to make the comparison. We begin by defining a right-handed coordinate system in b (analogous to R), b and b h, spherical coordinates. We have the three basis vectors k b or R) b the direction b. The last two basis vectors lie in a plane perpendicular to k v of propagation and hence define a polarization basis. Explicitly we have b = − sin φ cos αb k x − sin φ sin αyb − cos φb z (4.95) b = sin αb h x − cos αyb (4.96) b = cos φ cos αb v x + cos φ sin αyb − sin φb z (4.97) where we define φ and α as the elevation and azimuth angles describing the direction of observation or propagation. In Figure (4.7) we illustrate this coordinate system: where θsc is the orientation of the cylinder or scatterer as it was in the previous b + sin θsc yb. Also it is important section. We note that we may write ebsc = cos θsc x to observe here that in terms of our angles we may write the incident direction Ψ = α − θsc as Ψ was defined as the incident wave direction in the plane that the cylinder lies in. Next we write the vectors Ra⊥ and Rb⊥ in terms of the right-handed coordinate system. Recall that R⊥ = −PR⊥ b ê and therefore each of these vectors lies in the b −v b which is precisely the h b plane. We therefore may plane perpendicular to R 65 Figure 4.7: Spherical coordinates express these vectors as linear combinations of the two basis vectors b + A2 v b + cos βav v b = |Ra⊥ |(cos βah h b) Ra⊥ = A1 h (4.98) b + B2 v b + cos βbv v b = |Rb⊥ |(cos βbh h b). Rb⊥ = B1 h (4.99) b = |R⊥ | cos βah , where βah is the angle Note we have used the fact that A1 = Ra⊥ · h a ⊥ b between R and h. We may perform this same dot product in order to calculate a the other coefficients A2 , B1 , and B2 in a similar fashion. Note that βhi and βvi , i = a, b correspond to the angles γi and γs in equation (4.94). We now calculate the quantities (Ri⊥ ⊗ Rj⊥ ) : (b esc ⊗ ebsc ) for each of the four combinations of i = a, b and j = a, b. We rewrite this quantity in the form (Ri⊥ ⊗ Rj⊥ ) : (b esc ⊗ ebsc ) = (Ri⊥ · ebsc )(Rj⊥ · ebsc ) (4.100) as it is easier to see intuitively how to substitute in the quantities we have defined above. We will work out the details for the example when i = a and j = b and then 66 quote the results for the other antenna pairs. In this case we have Ra⊥ · ebsc = A1 sin α cos θsc + A2 cos φ cos α cos θsc − A1 cos α sin θsc + A2 cos φ sin α sin θsc = A1 (sin α cos θsc − cos α sin θsc ) + A2 cos φ(cos α cos θsc + sin α sin θsc ). (4.101) Similarly we have Rb⊥ · ebsc = B1 (sin α cos θsc − cos α sin θsc ) + B2 cos φ(cos α cos θsc + sin α sin θsc ). (4.102) Next we use the angle sum and difference trigonometric identities to rewrite the quantities in the parentheses above which gives us the following expressions: Ra⊥ · ebsc = A1 sin(α − θsc ) + A2 cos φ cos(α − θsc ) (4.103) Rb⊥ · ebsc = B1 sin(α − θsc ) + B2 cos φ cos(α − θsc ). (4.104) We may replace the angle α − θsc with Ψ as stated earlier. Using these new angles we may now write the quantity present in our data model as (Ra⊥ ⊗ Rb⊥ ) : (b esc ⊗ ebsc ) = (Ra⊥ · ebsc )(Rb⊥ · ebsc ) = A1 B1 sin2 (Ψ) + (A1 B2 + A2 B1 ) cos φ sin(Ψ) cos(Ψ) + A2 B2 cos2 (Ψ). (4.105) With all the possible antenna pairs we have the vector ⊥ ) : (b bsc ) A21 sin2 (Ψ) + 2A1 A2 cos φ sin(Ψ) cos(Ψ) + A22 cos2 (Ψ) (R⊥ ⊗ Ra esc ⊗ e a (R⊥ ⊗ R⊥ ) : (b 2 2 bsc ) esc ⊗ e a A1 B1 sin (Ψ) + (A1 B2 + A2 B1 ) cos φ sin(Ψ) cos(Ψ) + A2 B2 cos (Ψ) b = . 2 2 (R⊥ ⊗ R⊥ ) : (b bsc ) esc ⊗ e a b A1 B1 sin (Ψ) + (A1 B2 + A2 B1 ) cos φ sin(Ψ) cos(Ψ) + A2 B2 cos (Ψ) bsc ) (Rb⊥ ⊗ Rb⊥ ) : (b esc ⊗ e B12 sin2 (Ψ) + 2B1 B2 cos φ sin(Ψ) cos(Ψ) + B22 cos2 (Ψ) Note that this is equivalent to AS(θsc ) from equation (4.92). Now looking at the expression for σ(Ψ), equation (4.94), we see that the RCS model simply takes the first term from each element of the vector given above. This corresponds again to neglecting the “cross-term” elements present in the scattered field, i.e. the fact that 67 Ehs = Shh Ehi + Shv Evi and Evs = Svh Ehi + Svv Evi as we stated in section 4.2.1. In the following sections we will see that our inclusion of these “cross-terms” aids in scattering vector reconstruction in the presence of noise and clutter. We now return to deriving our forward model. 4.3.2 Scattering Model for the Target Recall we had the expression for the data Z D(k, s) = e 2ik(Rx,s ) Aa,a x2a A x x a,b a b Ab,a xa xb Ab,b x2b Aa,a xa ya Aa,a xa ya Aa,b xa yb Aa,b ya xb Ab,a ya xb Ab,a xa yb Ab,b xb yb Ab,b xb yb Aa,a ya2 cos2 θsc cos θ sin θ Aa,b ya yb sc sc dx ρsc (x) sin θsc cos θsc Ab,a yb yb sin2 θsc Ab,b yb2 (4.106) b a b x,s · ebs ))2 F i (k R b x,s · ebi )F j (k R b x,s · ebj )Pi (k) for )(F sc (k R Rx,s where Ai,j = (1/16π 2 Rx,s i = a, b and j = a, b. The issue that remains to be dealt with in terms of linearizing is the fact that the argument of the radiation pattern of the scatterer, F sc , contains one of our unknowns, ebsc . In order to remove this nonlinearity we will make an assumption about the radiation pattern of the scatterer. However we will make different assumptions based on whether the scatterer is part of the extended target we seek to image or if it is a clutter scatterer present in the scene. In this section we focus on the scatterers that make up our target of interest. These linearizing assumptions also serve a second purpose. They will demonstrate the different type of scattering behavior we expect to see with an extended target versus a clutter scatterer. Recall that the specific type of target we wish to image is a line or curve, which can be thought of as an edge of many manmade objects. We have already mentioned that we expect a specific directional, or anisotropic, scattering response from our target. In particular, based on this target-type we assume that we will only obtain a strong return from the scatterer when the direction of the target is perpendicular to the look direction or the direction of propagation of the electromagnetic waves. Therefore we assume that F T is narrowly peaked around 0, that is, the main contributions to the data arise when b x,s · ebsc ≈ 0. Note that in a bi-static system we would require that R b i · ebT ≈ 0 R x,s for i = a, b. Also we will now change the subscript indicating the scatterer to the letter T to indicate we are considering a target scatterer. Using our directional 68 assumption we can simplify the expression (4.84) h i h i b x,s × R b x,s × eba · ebT R b x,s × R b x,s × ebb · ebT R = (−b eT · eba )(−b eT · ebb ) = [b ea ⊗ ebb ] • [b eT a2 a1 a2 1 a b a b 1 2 1 1 = a1 b1 a2 b1 b21 b1 b2 ⊗ ebT ] a1 a2 a2 b1 a1 b2 b1 b2 a22 cos2 θ a2 b2 cos θ sin θ a2 b2 sin θ cos θ b22 sin2 θ (4.107) where we let eba = (a1 , a2 , 0) and ebb = (b1 , b2 , 0). Note we have used the triple product, or BAC-CAB, identity like in (4.85) and the tensor notation like in (4.87). We can also say that b i · ebT ) = F T (k R x,s 1 if R b i · ebT ≈ 0 x,s (4.108) 0 otherwise for i = a, b. This eliminates the remaining nonlinearity in our forward model. Therefore we may express the data received from the target in the following form: Z T D (k, s) = e2ik(Rx,s ) AT (k, s, x)T (x)dx (4.109) where we define T (x) = ρT (x)S(θT ) as the target function. This quantity is the unknown we will reconstruct in our imaging scheme. Also we have defined the amplitude matrix AT : Aa,a a21 Aa,a a1 a2 Aa,a a1 a2 A a b A a b a,b 1 2 a,b 1 1 T A (k, s, x) = Ab,a a1 b1 Ab,a a2 b1 Ab,b b21 Ab,b b1 b2 Aa,a a22 Aa,b a2 b1 Aa,b a2 b2 Ab,a a1 b2 Ab,a a2 b2 2 Ab,b b1 b2 Ab,b b2 (4.110) 69 where b x,s · eba )F a (k R b x,s · eba )Pa (k) F a (k R a Rb 16π 2 Rx,s x,s a b b x,s · eba )F (k R b x,s · ebb )Pa (k) F (k R = a Rb 16π 2 Rx,s x,s b x,s · eba )F b (k R b x,s · ebb )Pb (k) F a (k R = a Rb 16π 2 Rx,s x,s b x,s · ebb )F b (k R b x,s · ebb )Pb (k) F b (k R = . a Rb 16π 2 Rx,s x,s Aa,a = (4.111) Aa,b (4.112) Ab,a Ab,b (4.113) (4.114) We see clearly now that the amplitude matrix no longer depends on the unknown quantity ebT and therefore we obtain the following linear forward model for data obtain from scattering from the target: T Z T D (k, s) = F [T ](k, s) = e2ik(Rx,s ) AT (k, s, x)T (x)dx where F T is a linear forward operator. We observe the special case when our model coincides with that of the radar literature. Note that if we let eba = [1, 0] and ebb = [0, 1] AT becomes diagonal, that is, Aa,a 0 AT (k, s, x) = 0 0 0 0 Aa,b 0 0 Ab,a 0 0 0 0 . 0 Ab,b (4.115) In this case the “cross-terms” would not be included in the forward model. This corresponds to assuming that one can reconstruct each element of the scattering vector Si,j from its corresponding data set Di,j for i = a, b and j = a, b. In the following sections we will compare reconstruction results using our data model and those created assuming that the amplitude matrix is diagonal as above. 70 4.3.3 Scattering Model for Clutter We can also take the general model and adapt it to the expected scattering model of our clutter, or the unwanted scatterers in our scene. In this case we replace the generic scatterer subscript with the letter C to indicate we are considering data received from the scattering of an object in the scene which is not part of our target. We assume our clutter scatters isotropically, since it is most likely not made up of edges like our target. This implies that b x,s · ebC ) = 1 F C (k R (4.116) ∀k, s, x. This removes the nonlinearity from the forward model as we did for the target. We obtain the following forward model for clutter data: C Z C D (k, s) = F [C](k, s) = e2ikRx,s AC (k, s, x)C(x)dx, (4.117) where we let the function that describes the clutter be C(x) = ρC (x)SC (θ). Note ρC (x) is the clutter scattering strength at x, and SC (θ) is the clutter scattering vector which depends on the orientation of the clutter dipole element at location x. Observe the amplitude matrix has the form Aa,a ya2 Aa,a x2a Aa,a xa ya Aa,a xa ya A x x A x y A y x A y y a,b a b a,b a b a,b a b a,b a b AC (k, s, x) = , Ab,a xa xb Ab,a ya xb Ab,a xa yb Ab,a yb yb Ab,b x2b Ab,b xb yb Ab,b xb yb (4.118) Ab,b yb2 where Ai,j for i = a, b and j = a, b are defined as in equations (4.111, 4.112, 4.113, 4.114). Also recall that Ra⊥ = (xa , ya , za ) and Rb⊥ = (xb , yb , zb ). Note that the operator F C is different from F T but it is still a linear operator. We will now combine all elements of the data into one expression and then move on to discuss the reconstruction, or imaging, process. 71 4.4 Total Forward Model We now can combine the target and clutter data with measurement, or ther- mal, noise n to obtain the full forward model expression. That is, we expect our collected data D to be of the form D(k, s) = F T [T ](k, s) + F C [C](k, s) + n(k, s) = D T (k, s) + D C (k, s) + n(k, s). (4.119) More specifically we have Z D(k, s) = e 2ik(Rx,s ) T A (k, s, x)T (x)dx + Z e2ikRx,s AC (k, s, x)C(x)dx + n(k, s), (4.120) where we assume n is a 4 × 1 vector. We also make the assumption now that our target vector T (x), our clutter vector C(x), and our noise vector n(k, s) are all second-order stochastic processes. A second-order stochastic process has finite variance, or each element of the covariance matrix is finite in this case. It is typical to make this assumption for clutter and noise. We choose to apply the same assumption for the target for the same reason that in Bayes estimation one assumes the parameter is a random variable or stochastic process. Since the location, shape, length, and all other descriptive quantities regarding the target are unknown to us prior to the data collection it is a standard techinique in statistics to make the assumption that this unknown is in fact stochastic in nature. This allows one to use statistical information regarding the object in the reconstruction or imaging task. We also note that although our target is relatively simple in nature as it is a curve, most radar targets are significantly more complicated in terms of their appearance. Even our simple object’s RCS varies widely as one changes the angle of the observation. Therefore it seems like an acceptable modeling choice to assume that the object is random in some respect. We may assume that only its descriptive parameters are random, like its location and orientation, and assume that the standard form of the target function is known. Or one may assume that the target function lies in some space of possible functions where there exists a probability distribution defined on that space describing how likely it is that the target function coincides 72 with any given element of that space. We have already specified a somewhat rigid form for our target function, that is, T (x) = ρT (x)S(θT ). In this case the form of S is assumed and we assume that only the parameter θT is stochastic. However we leave the form of the scattering strength unspecified and therefore we will eventually need to define a probability distribution describing the stochastic nature of ρT (x). For now we do not assign specific distributions to these random quantities. In the our numerical experiments we will ascribe distributions and they will be discussed in detail in the following sections. We do however need to specify some statistical assumptions on T , C, and n. For the first-order statistics we have that E[T (x)] = µ(x) (4.121) E[C(x)] = 0 (4.122) E[n(k, s)] = 0, (4.123) where µ(x) = [E[Ta,a (x)], E[Ta,b (x)], E[Tb,a (x)], E[Tb,b (x)]] and above 0 is the 4 × 1 zero vector. We also specify the autocovariance matrices for T , C, and n where we define the (l, k)th entry as follows T Cl,k (x, x0 ) = E[(Tl (x) − µl (x))(Tk (x0 ) − µk (x))] (4.124) 0 0 RC l,k (x, x ) = E[Cl (x)Ck (x )] (4.125) n Sl,k (k, s; k 0 , s0 ) = E[nl (k, s)nk (k 0 , s0 )]. (4.126) where here l = aa, ab, ba, bb and k = aa, ab, ba, bb. We assume the three processes are second-order, and we define all integrals involving the three processes in the mean-square sense. In addition we assume the Fourier transforms for CT , RC , and Rn exist. We will consider two cases, in the first we assume that the target, clutter, and noise are mutually statistically independent. In the second case we will allow there to be a correlation between the target and clutter processes. Now that our forward model is fully derived and specified we will move on to derive our imaging scheme. We begin with a brief discussion on the backprojection 73 process in general in the polarimetric SAR case and then give the results for the two statistical cases described above. 4.5 Image Formation in the presence of noise and clutter In order to form an image of our target, we will use a filtered-backprojection- based reconstruction method. Specifically we apply the backprojection operator K to our data to form an image I of our target, i.e. Z I(z) = (KD)(z) = e−i2kRz,s Q(z, s, k)D(k, s)dkds (4.127) where I(z) = [Ia,a (z), Ia,b (z), Ib,a (z), Ib,b (z)]. Plugging in the expression for D we have Z I(z) = e−i2k(Rz,s −Rx,s ) Q(z, s, k) AT (k, s, x)T (x) + AC (k, s, x)C(x) dx + n(k, s) dkds, (4.128) where we define Q as a 4 × 4 filter matrix. The filter Q can be chosen in a variety of ways. One way was already described in chapter 2. This method attempts to provide an image-fidelity operator that most closely resembles a delta function. This method works well in the case when we assume our target function can be described deterministically. We will instead consider a statistical criterion for selecting the optimal filter Q. In particular we will attempt to minimize the mean-square error between the reconstructed image I and the actual target function T . This method seeks to minimize the effect of noise and clutter on the resulting image. This method was first described for the case of standard SAR in the work of Yazici et al [15]. We begin by first defining an error process Ia,a (z) − Ta,a (z) I (z) − T (z) a,b a,b E(z) = I(z) − T (z) = . Ib,a (z) − Tb,a (z) Ib,b (z) − Tb,b (z) (4.129) 74 We also define the mean-square error as Z J (Q) = 2 E[|E(z)| ]dz = Z E[(E(z)† (E(z))]dz. (4.130) where E † indicates we are taking the complex conjugate and the transpose of the vector E. Note that we have J (Q) = V(Q) + B(Q) (4.131) where Z V(Q) = Z B(Q) = E[|E(z) − E[E(z)]|2 ]dz (4.132) |E[I(z)] − T (z)|2 dz. (4.133) Here V is known as the variance of the estimate and B is the bias. It is well-known that mean-square error is made up of variance and bias and that when we attempt to minimize such a quantity there is always a tradeoff between minimizing variance and bias. We will also see that minimizing this quantity will come at cost with respect to visible singularities of the target function. As discussed in [15], as we suppress the clutter and noise contributions to the image we also may be suppressing the strengths of these singularities which are key in identifying a target from an image. We will see though that our image-fidelity operator will again be of the form of a pseudodifferential operator and therefore the location and orientation of these singularities will be maintained. 4.5.1 Statistically Independent Case We will now consider the case when T , C, and n are all mutually statistically independent. We begin by stating the following theorem which summarizes the resulting optimal filter obtained in the case when we minimize mean-square error. Theorem. 1. Let D be given by (4.119) and let I be given by (4.128). Assume 75 S n is given by (4.126) and define S T , S C , and M as follows: T 0 C 0 Z C (x, x ) = Z R (x, x ) = † 0 µ(x)µ (x ) = Z 0 0 0 0 0 0 e−ix·ζ eix ·ζ S T (ζ, ζ 0 )dζdζ 0 (4.134) e−ix·ζ eix ·ζ S C (ζ, ζ 0 )dζdζ 0 (4.135) e−ix·ζ eix ·ζ M (ζ, ζ 0 )dζdζ 0 (4.136) then any filter Q satisfying a symbol estimate and also minimizing the leadingorder mean-square error J (Q) must be a solution of the following integral equation ∀r and ∀k: Z ηeix·(ζ 0 −ζ) (QAT η − χ̃Ω )(S T + M )(AT )† + (QAC η)S C (AC )† dζ 0 n + QS̃ η (r,k) =0 (r,k) 2. If we make the following stationarity assumptions: S T (ζ, ζ 0 ) = S T (ζ)δ(ζ − ζ 0 ) (4.137) S C (ζ, ζ 0 ) = S C (ζ)δ(ζ − ζ 0 ). (4.138) Then the filter Q minimizing the total error variance V(Q) is given by: n 2 T T † C C † Q |η| (A S T (A ) + A S C (A ) ) + η S̃ = η χ̃Ω S T (AT )† . (4.139) We also include in this theorem the case when we minimize simply the variance of the error process. This calculation leads to an algebraic expression for Q as opposed to the integral equation expression we obtain when we minimize meansquare error. Note that in this second result we also make additional stationarity assumptions on the target and clutter process. Proof. 1. Our goal is to minimize J (Q), which is given by the expression Z J (Q) = E[|(K(F T (T ) + F C (C) + n))(z) − T (z)|2 ]dz. (4.140) 76 Because we have assumed that T , C, and n are mutually statistically independent, this mean-square error can be written with three terms, one dependent on each process present in the data. That is, J (Q) = JT (Q) + JC (Q) + Jn (Q) (4.141) where Z JT (Q) = Z JC (Q) = Z Jn (Q) = E[|(K(F T ) − IΩ )(T )(z)|2 ]dz (4.142) E[|K(F C (C))(z)|2 ]dz (4.143) E[|K(n)(z)|2 ]dz. (4.144) Also note we have Z IΩ T (z) = 0 χ̃Ω (z, ξ)ei(z −z)·ξ T (z 0 )dξdz 0 (4.145) where χ̃Ω (z, ξ) is a smoothed characteristic function to avoid ringing. The next step is to simplify the expression for JT (Q). We first rewrite the expression for K(F T T )(z). Recall we had T K(F T )(z) = Z e−i2k(Rz,s −Rx,s ) Q(z, s, k)AT (x, s, k)T (x)dxdkds. (4.146) From the method of stationary phase in the variables (s, k), we know that the the main contributions to the integral come from the critical points of the phase. We have assumed that only the critical point x = z is actually visible to the radar. In order to obtain a phase that resembles that of a delta function, namely i(x − z) · ξ, we expand the phase about the point x = z. 77 We utilize the following Taylor expansion formula as in chapter two: Z 1 d f (z + µ(x − z))dµ 0 dµ Z 1 ∇f |z+µ(x−z) dµ = (x − z) · Ξ(x, z, s, k), = (x − z) · f (x) − f (z) = 0 (4.147) where in our case f (z) = 2kRz,s . We now perform the Stolt change of variables from (s, k) to ξ = Ξ(x, z, s, k). Therefore we have Z T ei(x−z)·ξ Q(z, s(ξ), k(ξ))AT (x, s(ξ), k(ξ))T (x)η(x, z, ξ)dxdξ, K(F T )(z) = (4.148) where η is the Jacobian resulting from the change of variables, sometimes called the Beylkin determinant. We can now substitute (4.148) into our expression for JT (Q), (4.157). We have Z JT (Q) = Z 2 n o T (x−z)·ξ E e Q(z, ξ)A (x, ξ)η(x, z, ξ) − χ̃Ω (z, ξ) T (x)dξdx dz (4.149) We note that (4.149) involves terms of the form 2 Z i(x−z)·ξ ∆= e Ã(z, x, ξ)dξ T (x)dx 2 (4.150) L A standard result in the theory of pseudodifferential operators [50] tells us that each term in the integral expression in (4.150) can be written Z AT (x) := i(x−z)·ξ e Z Ã(z, x, ξ)dξ T (x)dx = ei(x−z)·ξ p(ξ, z)dξ T (x)dx (4.151) 78 where p(ξ, z) = e−iz·ξ A(eiz·ξ ). The symbol p has an asymptotic expansion p(ξ, z) ∼ X i|α| α≥0 α! Dξα Dxα Ã(z, x, ξ) (4.152) z=x where α is a multi-index. In other words, the leading-order term of p(ξ, z) is simply Ã(z, z, ξ). The expression (4.150) can be written ∆ = hAT, AT i = hA† AT, T i, (4.153) where h·, ·i denotes the L2 inner product. The symbol calculus for pseudodifferential operators [50] tells us that the leading-order term of the composition A† A is † Z A AT = ei(x−z)·ξ p∗ (ξ, z)p(ξ, z)dξ T (x)dx. (4.154) This implies that the leading-order contribution to (4.149) is Z JT (Q) ∼ Z n o† T 0 0 0 e T (x) Q(x , ξ)A (x, ξ)η(x, x , ξ) − χ̃Ω (x , ξ) E o n T 0 0 0 0 0 0 × Q(x , ξ)A (x , ξ)η(x , x , ξ) − χ̃Ω (x , ξ) T (x ) dξdxdx0 . i(x−x0 )·ξ † (4.155) Our next task is to simplify the expression inside the expectation. We note that in the argument of the expectation we have a quantity that may be expressed as T † (x)H † H̃T (x0 ) where T is a 4 × 1 vector and H and H̃ are two 4 × 4 matrices. If we write out the matrix and vector multiplication using 79 summations altogether we have 4 X 4 X 4 X † E[T (x)H H̃T (x )] = E[ Hl,p H̃p,r Tl† (x)Tr (x0 )] † † 0 (4.156) l=1 p=1 r=1 = 4 X 4 X 4 X † Hl,p H̃p,r E[Tl† (x)Tr (x0 )] l=1 p=1 r=1 = 4 X 4 X 4 X † T Hl,p H̃p,r (Cr,l (x0 , x) + µr (x0 )µl (x)) l=1 p=1 r=1 = tr(H † (H̃C T (x0 , x))) + tr(H † (H̃µ(x0 , x))) where recall C T is the covariance matrix of the target and µ(x) is the mean of the target. Also note we define the following parameter µ(x, x0 ) = µ(x)µ† (x0 ) which is a 4 × 4 matrix as well. We therefore have that JT (Q) ∼ J˜T (Q) + B(Q) n Z o† T i(x−x0 )·ξ 0 0 0 Q(x , ξ)A (x, ξ)η(x, x , ξ) − χ̃Ω (x , ξ) ∼ e tr n o T T 0 0 0 0 0 0 Q(x , ξ)A (x , ξ)η(x , x , ξ) − χ̃Ω (x , ξ) C (x , x) dξdxdx0 n Z o† i(x−x0 )·ξ + e tr Q(x0 , ξ)AT (x, ξ)η(x, x0 , ξ) − χ̃Ω (x0 , ξ) n o T 0 0 0 0 0 0 Q(x , ξ)A (x , ξ)η(x , x , ξ) − χ̃Ω (x , ξ) µ(x , x) dξdxdx0 . (4.157) Here we see explicitly the bias term of the mean-square error. The rest of the terms make up the variance portion. We can repeat the same steps for the clutter term JC (Q) (i.e. write KF in terms of ξ, express JC in terms of the composition of pseudodifferential 80 operators, and use the symbol calculus) to obtain the leading-order expression Z † Q(x0 , ξ)AC (x, ξ)η(x, x0 , ξ) C 0 C 0 0 0 0 Q(x , ξ)A (x , ξ)η(x , x , ξ) R (x , x) dξdxdx0 . JC (Q) ∼ i(x−x0 )·ξ e tr (4.158) The last term we need to simplify is the noise term, i.e. Jn (Q). We write it out explicitly as Z (Kn)(z) = e−i2kRz,s Q(z, s, k)n(s, k)dkds, (4.159) which implies that Z Z ei2kRz,s n† (s, k)Q† (z, s, k) Z −i2k0 Rz,s0 0 0 0 0 0 0 × e Q(z, s , k )n(s , k )dkdsdk ds dz. Jn (Q) = E (4.160) We rewrite the matrix and vector multiplication as before to obtain Z Jn (Q) = e i2(kRz,s −k0 Rz,s0 ) tr Q (z, s, k)Q(z, s , k )S (s , k ; s, k) dkdsdk 0 ds0 dz † 0 0 n 0 0 (4.161) where S n is the covariance matrix of the noise. We denote it by the letter S because the noise is already written in terms of a frequency variable k and is therefore analogous to a spectral density function. In order to simplify our expression so that we may add it to the target and clutter terms we make the assumption that the noise is stationary in both s and k. This is equivalent to assuming that the noise has been prewhitened. Mathematically this assumption is written as n n Si,j (s, k; s0 , k 0 ) = S̃i,j (s, k)δ(s − s0 )δ(k − k 0 ) (4.162) 81 Inserting this specific S n into equation (4.161) we obtain Z Jn (Q) = tr Q (z, s, k)Q(z, s, k)S̃ (s, k) dkdsdz n † (4.163) where without loss of generality we replace s0 , k 0 with s, k. Our last step is to perform the Stolt change of variables from (s, k) to ξ to obtain, Z Jn (Q) = tr Q (z, ξ)Q(z, ξ)S̃ (ξ) η(z, z, ξ)dξdz. n † (4.164) We rewrite our expression for JT and JC in terms of the spatial frequency variable now in order to eventually be able to combine these terms with the noise terms. Our goal is to have the same integrations present in all three terms. We define the following spectral density functions as they were defined in the statement of the theorem: T 0 C 0 Z C (x, x ) = Z R (x, x ) = 0 0 0 0 e−ix·ζ eix ·ζ S T (ζ, ζ 0 )dζdζ 0 (4.165) e−ix·ζ eix ·ζ S C (ζ, ζ 0 )dζdζ 0 . Note these are both 4 × 4 matrices as well. Switching the two arguments of the covariance matrices gives us the following expressions T 0 Z C (x , x) = RC (x0 , x) = Z 0 0 0 0 e−ix ·ζ eix·ζ S T (ζ, ζ 0 )dζdζ 0 (4.166) e−ix ·ζ eix·ζ S C (ζ, ζ 0 )dζdζ 0 . We find these specifically because we require these two expressions in order to rewrite the mean-square error in terms of ζ and ζ 0 . In particular we insert these into the equations for J˜T (Q) and JC (Q). We have J˜T (Q) ∼ Z n o† e e tr Q(x0 , ξ)AT (x, ξ)η(x, x0 , ξ) − χ̃Ω (x0 , ξ) n o Q(x0 , ξ)AT (x0 , ξ)η(x0 , x0 , ξ) − χ̃Ω (x0 , ξ) S T (ζ, ζ 0 ) dξdxdx0 dζdζ 0 i(x−x0 )·ξ −i(x0 ·ζ−x·ζ 0 ) (4.167) 82 and Z JC (Q) ∼ † Q(x0 , ξ)AC (x, ξ)η(x, x0 , ξ) C 0 0 0 0 0 Q(x , ξ)A (x , ξ)η(x , x , ξ) S C (ζ, ζ ) dξdxdx0 dζdζ 0 . (4.168) i(x−x0 )·ξ −i(x0 ·ζ−x·ζ 0 ) e e tr We now use (4.151) and the symbol calculus again to carry out the integrations in the variables x0 and ξ. We obtain the leading order contributions to J˜T and JC as n o† J˜T (Q) ∼ e tr Q(x, ζ 0 )AT (x, ζ 0 )η(x, x, ζ 0 ) − χ̃Ω (x, ζ 0 ) o n T 0 0 0 0 0 Q(x, ζ )A (x, ζ )η(x, x, ζ ) − χ̃Ω (x, ζ ) S T (ζ, ζ ) dxdζdζ 0 (4.169) Z ix·(ζ 0 −ζ) and Z † Q(x, ζ 0 )AC (x, ζ 0 )η(x, x, ζ 0 ) C 0 0 0 0 Q(x, ζ )A (x, ζ )η(x, x, ζ ) S C (ζ, ζ ) dxdζdζ 0 . JC (Q) ∼ ix·(ζ 0 −ζ) e tr (4.170) Now for the bias term, B(Q), we introduce the function M , which is analogous to a spectral density function and is defined as follows: 0 † Z 0 µ(x, x ) = µ(x)µ (x ) = 0 0 e−ix·ζ eix ·ζ M (ζ, ζ 0 )dζdζ 0 . (4.171) Again switching the two arguments we obtain 0 µ(x , x) = Z 0 0 e−ix ·ζ eix·ζ M (ζ, ζ 0 )dζdζ 0 . (4.172) Now we substitute (4.172) into the expression for B(Q) and perform the same 83 stationary phase calculation in x0 and ξ to arrive at n o† B(Q) ∼ e tr Q(x, ζ 0 )AT (x, ζ 0 )η(x, x, ζ 0 ) − χ̃Ω (x, ζ 0 ) n o T 0 0 0 0 0 Q(x, ζ )A (x, ζ )η(x, x, ζ ) − χ̃Ω (x, ζ ) M (ζ, ζ ) dxdζdζ 0 . (4.173) Z ix·(ζ 0 −ζ) We have now finished simplifying the terms that make up the mean-square error. Our next step is to return to task of finding the optimal filter Q. Recall our goal is to find the Q which minimizes J (Q). We do this by finding the variation of J with respect to Q. That is, we look for the Q which satisfies d d d ˜ 0 = JT (Q + Q ) + JC (Q + Q ) + Jn (Q + Q ) d =0 d =0 d =0 d + B(Q + Q ) (4.174) d =0 for all possible Q . This variational optimization technique comes from calculus of variations and is analogous to the Euler-Lagrange method. We use such a method because our quantity J (Q) is a functional and not a function. We now begin calculating this derivative. We focus on the first term on the right-hand side of (4.174) and then apply similar steps to obtain the other terms in the derivative. We have Z d 0 −ζ) T T ix·(ζ † J˜T (Q + Q ) = e tr (Q A η) ((QA η − χ̃Ω )S T ) dxdζdζ 0 d =0 Z T T ix·(ζ 0 −ζ) † + e tr (QA η − χ̃Ω ) ((Q A η)S T ) dxdζdζ 0 . (4.175) Now if we interchange ζ and ζ 0 in the second integral and use the fact that 84 S †T (ζ, ζ 0 ) = S T (ζ 0 , ζ) we obtain Z d T T ix·(ζ 0 −ζ) † ˜ tr (Q A η) ((QA η − χ̃Ω )S T ) dxdζdζ 0 JT (Q + Q ) = e d =0 Z † T T ix·(ζ−ζ 0 ) † 0 + e tr (QA η − χ̃Ω ) ((Q A η)S T (ζ, ζ )) dxdζdζ 0 . (4.176) Now using the fact that for any square matrix M, tr(M) = tr(M0 ) (where the superscript 0 here refers to transpose) we have Z d 0 −ζ) T T † ix·(ζ J˜T (Q + Q ) = e tr (Q A η) ((QA η − χ̃Ω )S T ) dxdζdζ 0 d =0 Z T T 0 ix·(ζ−ζ 0 ) 0 + e tr S T (ζ, ζ )(Q A η) (QA η − χ̃Ω ) dxdζdζ 0 . (4.177) And finally we use the fact that for any square matrices A, B, and C, tr(ABC) = tr(BCA) to obtain Z d 0 −ζ) T T ix·(ζ † J˜T (Q + Q ) = e tr (Q A η) ((QA η − χ̃Ω )S T ) dxdζdζ 0 d =0 Z T T ix·(ζ−ζ 0 ) 0 0 + e tr (Q A η) (QA η − χ̃Ω )S T (ζ, ζ ) dxdζdζ 0 . (4.178) We notice that the second term is exactly the complex conjugate of the first term. This leads us to the following expression: Z d ix·(ζ 0 −ζ) T † T ˜ JT (Q + Q ) = 2 Re e tr (Q A η) ((QA η − χ̃Ω )S T ) dxdζdζ 0 . d =0 (4.179) Performing similar steps we obtain the expressions for the variational deriva- 85 tives of JC , Jn , and B: Z d C † C ix·(ζ 0 −ζ) tr (Q A η) ((QA η)S C ) dxdζdζ 0 , JC (Q + Q ) = 2 Re e d =0 Z d n † Jn (Q + Q ) = 2 Re tr Q QS̃ ηdxdζ, d =0 Z d T T ix·(ζ 0 −ζ) † B(Q + Q ) = 2 Re e tr (Q A η) ((QA η − χ̃Ω )M ) dxdζdζ 0 . d =0 (4.180) Now inserting the above results into equation (4.174) we have 0 = 2 Re e tr (Q A η) ((QA η − χ̃Ω )S T ) dxdζdζ 0 (4.181) Z C † C ix·(ζ 0 −ζ) + 2 Re e tr (Q A η) ((QA η)S C ) dxdζdζ 0 Z n † + 2 Re tr Q QS̃ ηdxdζ Z T T ix·(ζ 0 −ζ) † + 2 Re e tr (Q A η) ((QA η − χ̃Ω )M ) dxdζdζ 0 . Z ix·(ζ 0 −ζ) † T T We combine the four terms to obtain 0 = 2 Re e tr (Q A η) ((QA η − χ̃Ω )(S T + M )) dxdζdζ 0 Z C † C ix·(ζ 0 −ζ) + 2 Re e tr (Q A η) ((QA η)S C ) dxdζdζ 0 Z n † + 2 Re tr Q QS̃ ηdxdζ (4.182) Z ix·(ζ 0 −ζ) T † T Now in the first two terms we use the fact that for any square matrices A, B, 86 C, and D we have tr(ABCD) = tr(BCDA). This allows us to write Z 0 Q† (ηeix·(ζ −ζ) )(QAT η − χ̃Ω )(S T + M )(A ) dxdζdζ 0 Z † C C † ix·(ζ 0 −ζ) + 2 Re tr Q (ηe )(QA η)(S C )(A ) dxdζdζ 0 Z n † + 2 Re tr Q QS̃ dxdζ Z 0 = 2 Re tr Q† ηeix·(ζ −ζ) [(QAT η − χ̃Ω )(S T + M )(AT )† Z n C C † † 0 + (QA η)(S C )(A ) ] dxdζdζ + 2 Re tr Q QS̃ dxdζ. 0 = 2 Re tr T † (4.183) In order to derive the condition which guarantees our equation equals zero we write out the trace operator in summation form: 0 = 2 Re Z X 4 X 4 Q†,(k,r) Z ix·(ζ 0 −ζ) ηe (QAT η − χ̃Ω )(S T + M )(AT )† k=1 r=1 C † C + (QA η)(S C )(A ) dζ + (QS̃ η)(r,k) dxdζ. 0 n (r,k) (4.184) We see that (4.184) holds for all Q if Q satisfies the following integral equation Z 0= ix·(ζ 0 −ζ) ηe T T † C C † (QA η − χ̃Ω )(S T + M )(A ) + (QA η)S C (A ) dζ 0 (r,k) n + (QS̃ η)(r,k) , (4.185) ∀r and ∀k. This completes the proof of part (1) of the theorem. We now consider part (2). 2. We now consider the task of minimizing the variance of the error process. 87 Ultimately we decide to perform this alternate task because it results in an algebraic expression for Q which aids in numerical calculations. We begin with the leading order contribution to V(Q) (the variance of the error term) which is given by V(Q) = J˜T (Q) + JC (Q) + Jn (Q). (4.186) Again note the only term missing is the bias term B(Q). Following the above calculations we have that ∀r and ∀k Z 0= ix·(ζ 0 −ζ) ηe T T † C C † (QA η − χ̃Ω )S T (A ) + (QA η)S C (A ) dζ 0 (r,k) n + (QS̃ η)(r,k) . (4.187) We can make a stationarity assumption on (T − µ) and C such that S T (ζ, ζ 0 ) = S T (ζ)δ(ζ − ζ 0 ) (4.188) S C (ζ, ζ 0 ) = S C (ζ)δ(ζ − ζ 0 ). (4.189) With these assumptions our condition simplifies to become T † T C C † 0 = η ((QA η − χ̃Ω )S T (A ) + (QA η)S C (A ) n + (QS̃ η)(r,k) (4.190) (r,k) ∀r and ∀k. We may rewrite this as n T T † C C † 2 Q |η| (A S T (A ) + A S C (A ) ) + η S̃ = η χ̃Ω S T (AT )† . (4.191) Then we take the adjoint of the above equation to obtain |η| (A (S ) (A ) + A (S ) (A ) ) + η(S̃ ) Q† = ηAT (S T )† χ̃†Ω (4.192) 2 T T † T † C C † C † n † 88 Therefore if the matrix in brackets is invertible we obtain the following filter † 2 T T † T † C C † C † n † Q = |η| (A (S ) (A ) +A (S ) (A ) )+η(S̃ ) −1 ηAT (S T )† χ̃†Ω . (4.193) This completes the proof of part (2). 4.5.2 Correlated Clutter and Target Case We now consider the case when the clutter process and the target process are statistically dependent. Recall our forward model Z D(k, s) = e2ikRx,s AT (x, k, s)T (x) + AC (x, k, s)C(x) dx + n(k, s). (4.194) Also recall our statistical assumptions; the first-order statistics remain the same: E[T (x)] = µ(x) (4.195) E[C(x)] = 0 (4.196) E[n(k, s)] = 0. (4.197) For the second-order statistics we again have the autocovariance matrices for T , C, and n where we define the (l, k)th entry as follows T Cl,k (x, x0 ) = E[(Tl (x) − µl (x))(Tk (x0 ) − µk (x))] (4.198) 0 0 RC l,k (x, x ) = E[Cl (x)Ck (x )] (4.199) Rnl,k (k, s; k 0 , s0 ) = E[nl (k, s)nk (k 0 , s0 )]. (4.200) We assume that T and n are statistically independent and also that C and n are statistically independent as before. However, we now assume that T and C are statistically dependent with the following cross-covariance matrices. We define the 89 (l, k)th entries of the cross-covariance matrices, C T,C and C C,T as T,C Cl,k (x, x0 ) = E[(Tl (x) − µl (x))Ck (x0 )] = E[Tl (x)Ck (x0 )] (4.201) C,T Cl,k (x, x0 ) = E[Cl (x)(Tk (x0 ) − µk (x0 ))] = E[Cl (x)Tk (x0 )]. (4.202) Theorem. 1. Let D be given by (4.119) and let I be given by (4.128). Assume S n is given by (4.126) and define S T , S C , M , S T,C , and S C,T as follows: T 0 C 0 Z C (x, x ) = Z R (x, x ) = µ(x)µ† (x0 ) = Z Z 0 C T,C (x, x ) = C C,T 0 Z (x, x ) = 0 0 0 0 0 0 0 0 0 0 e−ix·ζ eix ·ζ S T (ζ, ζ 0 )dζdζ 0 e−ix·ζ eix ·ζ S C (ζ, ζ 0 )dζdζ 0 e−ix·ζ eix ·ζ M (ζ, ζ 0 )dζdζ 0 e−ix·ζ eix ·ζ S T,C (ζ, ζ 0 )dζdζ 0 e−ix·ζ eix ·ζ S C,T (ζ, ζ 0 )dζdζ 0 (4.203) (4.204) (4.205) (4.206) (4.207) then any filter Q satisfying a symbol estimate and also minimizing the meansquare error J (Q) must be a solution of the following integral equation ∀r and ∀k: Z ix·(ζ 0 −ζ) (QAT η − χ̃Ω )[(S T + M )(AT )† + S C,T (AC )† ] C C † T † + (QA η)[S C (A ) + S T,C (A ) dζ 0 0= ηe (r,k) + ηQS n (r,k) 2. If we make the stationarity assumptions S T (ζ, ζ 0 ) = S T (ζ)δ(ζ − ζ 0 ) (4.208) S C (ζ, ζ 0 ) = S C (ζ)δ(ζ − ζ 0 ) (4.209) S T,C (ζ, ζ 0 ) = S T,C (ζ)δ(ζ − ζ 0 ) (4.210) S C,T (ζ, ζ 0 ) = S C,T (ζ)δ(ζ − ζ 0 ), (4.211) 90 then the filter Q minimizing the total error variance V(Q) is given by: −1 Q† = |η|2 AT (S T )† + AC (S C,T )† (AT )† + AC (S C )† + AT (S T,C )† (AC )† + ηR†n ×η AT (S T )† + AC (S C,T )† χ̃† . (4.212) Proof. 1. We begin again by defining the error process and minimizing the mean- square error with respect to the filter Q. Recall the error process E(z) = I(z) − T (z) and the functional describing the mean-square error J (Q) given by Z J (Q) = Z 2 E[|E(z)| ]dz = E[|(K(F T (T ) + F C (C) + n))(z) − T (z)|2 ]dz = JT (Q) + JC (Q) + Jn (Q) + JC,T (Q). (4.213) Note that because of the correlation of the target and clutter processes we have the additional cross-term JC,T . The derivations for the first three terms remain unchanged so for now we focus on simplifying the cross-term. Explicitly JC,T is given by † C e C (x) Q(z, ξ)A (x, ξ)η(x, z, ξ) dξdx JC,T (Q) = 2 Re E Z T i(x0 −z)·ξ0 0 0 0 0 0 0 0 0 0 × e Q(z, ξ )A (x , ξ )η(x , z, ξ ) − χ̃Ω (z, ξ ) T (x )dξ dx dz. Z Z −i(x−z)·ξ † (4.214) Following the same outline of steps in the proof of the statistically independent result, we now use (4.151) to carry out the integrations in z and ξ. We then 91 find that the leading-order contribution to the cross-term can be written † C 0 0 e C (x) Q(x , ξ)A (x, ξ)η(x, x , ξ) JC,T (Q) ∼ 2 Re E T 0 0 0 0 0 0 × Q(x , ξ)A (x , ξ)η(x , x , ξ) − χ̃Ω (x , ξ) T (x ) dξdxdx0 . Z Z i(x−x0 )·ξ † (4.215) Now if we write out the matrix multiplication in summation form and carry the expectation through to the random elements T and C, we obtain the following expression † C 0 0 JC,T (Q) ∼ 2 Re e tr Q(x , ξ)A (x, ξ)η(x, x , ξ) T C,T 0 0 0 0 0 0 Q(x , ξ)A (x , ξ)η(x , x , ξ) − χ̃Ω (x , ξ) C (x, x ) dξdxdx0 . Z i(x−x0 )·ξ (4.216) We then define the cross-spectral density matrix S C,T (ζ, ζ 0 ) and S T,C (ζ, ζ 0 ) as follows 0 C C,T (x, x ) = C T,C 0 (x, x ) = Z Z 0 0 0 0 e−ix·ζ eix ·ζ S C,T (ζ, ζ 0 )dζdζ 0 e−ix·ζ eix ·ζ S T,C (ζ, ζ 0 )dζdζ 0 . (4.217) (4.218) In order to write this term in terms of ζ and ζ 0 we insert (4.217) into the expression for JC,T to obtain the expression † C 0 0 JC,T (Q) ∼ 2 Re e e tr Q(x , ξ)A (x, ξ)η(x, x , ξ) T 0 0 0 0 0 0 Q(x , ξ)A (x , ξ)η(x , x , ξ) − χ̃Ω (x , ξ) S C,T (ζ, ζ ) dξdxdx0 dζdζ 0 . Z i(x−x0 )·ξ −i(x·ζ−x0 ·ζ 0 ) (4.219) Note we define ST,C for use in the simplification of the variational derivative. Again we use (4.151) and the symbol calculus to obtain the leading-order 92 contribution from the integrations in x and ξ: † C tr Q(x, ζ)A (x, ζ)η(x, x, ζ) JC,T (Q) ∼ 2 Re e T 0 Q(x, ζ)A (x, ζ)η(x, x, ζ) − χ̃Ω (x, ζ) S C,T (ζ, ζ ) dxdζdζ 0 . Z ix·(ζ 0 −ζ) (4.220) Our next step is to find the variational derivative of JC,T with respect to Q and rewrite it in such a way that we can combine it easily with the terms previously derived. We first note that the variational derivative can be written as Z d ix·(ζ 0 −ζ) C † T 0 (JC,T (Q + Q )) = 2 Re e tr (QA η) Q A ηS C,T (ζ, ζ ) dxdζdζ 0 d =0 Z 0 +2 Re eix·(ζ −ζ) tr η(AC )† Q† (QAT η − χ̃Ω )S C,T (ζ, ζ 0 ) dxdζdζ 0 . (4.221) In the first term we interchange ζ and ζ 0 and use the fact that S †T,C (ζ, ζ 0 ) = S C,T (ζ 0 , ζ) to obtain Z d † 0 ix·(ζ−ζ 0 ) C † T (JC,T (Q + Q )) = 2 Re e tr (QA η) Q A ηS T,C (ζ, ζ ) dxdζdζ 0 d =0 Z ix·(ζ 0 −ζ) 0 C † † T +2 Re e tr η(A ) Q (QA η − χ̃Ω )S C,T (ζ, ζ ) dxdζdζ 0 . (4.222) We then use the fact that for any square matrix A we have that tr(A0 ) = tr(A) (where 0 indicates transpose) in the first term to obtain Z d ix·(ζ−ζ 0 ) T 0 C (JC,T (Q + Q )) = 2 Re e tr S T,C (Q A η) (QA η) dxdζdζ 0 d =0 Z C † † T 0 ix·(ζ 0 −ζ) +2 Re e tr η(A ) Q (QA η − χ̃Ω )S C,T (ζ, ζ ) dxdζdζ 0 . (4.223) Next we use the fact that for any square matrices A, B, and C that tr(ABC) = 93 tr(BCA) in the first term. This gives us Z d T C 0 ix·(ζ−ζ 0 ) (JC,T (Q + Q )) = 2 Re e tr (Q A η) (QA η)S T,C dxdζdζ 0 d =0 Z C † † T 0 ix·(ζ 0 −ζ) +2 Re e tr η(A ) Q (QA η − χ̃Ω )S C,T (ζ, ζ ) dxdζdζ 0 . (4.224) Also use the fact that for any complex number z, Re(z) = Re(z), in the first term to find Z d T C † ix·(ζ−ζ 0 ) (JC,T (Q + Q )) = 2 Re e tr (Q A η) (QA η)S T,C dxdζdζ 0 d =0 Z C † † T ix·(ζ 0 −ζ) 0 +2 Re e tr η(A ) Q (QA η − χ̃Ω )S C,T (ζ, ζ ) dxdζdζ 0 . (4.225) Finally in both terms we use the face that for any square matrices A, B, C, and D tr(ABCD) = tr(BCDA) to write Z d † C T † ix·(ζ 0 −ζ) (JC,T (Q + Q )) = 2 Re e tr Q η(QA η)S T,C (A ) dxdζdζ 0 d =0 Z ix·(ζ 0 −ζ) +2 Re e tr Q† η(QAT η − χ̃Ω )S C,T (AC )† dxdζdζ 0 . (4.226) Combining with the other terms of the MSE we find that the variational derivative of the MSE with respect to Q is given by d 0 = (J (Q + Q )) d =0 Z Z † ix·(ζ 0 −ζ) = 2 Re tr Q ηe (QAT η − χ̃Ω )[(S T + M )(AT )† + S C,T (AC )† ] C C † T † 0 + (QA η)[S C (A ) + S T,C (A ) dζ + ηQRn dxdζ. (4.227) This expression holds for all Q if Q satisfies the following integral equation 94 for all r and for all k ηe (QAT η − χ̃Ω )[(S T + M )(AT )† + S C,T (AC )† ] 0= C C † T † 0 + (QA η)[S C (A ) + S T,C (A ) dζ + ηQRn Z ix·(ζ 0 −ζ) (4.228) (r,k) (r,k) This completes the proof of part (1). 2. Now if we consider minimizing simply the variance as before, we obtain the following integral equation for Q Z 0= ηe (QAT η − χ̃Ω )[S T (AT )† + S C,T (AC )† ] C C † T † 0 + ηQRn + (QA η)[S C (A ) + S T,C (A ) dζ ix·(ζ 0 −ζ) (r,k) (4.229) (r,k) We may also make the following stationarity and joint stationarity assumptions S T (ζ, ζ 0 ) = S T (ζ)δ(ζ − ζ 0 ) (4.230) S C (ζ, ζ 0 ) = S C (ζ)δ(ζ − ζ 0 ) (4.231) S T,C (ζ, ζ 0 ) = S T,C (ζ)δ(ζ − ζ 0 ) (4.232) S C,T (ζ, ζ 0 ) = S C,T (ζ)δ(ζ − ζ 0 ). (4.233) This leads to the following algebraic expression for Q T T T † C † C C † T † 0 = η (QA η − χ̃Ω )[S (A ) + S C,T (A ) ] + (QA η)[S C (A ) + S T,C (A ) ] + QRn η. (4.234) We may rearrange the equation and take the transpose as in the last step of 95 the filter derivation in the independent case to obtain −1 T T † C C † † 2 † † † † Q = |η| A (S T ) + AC (S C,T ) (A ) + A (S C ) + AT (S T,C ) (A ) + ηRn T C † † × η A (S T ) + A (S C,T ) χ̃† . † (4.235) which completes the proof of the theorem. 4.6 Numerical Simulations We conclude our study of SAR imaging of extended targets with some nu- merical experiments to verify the theory proved in the previous section. Recall we indicated that the most significant difference between our polarimetric SAR processing and the type of processing in practice is that we utilize every set of polarimetric data to reconstruct each element of the scattering matrix. This amounts to the fact that the optimal filter Q is a fully dense filter matrix. In standard SAR processing one would make the assumption that Q is a diagonal matrix. That is, the element Ti,j (x) may be reconstructed from the data set Di,j (k, s) for each i = a, b and j = a, b. We saw this assumption presenting itself in the forward model in our comparison to the standard RCS model for cylindrical extended targets. We have calculated the optimal filter and allowed it to be in general fully dense. We consider a simple thought experiment that shows even under the simplest assumptions on the target and clutter processes the optimal filter in the mean-square sense is not diagonal. Observe that for the remainder of the chapter we will consider specifically the case when target and clutter are not statistically correlated and we will look only at the optimal filter for the case of minimizing variance as in (4.139). This choice is made solely for simplicity in terms of the numerical calculations necessary. Now note that if the spectral density matrices S T , S C , and S n are diagonal, 96 we obtain a diagonal filter Q where the entries along the diagonal simplify to T Qi,i = ηA(i,i) S T(i,i) χ̃Ω,(i,i) n 2 C |η|2 (|AT(i,i) |2 S T(i,i) + |AC (i,i) | S (i,i) ) + ηS (i,i) . (4.236) This corresponds to performing a component-by-component backprojection where the filter is derived to minimize the mean-square error in each component of the image with respect to each component of the actual target function. This result is derived directly in [15]. This is an example of a backprojection filter that can be used in standard polarimetric SAR imaging. It is not necessarily the case that these spectral density matrices are diagonal. We may assume that the noise has been whitened and therefore S n is diagonal. It is not as simple to reason that the target and clutter spectral densities have such a structure. To demonstrate this we calculate example covariance matrices for specific target and clutter functions. For our target example we begin by assuming all randomness in T (x) arises in the scattering vector S(θT ). Therefore for simplicity we assume that ρT (x) is deterministic. In particular we assume that the orientation θT is a random process dependent on the location x. For simplicity we assume θT is a Gaussian random process where for any θT (x) and θT (x0 ) the joint probability density function has the form f (θT , θT0 ) 1 2 1 02 0 0 = √ exp − (2θT − 2θT θT − 10θT − 10θT + 2θT + 50) 3 2π 3 (4.237) where θT = θT (x) and θT0 = θT (x0 ). This corresponds to a joint Gaussian density with mean µ = [5, 5]T and covariance matrix C= 2 1 1 2 . We note that this implies the marginal density function for any θT is given by √ θT2 2 25 f (θT ) = √ exp − + 5θT − . 2 2 4 π (4.238) 97 Under these assumptions we obtain the following covariance matrix for T (x) −0.00498 0.05602 CT (x, x0 ) = CT (θT , θT0 ) = ρT (x)ρT (x0 ) −0.00498 0.05478 0.05478 −0.00422 −0.00422 0.08440 (4.239) 0.00805 where we have assumed for simplicity that Tab (x) = Tba (x). This assumption reduces the dimensions of our original covariance matrix to a 3 × 3 matrix. Note we calculate the covariance matrix by averaging over θT and θT0 . We also calculated the result for when µ = [0, 0]T with C the same as above. In this case our covariance structure was slightly simpler, however still not diagonal. We have 0.086885 CT (x, x0 ) = CT (θT , θT0 ) = ρT (x)ρT (x0 ) 0 0.05503 0.05503 0.0083 0 . 0 0.05305 0 (4.240) Also for the sake of completeness we found the covariance structure within each pixel, that is the case when x = x0 . The result is given by 0.11 CT (x, x) = |ρT (x)|2 −0.0051 −0.0051 0.000794 −0.0040978 . 0.000794 −0.0040978 .13839 0.06215 (4.241) We see that even a Gaussian random target process does not result in a diagonal C T and therefore S T will also not be diagonal. In addition, we consider an example clutter function, where again we assume all the randomness arises due to the random process θC (x). In this case we assume that each realization θC (x) is uniformly distributed in the interval [0, π/2]. This assumption has been previously used for clutter in [21]. We also assumed that any two realizations θC (x) and θC (x0 ) were independent. Therefore the covariance 98 matrix for C(x) for any two x and x0 is given by 0.25 CC (x, x0 ) = ρC (x)ρC (x0 ) 0.25π 0.25 0.25π π −2 0.5π −1 0.25 0.5π −1 . (4.242) 0.25 Here we have used CC to denote the covariance as opposed to RC because in this case the mean of the clutter function is not zero as assumed before. It is clear that even with rather simple statistical assumptions our covariance matrices for our target and clutter functions, and hence the spectral density functions, are not diagonal. We can therefore say with certainty that our resultant filter Q defined in (4.193) is different from the filter in the component-by-component backprojection from (4.236). Our derived filter implies that we utilize all the components of the data when creating each component of the image, that is each Di,J is used to create the image Ii,j for i = a, b and j = a, b. It is also important to observe that these statistical assumptions are rather simple and introduce randomness in one specific way. We may also assume, for example, that the scattering strengths ρT and ρC are random processes dependent on the location x. This would introduce complexity in the joint density functions, and complicate the covariance matrix calculation perhaps significantly. We stress that the assumptions made to arrive at the above example matrices are simple ones and do not reflect a very realistic target and clutter scene. We find that even in this simple case, these covariance matrices and hence the spectral density matrices are not diagonal. Therefore our derived filter (4.193) is indeed different and novel in comparison with a component-by-component backprojection scheme in which one treats each set of data independently to form the four images that make up our I(x). This component-by-component scheme may be the one defined in (4.236), or an even simpler backprojection scheme in which the statistics of the target and clutter scene are not taken into account. 99 4.6.1 Numerical Experiments We move on now to discuss the specific numerical experiments performed to verify our theory. We assume that our scene on the ground is of the size 50 meters by 50 meters where there exist 100 pixels by 100 pixels. That is, our resolution cell size is .5 meters by .5 meters. The coordinate system used is target-centered so the target is always located at the origin of the scene. We consider targets with varying orientation for example θT (x) = 0, π/4, π/2 at each x location that the target is found. We also have that ρT (x) = 1 for all target locations x. Note we assume the target is always twenty pixels in length and one pixel in width. For the clutter process we assume that a clutter dipole is located at every possible x in the scene of interest. Also note that all the random variables ρC (x) are independent identically distributed (i.i.d.) Gaussian random variables with zero-mean and unit complex variance. The random variables θC (x) are i.i.d uniform between the angles [0, π/2]. We note that in this case the total clutter process C(x) is wide-sense stationary and therefore the stationarity assumption (4.138) holds. Also observe that measurement noise is not explicitly included in the numerical simulations as it is simple to assume that the data has been prewhitened. The flight path is always assumed to be linear with the coordinates given by γ(s) = [x0 , s, z0 ], where we have assumed that x0 and z0 are fixed or constant. The two antennas used for transmission and reception have orientations eba = [1, 0, 0]0 and ebb = [0, 1, 0]0 which are defined with respect to the origin in the scene on the ground. Note we may think of a as having the horizontal or H orientation and b therefore has the vertical or V orientation. Our frequency range is 1 − 1.5 GHz, where we sample at a rate above Nyquist. We also note the way in which we calculate the spectral density functions of the processes T (x) and C(x). Since the target is not actually random in the experiments we calculate its spectral density via the formula Z 2 −ix·ζ T (x)dx . S T (ζ) = e (4.243) The clutter covariance matrix was calculated by hand given the simple assumptions 100 on its distribution. That is, we average over C(x) and C(x0 ). Then we take the Fourier transform in order to calculate S C (ζ). We also note our definition of signalto-clutter ratio (SCR) is given by SCR = 20 log 1 N PN 1 |(T (xi ) − µT (xi )|2 E[|C|2 ] (4.244) where N is the number of grid points and µT is the mean of the target function. Note that in producing the data the directional scattering assumptions on the target and clutter process, (4.108) and (4.116), are not used. In this way we avoid making any crimes of inversion. After the simulated data is produced we image the target scattering vector, or target function, T (x) using both the standard SAR processing and our coupled polarimetric processing. We will then compare two sets of images for each example. We have Is (z) and Ic (z) where these are the component-by-component backprojection image vector and coupled backprojection image vector respectively. We will also note the differences in mean-square error and the final image signal-to-clutter (SCR) ratios. Before we go into specific results we note one issue present in our coupled numerical reconstruction scheme. The matrix in the expression for Q in equation (4.193) is typically close to being singular. This poses some issues in finding its inverse numerically. We have implemented a regularization scheme in which we diagonally weight the matrix in order to improve its condition number. This diagonal weighting depends on a constant factor which we call our regularization parameter, similar to the terminology used in Tikhonov regularization. The choice of the regularization parameter is done for each case individually and has not yet been optimized for minimizing MSE or final image SCR. This is left as future work. Once this is optimized we expect the improvements made in using our coupled scheme to be even more significant. 4.6.1.1 Example One - Horizontally Polarized Target We first consider the case when the target has the orientation ebT (x) = [1, 0, 0]0 which is parallel to the a, or H, antenna. In Figure (4.8) we show the actual target scene on the left and then the target-embedded-in-clutter scene on the right. We 101 Figure 4.8: HH component of target vector and target plus clutter vector, horizontally polarized target assume that the target will not be visible when the antenna reaches the line y = 0 as b x,s is parallel with ebT . We display the this is where the target lies and at this point R data obtained using the a antenna for both transmission and reception, using a for transmission and b for reception, and also the case when b is used for both processes in Figures (4.9) and (4.10). The first group of data is target-only data, Figure (4.9), and the second set has target-embedded-in-clutter, Figure (4.10). We see that indeed there appears to be no data collected when s = 38 as this is the point the flight path crosses the x-axis where the target lies. When clutter is present the target data is completely obscured. We do note however that the target is visible from almost all other points on the flight path indicating that our directional scattering assumption is not entirely accurate. However we observe that making the assumption aids in formulating the inversion scheme. In the next figure, Figure (4.11) we present the results of the standard image processing and then follow with the results of our coupled processing in Figure (4.12). We present these side-by-side with the actual target function. Here we only show the result of the HH image as the other two images are flat as expected. Note in this case we have signal-to-clutter ratio of 10dB. We observe that in Figure (4.12) the scale of pixel values of the image is about nine orders of magnitude greater than that of the scale in the standard processed image in Figure (4.11). Also note that the image is significantly more focused in the coupled processing case. We also 102 Figure 4.9: HH, HV, and VV target only data for the case, horizontally polarized target plot the signal-to-clutter ratio versus the mean-square error in Figure (4.13). The MSE is reduced by an order of magnitude with the coupled processing technique. This is of note because in our filter derivation we minimized the variance of the error process instead of MSE and yet the MSE is significantly reduced with our technique. We note the slight increase when the SCR is 20dB, this is most likely due to the current realization of the clutter used in that calculation. Lastly we display the final image signal-to-clutter ratio in Tables (4.1) and (4.2). We calculate final SCR by performing the reconstruction techniques on targetonly data and clutter-only data and then compare the energy in each set of images. Here we see a significant improvement in final image SCR which is clear from comparing the coupled processed image with the standard processed image. These are two key parameters when considering the success of an imaging algorithm. This 103 Figure 4.10: HH, HV, and VV target embedded in clutter data for the case, horizontally polarized target trend will be displayed again in the next two examples, however the method does perform best in this example. 104 Figure 4.11: HH image created using the standard processing vs. the true target function Figure 4.12: HH image created using the coupled processing vs. the true target function Figure 4.13: SCR vs. MSE for the standard processed images and coupled processed images respectively, horizontally polarized target 105 SCR(dB) Final Image SCR(dB) -20 0.3122 -10 0.9873 0 3.1223 10 9.8735 20 31.2226 Table 4.1: Initial SCR in dB vs. Final Standard Processed Image SCR in dB, horizontally polarized target SCR(dB) Final Image SCR(dB) -20 0.7869 -10 2.4884 0 7.8691 10 24.8844 20 78.6912 Table 4.2: Initial SCR in dB vs. Final Coupled Processed Image SCR in dB, horizontally polarized target 106 Figure 4.14: VV component of target vector and target plus clutter vector, vertically polarized target 4.6.1.2 Example Two - Vertically Polarized Target We next consider the case when the target has the orientation ebT (x) = [0, 1, 0]0 which is parallel to the b antenna, or V antenna, and perpendicular to the flight path. We display the true target scene and also the target embedded in clutter scene in Figure (4.14). We expect to see much of the target in this case because of the relationship between the target orientation and the flight path. We again display the data obtained using the a antenna for both transmission and reception, using a for transmission and b for reception, and also the case when b is used for both processes in Figures (4.15) and (4.16). Also again the first group of data is target-only data, and the second set has target-embedded-in-clutter. We see that in this case the target is visible for almost all slow-time values, or for the length of the flight path. We also note that the addition of clutter does not completely obscure data from the target in Figure (4.16). Since there is significant information in all channels we do not expect our coupled processing to provide that much of an advantage over standard processing. In the Figure (4.17) we again present the results of the standard image processing and then in Figure (4.18) the results of our coupled processing. These are shown side-by-side with the actual target function. Here we only show the result of the VV image as the other two images are flat as expected. Note in this case we have signal-to-clutter ratio of −20dB. In this case, as expected, the difference between 107 Figure 4.15: HH, HV, and VV target only data for the case, vertically polarized target the two schemes is not as obvious. The scale on both images is the same. We also plot the signal-to-clutter ratio versus the mean-square error in Figure (4.19). Here we see a slight reduction in mean-square error, however again the improvement is not as noticeable. Finally in Tables (4.3) and (4.4) we see that final image SCR is improved again slightly with our scheme. This result is expected as the target is very visible in this scattering scenario. 108 Figure 4.16: HH, HV, and VV target embedded in clutter data for the case, vertically polarized target Figure 4.17: VV image created using the standard processing vs. the true target function 109 Figure 4.18: VV image created using the coupled processing vs. the true target function Figure 4.19: SCR vs. MSE for the standard processed images and coupled processed images respectively, vertically polarized target SCR(dB) Final Image SCR(dB) -20 0.308 -10 0.974 0 3.0802 10 9.7404 20 32.5014 Table 4.3: Initial SCR in dB vs. Final Standard Processing Image SCR in dB, vertically polarized target 110 SCR(dB) Final Image SCR(dB) -20 0.3213 -10 1.0159 0 3.2126 10 10.159 20 30.9024 Table 4.4: Initial SCR in dB vs. Final Coupled Processing Image SCR in dB, vertically polarized target 111 Figure 4.20: HV component of target vector and target plus clutter vector, 45 ◦ polarized target 4.6.1.3 Example Three - 45 ◦ Polarized Target Our third example considers the case when the target has orientation ebT = √ √ [1/ 2, 1/ 2, 0]0 . In this case we expect the coupled processing to aid even more in target reconstruction as there is more information to be gained by using all three data sets to construct each target vector element. In Figure (4.20) we show the true target scene and the target embedded in clutter scene. Next we display the target-only and also target embedded-in-clutter data in Figures (4.21) and (4.22). We see in this case that the target is not visible for much of the flight path and also the addition of clutter completely obscures the target. However there is data in all three channels so we expect to see some improvement by using our coupled processing. Next we display example images processed using the two different techniques. Here we show only the result of the HV image as the other two images are almost identical. These are shown in Figures (4.23) and (4.24). Note in this case we have signal-to-clutter ratio of 20dB. In this case both algorithms struggle to reconstruct the target however we note that our coupled scheme is able to properly display the orientation of the target while the standard processing fails in this respect. Next we plot the mean-square error versus SCR in Figure (4.25). Here we see slight improvement again when using our scheme. We do not expect that our scheme will improve MSE much as the target is not visible often and therefore the 112 amount of data available for reconstruction is minimal in all channels. Lastly we calculate the final image SCR for each type of image and display the results in Tables (4.5) and (4.6). We again see the same results as the previous two examples with an improvement in final image SCR. Again we do expect the gain to be significant in this case due to the lack of data. We note though that if the target has an orientation that is not along the coordinate axes the additional information used in our reconstruction scheme aids in producing a more accurate target image as we are able to properly reconstruct target orientation using our method. Figure 4.21: HH, HV, and VV target only data, 45 ◦ polarized target 113 Figure 4.22: HH, HV, and VV target embedded in clutter data, 45 ◦ polarized target Figure 4.23: HV image created using the standard processing vs. the true target function 114 Figure 4.24: HV image created using the coupled processing vs. the true target function Figure 4.25: SCR vs. MSE for the standard processed images and coupled processed images respectively, 45 ◦ polarized target SCR(dB) Final Image SCR(dB) -20 0.1646 -10 0.5039 0 1.5934 10 5.0389 20 15.9344 Table 4.5: Initial SCR in dB vs. Final Standard Processing Image SCR in dB, 45 ◦ polarized target 115 SCR(dB) Final Image SCR(dB) -20 0.1858 -10 0.5069 0 1.6031 10 5.0694 20 16.0308 Table 4.6: Initial SCR in dB vs. Final Coupled Processing Image SCR in dB, 45 ◦ polarized target CHAPTER 5 Conclusions and Future Work In this work we have studied incorporating statistical techniques into SAR imaging. In particular we have shown that the generalized likelihood ratio test is equivalent to performing backprojection imaging when a matched filter is used. It is very striking that these identical data processing techniques arise from very different theories and problem formulations. This leads to many more questions about the relationship between statistics and microlocal analysis when the theories are used in imaging in particular. Future work is needed to fully understand why these theories lead to the same results. The second body of work developed a novel polarimetric imaging technique. This technique not only demonstrates how to incorporate statistical knowledge into the imaging scheme, but also demonstrates a way in which to utilize the additional information a polarimetric radar provides. Previous polarimetric imaging techniques did not fully take advantage of all the data sets available to them. As a result polarimetric SAR has not been utilized often because the additional cost of computation and hardware required is not outweighed by the improvements made by using these extra data sets. We have demonstrated that the mean-square error of the reconstructed target scattering vector can be reduced by using this coupled processing. Also one may be able to improve the final image signal-to-clutter ratio by using this method of data processing. In addition this method has proved to be extremely useful when the target does not share a polarization state with the antennas used for transmission and reception. In this case standard polarimetric techniques fail to correctly display the orientation of the target while the polarimetric coupled processing is able to recover this information. Future work may be done by using these coupled processed images as input to a detection and estimation scheme. It would be interesting to determine whether or not these images provide more information for the detection or estimation technique and if probability of detection is improved in this case. Combining this work with the first part of the thesis one could study whether 116 117 or not performing simply a generalized likelihood ratio test on the polarimetric data would produce similar results to using the coupled backprojection imaging. We have demonstrated that with new computing capabilities polarimetric radar may prove useful in improving SAR images and target detection capabilities. Finally the discovery of this additional information provided by using every data set to reconstruct each element of the target vector suggests that there may be additional ways to improve polarimetric imaging and detection. There is clearly a relationship between elements of the scattering vector, especially in the case of dipole scatterers. One may use this information and instead attempt to reconstruct the second-order statistics (the covariance matrix) of the target scatterer. This makes sense as we are actually only imaging one realization of the random target process in our current formulation. If the stochastic assumption is accurate it would be more informative to recover the covariance structure of the random process. This topic will be studied in the future by the author. We conclude that statistics is a powerful tool that can be combined with the mathematical theory of microlocal analysis to create new imaging algorithms which improve our ability to detect targets in SAR images. LITERATURE CITED [1] M. Cheney and B. Borden, Fundamentals of Radar Imaging, SIAM, Philadelphia, 2009. [2] C.J. Nolan and M. Cheney, “Synthetic aperture inversion”, Inverse Problems, vol. 18, pp. 221-236, 2002. [3] C.E. Yarman, B. Yazici, and M. Cheney, “Bistatic synthetic aperture radar imaging with arbitrary trajectories”, IEEE Transactions in Image Processing, Vol. 17, No: 1, pp: 84-93, 2008. [4] C.E. Yarman, B. Yazici “Synthetic aperture hitchhiker imaging”, IEEE Transactions in Image Processing, Vol. 17, No. 11, pp: 2156-2173, 2008. [5] T. Varslot, C.E. Yarman, M. Cheney and B. Yazici, “A Variational Formulation to Waveform Design for Synthetic Aperture Imaging,” Inverse Problems in Imaging, Vol. 1, No. 3, pp: 577-592, 2007. [6] D.J. Rossi and A.S. Willsky, “Reconstruction from Projections Based on Detection and Estimation of Objects – Part I: Performance Analysis and Part II: Robustness Analysis”, IEEE Trans. on Acoustics, Speech, and Signal Processing, Vol. 32, No. 4, pp. 886-906, Aug. 1984. [7] H.L. Royden, Real Analysis, Macmillan, New York, 1968. [8] E. Parzen, Time Series Analysis Papers, Holden-Day Series in Time Series Analysis, 1967. [9] J. Hajek, “A property of J-divergences of marginal probability distributions,” Czechoslovak Mathematical Journal, Vol. 8 (1958), No. 3, 460-463. [10] Chen-To Tai, Dyadic Green’s Functions in Electromagnetic Theory, IEEE Press Series on Electromagnetic Waves, Piscataway, NJ, 1994. [11] David Colton and Rainer Kress, Integral Equation Methods in Scattering Theory, John Wiley and Sons, Inc., New York, 1983. [12] Leung Tsang, Jin Au Kong, and Kung-Hau Ding, Scattering of Electromagnetic Waves: Theories and Applications, John Wiley and Sons, Inc., New York, 2000. [13] J. van Bladel, Singular Electromagnetic Fields and Sources, Oxford University Press, New York, 1991. 118 119 [14] C.H. Papas, Theory of Electromagnetic Wave Propagation, McGraw-Hill, Inc., New York, 1965. [15] B. Yazici, M. Cheney, and C.E. Yarman, “Synthetic-Aperture Inversion in the Presence of Noise and Clutter,” Inverse Problems 22, pp. 1705-1729, 2006. [16] J.R. Huynen “Phenomenological Theory of Radar Targets,” Ph.D. Thesis, University of Technology, Delft, The Netherlands, December 1970. [17] R.L. Dilsavor and R.L. Moses, “Fully-Polarimetric GLRTs for Detecting Scattering Centers with Unknown Amplitude, Phase, and Tilt Angle in Terrain Clutter,” SPIE’s International Symposium on Optical Engineering in Aerospace Sensing, Orlando, FL, April 1994. [18] S.R. Cloude and E. Pottier, ”A review of target decomposition theorems in radar polarimetry,” IEEE Trans. GRS. vol. 34(2), pp. 498-518, March 1996. [19] E. Ertin and L.C. Potter, “Polarimetric Classification of Scattering Centers using M-ary Bayesian Decision Rules,” IEEE Trans. on Aerospace and Electronic Systems, vol. 36, No. 3, July 2000. [20] J.S. Lee, E. Pottier, Polarimetric radar imaging, Taylor and Francis, Boca Raton, 2009. [21] A. Freeman and S.L. Durden, “A Three-Component Scattering Model for Polarimetric SAR Data,” IEEE Trans. on Geoscience and Remote Sensing, vol. 36, no. 3, May 1998. [22] L.M. Novak, M.C. Burl, and W.W. Irving, “Optimal Polarimetric Processing for Enhanced Target Detection,” IEEE Trans. on Aerospace and Electronic Systems, vol. 29, no. 1, Jan. 1993. [23] R.D. Chaney, M.C. Burl, and L.M. Novak, “On the Performance of Polarimetric Target Detection Algorithms,” IEEE International Radar Conference, 1990. [24] L.M. Novak, M.B. Sechtin, and M.J. Cardullo, “Studies of Target Detection Algorithms that use Polarimetric Radar Data,” IEEE Trans. on Aerospace and Electronic Systems, vol. AES-25, no. 2, March 1989. [25] H. Mott, Antennas for Radar and Communications: A Polarimetric Approach, John Wiley and Sons, Inc., New York, 1992. [26] I. Hajnsek, “Inversion of Surface Parameters using Polarimetric SAR,” DLR, Oberpfaffenhofen, Germany, DLR Res. Rep. FB 2001-30, 2001. [27] M. Gustafsson, “Multi-static synthetic aperture radar and inverse scattering,” Technical Report LUTEDX, TEAT-7123, 2004. 120 [28] Ruck, Barrick, Stuart, and Krichbaum, Radar Cross Section Handbook Vol. 1, Plenum Press, NY-London, 1970. [29] M.R. Allen and L.E. Hoff, “Wide-angle wideband SAR matched filter image formation for enhanced detection performance,” Algorithms for Synthetic Aperture Radar Imagery, 2230, pp. 302314, SPIE, (Orlando, FL, USA), Apr. 1994. [30] L. Lo Monte, “Radio Frequency Tomography for Underground Void Detection,” Ph.D. Thesis, University of Illinois at Chicago, 2009. [31] R.F. Harrington, Time-Harmonic Electromagnetic Fields, McGraw-Hill Book Company, NY, 1961. [32] H.C. Van de Hulst, Light Scattering by Small Particles, Dover Publications, Inc., NY, 1981. [33] P.Z. Peebles, Radar Principles, John Wiley and Sons, Inc., NY, 1998. [34] J.H. Van Vleck, F. Bloch, and M. Hamermesh, “Theory of Radar Reflection from Wires or Thin Metallic Strips,” Journal of Applied Physics, Vol. 18, March, 1947. [35] R. King and C.W. Harrison, “The Distribution of Current Along a Symmetrical Center-Driven Antenna,” Proceedings of the I.R.E, October, 1943. [36] C.T. Tai “Electromagnetic Back-Scattering from Cylindrical Wires,” Journal of Applied Physics, Vol. 23, No.8, August, 1952. [37] V.H. Poor An Introduction to Signal Detection and Estimation, Springer-Verlag, New York, 1988. [38] A. Papoulis and S.U. Pillai, Probability, Random Variables, and Stochastic Processes, McGraw-Hill Book Company, NY, 2002. [39] E.J. Kelly, “An Adaptive Detection Algorithm,” IEEE Transactions on Aerospace and Electronic Systems, vol. AES-22, no.1, March 1986. [40] J. Schou, H. Skriver, A.A. Nielsen, and K. Conradsen, “CFAR Edge Detector for Polarimetric SAR Images,” IEEE Trans. on Geoscience and Remote Sensing, vol. 41, no. 1, January 2003. [41] W.M. Boerner, M.B. El-Arini, C.Y. Chan, and P.M. Mastoris, “Polarization Dependence in Electromagnetic Inverse Problems,” IEEE Trans. on Antennas and Propagation, vol. AP-29, no. 2, March 1981. 121 [42] J.R. Huynen, “The Stokes matrix parameters and their interpretation in terms of physical target properties,” SPIE Vol. 1317 Polarimetry: Radar, Infrared, Visible, Ultraviolet, and X-ray, 1990. [43] J.J. van Zyl, “Application of Cloude’s target decomposition theorem to polarimetric imaging radar data,” SPIE Vol. 1748, Radar Polarimetry, 1992. [44] J.A. Jackson and R.L. Moses, “Feature Extraction Algorithm for 3D Scene Modeling and Visualization Using Monostatic SAR,” Algorithms for Synthetic Aperture Radar Imagery XIII, Proc. of SPIE, E. G. Zelnio and F. D. Garber, Eds., 2006, vol. 6237. [45] R.D. Chaney, A.S. Willsky, and L.M. Novak, “Coherent aspect-dependent SAR image formation,” Algorithms for Synthetic Aperture Radar Imagery, Proceedings of SPIE, 1994. [46] K. Voccola, B. Yazici, M. Cheney, M. Ferrara, “On the Relationship between the Generalized Likelihood Ratio Test and Backprojection Method for Synthetic-Aperture Radar Imaging,” SPIE Defense and Security Conference, April 2009, Orlando, FL. [47] C.J. Nolan and M. Cheney, “Microlocal Analysis of Synthetic Aperture Radar Imaging,” The Journal of Fourier Analysis and Applications 10, no. 2, pp. 133-148, 2004. [48] G. Beylkin, “Imaging of discontinuities in the inverse scattering problem by inversion of a causal generalized Radon transform,” J. Math. Phys. 26, pp. 99-108, 1985. [49] F. Treves, Introduction to pseudodifferential and Fourier integral operators, vol. I & II, The University Series in Mathematics, Plenum Press, New York, 1980. [50] A. Grigis and J. Sjostrand, “Microlocal analysis for differential operators, an introduction,” London Mathematical Society Lecture Note Series 196, Cambridge University Press, Cambridge, 1994. APPENDIX A FIOs and Microlocal Analysis A.1 Fourier Integral Operators Definition. A Fourier integral operator (FIO) P of order m is defined as Z Pu = eiΦ(x,y,ξ) p(x, y, ξ)u(y)dydξ, where p(x, y, ξ) ∈ C ∞ (X × Y × Rn ) satisfies the estimate: for every compact set K ⊂ X × Y and for every multi-index α, β, γ, there is a constant C = C(K, α, β, γ) such that |∂ξα ∂xβ ∂yγ p(x, y, ξ)| ≤ C(1 + |ξ|)m−|α| for all x, y ∈ K and for all ξ ∈ Rn . Also Φ must be a phase function, i.e. if, 1.) Φ is positively homogeneous of degree 1 in ξ. That is Φ(x, y, rξ) = rΦ(x, y, ξ) for all r > 0. 2.) (∂x Φ, ∂ξ Φ) and (∂y Φ, ∂ξ Φ) do not vanish for all (x, y, ξ) ∈ X × Y × Rn \{0}. The phase variable ξ is the analogue of ω in (2.36). One example of an FIO is a pseudodifferential operator, which is an FIO with a phase function of the form Φ(x, z, ξ) = (z − x) · ξ. A.2 Microlocal Analysis The mathematical theory of microlocal analysis is a way of analyzing singu- larities, in our case these singularities are the edges and boundaries we seek to identify. We begin our discussion of the theory by describing the way singularities are characterized in microlocal analysis, by their location and direction. We define the singular structure of a function by its wavefront set which is the collection of 122 123 singular points and their associated directions. Intuitively the directions included in the wavefront set are those in which the function oscillates most in the frequency domain. A formal definition of the wavefront set is as follows, Definition. The point (y, ξ0 ) is not in the wavefront set W F (f ) of the function f if there is a smooth cutoff function ψ with ψ(y) 6= 0, for which the Fourier trans[ form (f ψ)(λξ) decays rapidly (i.e. faster than any polynomial in 1/λ) as λ → ∞, uniformly for ξ in a neighborhood of ξ0 . We can break down this definition into three steps to determine if a point is in the wavefront set: i.) localize around y by multiplying by a smooth cutoff function ψ supported in the neighborhood of y, ii.) Fourier transform f ψ, and iii.) study the decay of the Fourier transform in the direction ξ0 . If the Fourier transform decays rapidly in that direction the point is not in the wavefront set. We include two examples to make this concept more concrete. Example 1 A point scatterer, i.e. if f (x) = δ(x), then W F (f ) = {(0, ξ) : ξ 6= 0} Example 2 A line, i.e. if f (x) = δ(x · ν), then W F (f ) = {(x, αν) : x · ν = 0, α 6= 0} APPENDIX B Calculation of the Radiation of a Short Dipole B.0.1 Vector potential The vector potential satisfies ∇2 A + k 2 A = −µ0 J , (B.1) which can be solved as Z A(x) = eik|x| eik|x−y| µ0 J (y)dy ≈ µ0 4π|x − y| 4π|x| eik|x| b) e−ikxb·y J (y)dy = µ0 F (k x 4π|x| | {z } Z b)=F [J](kx b) F (kx (B.2) where F is the radiation vector. If we use the Lorenz gauge ∇ · A − iω0 µ0 Φ = 0. (B.3) then we obtain the following expression for E in terms of A: c2 ∇(∇ · A) E = iω A − 0 (iω)2 B.0.2 = iω A + k −2 ∇(∇ · A) (B.4) Far-field radiation fields Taking |x| large results in Erad (x) = iωµ0 eik|x| eik|x| b(b [F − x x · F )] = iωµ0 [b x × (b x × F )] 4π|x| 4π|x| (B.5) where we have used the “bac-cab” vector identity, together with k 2 = ω 2 /c20 = µ0 0 ω 2 . We see that one effect of going to the far field is that the longitudinal b, which must be true for a plane wave component of E is removed, so that E ⊥ x b. propagating in direction x 124 125 B.0.3 Radiation vector for a dipole For an antenna consisting of a wire that is short relative to the wavelength, the current density is approximately constant along the antenna. If we consider the antenna to be a line in space, say the set of points x(s) = sb e with −L/2 ≤ s ≤ L/2, and denote the current density by the constant vector I eb, then the radiation vector is Z b) = F (k x L/2 −L/2 = −I eb e−ikxb·(sbe) I ebds = I eb e−ikLbx·be/2 − eikLbx·be/2 b · eb ik x 2i sin(kLb x · eb/2) = −LI eb sinc(kLb x · eb/2) b · eb ik x (B.6)