Using Spatial Statistics in Research: Examples from work at UT-Dallas Faculty research Ph.D. dissertations Masters Projects Former UTD graduates “at work” Spatial Autoregressive Model for Population Estimation at the Census Block Level Using LIDAR-derived Building Volume Information Qiu, Fang*; Sridharan, Harini***; Chun, Yongwan** Cartography and Geographic Information Science, Volume 37, Number 3, July 2010 , pp. 239257(19) *associate professor **assistant professor ***Ph.D. candidate University of Texas at Dallas Objective • Estimate population in small geographic areas (city block) using remote sensing data – Cheaper than carrying out a census – Census may not provide data for small areas Legend Population 1 - 50 51 - 125 126 - 200 201 - 400 >400 500 m Previous Work (literature review) • Previous work used remote sensing image analysis to measure density of roads or area of residential land use – Population then estimated using these data • Data is only 1 or 2 dimensional – does not measure multi-story housing units – Would not work in China! • Use LIDAR data to measure building volume LiDAR • Light Detection And Ranging (LiDAR) technology • Collects elevation data using a laser scanner – Laser beam bounces (reflects) back from ground, top of buildings, top or side of trees, etc. • Produces point cloud of 3-D information – x,y, z: longitude, latitude, elevation • Very detailed and accurate – Points every few cms if desired 5 Data Footprint (top of building) • Obtain building footprints and their area from analysis of digital ortho images • Buffer 1m around footprint • Height of building is difference between median Lidar elevation within footprint (top of building) and median Buffer elevation within buffer (ground) (ground around building) • Area x height = volume Model • P=a*Ab allometric growth model used in previous research Population Area – Population is an increasing function of area (A) • P=α*Vβ modified allometric growth used in this research – Population is an increasing function of volume (V) • Log(P) = Log(α)+βLog(V) – Take log of both sides to linearize the equation – use linear regression to estimate the coefficients Results Models R2/Pseudo AIC RMSE R2 Building volume based Building area based OLS Land use area based Road length based Building volume based Building area based SPATIAL MODELS Land use area based 0.844 0.812 Adj RMSE 131.04 139.41 28.415 0.4023 53.581 0.7268 53.622 0.4381 244.50 0.909 28.173 0.288 0.638 207.88 0.619 185.48 0.850 128.84 0.824 138.96 35.072 0.484 0.674 189.61 53.884 0.44 length based • Volume Road always better than area or road 0.72 178.55 74.770 • Spatial always better than OLS 0.546 Case study: A Spatial Analysis of West Nile Virus Diffusion of WNV across the US Daniel A. Griffith Ashbel Smith Professor http://www.ij-healthgeographics.com/articles/browse.asp A comparison of six analytical disease mapping techniques as applied to West Nile Virus in the coterminous United States, International Journal of Health Geographics 2005, 4:18. Geographic distribution of West Nile virus (WNV) reported cases in 2002. Black denotes states with, and white denotes states without reported cases. % WNV deaths in 2003 % WNV deaths in 2004 2002 Challenges of spatial statistics in analyzing WNV What are the issues/problems? • Predicting where it will spread/occur. • Calculating the correct margin of error for predicting its occurrence when nearby values are similar (i.e., related). Why do they need to be resolved? • People are dying. How are these issues being addressed? • Specifying correct spatial statistical models. Scatterplots of observed versus predicted values Surprising spatial filter result: a jump to California A Predictive Terrestrial Clutter Model for Ground-to-Ground Automated Target Detection Applications By Gene A. Feighny Ph.D. dissertation, UT-Dallas 2010 Adviser: Dr. Denis Dean (currently Senior Research Engineer, E-Systems Inc.) Problem Statement and Objective • Automated target detection (ATD) algorithms important for both military and civilian use – Identify an “object of interest”: • tank • plane wreck • “suspicious” package or person • How do we separate the “object” from the “background clutter”? • Clutter has consistent characteristics – Identify those characteristics • Object will have different characteristics – It will “stand out” • Therefore we need to identify the characteristics of clutter These two scenes obviously have different clutter characteristics • What are some of the characteristics of clutter? – degree of spatial clustering at various distances. • How do we measure this? – Ripley’s K function URBAN FOREST INVENTORY USING AIRBORNE LIDAR DATA AND HYPERSPECTRAL IMAGERY by Caiyun Zhang Ph.D. dissertation, UT-Dallas 2010 Adviser: Dr Fang Qiu (Currently, Assistant Professor, Florida Atlantic University) Research Objectives 1. 2. 3. 4. 5. Develop a relatively simple and robust algorithm to isolate individual trees using LiDAR vector point cloud data. Estimate single tree metrics such as tree heights, tree distributions, stem density, crown diameters, crown depths, and base heights, from original LiDAR vector data. Develop a neural network based approach to identifying tree species at the individual tree level using the detailed spectral information derived from high spatial resolution hyperspectral images. Produce urban forest 3-D scenes by constructing 3-D tree visualization models using the LiDAR derived information. Map urban forests at the individual tree level using state-of-the-art geographic information system (GIS) techniques . Point pattern analysis was one of the many techniques used to meet these objectives. Lidar produces a 3-D “point cloud” Various cluster analysis techniques are used to identify different objects Turtle Creek, Dallas: Lidar data (laser derived elevations) identifies trees • Ground Points Turtle Creek, Dallas: Hyperspectral data (2151 bands) identifies species • Ground Points Accuracy doubled from existing methods: --60%-70% versus 30%-40% --one research question to explore is whether or not tree species cluster --in urban forests: No (for U.S.) (they are planted by people) --in natural forests: YES 3-D Forest model based on cluster analysis of Lidar point cloud. --each tree is identified --modeled independently based on height crown depth crown diameter in 4 directions height Crown depth Real trees in 2-D image Crown diameter Proposal for Dissertation Supervising Committee: Dr. Ronald Briggs Dr. Yongwan Chun Dr. Denis Dean Dr. Fang Qiu (Chair) Point Cloud Segmentation-based Filtering and Object-based Feature Extraction from Airborne LiDAR Data Jie Chang Ph.D. Program in Geospatial Sciences University of Texas at Dallas May 3, 2010 • LIDAR LiDAR Characteristics – 3D remote sensing – Direct 3D position measurements – Very good vertical accuracy – Capable of capturing multiple returns and intensity values from different parts of objects – Capable of penetrating openings in tree canopies and measuring ground elevation 26 Aerial Photo (0.3 m, True Color) How do we identify each house and each tree? 27 Constrained 3D K Mutual Nearest Neighborhood Point Segmentation Algorithm 28 Incorporating Time And Daily Activities Into An Analysis Of Urban Violent Crime Or Measuring Crime Rates Realistically Janis Schubert Ph.D. dissertation, University of Texas at Dallas, 2009 Adviser: Dr. Dan Griffith (currently Senior Research Scientist, Critical Infrastructure Protection Program, Los Alamos National Laboratory) Night time population density Daily Change in Population Density Crime statistics invariably use the residential (night time) population when calculating rates. But the geographic distribution of population varies substantially during any 24 hour period as people go about their daily business (work, shop, play, etc.) This is what the US Census reports. 10am-4pm 10pm-4am Day/Night Aggravated Assault Rates Uses a simulation model of daily traffic flows to estimate population at each location at different times of the day Then uses crime counts for same locations and time periods to re-calculate crime rates. Application of GIS in Law Enforcement Peter V. Pennesi Crime Analyst, Plano Police Department MGIS Graduate UT-Dallas Enhancing Public Service with Locational Awareness Do home addresses of registered sex offenders cluster? Where are these clusters? (I don’t want to live there!) Selected Law Enforcement Areas of Interest For GIS Researchers and Developers Where are the hotspots for automobile accidents? Avoid these intersections! Can we redesign them? Selected Law Enforcement Areas of Interest For GIS Researchers and Developers Hotspot street segments for crime. Police these streets! Selected Law Enforcement Areas of Interest For GIS Researchers and Developers Enhancing Business with Location Intelligence Wayne Geary Staubach Companies Advisers and Analysts for the Real Estate Industry Site Selection Geographies of opportunity Leads to a real estate solution An Automated System For Image-to-Vector Georeferencing Yan Li Ph.D. dissertation, UT-Dallas 2009 Adviser: Dr. Ronald Briggs (currently GIS Data Base Manager, City of Dallas, Tx.) Finding the location and appropriate transformation to position and align an image at its true world location Image is distorted and its location is unknown Where in the world is this image ? City of Dallas Street Centerline file 68,000 street segments The Problem The current way of georeferencing: – Manually create a set of control point pairs (CPPs) linking between the raster image and a reference map – Difficult, time consuming, tedious, inaccurate, inconsistent – Often impossible to find locations without prior knowledge – About the image’s approximate location – About the region by the operator + An automated solution is highly desirable GeoInfo 2010, Dr. Yan Li & Dr. Ron Briggs 4 2 Automated Approach from image An unknown distorted image 1. Automatic feature extraction 3. Optimize transformation result Image Point 2. Automatic Set R feature matching from Vector base A n arbitrarily large reference road network Vector Point Set V Go Home China Project, June 2010, Dr. Yan Li & Dr. Ron Briggs 43 Methodology searches for similar patterns of road intersections: must be invariant to the underlying transformation Y vi v2 ++ ayi axi -+ a0 X v1 v0 -- For a similarity transformation, angles are preserved and distance between two points stay proportional +- For an affine transformation, the ratio of the areas of triangles between intersections is a constant Photorealistic Modeling of Geological Formations Mohammed Alfarhan Ph.D. dissertation, UT-Dallas 2010 Adviser: Dr. Carlos Aiken (currently faculty member, King Saud University, Saudi Arabia) GeoAnalysis Tool with Surface Extrusions Not just a movie! It’s a model of the formation from which measurements can be made A model of the formation from which measurements can be made Display and measurements using ArcGIS/ArcMap Articles in Chinese • He and Pan Geographical Concentration and Agglomeration of Industries Progress in Geography, Vol. 26, No. 2, 2007 pp 1-13 – Uses Ripley’s K-function • Wei, Zhang and Chen Study on Construction Land Distribution using Spatial Autocorrelation Analysis Progress in Geography, Vol. 26, No. 3, 2007 pp 1-17 – Uses Moran’s I I have really enjoyed being here. I hope that you have learned some new and useful things! briggs@utdallas.edu www.utdallas.edu/~briggs