Change Detection: An Inter-disciplinary Investigation Across Climate Sc., Computer Sc./Eng., Statistics, & Remote sensing Snigdhansu Chatterjee Abdollah Homaifar Shashi Shekhar Students: Zhe Jiang Keith Harding Mohammad Gorji Sefidmazgi Ansu’s student Lian Rampi Xun Zhou Joseph F. Knight Stefan Liess Peter K. Snyder On site review of NSF Expedtions in Computing: Understanding Climate Change: A Data Driven Approach. Minneapolis, MN, Oct. 16, 2012 Sponsor: NSF CISE?/EIA? Slide 1 Oct 16, 2012 Change Detection Questions in Climate Sc. • Sahel: Characterize spatial Extent of the Sahel over time • • • • • How does one define Savanna using remotely sensed data ? Identify appropriate variable to detect Sahel (and droughts) from among precipitation, soil moisture, vegetation, water supplies, etc. How does one efficiently find Sahel-footprint given Savanna definition How will statistical distribution of top k-percentile change? Regimes: • • Slide 2 How does one efficiently detect interesting interval in a time series? How does one detect persistent regime-intervals in time series collection? Oct 16, 2012 Contributions to Computer Sc./Eng. & Statistics • Statistics: • Optimally detect change in multiple climate characteristics, their statistics, and relationship among these characteristics and variables, • Quantify the uncertainty and confidence in change detection, with incomplete, and spatio-temporally dependent • Computer Sc./Eng. • Efficiently discover Interesting sub-path from ST datasets: A Sub-path Enumeration and Pruning (SEP) approach • Spatial Decision Tree Learning algorithm (global spatial autocorrelation) • Finding common intervals of change among time series (need name of the approach/algorithm from Abbie’s group) Slide 3 Oct 16, 2012 Computer Sc. Problem : Interesting Sub-path Query (ISQ) • • Input • An interest measure and thresholds. • A path and its attribute Output • • [1,2], [5,11] All dominant interesting sub-path Constraints • Correctness & completeness • Automation & scalability Unit interval : 1-2 Change : 7 Slope = 7 Slide 4 Average change (slope) ≥ 3.5 2-3 3-4 -6 1 4-5 -1 5-6 6-7 7-8 8-9 5 5 4 -3 9-10 10-11 11-12 5 5 -11 Slope = 3.5 Oct 16, 2012 Computational Structure & A Naive Algorithm • Naive approach : • Phase 1: Collect qualifying sub-paths • • O(n4) in worst case For each possible sub-paths, evaluate interest measure • Phase 2: Identify dominant sub-paths by comparing pairs of qualifying sub-paths. Will Dynamic Programming reduce computational cost? 1 2 3 4 1 2 3 5 Start location 6 7 8 9 10 11 12 Examined interval Skipped interval Invalid interval Dominated interesting sub-path Dominant Interesting sub-path End location 4 5 6 7 8 9 Dominated by 10 11 Slide12 5 Oct 16, 2012 Why is ISQ Problem Hard? • • • • Concept Definitions • Sahel Footprint: Rectangle or irregular polygon • Interest Measure: Characterize Sahel signature in remotely sensed data Large Data Volume and Computations • Trillion computations per time step for GIMMS/MODIS (resolution 0.07 degree) • Thousand time steps per variable Non-monotonic Interest Measure • Example: Average Slope (AS) • AS (interval) does not bound AS (sub-interval) Dynamic programming principle violated • Slide 6 Lack of (optimal) sub-structure Oct 16, 2012 Computer Sc. Contributions for ISQ Problem • • • • Formalize Interesting (change) sub-path Query problem Characterized computational structure A novel algorithm: Sub-path Enumeration and Pruning (SEP) Evaluation • Cost model • Computational experiments • Case study with Eco-climate data Slide 7 Oct 16, 2012 Related Work, Its Limitations, Novelty of Our Approach Interesting sub-region query Change-points e.g., CUSUM[3] [6] Slide 8 sub-paths e.g., SEP (Our Work) sub-regions (Future Work) [1,2], [5,11] CUSUM score: S0 = 0, Sn+1 = max(0, Sn + xn - Ɵn) Here Ɵ is chosen to be the mean of the data Change below mean above mean Oct 16, 2012 The SEP approach • Insight 1 : Interest measure is a algebraic function • Insight 2: Dominance imposes a partial order among sub-paths Insight 3: The partial order is a grid-based DAG • • • 1 Better way to traverse the G-DAG ? BFS? DFS (preorder)? DFS (postorder)? 2 3 4 Start location 5 6 7 8 9 • AVG = SUM/COUNT. - Build lookup table for SUM and COUNT - pre-compute for O(n), access for O(1) • Row-wise : scan each row, stop when pattern found • Top-down : Smart BFS over G-DAG - A node has 2 parents: a pruned node may reappear! - No phase 2 needed – more space for recording 10 11 12 Cnt SUM 1 1-12 72 1-2 2 13 1-3 3 24 1-4 End location End location 1 4 15 1-5 5 66 1-6 6 117 1-7 7 15 8 1-8 8 12 9 1-9 9 17 10 1-10 10 22 11 1-11 11 Slide 9 12 12 1-12 Grid-based Directed Acyclic Graph (G-DAG) Traversal Direction 5-11 5-11 1-2 1-2 2-3 3-4 4-5 5-6 6-7 7-8 16, 2012 8-9 9-10Oct10-11 11-12 A Comparison of Techniques for Traversing G-DAG DFS (wo/ pruning) BFS (wo/ pruning) BFS (w/ leaf scan and pruning) SEP Pruning border approach A: Redundant leaf visits Yes Yes None None B: Unnecessary dominated nonleaf visits Yes Yes None None C: Memory needs to avoid B O(n) O(n) O(n2) O(1) Slide 10 Oct 16, 2012 Generalizable contribution to computer science • • • New graph traversal order (for G-DAG) Can benefit many other problems for scaling up to larger datasets • Space (e.g., spatial field data) • Time (e.g., time series) • Space-time (e.g., Lagrangian path?) • Trajectories • Hui’s paper (see if apply) Space-filling curves are designed for traversing planar space not graph • Hillbert Slide 11 Hillbert curve (source: wikipedia) Oct 16, 2012 Theoretical and Experimental Evaluations of SEP • Theoretical Evaluation: • SEP is Correct and Complete • Correct: All the reported sub-paths are qualifying dominant sub-paths • Complete: All the dominant interesting sub-paths are reported • Case 1: short patterns (PLR = 0.1) Experimental Evaluation Row-wise vs.=Top-down Case Case 2: long3:patterns PLR 1 • SEP is orders of magnitude faster than competition • SEP top-down is faster for longer patterns • SEP row-wise is faster for shorter patterns * Synthetic dataset: length 10k-50k, unit difference follow Gaussian distribution. Code in Matlab. ** Pattern Length Ratio is the length of longest interesting sub-path by the length of the entire path, between 0 and 1. Slide 12 Oct 16, 2012 Case Study (1) • Data: Vegetation Data (in NDVI) by GIMMS [4], Africa, 1981 August. Resolution: 8km. Smoothed within 1x1 degree. • • Path: along each longitude (south north) • Thresholds: α= 20% percentile, SD ≥0.5 Interest measure: (Slope) Sameness degree AVG{∆} Slide 13 AVG≥α{∆} , ∆ : unit slope Oct 16, 2012 Case Study (2) • • The Sahara desert is growing towards south What is the spatial pattern of the Sahel over time • Time: August, 1982-1985, 1990, 2000 Slide 14 Oct 16, 2012 (Path to) Contribution to Climate Science • • • Current • Identify the spatial extent of the Sahel and its change over time. • Characterize existing land cover/use applicable to climate studies (e.g. savanna) Near Future: Understand Sahel Drought Occurrences • Attribution: Human Influence Vs. natural processes • Changes in intensity, location, frequency • Tele-connections • Predict future changes using projected climate information (CMIP5) • How is regional climate changing (e.g., moisture content, evapo-transpiration, boundary layer energetics)? • Characterizing changes in the general circulation and its affect on extreme events detecting changes in Rossby wave amplitude and wave number Long Term • Slide 15 Improve vegetation representation in climate simulations Oct 16, 2012 Future research directions in Computer Sc. & Statistics • Computer science directions • • Exploring two dimensional change patterns • Two dimensional transitional zone (e.g., rectangle) • Arbitrary change direction Exploring three dimensional change pattern • • Space-time change zone • Reduce memory needs of the SEP algorithm • Spatial Decision Tree Learning algorithm + local autocorrelation (from zhe) Statistics Future Directions • Slide 16 Needs input from Ansu Oct 16, 2012 List of Publications and References Contributors’ Publications: [1] Xun Zhou, Shashi Shekhar, Pradeep Mohan, Stefan Liess, Peter K. Snyder: Discovering interesting sub-paths in spatiotemporal datasets: a summary of results. GIS 2011: 44-53 [2] Need publications from Ansu, Abby and Joe’s group References: [3] E. Page. Continuous inspection schemes. Biometrika, 41(1/2):100--115, 1954. [4] Tucker, C. J., J. E. Pinzon, M. E. Brown. Global inventory modeling and mapping studies. Global Land Cover Facility, University of Maryland, College Park, Maryland, 1981--2006. [5]. Needs references from Ansu, Abby, and Joe’s group Slide 17 Oct 16, 2012 Backup Slides Start here Slide 18 Oct 16, 2012 Slide 19 Oct 16, 2012 Traversal order on the G-DAG (Top-down/smart BFS) 1-12 Grid-based Directed Acyclic Graph (G-DAG) 5-11 1-2 Slide 20 Oct 16, 2012 Traversal order on the G-DAG (Pruning bordar/smart DFS) 1-12 Grid-based Directed Acyclic Graph (G-DAG) 5-11 1-2 Slide 21 Oct 16, 2012 General contribution to computer science 1. General contribution to computer science 1. New graph traversal order 2. Can benefit many other problems for scaling up to larger datsets 1. Space 2. Time 3. Space-time 4. Trajectories 5. Hui’s paper (see if apply) 3. Space-filling curves for space not for graph space 1. + pictures of Hillbert Slide 22 Oct 16, 2012 What is a drought • A period of unusually persistent dry weather that persists long enough to cause serious problems such as crop damage and/or water supply shortages • Four different ways to define drought • Meteorological-a measure of departure of precipitation from normal. Due to climatic differences, what might be considered a drought in one location of the country may not be a drought in another location. • Agricultural-refers to a situation where the amount of moisture in the soil no longer meets the needs of a particular crop. • Hydrological-occurs when surface and subsurface water supplies are below normal. • Socioeconomic-refers to the situation that occurs when physical water shortages begin to affect people. sources: NOAA http://www.wrh.noaa.gov/fgz/science/drought.php?wfo=fgz Slide 23 Oct 16, 2012 Desertification (1) • • • • • • • • • Sahel is transition zone between the desert and Savannas. Arabic word Sahel means shore (coastline of Sahara desert) Sahel droughts have occur numerous time over centuries including 2012, 2010, 1984-85 (Ethiopia), 1968-73,1940s, 1910s, 1898, etc. Possible correlates include AMO, global warming/dimming, Solar(89-120 years) Wolf-Gleissberg cycles, overgrazing/deforrestation, land management practices, ... UN Convention to Combat Desertification shows a map of areas of high risk for dessertification. This map looks very similar to the map produced in our case study with vegetation data http://en.wikipedia.org/wiki/Desertification Deserification is the the process of fertile land transforming into desert typically as a result of deforestation, drought or improper/inappropriate agriculture Regards, A billion people are under threat from further desertification Sahara is currently expanding southward 48 km/year. desertification creates increasingly larger empty spaces over a large strip of land, a phenomenon known as "tiger fur pattern". Pictorial details of Sahel dessertification are at http://oceanworld.tamu.edu/resources/environment-book/desertificationinsahel.html Slide 24 Oct 16, 2012 Desertification (2) • Current decade (2010-2020) is UN decade Decade for Deserts and the Fight • • • against Desertification. Last week, Colorado State U hosted a UN meeting on desertification. See http://www.today.colostate.edu/story.aspx?id=4888 It suggests that desertification is a key issue for US (West, Mid-west). A recent paper lists six research priorities including Increase understanding of the nature, extent and severity of desertification, drought and dryland degradation, and develop more effective ways to measure and monitor it. See page 8 , 12-13, 25-26 (Dust Bowl), 27-28 (Sahel) of Desertification, Drought, Poverty and Agriculture: Research Lessons and Opportunities, Mark Winslow et al, 2004. http://www.iwmi.cgiar.org/Assessment/files/Synthesis/Land%20Degradatio n/DDPAARLO_text.pdf Another report on desertification from 2009-2010 is at IDEntifying and Analysing New Issues in Desertification: Research Trends and Research NeedS http://www.uni-marburg.de/fb02/ike/forschung/projekte/finalreport.pdf Slide 25 Oct 16, 2012