Synchrony, Parallelism and Multimodeling Louis J. Gross The Institute for Environmental Modeling Departments of Ecology and Evolutionary Biology and Mathematics University of Tennessee Overview • Synchrony and concurrency in ecology • Everglades restoration and ATLSS - Tales from the Real World: Mathematics and Computing Meets Greed, Politics, Lawyers and the Army Corps of Engineers • Computational ecology and some parallelization results • Distributed, grid-based computing for ecological modeling, providing stakeholders with capability to investigate their own hypotheses • Spatial control - some examples • Some educational issues in computational science Key Points • The availability of parallel computing in its many forms offers opportunities to rethink how to model many systems . Accounting for concurrency and the possibility of synchronous processes arising has great potential to rethink the way that many biological and social systems are modeled, going beyond the serial mindset that underlies much of applied science today. Developing the capability for this will require computational scientists with insight in the phenomena being modeled as well as deep understanding of parallelism. • A central question in science is what macroscopic properties arise from the properties of the entities which make up the system and how these are affected my modifications in the properties of the entities themselves and the interactions between the entities. Parallel computational naturally provides a means to investigate these issues outside of any constraints arising from a limited set of available mathematical approaches. • Realistic modeling of natural systems requires multiple linked approaches – multimodeling and new methods are needed to develop and analyze these. Such multimodels utilize a mixture of different underlying mathematical or computational approaches, (sometimes called hybrid models), are a reasonable way to analyze multiscale phenomena, and present problems appropriate for coarse parallelization. • Much of applied ecology deals with problems of spatial control – what to do, where to do it, when to do it, and how to monitor it – and these problems are not easily solved, opening up many new, fascinating problems in applied mathematics and computational science. These offer the opportunity to tie simulation methods with one of the most pervasive technological tools in environmental analysis, geographic information systems. Though readily accepted throughout applied ecology, GIS has had little connection to system dynamics methods needed to for decision support in resource management. This report notes that Simulation-Based Engineering Science is central to advances in numerous fields including biomedicine, nanomanufacturing, and energy and environmental sciences This report comments that: • Methods are needed for linking models at various scales and simulating multiphysics phenomena • The US must be in the forefront of methods to make simulation easier and more reliable • The simulation methods available are for only limited ranges of spatial and temporal scales and the principal physics governing events typically changes with scale so the models themselves must change in structure as the ramifications of events pass from one scale to another • Simulation-based decision making gives rise to complex optimization problems, which are governed by large-scale simulations. • Despite arguing that new thinking on how to model events at multiple scales is required to alleviate the “tyrany of scales” , the report says nothing about the potential for parallelization methods in this. Why is there not more emphasis on the role of parallelization in driving innovative modeling approaches? “there are so many important computational problems that are, besides the issue of efficiency, much more elegantly solved in parallel, and in particular can be naturally mapped onto an architecture with multiple processors. However, it is chiefly the performance, and among various performance metrics, primarily the speed of execution, that have been the dominant driving forces behind the quest for massive parallelism” Pedrag T. Tosic, ACM 2004 So parallelization is still viewed as primarily for speedup, not for reconceptualizing the underlying model This used spatial grid partitioning and MPI. A key lesson from this effort was the need to rethink the rules of movement and interaction of deer and panther from the serial implementation to allow for concurrent actions. The results were NOT the same as those for the serial implementation and could be argued as being more realistic as the parallel implementation accounted for interactions within a model time step in a way better according with field biology. Wet Season: May-October Dry Season: November-April Photos: South Florida Water Management District Collaborators: Don DeAngelis Rene Salinas Holly Gaff Jon Cline Mark Palmer Michael Peek Scott Duke-Sylvester Jane Comiskey Eric Carr Paul Wetzel Brian Beckage Numerous field biologists Computational facilities provided by NSF-supported Scalable Intracampus Research Grid (SInRG) www.cs.utk.edu/sinrg www.tiem.utk.edu Everglades natural system management requires decisions on short time periods about what water flows to allow where and over longer planning horizons how to modify the control structures to allow for appropriate controls to be applied. This is very difficult! •The control objectives are unclear and differ with different stakeholders. •Natural system components are poorly understood. •The scales of operation of the physical system models are coarse. So what have we done? Developed a multimodel (ATLSS - Across Trophic Level System Simulation) to link the physical and biotic components. Compare the dynamic impacts of alternative hydrologic plans on various biotic components spatially. Let different stakeholders make their own assessments of the appropriate ranking of alternatives. http://atlss.org Individual-Based Models Age/Size Structured Models Cape Sable Seaside Sparrow Snail Kite White-tailed Deer Wading Birds Florida Panther Fish Functional Groups Alligators Radio-telemetry Tracking Tools Reptiles and Amphibians Linked Cell Models Lower Trophic Level Components Vegetation Process Models Spatially-Explicit Species Index Models Cape Sable Seaside Sparrow Long-legged Wading Birds Short-legged Wading Birds Snail Kite Abiotic Conditions Models High Resolution Topography High Resolution Hydrology White-tailed Deer Alligators Disturbance © TIEM / University of Tennessee 1999 ATLSS High Resolution Topography * The High Resolution Topography model provides more detail about local variation in elevation. * The detail captures variation in elevation due to important features such as tree islands. High Resolution Topography Water Management Model Topography ATLSS High Resolution Hydrology * With the High Resolution Topography, High Resolution Hydrology values can be created from the SFWMD hydrology. High Resolution Hydrology * Hydrology values created in this way provide the spatial variation and resolution required to model the dynamics of many animal populations in South Florida. 4 miles SFWMD Hydrology4 miles How High Resolution Topography Is Made. Habitat cover map, provided by the Florida GAP analysis 4 miles At each location in the Florida GAP map, the model predicts a ground surface which is higher or lower than the base ground surface, derived from the hydroperiod of the cell, as given by the SFWMD hydrology data, and the estimated hydroperiod for the habitat type at that location. The total volume of water predicted by the SFWMD model in each grid cell is preserved in the High Resolution Hydrology Model. Estimates of hydroperiod for each habitat type in the Florida GAP analysis map. Class MinHp 0 365 45 180 30 15 40 45 0 365 10 60 0 0 10 0 …. …. …. A hydroperiod curve for each location on the map showing the number of days the water surface was at or above each elevation. This curve is generated from the Calibration/Validation run of the SFWMD hydrology model. Max HP 0 1 2 3 4 5 6 7 Spatially-Explicit Species Index (SESI) Models These are designed as extensions of habitat suitability index models, to provide yearly assessments of the effects of within and between year hydrology variation on basic requirements for foraging and breeding in a spatially-explicit manner. They allow comparisons of alternative scenarios, and allow different stakeholders to focus on their own criteria. ATLSS SESI Models Implement and Execute the Models for a Hydrology Scenario Objectives: Integrate SESI components into a cohesive computational framework and apply the models to a hydrology scenario. Hydrology Scenario Daily Water Depth Distribute water over high resolution topography High Resolution Hydrology SESI Models Cape Sable Seaside Sparrow Are the nests flooded during egg incubation? Snail Kite Are conditions favorable for the apple snails they depend on? Wading Birds Are water depths in the correct range for the fish they eat? Standard Output Generation/Visualization Tools White-tailed Deer Is breeding disrupted by high water levels? American Alligator Is there high ground to build a nest on? SESI Output for Long-Legged Wading Birds in N. Taylor Slough: For 1993 Long-Legged Wading Bird SESI Index - WCA-2B Subregion, Comparing F2050 (blue) with D13R (red) 0.3500 D13R 0.2500 0.2000 0.1500 0.1000 0.0500 F2050 Year (from 1965) 33 31 29 27 25 23 21 19 17 15 13 11 9 7 5 3 0.0000 1 Foraging Index 0.3000 ATLSS Fish Functional Group Dynamics Model Fish biomass is one of the most important components of the Everglades system. To produce projections of fish biomass ATLSS uses a... … spatially explicit size-structured dynamic simulation model, ALFISH. ALFISH simulates the number, size-structure and biomass densities of “small fish” and “large fish” functional groups in the freshwater marsh on 5-day time steps. This represents the temporally and spatially varying food base for wading birds. ALFISH has been evaluated through comparisons to some sites in Shark Slough and WCA3. ALFISH Objectives •Provide estimates of effects of alternative water management scenarios on spatial and temporal distribution of food resources for upper trophic level consumers (wading birds). •Provide method to evaluate hypothesized impact of hydrologic changes on fish community composition. ATLSS Landscape Fish Model Holly Gaff, Rene’ Salinas, Louis Gross, Don DeAngelis, Joel Trexler, Bill Loftus and John Chick Approach A size-structured population model for fish functional groups (large and small fish) that operates on a spatial cell basis with movement between cells and between habitats within cells. ALFISH FLOW CHART Fish Cell Layout Example of Small Fish Least Killifish Heterandria formosa Female Male Pond areas assumed permanently wet, marsh areas periodically dry Landscape Layout and Movement Fish as Prey Fish provide the prey-base for endangered wading bird species such as Great Egret (Casmerodius albus) White - movement from low water to high water areas Red - movement from high fish density to low density areas ALFISH MODEL EXAMPLE RESULTS - Alt D13r4 compared to F2050Base Fish Available as Prey during a Typical Rainfall Year Fish Available as Prey during a High Rainfall Year Average Fish Available as Prey from 1965 - 1995 Fish Available as Prey during a Low Rainfall Year Distribution of Sizes for Fish in WCA 3A Total Fish Densities through 31-year Model Run Average Fish Available as Prey during Breeding Season for Wading Birds Total Fish Densities for Certain Years in Given Areas Parallelizations for Everglades Fish model investigated • Comparison of serial version to grid partitioning by region to analyze impacts of compartmentalization • Comparisons of MPI methods on clusters and SMP • Analysis of dynamic load balancing with row-stripe partitioning on SMP • Comparison of alternative MPI and multithread (Pthread) implementations. • Comparisons of parallelization by component structure (age classes)to spatial grid partitioning using MPI and Pthread implementations. • Multiple model implementation combining Fish model with Wading Bird model ATLSS grid-service module ATLSS Model Interface Wang et al. 2005. A grid service module for natural resource managers. IEEE Internet Computing 9:35-41 Information Analysis/Control and Data Representation Layer Ecological Models Components Layer Biotic component Ecological Modeling Oriented Data Assimilation Layer Spatial Information (GIS, topology, etc) External Models (hydrology, climate, etc) Abiotic component GEM: Grid-based Ecological Modeling, http://www.tiem.utk.edu/gem Wang, D., M. W. Berry, E. A. Carr, L. J. Gross. Towards Ecosystem Modeling on Computing Grids, Computing in Science and Engineering Vol. 13, No. 1, pp55-76, 2005 If space is the final frontier then spatial control theory sets our course to apply our ecological understanding of spatial effects to many practical problems in applied ecology. Supported by NSF Awards DMS-0110920, DEB-0219269 and IIS-0427471 What is spatial control? What do we do? How do we do it? Where do we do it? How do we assess/monitor to determine success? Why is spatial control important? Much of applied ecology involves questions for which spatial control is required. •Harvesting •Reserve Design •Water planning •Intercropping These problems offer us the opportunity to demonstrate the utility of computing in very practical situations, and link together models with GIS and decision support tools that natural system managers and policy-makers need. Example problems in spatial control – The ATLSS project and Everglades restoration – Black bears (Salinas, Lenhart) • Metapopulation approach and human-bear interactions • Reserves and individual-based models – Invasives - Lygodium macrophyllum (Duke-Sylvester) – Invasive control of foci vs outliers - (Whittle, Lenhart) – Control of integro-difference equation models (Lenhart, Joshi, Gaff, Whittle) – Fisheries harvesting (Ding, Lenhart) – Tick-borne disease control (Gaff) – Control theory and intercropping (Lenhart, Joshi) – Managing antibiotic resistance (Duke-Sylvester) – Wildfire control and optimization (Bains, Berry, Shaw) American Black Bear (Ursus americanus) Rene Salinas and Suzanne Lenhart Salinas, R., S. Lenhart and L. Gross. 2005. Control of a metapopulation harvesting model for black bears. Natural Resource Modeling 18:307-321 Current Black Bear Distribution Southeastern U.S. Source: Pelton and van Manen (1994) Current Issues • The human population surrounding the (GSMNP) has also grown over the last 70 years. • Nuisance bear activity is a major problem all along the Appalachian range. • With the increase in bear-human encounters, the likelihood of harmful encounters also increases. BASE scenario during a good mast year. BASE scenario during a poor mast year. ALT2 scenario during the same poor mast year. Spatial treatment for control of an invasive - Detection, Mapping & Prediction of Spread of Lygodium microphyllum in Loxahatchee NWR (Scott Duke-Sylvester) Background About Lygodium • Old world climbing fern • Ranges from Africa to SE Asia/Australia – (Pemberton, et. al) • Introduced to South Florida : prior to 1958 – (Nauman and Austin, 1978) • Negatively impacts both flora and fauna SRF Data 2000 2002 Goals of Modeling • Provide a method to collect all available data and suggest additional data requirements • Provide a means to assess the impacts of alternative possible control schemes • Provide guidance to managers regarding economics of control Spatial Model Dynamics Px,y (t) Px,y (t 1) a0 k(x, y, x', y')I x,y (t)Px,y (t)rRx,y (t) x',y' I x,y (t) I x,y (t 1) a0 k(x, y, x', y')I x,y (t)Px,y (t) a1k(x', y')I x,y (t)Rx,y (t) Tx,y (t) x',y' x',y' Rx,y (t 1) Rx,y (t 1) a1k(x, y, x', y')I x,y (t)Rx,y (t) Tx,y (t) rRx,y (t) x',y' Px,y (0) P0 x,y , I x,y (0) I 0 x,y , Rx,y (0) R 0 x,y Px,y (t) I x,y (t) Rx,y (t) 1,x, y,t Results • Optimal control with limited resources 0% 0% 91-100% 91-100% Infected Treated Results • Optimal control with limited resources Total Treatment Effort Some thoughts on educational issues: • Collaborations between disciplines can be effective at providing a common language for interdisciplinary computational science problems, but cannot be effectively established in a single class or workshop sustained effort is required for effective collaboration • The move of computer science programs to Engineering colleges may be effective at encouraging students to develop skills beyond coding, but it is far from clear that computer science units are the most effective home for new computational science programs • Far greater exposure of science students at the undergraduate level to simulation methods is necessary given the importance of simulation across science and this implies a change from a curriculum focusing on scientific computing (e.g. numerical analysis) for these students to one containing scientific simulation - hosts of good products are available, but many are little used • The development of curricula which encourage new uses of parallel computing beyond simply its potential for speedup should be supported • Enhanced graduate student use of computational science can arise from development of more of an “outreach”, service orientation from those units at universities focused on high performance computing Key Points • The availability of parallel computing in its many forms offers opportunities to rethink how to model many systems . Accounting for concurrency and the possibility of synchronous processes arising has great potential to rethink the way that many biological and social systems are modeled, going beyond the serial mindset that underlies much of applied science today. Developing the capability for this will require computational scientists with insight in the phenomena being modeled as well as deep understanding of parallelism. • A central question in science is what macroscopic properties arise from the properties of the entities which make up the system and how these are affected my modifications in the properties of the entities themselves and the interactions between the entities. Parallel computational naturally provides a means to investigate these issues outside of any constraints arising from a limited set of available mathematical approaches. • Realistic modeling of natural systems requires multiple linked approaches – multimodeling and new methods are needed to develop and analyze these. Such multimodels utilize a mixture of different underlying mathematical or computational approaches, (sometimes called hybrid models), are a reasonable way to analyze multiscale phenomena, and present problems appropriate for coarse parallelization. • Much of applied ecology deals with problems of spatial control – what to do, where to do it, when to do it, and how to monitor it – and these problems are not easily solved, opening up many new, fascinating problems in applied mathematics and computational science. These offer the opportunity to tie simulation methods with one of the most pervasive technological tools in environmental analysis, geographic information systems. Though readily accepted throughout applied ecology, GIS has had little connection to system dynamics methods needed to for decision support in resource management.