Computational Sustainability Carla P. Gomes Institute for Computational Sustainability Computing and Information Science Cornell University Support: Expeditions in Computing (CISE) February 2011 The Institute for Computational Sustainability (ICS) Research Team ICS members and collaborators for their many contributions towards the development of a vision for research, education, and outreach activities in the new area of Computational Sustainability Expeditions in Computing (CISE) Support: 29 graduate students 24 undergrad. students Background and Motivation: Perspective on the Future of Computer Science Computer Science is a fast moving discipline, and it is becoming crucial to other fields. “Computational Thinking” is permeating and shaping so many areas and disciplines. I - Foundations up to 1985 Early days computer science and engineering focused on making computers more efficient, faster, easier to program, and better at communicating. Topics: Hardware Architecture Operating systems Networks and Internet Programming Languages Compilers, Theory of Computation: Algorithms and Complexity, etc Many challenges remain in these areas and much work is required to ensure continued advances in computational power , data, sensing, and communication networks. Our computing and data infra-structure will be 100x to 1000x more powerful by 2030. Our current setup will look like a good “calculator” with a primitive data network. 5 With developments in these areas, we can now create highly complex systems, with software with millions of lines of code running on hundreds of networked machines, that is fast, reliable and (reasonably) secure. We can control: Huge supply chains (eg Amazon) Search engines (eg Google), Financial , data, and communication networks Sophisticated Medical Equipment, and almost any complex artifact. Average household has lots of processors embedded. We’ll soon have produced more transistors than grains of sand in the world! II- Broadening the Field 1986 to 2000 (and beyond) Key goal: making computers smarter, and easier to interact with. New applications and subfields emerged, such as: Information retrieval (Search engines), Computer graphics, AI (machine learning, computer vision, natural language understanding), and Robotics. These trends will continue with exciting things coming up within the next 10 to 20 years. For example Smart buildings, Self-driving cars, Simultaneous automatic translation, Household robots, Robotic remote surgery, … III Reaching Out 2001 to current Computer Science is reaching out to other disciplines. Basic theme: infuse computational thinking into other disciplines. Tremendous opportunities: Any scientist can now have a local 10 TB disk, storing the equivalent of 10 million books, right at his or her desk, with sophisticated compute power . Explosion and proliferation of a variety of sensing devises with advanced embedding and networking capabilities We see some of the first results in e.g.: biology (e.g. genetics) social sciences (eg social network analysis), and linguistics (eg evolution of word usage). A New Kind of Science Opportunities go much beyond “just” better and larger scale data processing. “There is the option of training complex computational models with good predictive power but too complex for human comprehension (no explanatory capacity).” Strogatz, NYT 2010; On the unreasonable effectiveness of data, Halevy et al. 2010. A New Kind of Science? In other words, in the future our scientific theories are no longer restricted in complexity by our own cognitive abilities. Rather, given the right amount of data and inference power, a machine can potentially train a highly complex model and use it as a predictive engine. Example: Google’s machine translation. Works well for dozens of languages. No clear linguistic underlying model. Instead, highly complex statistical model, automatically trained on 1+ million examples. Programmers often did not know the languages! Consider providing such computational thinking to many other fields! Examples of new emerging computational fields: Computational biology, Computational Economics, Healthcare Informatics, and Computational Sustainability Pervasive & ubiquitous computing and communications will play a key role in information capture and reasoning, as well as adaption to highly dynamic and uncertain environments I will discuss “Computational Sustainability.” Computational Sustainability Sustainability and Sustainable Development The 1987 UN report, “Our Common Future” (Brundtland Report): Raised serious concerns about the State of the Planet. Introduced the notion of sustainability and sustainable development: Sustainable Development: “development that meets the needs of the present without compromising the ability of future generations to meet their needs.” UN World Commission on Environment and Development,1987. Gro Brundtland Norwegian Prime Minister 14 Chair of WCED Follow-Up Reports: Intergovernmental Panel on Climate Change (IPCC 07) Global Environment Outlook Report (GEO 07) ”There are no major issues raised in Our Common Future for which the foreseeable trends are favourable.” Erosion of Biodiversity Examples: •At the current rates of human destruction of natural ecosystems, 50% of all species of life on earth will be extinct in 100 years. Wilson, E.O. The Future of Life (2002) . Global Warming +130 countries •The biomass of fish is estimated to be 1/10 of what it was 50 years ago and is declining. Worm et al. (2006) [Nobel Prize with Gore 2007] 15 Safe Operating Boundaries: Crucial Biophysical Systems Source: Planetary Boundaries: A Safe Operating Space for Humanity, Nature, 2009 16 Sustainability: Interlinked environment, economic, and social issues Our Common Future recognized that environmental, economic and social issues are interlinked. The economy only exists in the context of societies, and both society and economic activity are constrained by the earth’s natural systems. A secure future depends upon the health of all 3 systems (environment, society, economy). Sustainable Development encompasses balancing environmental, economic, and societal needs. 17 Challenging Sustainability Problems Complex Dynamical Systems Sustainability problems are unique in scale, impact, complexity, and richness: Often involving multiple and highly interconnected components and players, in highly dynamic and uncertain environments. ecosystems.noaa.gov Natural Ecosystem Offer challenges but also opportunities for the advancement of the state of the art in computing and information science. Smart Grid: Complex Digital Ecosystem 18 Computational Sustainability: Vision Computer scientists can — and should — play a key role in increasing the efficiency and effectiveness of the way we manage and allocate our natural resources, while enriching and transforming Computing and Information Science and related disciplines. We need critical mass in the new field of Computational Sustainability!!! 19 Outline I Examples of Computational Sustainability problems highlighting research themes II Our research themes III Building a community in Computational Sustainability IV Conclusions 20 Sample of Computational Sustainability Problems I Environment E.g. Conservation and Biodiversity II Socio-Economic Systems E.g. Poverty mapping and poverty reduction, and harvesting policies. III Energy Fuel distributors E.g. Smart grid, Material discovery and Biofuels Farmers Energy crops Water quality Soil quality Gasoline producers Non-energy crops Food supply Environmental impact Biodiversity Local air pollution Consumers Energy market Social welfare Economic impact 21 Conservation and Biodiversity 22 Combatting Biodiversity Loss: Landscape Connectivity Ideas from circuit theory for mapping critical linkages in complex landscapes Brad McRae et al. Maintaining movement and connectivity across landscapes can help reduce inbreeding, increase genetic diversity, and/or colonize new habitat. Connectivity for mountain in southern California. Blue - low current density (low densities of dispersing mountain lions); Yellow - movement bottlenecks, where connectivity is most vulnerable to habitat destruction. Red - high current flow high priority areas for conservation or restoration. “With limited funding and constant threats of habitat loss how do we choose which habitats to protect so that landscapes will stay well-connected for wild animal species?” Opportunities for new computational models integrating ecological and economic constraints. Combatting Habit Loss and Fragmentation: Wildlife Corridors Wildlife Corridors link core biological areas, allowing animal movement between areas. Typically: low budgets to implement corridors. Example: Goal: preserve grizzly bear populations in the U.S. Northern Rockies by creating wildlife corridors connecting 3 reserves: Yellowstone National Park Glacier Park / Northern Continental Divide Salmon-Selway Ecosystem cost suitability Conservation and Biodiversity: Wildlife Corridors Challenges in Constraint Reasoning and Optimization Wildlife corridor design Computational problem Connection Sub-graph Problem Connection Sub-graph Problem Given a graph G with a set of reserves: Find a sub-graph of G that: Connection Sub-Graph - NP-Hard Worst Case Result --- Real-world problems possess hidden structure that can be exploited allowing scaling up of solutions. contains the reserves; is fully connected; with cost below a given budget; and with maximum utility Interdisciplinary Research Project: Conrad, Dilkina, Gomes, van Hoeve, Sabharwal, Sutter; 2007-2010 25 Minimum Cost Corridor for the Connected Sub-Graph Problem Ignore utilities Min Cost Steiner Tree Problem Fixed parameter tractable – polynomial time solvable for fixed (small) number of terminals or reserves (not used in conservation literature) 50x50 grid 167 Cells $1.3B <1 sec 40x40 grid 242 Cells $891M <1 sec 25x25 grid 570 Cells $449M <1 sec 10x10 grid 3299 Cells $99M 10 mins 25 km2 hex 1288 Cells $7.3M 2 hrs 25 km2 hex Extend with 2xB=$15M 10x in Util What if we were allowed extra budget? Need to solve problems large number of cells! Scalability Issues Models Are Important!!! Single Commodity Flow Quite compact (poly size) Directed Steiner Tree Exponential Number of Constraints ! Captures Better the Connectedness Structure ! Provides good upper bounds (discover good cuts with ML)! Dilkina and Gomes 2010 Science of Computation - Understanding Structure: “Typical” Case Analysis and Identification of Critical Parameters (Synthetic Instances) Runtime How is hardness affected as the budget fraction is varied? Problem evaluated on semi-structured graphs 3 reserves m x m lattice / grid graph with k terminals Inspired by the conservation corridors problem Place a terminal each on top-left and bottom-right Maximizes grid use Place remaining terminals randomly Assign uniform random costs and utilities from {0, 1, …, 10} Runtime for Optimal Solution No reserves: “pure optimization” From 6x6 to 10x10 grid (100 parcels): 1000 instances per data-point; Utility Gap (Optimally Extended Min cost/ Optimal) 3 reserves Real world instance: Glacier Park Corridor for grizzly bears in the Northern Rockies, connecting: Yellowstone Salmon-Selway Ecosystem Glacier Park Salmon-Selway (12788 nodes) Yellowstone Scaling up Solutions by Exploiting Structure •Synthetic generator / Typical Case Analysis •Identification of Tractable Sub-problems •Exploiting structure via Decomposition Static/Dynamic Pruning •Streamlining for Optimization •New Encodings 5 km grid (12788 land parcels): minimum cost solution 5 km grid (12788 land parcels): +1% of min. cost Our approach allows us to handle large problems and reduce corridor cost dramatically (hundreds of millions of dollars) compared to existing approaches while providing guarantees of optimality in terms of utility: Optimal or within 1% of optimality for interesting budget levels. Multiple Species Identification of new problems to address multiple species: E.g. Wolverines Steiner Multigraph Problem Upgrading Shortest Path Problem Dynamics and Game Theory : Grizzly Bear Lynx Collaborators: Michael K. Schwartz USDA Forest Service, Rocky Mountain Research Station Claire Montgomery Oregon State University • Study dynamics of interactions • How to be fair??? • What metrics? 30 Outreach and Education Pedagogical Games Shortest path, Steiner trees, and much more about Computational Sustainability Boynton Middle School Math Day “Edutainment” Video Games for middle school Effort led by David Schneider Lots of undergrad/Meng students Having fun designing games for A Travel Museum on Computational Sustainability 31 Connectivity Problems: Other applications If a person is infected with a disease, who else is likely to be? Network of Pandemic Influenza Robert J. Glass,* Laura M. Glass,† Walter E. Beyeler,* Facebook Network What characterizes the connection between two individuals? The shortest path? Size of the connected component? A “good” connected subgraph? Which people have unexpected ties to any members of a list of other individuals? [Faloutsos, McCurley, Tompkins ’04] and H. Jason Min* 2006 Sensor Web as predicted by Matt Heavner at University of Alaska Southeast . Sensor and Wireless Networks 32 Bird Conservation Information Sciences D. Fink 33 Conservation and Biodiversity: Reserve Design for Bird Conservation Red Cockaded Woodpecker (RCW) is a federally endangered species Current population is estimated to be about 1% of original stable population (~12,000 birds) Conservation Funds manages Palmetto Peartree Preserve (North Carolina) 32 active RCW territories (as of Sept 2008) Goal: Increase RCW population level Management options: Prioritizing land acquisition adjacent to current RCW populations Building artificial cavities Translocation of birds Sheldon, D., Dilkina, B., Ahmadizadeh, K., Elmachtoub, A., Finseth, R., Conrad, J., Gomes, C., Sabharwal, A. Shmoys, D., Amundsen, O., Allen, W., Vaughan, B.; 2009-10 Conservation and Biodiversity: Reserve Design for Bird Conservation Red Cockaded W. Management Decisions: Biological and Ecological Model Land acquisition Artificial cavities Translocation of birds Must explicitly consider interactions between biological/ecological patterns and management decisions Maximizing the Spread of Cascades i pij 1-β j pkj k t=1 t= 2 t=3 Stochastic diffusion model (movement and survival patterns) in RCW populations Stochastic optimization model Decisions: where and when to acquire land parcels Goal: Maximize expected number of surviving RCW Approaches: Sample Average Approximation Computational Challenge: scaling up solutions Upper confidence Bounds for considering a large number of years (100+) decomposition methods and exploiting structure Maximize the spread Influencing Cascades in Networks: Other Examples Maximizing or Minimizing spread Minimize the spread Business Networks: Technology adoption among friends/peers. Social Networks: Spread of rumor/news/articles on Facebook, Twitter, or among blogs/websites. Targeted-actions (e.g. marketing campaigns) can be chosen to maximize the spread of these phenomena. Viral Marketing [Domingos and Richardson, KDD 2001] [Krause and Guestrin 2007] [Kempe, Kleinberg, and Tardos, KDD 2003] Epidemiology: Spread of disease – In human networks, or between networks of households, schools, major cities, etc. – In agriculture settings. Contamination: The spread of toxins / pollutants within water networks. Invasive species Mitigation strategies can be chosen to minimize the spread of such phenomena. Additional Levels of Complexity: Large-Scale Data Modeling, Stochasticity, Uncertainty, • How to estimate species distributions and habitat suitability • Movements and migrations; • Understanding species interaction • Climate change • Stochasticity and Uncertainty • How to get the data? • Other factors (e.g., different models of land conservation (e.g., purchase, conservation easements, auctions) typically over different time periods). 37 How to estimate species distributions? Active research area bringing together machine and statistical learning: Eastern Phoebe Migration Steven Philips, Miro Dudik & Rob Schapire Maxent Information Sciences Kelling D. I. MacKenzie, J. D. Nichols, G. B. Lachman, S. Droege, J. A. Royle, and C. A. Langtimm. 2002 Jun Yu, Weng-Keen Wong, Rebecca Hutchinson (2010). Extension of Occupancy-Detection model of Mackenzie et al. 2006 38 : Wind Energy and Bird Conservation Existing and proposed wind farms in US and Mexico (2008) • 26,000+ turbines, 1.5% of potential “Build-out” to reach potential would require >1.7 million turbines • Areas with most favorable winds are also often associated with migratory pathways IWhere to not locate wind farms? Guidelines for locating wind farms Andrew Farnsworth, Ken Rosenberg, 2009 Ken Rosenberg (ICS) part of the science team for the State of the Birds Report eBird: Citizen Science at the Cornell Lab. Of Ornithology Increase scientific knowledge Gather meaningful data to answer large-scale research questions Increase scientific literacy Enable participants to experience the process of scientific investigation and develop problemsolving skills Increase conservation action Apply results to science-based conservation efforts Examples of research and outreach outputs: Over 60M bird observations from more than 400,000 locations CLO: Track record of influencing Labconservation of Ornithology science.at The Citizen Science project at the Cornell empowers everyone interested in birds — from research labs to backyards to remote forests, anyone who counts birds – to contribute to research. The State of Online Research Kit The Birds 40 Processing and Analysis eBird Patterns in Bird Species Occurrence Information Sciences Model results eBird Occurrence of Indigo Bunting (2008) Land Cover Jan Apr Jun Sep Dec Meteorology Species distribution Models . MODIS – Remote sensing data Potential Uses• Examine patterns of migration • Infer impacts ofclimate change • Measure patterns of habitat useage • Measure population trends Ebird: Citizen Science @ Lab. Of Ornithology @ Cornell http://www.birds.cornell.edu/ Source: Daniel Fink. Eastern Phoebe Migration Information Sciences 42 2009. Daniel Fink,Wesley Hochachka, Art Munson, Mirek Riedewald, Ben Shaby, Giles Hooker, and Steve Kelling, Citizen Science and HCI HCI US eBirdLocations • Tools to support public engagement and decision making: Collecting, modeling, and presenting relevant information via usableSampling interfaces; bias: • Crowd-sourcing and citizenHighly science; concentrated data • Computer games and intelligent tutoring systems; How to infer and factor in expertise level of observer? • Models, methods and tools for dissemination How to engage citizens? of Ideally to the locations and increasing awareness of desired sustainable practices, behaviors, and attitudes. Active Learning: where to sample next? How to reduce uncertainty & increase accuracy of species distribution models Slide by Vipin Kumar Monitoring of Global Forest Cover l Automated Land change Information technology can make a huge Evaluation, Reporting andcommunity Tracking by impact on the environmental System (ALERT) helping global datasets (e.g., • develop Planetary Scale Information System for for of disturbances in the water, assessment food, energy), that are key for global forest ecosystem: sustainability Forest fires, droughts,research. floods, • logging/deforestation, conversion to agriculture l This system will help • • • l quantify the carbon impact of changes in the forests Understand the relationship to global climate variability and human activity Provide monitoring and verification system needed for the UN REDD+ program for saving tropical forests Provide ubiquitous web-based access to changes occurring across the globe, creating public awareness Planetary Skin Institute Planetary Skin a global “nervous system” that will integrate land-sea-air-and-space-based sensors helping the public and private sectors make decisions to prevent and adapt to climate change. Time Sensing and Monitoring for Sustainability Slide by Andreas Krause Sensing for managing Path planning for robotic Detect contamination in drinking intelligent buildings environmental monitoring water distribution networks Sensor networks for monitoring environments: data collection, analysis, synthesis, and inference in largescale sensor networks. Want most useful information at minimum cost Exploit structure to efficiently find near-optimal solutions – Many important sensing utility functions are submodular E.g.: Greedy algorithm finds near-optimal set of observations Resulting algorithms perform well on real applications – E.g., top score in Battle of the Water Sensor Networks challenge – Can handle robustness, complex constraints, dynamics, … 45 Understanding Species Interactions: Food Webs Food webs who eats whom among species within their natural habitats – Nodes – species – Edges – feeding relationships between the species Question how does Nature balance the abundance of different species? Can we predict the consequences of invasive alien species? Can we predict the consequences of extinction due to biodiveristy loss? Neo D. Martinez Pacific Ecoinformatics and Computational Ecology Lab www.FoodWebs.org 46 Understanding Species Interactions: Network Science - Food Webs Food webs who eats whom among species within their natural habitats – Nodes – species – Edges – feeding relationships between the species Question how does Nature balance the abundance of different species? Can we predict the consequences of invasive alien species? Neo D. Martinez Can we predict the consequences Pacific Ecoinformatics and of extinction due to biodiveristy loss? Computational Ecology Lab www.FoodWebs.org 47 Understanding Species Interactions: Network Science - Food Webs Food webs who eats whom among species within their natural habitats – Nodes – species – Edges – feeding relationships Network modeling, prediction, and between the speciesoptimization – challenging research area when Question considering the dynamics of nodes. how does Nature balance the Key to understanding coupled abundance of different species? human-natural systems and allow prediction of potential ecological and Can we predict the consequences economic consequences of natural of invasive alien species? and human-driven forces acting upon them. Neo D. Martinez Can we predict the consequences Pacific Ecoinformatics and of extinction due to biodiveristy loss? Computational Ecology Lab www.FoodWebs.org 48 Understanding Climate Change: A Data-Driven Approach Disagreement between IPCC models • Climate change is the defining issue of our era • Actionable predictive insights are required to inform policy • Physics-based models are essential but not adequate – – Models make relatively reliable predictions at global scale for ancillary variables such as temperature But provide the least reliable predictions for variables that are crucial for impact assessment such as regional precipitation Regional hydrology exhibits large variations among major IPCC model projections Project aim: A new and transformative data-driven approach that complements physics-based models and improves prediction of the potential impacts of climate change “... data-intensive science [is] …a new, fourth paradigm for scientific exploration." - Jim Gray Transformative Computer Science Research Predictive Modeling Enable predictive modeling of typical and extreme behavior from multivariate spatio-temporal data Relationship Mining Enable discovery of complex dependence structures: nonlinear associations or long range spatial dependencies Lead PI: Vipin Kumar Complex Networks Enable studying of collective behavior of interacting eco-climate systems High Performance Computing Enable efficient large-scale spatiotemporal analytics on exascale HPC platforms with complex memory hierarchies • Science Contributions – – Data-guided uncertainty reduction by blending physics models and data analytics A new understanding of the complex nature of the Earth system and mechanisms contributing to adverse consequences of climate change • Success Metric – Inclusion of data-driven analysis as a standard part of climate projections and impact assessment (e.g., for IPCC) Socio-Economic Systems: Poverty, Agricultural Systems, Harvesting Policies 50 Poverty maps Identifying the poor is the first essential step Poverty maps: - Use multiple data sets to estimate and map poverty patterns not directly measured. - Machine learning and related methods can permit more efficient use of data from varied sources analogy to species distribution but little research has been done. Example: 2002 Uganda poverty map Chris Barrett and Corey Lang Targeting maps Targeting the best response to reduce poverty Targeting maps: How to estimate the impact and marginal returns of different assets? Poverty interventions need to be targeted to specific areas Asset-based investments have spatially-varying marginal returns Average marginal returns of different assets (Uganda) Policies for Poverty Reduction: Which set of interventions to apply to each targeted area to maximize the poverty reduction, subject to resource constraints (e.g. budget))? Challenging problems at the intersection of machine learning and optimization, Chris Barrett and Corey Lang applied economics and social science. AI for Development Catastrophe Modeling for Rwandan Disease Surveillance Can mobile phones be used as an early warning system for disease outbreaks? Bayesian anomaly detection algorithms A. Kapoor, N. Eagle, E. Horvitz Spatiotemporal Diffusion of Contraceptive Norms How do contraceptive norms spread through rural areas of the developing world? Spatiotemporal diffusion models have the potential to better evaluate the efficacy of HIV prevention techniques and inform policy decisions related to public health. – H. Yoshioka, N. Eagle Many other exciting projects!!!! 53 Harvesting Policies Fishery Management 54 Natural Resource Management: Policies for harvesting renewable resources Over Harvesting The biomass of fish is estimated to be 1/10 of what it was 50 years ago and is declining (Worm et al. 2006). The state of the world’s marine fisheries is alarming Researchers believe that the collapse of the world’s major fisheries is primarily the result of the mismanagement of fisheries (Clark 2006; Costello et al. 2008). There is therefore a clear urgency of finding ways of defining policies for managing fisheries is a sustainable manner. Natural Resource Management: Policies for harvesting renewable resources Economy Example of a Biological Growth Function F(x): Logistic map: x t+1 = r xt (1 - xt), r is the growth rate Increasing Complexity: more complex models and multiple species interactions We are interested in identifying policy decisions (e.g. when to open/close a fishery ground over time). Combinatorial optimization problems with an underlying dynamical model. Class of Computationally Hard Hybrid Dynamic Optimization Models Clark 1976; Conrad 1999; Ang, Conrad, Just, 2009; Ermon , Conrad, Gomes and Selman 2010 56 An Application to the Pacific Halibut Fishery in Area 3A Regulated by International Pacific Halibut Commission (IPHC) Pacific Halibut Fishery in Area 3A Ermon , Conrad, Gomes and Selman UAI 2010 General MDP Framework Resource dynamics Resource growing over time Uncertain dynamics Economics of harvesting New stock size Amount left Stochasticity Closed loop policy To maximize total discounted utility Goal: policy for optimal exploitation over time natural MDP formulation Ermon , Conrad, Gomes and Selman UAI 2010 Main result Fixed costs and variable costs Concavity of the growth function Robust optimization framework (Worst-case for Nature) Theorem: The optimal policy is of S-s type. s growth S harvest • If the stock is greater than s, harvest down to S • If the stock is smaller than s, let the stock grow The Pacific Halibut Fishery in Area 3A Regulated by International Pacific Halibut Commission (IPHC) Total Allowable Catch Policy Constant Proportion Policy rate of fishing ~ 12.3% of estimated stock. Our Result: Policies with periodic closures of a fishery can outperform Constant Proportion Policies Optimal policies for management of Ermon, Conrad, Gomes, Selman, UAI 2010 renewable resources using MDPs Markov Decision Processes (MDPs) to Model for many renewable resource allocation problems. General Approach to model renewable resources with complex dynamics!!!! Renewable resources: forests and fisheries Problems with a similar mathematical structure: - Pollution management, - Invasive species control, - Supply chain management and Inventory control, and many more. - Optimal policies for management of Ermon, Conrad, Gomes, Selman, UAI 2010 renewable resources using MDPs Energy Energy efficiency and renewable energy 62 Energy Efficiency Global warming and climate change concerns have led to major changes in energy policy in many industrial countries. There are tremendous opportunities to increase energy efficiency, such as through the design of control systems for smart buildings, smart grids, smart vehicles, data centers, etc. – For example, in the United States buildings consume 39% of the energy used nationwide and 70% of the electric energy used. Smart buildings Data Centers Environmental Protection Agency estimates data center capacity will grow 10% year in next decade Data center costs increase 20% year (v.s. 6% for overall IT). Huge Inefficiencies - server utilization rarely exceeds 6% and facility utilization as low as 50%. (McKinsey & Co Report 2008) Main reasons for companies to try to cut energy costs Co2 – carbon emissions of the world’s data centers emissions of Argentina and the Netherlands (McKinsey & Co Report 2008) The IT industry and academia are actively researching new ways of increasing energy efficiency ----- advanced power management hardware, smart cooling systems , virtualization tools, dense server configurations, etc. But increasing energy efficiency is not the main path towards reducing carbon emissions64 Renewable Energy The development of renewable energy sources that are cleaner and generate little or no carbon can have a much greater environmental impact. There has been considerable technological progress in the area of renewable energy sources, in part fostered by government incentives. Renewable Energy 65 Energy Independence and Security Act (2007) DOE was charged with implementing the Smart Grid: Complex Digital Ecosystem from largely non-digital, electromechanical grid to a network of digital systems and power infrastructure; from a centralized, producer-controlled network to a more decentralized system allowing greater consumer and local producer interaction. For example consumers will have smart meters that can 1. track energy consumption, 2. monitor individual power circuits in the home, 3. control smart appliances and actively manage their energy use; 4. dynamically select different providers, 5. sell excess energy (e.g., car battery, wind, biomass) to grid, etc Research in sensing and measuring technologies, advanced control methods (monitor and respond to events), dynamic pricing, improved interfaces, decision support and optimization tools, and security and66 privacy. Slide by Eve M. Schooler How to get people to care? Personalization Source : EPRI Identify places consumers have an impact and are impacted by resource usage – Use social science to understand consumer attitudes – Make sustainability economically relevant Identify reasons for industry to care, e.g., Source : ASU – Perception as Environmental Leader – Energy reductions in operations and products – New markets, products, services • Address tussle between Security & Privacy of data 67 Goal: Trusted Personal – Facilitate ubiquitous monitoring-inferencingactuation – Simplify device and data management for mere Energy Management at the edge of the Smart mortals Grid Challenge of Managing the Power Grid: Balancing different supply and demand sources Total energy produced must be equal to the amount consumed by the system loads AT ALL TIMES: Managing the Power Grid is a Balancing Act!!! How to integrate the different sources into the smart grid ? challenging dynamical problems!!! What is the best mix of energy generation technologies? What is the best mix of storage technologies? Where to locate renewable energy plants? Energy Efficiency and Renewable Energy: Biofuels Energy Independence and Security Act (Signed into law in Dec. 2007) Ambitious mandatory goal of 36 billion gallons of renewable fuels by 2022 (five-fold increase from 2007 level) First generation Biofuels Advanced Biofuels (“Cleaner”) (non food crops and biomass ) Fuel distributors Gasoline producers Farmers Non-energy crops Food supply Energy crops Energy market Water quality Soil quality Biodiversity Switchgrass Wood Waste Animal Waste Municipal Waste Consumers Environmental impact Local air pollution Social welfare Economic impact Computational challenges: Large Scale Logistics Planning Realistic Computational Models to Evaluate Impact 69 Large Scale Stochastic Logistics Planning for Biofuels Large-scale investment in new technology provides exciting logistical planning and optimization challenges and opportunities Resulting optimization models are beyond the scope of the current state of the art in several ways: Transportation Network (Roads, Rail, Marine) Feedstock Map Distribution Terminals and Inter-Modal Facilities to Transfer Liquid Fuel Biomass Map - large-scale input; - stochastic nature (e.g.,feedstock and demand) new models to capture uncertainty new stochastic optimization algorithms; - dynamics of evolution of demand and capacity Potential biorefinery locations 70 Realistic Computational Models and Metrics to Study Sustainability Impacts (e.g. Impact of Renewable Energy Sources ) Current approaches limited in scope and complexity Life Cycle Analysis E.g. based on general equilibrium models (e.g., Nash style) deling and control of complex high-dimensional systems General Equilibrium Complex Adaptive Systems Strong convexity assumptions to keep the model simple enough Models mbining physics-based models, model-based reasoning, and for analytical, closed-form solutions (unrealistic scenarios) optimization and control methods, and Multi-Agent Systems Limited computational thinking Models and Metrics!!!!! machine learning models built from large-scale heterogeneous data. Transformative research directions More realistic computational models in which meaningful solutions can be computed Multi-agent systems, multi-agent Large-scale data, beyond state-of-the-art CS techniques equilibrium models, game theory, and Study of dynamics of reaching equilibrium — key for adaptive policy making! design of effective mechanisms and How to measure risks/ Fuel distributors policies for thepredict exchange of goods. rare events? Gasoline producers Farmers Impact on Land-use Non-energy crops Food supply Energy crops Energy market Water quality Soil quality Biodiversity Consumers Environmental impact Local air pollution Social welfare Economic impact Impact on food prices? Realistic Computational Models and Metrics to Study Sustainability Impacts (e.g. Impact of Renewable Energy Sources ) Current approaches limited in scope and complexity Life Cycle Analysis E.g. based on general equilibrium models (e.g., Nash style) General Equilibrium Systems Strong convexity assumptions to keep the model simple Complex enough Adaptive Models and for analytical, closed-form solutions (unrealistic scenarios) Multi-Agent Systems Limited computational thinking Models and Metrics!!!!! Transformative research directions More realistic computational models in which meaningful solutions can be computed Multi-agent systems, multi-agent Large-scale data, beyond state-of-the-art CS techniques equilibrium models, game theory, and Study of dynamics of reaching equilibrium — key for adaptive policy making! design of effective mechanisms and How to measure risks/ Fuel distributors policies for thepredict exchange of goods. rare events? Gasoline producers Farmers Impact on Land-use Non-energy crops Food Energy crops Water quality Soil quality Biodiversity Environmental impact Local air pollution Consumersof complex supply Impact on food prices? Modeling and control high-dimensional systems byEnergy combining physics-based models, model-based reasoning, market optimization and control methods, and Social welfare machine learning models built from large-scale Economic impact heterogeneous data. The role of Experimental Research Crop Soil Science Material Discovery 73 Agricultural Systems and GHG’s emissions Agriculture is the primary driver of land use change and deforestation and an important source of Greenhouse Gases (GHG)s: 52% and 84% of global anthropogenic CH4 and N2O, significant C02 31% Nitrogen Based Fertilizers Source of Greenhouse Gases 74 Sustainable Agriculture and Sustainable Communities Super eco EcoVillage in Ithaca,NY Sustainable Agriculture and Sustainable Communities Collaborators Laurie Drinkwater and Norm Scott Super Techno Nitrogen Cycle and Fertilizers Dead Zone in Nitrogen Based Gulf of Mexico Fertilizers Townsend & Howarth (2010) Scientific American Spatial and temporal analysis of Nitrogen cycle Fertilizers have played a key role in the increase of food production but they key culprits in the production of greenhouse gases emissions creating dead zones Study of fertilizers and design of Experiments Collaborator Harold van Es 76 Design of Scientific Experiments (4 Treatments: A,B,C,D) Hybrid IP/CSP based – Assignment formulation – Packing formulation – Different CSP models SAT/ CSP based approach Latin Square – State of the art model for Latin Square + symmetry breaking by initializing first row and column (SBDD doesn’t help; this is not a completion problem) Local search based approach These approaches do not scale up max order 6 Below average Perfectly balanced 2.33 Above average Spatially Balanced Latin Square (Target number 30). Science of Computation: Discovering patterns, laws, and hidden structure in computational phenomena Streamlining Constraint Reasoning Streamlining Scaling up of solutions Discovery of structural properties across solutions (machine learning); Divide (“streamline”) the search space by imposing such additional properties. Design of Scientific Experiments for Studying Fertilizers – Spatially Balanced Latin Squares Existence of SBLS – open problem in combinatorics Streamlining by Permutation of Complete Columns of a Cyclic Latin Square Discovery of Construction for SBLS for experimental design LeBras, Perrault, Gomes, 2010 Can we find an domain independent way of streamlining? YES: XOR-Streamlining based on random parity constraints SBLS Order 35 Gomes and Sellmann 2003/2004 (Valiant and Vazirani 1986, Unique SAT) Provable bounds for Counting and Sampling. Gomes, Sabharwal, Selman, 2006 78 Material Discovery for Fuel Cell Technology 79 Material Discovery for Fuel Cell Technology Goal of Material Discovery: – Find new products – Find product substitutes – Understand material properties Approach: Analysis of inorganic libraries Sputter 2 or 3 metals (or oxides) onto a silicon wafer (which produces a thin-film) Use x-ray diffraction to study crystallographic structure of the thin-film Ronan LeBras, Damoulas, John M. Gregoire, Sabharwal, Gomes, van Dover Note: electromagnetic radiation experiments are very expensive!! Fuel Cell Example: study of platinum-tantulum library showed correlation between crystallographic phase and improved fuel cell oxidation catalysis (Gregoire et al 2010) Applications 80 Problem Definition 81 Problem Definition Al Fe Si 82 Problem Definition Al (38% Fe, 45% Al, 17% Si) Fe Si 83 Problem Definition Al [Source: Pyrotope, Sebastien Merkel] (38% Fe, 45% Al, 17% Si) Fe Si 84 Problem Definition 120 Al 100 80 60 40 20 0 15 20 25 30 35 40 45 50 55 60 (38% Fe, 45% Al, 17% Si) Fe Si 85 The Problem: Labeling Points with “Phase(s)” INPUT: UNDERLYING PHYSICAL STRUCTURE: Al 120 100 80 60 40 20 0 15 20 25 30 35 40 45 50 55 60 Al 120 100 pure phase regions 80 60 40 20 0 15 20 25 30 35 40 45 50 55 60 70 60 50 40 30 20 10 0 15 20 25 30 35 40 45 50 55 60 65 120 120 100 100 80 80 60 60 40 40 20 20 0 15 20 25 30 35 40 45 50 55 60 Fe 0 15 Si 20 25 30 35 40 45 50 55 60 Si Fe OUTPUT: NP-hard Al Labeled regions mixed phase region + Si Fe mixed phase region 86 Synthesis of Constraint Reasoning and Machine Learning Approaches for Material Discovery Standard Clustering Approaches: Based on pure Machine learning techniques + Good at providing a “rough”, data driven, global picture + Can incorporate complex dependencies and global similarity structure - Easily miss critical details (physical properties) Constraint reasoning and optimization Two “global alignment kernels” for a ternary problem instance Optimization Model + Great at enforcing the detailed constraints + Can encode and capture the “physics” behind the process - Do not scale up 87 What’s New: Solving it “Properly” Requires… … a robust, physically meaningful, scalable, automated solution method that combines: Underlying Physics A language for Constraints Machine Learning for a “global data-driven view” enforcing “local details” Constraint Programming model Significantly outperforming previous methods. Similarity “Kernels” & Clustering 88 Outline I Computational Sustainability II Examples of Computational Sustainability problems highlighting research themes III Institute for Computational Sustainability IV Building a community in Computational Sustainability V Conclusions 89 Interdisciplinary Research Projects (IRPs): The Building Blocks of our Institute A major part of the effort has been identifying a variety of collaborative projects that bring computer scientists together with domain experts in a number of fields Individual projects organized under the umbrella of IRPs (Interdisciplinary Research Projects) – Currently 13 Application IRPs, 5 Technical IRPs – Many more in incubation! Examples: Application IRP: Bird Conservation Application IRP: Biofuels .... Technical IRP: Dynamical Systems Technical IRP: Optimization Species distribution modeling Equilibrium models Parameter estimation Problem structure Bird migration Impact of gasoline taxes Synchronization Stochastic optimization RCW preservation Carbon offsets in cap-and-trade Monitoring of sustainability Multi-agent inference Critical transitions Balanced designs Reducing sampling bias Application IRPs IRP Name Researchers Collaborators Students Reserve Selection and Wildlife Corridors Conrad, Gomes, Sabharwal, Selman, Shmoys 2. Bird Conservation • Species distribution • Bird migration • RCW preservation Lab of O: Rosenberg, Webb TCF: Allen, Amundsen, Vaughan Cornell: Conrad, Damoulas, Gomes, Sabharwal, Shmoys OSU: Dietterich, Hutchinson, Sheldon, Wong Lab of O: Farnsworth, Fink, Hames, Hochachka, Kelling, Wood Cornell: Hooker, Joachims, Riedewald Ahmadizadeh, Castorena, Dilkina, Elmachtoub, Finseth 3. Fishery Management and Fish Preservation Conrad, Gomes, Sabharwal, Selman, Yakubu, Zeeman Jerald, Ziyadi Alam, Ermon, Haycraft, Li, Wiley 4. Disease Prevention in Dairy Cows Damoulas, Ellner, Gomes, Sabharwal, Selman Grohn, Cha Ahmadizadeh, LeBras, Prakash, Smith, Teose Forest Fires Borradaile, Dietterich, Montgomery, Rusmevichientong, Shmoys, Wong 1. 5. Dilkina, Dimiduk, Drusvyatskiy Ferst, Spencer, Gagnon, Houtman, Mark, Reynolds continued … Application IRPs IRP Name 6. Modeling Ecological Dynamics Researchers Ellner (continued ) Collaborators Students Jones, Mao-Jones, Ritchie, King, Rohani, Ionides, Kendall, Reumann, Newman, Ferrari, Hooker 7. Biofuels •Equilibrium models •Cap-and-trade Bento, Walker Anderson Bergstrom, Cuvilliez, Kang, Klotz, Landry, Latza, Mayton, Patel, Roth, Stanislaus, Taranto, Weiz 8. Pastoral Systems in East Africa Barrett, Gomes, Selman, Wong Kariuki, Leeuw, Smith Guo, Toth, Zhuo 9. Poverty Mitigation and Food Insecurity Barrett, Damoulas, Gomes, Lentz, Naschold Allen, Bell, Lang, Maxwell, McGlinchy, Murray Dilkina, Guo, Harou, Lang 10. Invasive Species Conrad, Shmoys Dietterich,Gomes Hutchinson, Sheldon 11. Impact of Climate Change Guckenheimer, Mahowald, Zeeman Many colleagues in climate science 12. Stress/Environment Interactions Zeeman McCobb Gribizis 13. Material Discovery Damoulas, Gomes, Sabharwal Gregoire, van Dover LeBras Yang, Ermon, Spencer Technical IRPs IRP Name Researchers Collaborators Students Dynamical Systems • Parameter estimation 14. • Synchronization • Monitoring of sustainability Guckenheimer, Strogatz, Zeeman Haiduc, Kuehn 15. Network Science • High dimensional data • Sensor networks • Dynamics & Sync Gomes, Hopcroft, Selman, Strogatz Ermon, Sheldon 16. Bridging Constraint Reasoning and Machine Learning Damoulas, Dietterich, Gomes, Sabharwal 17. Combinatorial Reasoning and Stochastic Optimization • Problem structure • Stochastic optimization • Multi-agent inference • Balanced designs Gomes, Sabharwal, Selman, Shmoys, Sheldon 18. Large Scale Computing Paradigms for Massively Large Data Sets and Simulation-Intensive Studies Gomes, Sabharwal, Selman Gregoire, van Dover LeBras Malitski, Sellmann Cheung, Dilkina, Ermon, Guo, Kroc, LeBras, Perrault, Ramanujan Ahmadizadeh, Dilkina, Kroc 93 Transformative Computer Science Research: Driven by Deep Research Challenges posed by Sustainability Design of policies to manage natural resources translating into large-scale optimization and learning problems, combining a mixture of discrete and continuous effects, in a highly dynamic and uncertain environment increasing levels of complexity Study computational problems as natural phenomena Science of Computation Many highly interconnected components; From Centralized to Distributed: multi agent systems Distributed Highly Interconnected Components / Agents Dynamics Complex dynamics and Multiple scales From Statics to Dynamics: Complex systems and dynamical systems Large-scale data and uncertainty Machine Learning, Statistical Modeling Constraint Satisfaction and Optimization Complex decision and optimization problems Constraint Reasoning and Optimization Complexity levels in Computational Sustainability Problems Data and Uncertainty 94 Transformative Computer Science Research: Driven by Deep Research Challenges posed by Sustainability Design of policies to manage natural resources translating into large-scale optimization and learning problems, combining a mixture of discrete and continuous effects, in a highly dynamic and uncertain environment Constraint Reasoning & Optimization Gomes PI,W, Hopcroft Co-PI SelmanCo-PI, ShmoysCo-PI increasing levels of complexity Study computational problems as natural phenomena Science of Computation Many highly interconnected components; From Centralized to Distributed: multi agent systems Complex dynamics and Multiple time scales From Statics to Dynamics: Complex systems and dynamical systems Large-scale data and uncertainty Machine Learning, Statistical Modeling Complex decision and optimization problems Constraint Reasoning and Optimization Resource Economics, Environmental Sciences & Engineering AlbersCo-PI,W, Amundsen, Barrett, Bento, ConradCo-PI, DiSalvo, MahowaldW, MontgomeryCo-PI,W Rosenberg,SofiaW, WalkerAA Environmental & Socioeconomic Needs Dynamical Models Guckenheimer Strogatz ZeemanPI,W YakubuAA Transformative Synthesis: Dynamics & Learning Science of Computation Science of Computation Data & Machine Learning DietterichPI WongCo-PI ChavarriaH 95 Institute Activities Education Research Coordinating transformative synthesis collaborations Postdocs Doctoral students Research seminar series Honors projects Summer REU program targeting minority students Interdisciplinary Research Projects (IRPs) Computational Sustainability courses ICS Outreach K-12 Activities Citizen Science Lab. Of Ornithology Rocky Mountain Research Station Conservation Fund Cornell Cooperative Extension Travel Museum The Nature Conservancy Building Research Community Conference & Workshops Web Portal Seminars External Collaborations Host visiting Scientists Computational Sustainability Community Conferences, Workshops, Discussion Groups CompSust’09: Cornell, 2009 Blogs: Over 225 international researchers from several disciplines and institutions (universities, labs, government) International Workshop on Constraint Reasoning and Optimization for Computational Sustainability • CROCS-09 at CP-09 NIPS-09 Mini Symposium: Machine Learning for Sustainability CompSust’10: MIT, 2010 Over 150 participants and lots of topics!! • CROCS at CPAIOR-10 • CROCS at CP-10 AAAI 2011 Special Track on Computational Sustainability February 3, 2011: abstracts due February 8, 2011: papers due Conclusions Computational Sustainability Interdisciplinary field that aims to apply techniques from computing and information science, and related fields (e.g., engineering, operations research, applied mathematics, and statistics ) for Sustainable Development. Sustainable Development encompasses balancing environmental, economic, and societal needs. 99 Climate Public Policy Energy Natural Resources Social Factors Sustainability Economic factors Human Factors Education 100 Computational Sustainability attempt at a definition Wide interdisciplinary field , encompassing disciplines as diverse as economics, sociology, environmental sciences and engineering , biology, crop and soil science, meteorology and atmospheric science. Focus: Develop computational models, methods and tools for a broad range of sustainability related applications and task: from supporting decision making and policy analysis concerning the management and allocation of natural resources to the development of new sustainable techniques, products, practices, attitudes, and behaviors. Key challenge: to effectively and efficiently establish interdisciplinary collaborations – the level of interconnectedness of social, economic, and environmental issues makes it really challenging! 101 Computational Sustainability Sustainability fields New challenging applications for Computer Science New methodologies In Computer Science Computational Thinking that will provide new insights into sustainability problems: New methodologies In Sustainability fields Computer science and related fields Analogy with Computational Biology Summary Computational Sustainability has great potential to advance the state of the art of computer science and related disciplines with unique societal benefits! Thank You! 103 Why Computational Sustainability? Thank you! From the Cloud Institute, NY 104 Statistics Alg. Theory eScience Climate Public Policy Natural Resources Networks & Systems AI Social Factors CompSust Computational Energy Sustainability Appl. Mathematics HCI Economic factors Human Factors Education Game Theory Systems Eng. Operations Research 105