Computational Sustainability Carla P. Gomes Cornell University Institute for Computational Sustainability

advertisement
Computational Sustainability
Carla P. Gomes
Institute for Computational Sustainability
Computing and Information Science
Cornell University
Support:
Expeditions
in Computing
(CISE)
February 2011
The Institute for Computational Sustainability (ICS)
Research Team
ICS members and collaborators for their many
contributions towards the development of a vision for
research, education, and outreach activities in the new
area of Computational Sustainability
Expeditions
in Computing
(CISE)
Support:
29 graduate students
24 undergrad. students
Background and Motivation:
Perspective on the Future of Computer Science
Computer Science is a fast moving discipline,
and it is becoming crucial to other fields.
“Computational Thinking”
is permeating and shaping so many areas and disciplines.
I - Foundations
up to 1985
Early days computer science and engineering focused
on making computers more efficient, faster, easier to
program, and better at communicating.
Topics:
Hardware Architecture
Operating systems
Networks and Internet
Programming Languages
Compilers,
Theory of Computation: Algorithms and Complexity,
etc
Many challenges remain in these areas and much work is
required to ensure continued advances in computational
power , data, sensing, and communication networks.
Our computing and data infra-structure will be
100x to 1000x more powerful by 2030.
Our current setup will look like a good
“calculator” with a primitive data network.
5
With developments in these areas, we can now create
highly complex systems, with software with millions of lines of code
running on hundreds of networked machines, that is fast, reliable and
(reasonably) secure.
We can control:
Huge supply chains (eg Amazon)
Search engines (eg Google),
Financial , data, and communication networks
Sophisticated Medical Equipment,
and almost any complex artifact.
Average household has lots of processors embedded.
We’ll soon have produced more transistors than grains of sand in the world!
II- Broadening the Field
1986 to 2000 (and beyond)
Key goal: making computers smarter, and easier
to interact with.
New applications and subfields emerged, such as:
Information retrieval (Search engines),
Computer graphics,
AI (machine learning, computer vision, natural language
understanding),
and Robotics.
These trends will continue with exciting things coming
up within the next 10 to 20 years.
For example
Smart buildings,
Self-driving cars,
Simultaneous automatic translation,
Household robots,
Robotic remote surgery, …
III Reaching Out
2001 to current
Computer Science is reaching out to other disciplines.
Basic theme:
infuse computational thinking into other disciplines.
Tremendous opportunities:
Any scientist can now have a local 10 TB disk, storing the equivalent of
10 million books, right at his or her desk, with sophisticated compute power .
Explosion and proliferation of a variety of sensing devises with advanced
embedding and networking capabilities
We see some of the first results in e.g.:
biology (e.g. genetics)
social sciences (eg social network analysis), and
linguistics (eg evolution of word usage).
A New Kind of Science
Opportunities go much beyond “just” better and larger scale data
processing.
“There is the option of training complex computational
models with good predictive power but too complex for
human comprehension (no explanatory capacity).”
Strogatz, NYT 2010;
On the unreasonable effectiveness of data, Halevy et al. 2010.
A New Kind of Science?
In other words, in the future our scientific theories are no longer
restricted in complexity by our own cognitive abilities.
Rather, given the right amount of data and inference power, a
machine can potentially train a highly complex model and use it
as a predictive engine.
Example: Google’s machine translation.
Works well for dozens of languages. No clear linguistic underlying model.
Instead, highly complex statistical model, automatically trained on 1+
million examples. Programmers often did not know the languages!
Consider providing such computational thinking
to many other fields!
Examples of new emerging computational fields:
Computational biology,
Computational Economics,
Healthcare Informatics,
and Computational Sustainability
Pervasive & ubiquitous computing and communications will play a
key role in information capture and reasoning, as well as adaption to
highly dynamic and uncertain environments
I will discuss “Computational Sustainability.”
Computational Sustainability
Sustainability and Sustainable Development
The 1987 UN report, “Our Common Future” (Brundtland Report):
 Raised serious concerns about the State of the Planet.
 Introduced the notion of sustainability and sustainable
development:
Sustainable Development: “development that meets the needs
of the present without compromising the ability of future
generations to meet their needs.”
UN World Commission on Environment and Development,1987.
Gro Brundtland
Norwegian
Prime Minister
14
Chair of WCED
Follow-Up Reports:
Intergovernmental Panel on Climate Change (IPCC 07) Global
Environment Outlook Report (GEO 07)
”There are no major issues raised in Our Common Future for
which the foreseeable trends are favourable.”
Erosion of Biodiversity
Examples:
•At the current rates of human destruction of natural
ecosystems, 50% of all species of life on earth will be
extinct in 100 years. Wilson, E.O. The Future of Life (2002) .
Global Warming
+130
countries
•The biomass of fish is estimated to be 1/10 of what it
was 50 years ago and is declining. Worm et al. (2006)
[Nobel Prize
with Gore 2007]
15
Safe Operating Boundaries:
Crucial Biophysical Systems
Source:
Planetary Boundaries: A Safe Operating Space for Humanity, Nature, 2009
16
Sustainability:
Interlinked environment, economic, and social issues
Our Common Future recognized that
environmental, economic and social issues are
interlinked.
The economy only exists in the context of societies,
and both society and economic activity are
constrained by the earth’s natural systems.
A secure future depends upon the health of all 3
systems (environment, society, economy).
Sustainable Development encompasses
balancing environmental, economic, and
societal needs.
17
Challenging Sustainability Problems
Complex Dynamical Systems
Sustainability problems are unique in scale,
impact, complexity, and richness:
Often involving multiple and highly interconnected
components and players, in highly dynamic and
uncertain environments.
ecosystems.noaa.gov
Natural Ecosystem
Offer challenges but also opportunities for
the advancement of the state of the art in
computing and information science.
Smart Grid: Complex Digital Ecosystem
18
Computational Sustainability:
Vision
Computer scientists can — and should — play a key role in
increasing the efficiency and effectiveness of the way we manage
and allocate our natural resources, while enriching and transforming
Computing and Information Science and related disciplines.
We need critical mass in the new field of
Computational Sustainability!!!
19
Outline
I Examples of Computational Sustainability problems
highlighting research themes
II Our research themes
III Building a community in Computational Sustainability
IV Conclusions
20
Sample of Computational Sustainability Problems
I Environment
E.g. Conservation and Biodiversity
II Socio-Economic Systems
E.g. Poverty mapping and poverty reduction,
and harvesting policies.
III Energy
Fuel
distributors
E.g. Smart grid, Material discovery and Biofuels
Farmers
Energy
crops
Water quality
Soil quality
Gasoline
producers
Non-energy
crops
Food
supply
Environmental
impact
Biodiversity
Local air
pollution
Consumers
Energy
market
Social welfare
Economic
impact
21
Conservation and Biodiversity
22
Combatting Biodiversity Loss:
Landscape Connectivity
Ideas from circuit theory for mapping
critical linkages in complex landscapes
Brad McRae et al.
Maintaining movement and connectivity across
landscapes can help reduce inbreeding,
increase genetic diversity,
and/or colonize new habitat.
Connectivity for mountain in southern California.
Blue - low current density (low densities of dispersing mountain lions);
Yellow - movement bottlenecks, where connectivity is most vulnerable to
habitat destruction.
Red - high current flow high priority areas for conservation or restoration.
“With limited funding and constant threats of habitat loss how do we choose
which habitats to protect so that landscapes will stay well-connected for
wild animal species?”
Opportunities for new computational models integrating
ecological and economic constraints.
Combatting Habit Loss and Fragmentation:
Wildlife Corridors
Wildlife Corridors link core biological areas,
allowing animal movement between areas.
Typically: low budgets to implement corridors.
Example:
Goal: preserve grizzly bear populations in
the U.S. Northern Rockies by creating
wildlife corridors connecting 3 reserves:
Yellowstone National Park
Glacier Park / Northern Continental Divide
Salmon-Selway Ecosystem
cost
suitability
Conservation and Biodiversity: Wildlife Corridors
Challenges in Constraint Reasoning and Optimization
Wildlife corridor design
Computational problem  Connection Sub-graph Problem
Connection Sub-graph Problem
Given a graph G with a set of reserves:
Find a sub-graph of G that:
Connection Sub-Graph - NP-Hard
Worst Case Result --- Real-world problems possess
hidden structure that can be exploited allowing
scaling up of solutions.
contains the reserves;
is fully connected;
with cost below a given budget;
and
with maximum utility
Interdisciplinary Research Project:
Conrad, Dilkina, Gomes, van Hoeve, Sabharwal, Sutter; 2007-2010
25
Minimum Cost Corridor for the Connected Sub-Graph Problem
 Ignore utilities  Min Cost Steiner Tree Problem
 Fixed parameter tractable – polynomial time solvable for fixed (small)
number of terminals or reserves (not used in conservation literature)
50x50 grid
167 Cells
$1.3B
<1 sec
40x40 grid
242 Cells
$891M
<1 sec
25x25 grid
570 Cells
$449M
<1 sec
10x10 grid
3299 Cells
$99M
10 mins
25 km2 hex
1288 Cells
$7.3M
2 hrs
25 km2 hex
Extend with
2xB=$15M
10x in Util
What if we were allowed extra budget?
Need to solve problems large number of cells! Scalability Issues
Models Are Important!!!
Single Commodity Flow
Quite compact (poly size)
Directed Steiner Tree
Exponential Number of Constraints !
Captures Better the Connectedness Structure !
Provides good upper bounds
(discover good cuts with ML)!
Dilkina and Gomes 2010
Science of Computation - Understanding Structure:
“Typical” Case Analysis and Identification of Critical Parameters
(Synthetic Instances)
Runtime
How is hardness affected
as the budget fraction is varied?
Problem evaluated on semi-structured graphs
3 reserves
m x m lattice / grid graph with k terminals
Inspired by the conservation corridors problem
Place a terminal each on top-left and bottom-right
Maximizes grid use
Place remaining terminals randomly
Assign uniform random
costs and utilities
from {0, 1, …, 10}
Runtime for Optimal Solution
No reserves:
“pure optimization”
From 6x6 to 10x10 grid (100 parcels):
1000 instances per data-point;
Utility Gap (Optimally Extended Min cost/ Optimal)
3 reserves
Real world instance:
Glacier Park
Corridor for grizzly bears in the
Northern Rockies, connecting:
Yellowstone
Salmon-Selway Ecosystem
Glacier Park
Salmon-Selway
(12788 nodes)
Yellowstone
Scaling up Solutions by Exploiting Structure
•Synthetic generator / Typical Case Analysis
•Identification of Tractable Sub-problems
•Exploiting structure via Decomposition
Static/Dynamic Pruning
•Streamlining for Optimization
•New Encodings
5 km grid
(12788 land parcels):
minimum cost solution
5 km grid
(12788 land parcels):
+1% of min. cost
Our approach allows us to handle large problems and
reduce corridor cost dramatically (hundreds of millions of dollars)
compared to existing approaches while providing
guarantees of optimality in terms of utility:
Optimal or within 1% of optimality for interesting budget levels.
Multiple Species
Identification of new problems
to address multiple species:
E.g.
Wolverines
Steiner Multigraph Problem
Upgrading Shortest Path Problem
Dynamics and Game Theory :
Grizzly Bear
Lynx
Collaborators: Michael K. Schwartz
USDA Forest Service, Rocky Mountain Research Station
Claire Montgomery Oregon State University
• Study dynamics of interactions
• How to be fair???
• What metrics?
30
Outreach and Education
Pedagogical Games
Shortest path, Steiner trees,
and much more about
Computational Sustainability
Boynton Middle School
Math Day
“Edutainment”
Video Games for
middle school
Effort led by
David
Schneider
Lots of undergrad/Meng students
Having fun designing games for
A Travel Museum on Computational
Sustainability
31
Connectivity Problems:
Other applications
If a person is
infected with a
disease,
who else is likely to
be?
Network of Pandemic
Influenza
Robert J. Glass,* Laura M. Glass,† Walter E. Beyeler,*
Facebook Network
 What characterizes the connection between
two individuals?
The shortest path?
Size of the connected component?
A “good” connected subgraph?
 Which people have unexpected ties to any
members of a list of other individuals?
[Faloutsos, McCurley, Tompkins ’04]
and H. Jason Min*
2006
Sensor Web as predicted by Matt Heavner
at University of Alaska Southeast
.
Sensor and Wireless Networks
32
Bird Conservation
Information Sciences
D. Fink
33
Conservation and Biodiversity:
Reserve Design for Bird Conservation
Red Cockaded Woodpecker (RCW) is a federally endangered species
Current population is estimated to be about
1% of original stable population (~12,000 birds)
Conservation Funds manages
Palmetto Peartree Preserve (North Carolina)
32 active RCW territories (as of Sept 2008)
Goal: Increase RCW population level
Management options:
Prioritizing land acquisition adjacent
to current RCW populations
Building artificial cavities
Translocation of birds
Sheldon, D., Dilkina, B., Ahmadizadeh, K., Elmachtoub, A., Finseth, R., Conrad, J.,
Gomes, C., Sabharwal, A. Shmoys, D., Amundsen, O., Allen, W., Vaughan, B.; 2009-10
Conservation and Biodiversity:
Reserve Design for Bird Conservation
Red Cockaded W.
Management Decisions:
Biological and
Ecological Model
Land acquisition
Artificial cavities
Translocation of birds
Must explicitly consider
interactions between
biological/ecological
patterns and management
decisions
Maximizing
the Spread of Cascades
i
pij
1-β
j
pkj
k
t=1
t=
2
t=3
Stochastic diffusion model
(movement and survival patterns)
in RCW populations
Stochastic optimization model
Decisions: where and when to acquire land parcels
Goal: Maximize expected number of surviving RCW
Approaches: Sample Average Approximation
Computational Challenge: scaling up solutions
Upper confidence Bounds
for considering a large number of years (100+)
 decomposition methods and exploiting structure
Maximize the spread
Influencing Cascades in Networks:
Other Examples
Maximizing or Minimizing spread
Minimize the spread
 Business Networks:
Technology adoption among friends/peers.
 Social Networks:
Spread of rumor/news/articles on
Facebook, Twitter, or among
blogs/websites.
Targeted-actions (e.g. marketing
campaigns) can be chosen to
maximize the spread of these
phenomena.
Viral Marketing
[Domingos and Richardson, KDD 2001]
[Krause and Guestrin 2007]
[Kempe, Kleinberg, and
Tardos, KDD 2003]
 Epidemiology: Spread of disease
– In human networks, or between
networks of households, schools,
major cities, etc.
– In agriculture settings.
 Contamination: The spread of
toxins / pollutants within water
networks.
 Invasive species
Mitigation strategies can be chosen to
minimize the spread of such
phenomena.
Additional Levels of Complexity:
Large-Scale Data Modeling, Stochasticity, Uncertainty,
•
How to estimate species distributions and habitat suitability
•
Movements and migrations;
•
Understanding species interaction
•
Climate change
•
Stochasticity and Uncertainty
•
How to get the data?
•
Other factors
(e.g., different models of land conservation (e.g., purchase, conservation
easements, auctions) typically over different time periods).
37
How to estimate species distributions?
Active research area bringing together machine and statistical learning:
Eastern Phoebe Migration
Steven Philips, Miro Dudik & Rob Schapire
Maxent
Information Sciences
Kelling
D. I. MacKenzie, J. D. Nichols, G. B. Lachman,
S. Droege, J. A. Royle, and C. A. Langtimm. 2002
Jun Yu, Weng-Keen Wong, Rebecca Hutchinson (2010).
Extension of Occupancy-Detection model of
Mackenzie et al. 2006
38
:
Wind Energy and Bird Conservation
Existing and proposed wind farms in US and Mexico (2008)
• 26,000+ turbines, 1.5% of potential
“Build-out” to reach potential would
require >1.7 million turbines
• Areas with most favorable winds are
also often associated with migratory
pathways
IWhere to not locate wind farms?
Guidelines for locating wind farms
Andrew Farnsworth, Ken Rosenberg, 2009
Ken Rosenberg
(ICS) part of
the science team for the
State of the Birds Report
eBird: Citizen Science at
the Cornell Lab. Of Ornithology
 Increase scientific knowledge
Gather meaningful data to answer large-scale
research questions
 Increase scientific literacy
Enable participants to experience the process of
scientific investigation and develop problemsolving skills
 Increase conservation action
Apply results to science-based conservation
efforts
Examples of research and outreach outputs:
Over 60M bird observations from more than
400,000 locations
CLO: Track record
of influencing
Labconservation
of Ornithology
science.at
The Citizen Science project at the
Cornell empowers
everyone interested in birds — from research labs to backyards to remote
forests, anyone who counts birds – to contribute to research.
The State of
Online Research Kit
The Birds 40
Processing and Analysis
eBird
Patterns in Bird Species Occurrence
Information Sciences
Model results
eBird
Occurrence of Indigo Bunting (2008)
Land Cover
Jan
Apr
Jun
Sep
Dec
Meteorology
Species distribution Models
.
MODIS –
Remote
sensing data
Potential Uses• Examine patterns of migration
• Infer impacts ofclimate change
• Measure patterns of habitat
useage
• Measure population trends
Ebird: Citizen Science @ Lab. Of Ornithology @ Cornell
http://www.birds.cornell.edu/
Source: Daniel Fink.
Eastern Phoebe Migration
Information Sciences
42 2009.
Daniel Fink,Wesley Hochachka, Art Munson, Mirek Riedewald, Ben Shaby, Giles Hooker, and Steve Kelling,
Citizen Science and HCI
HCI
US eBirdLocations
• Tools to support public engagement and decision
making: Collecting, modeling, and presenting
relevant information via usableSampling
interfaces;
bias:
• Crowd-sourcing and citizenHighly
science;
concentrated data
• Computer games and intelligent tutoring
systems;
How to infer and factor in expertise level of observer?
• Models, methods and tools for dissemination
How to engage
citizens? of
Ideally
to the
locations
and increasing
awareness
of desired
sustainable
practices,
behaviors,
and attitudes.
 Active Learning:
where
to sample
next?
How to reduce uncertainty & increase accuracy of species distribution models
Slide by Vipin Kumar
Monitoring of Global Forest Cover
l
Automated
Land change
Information
technology
can make a huge
Evaluation,
Reporting andcommunity
Tracking by
impact
on
the
environmental
System (ALERT)
helping
global
datasets
(e.g.,
• develop
Planetary Scale
Information
System
for for
of disturbances
in the
water, assessment
food, energy),
that are
key for
global forest ecosystem:
sustainability
Forest fires, droughts,research.
floods,
•
logging/deforestation, conversion to agriculture
l
This system will help
•
•
•
l
quantify the carbon impact of changes
in the forests
Understand the relationship to global
climate variability and human activity
Provide monitoring and verification
system needed for the UN REDD+
program for saving tropical forests
Provide ubiquitous web-based
access to changes occurring across
the globe, creating public awareness
Planetary Skin Institute
Planetary Skin  a global “nervous system” that will integrate land-sea-air-and-space-based sensors
helping the public and private sectors make decisions to prevent and adapt to climate change. Time
Sensing and Monitoring
for Sustainability
Slide by Andreas Krause
Sensing for managing
Path planning for robotic Detect contamination in drinking intelligent buildings
environmental monitoring water distribution networks
Sensor networks for monitoring environments: data
collection, analysis, synthesis, and inference in largescale sensor networks.


Want most useful information at minimum cost
Exploit structure to efficiently find near-optimal solutions
– Many important sensing utility functions are submodular
E.g.: Greedy algorithm finds near-optimal set of observations

Resulting algorithms perform well on real applications
– E.g., top score in Battle of the Water Sensor Networks challenge
– Can handle robustness, complex constraints, dynamics, …
45
Understanding Species Interactions:
Food Webs
 Food webs who eats whom among species within their natural habitats
– Nodes – species
– Edges – feeding relationships
between the species
Question
how does Nature balance the
abundance of different species?
Can we predict the consequences
of invasive alien species?
Can we predict the consequences
of extinction due to biodiveristy loss?
Neo D. Martinez
Pacific Ecoinformatics and
Computational Ecology Lab
www.FoodWebs.org
46
Understanding Species Interactions:
Network Science - Food Webs
 Food webs who eats whom among species within their natural habitats
– Nodes – species
– Edges – feeding relationships
between the species
Question
how does Nature balance the
abundance of different species?
Can we predict the consequences
of invasive alien species?
Neo D. Martinez
Can we predict the consequences
Pacific Ecoinformatics and
of extinction due to biodiveristy loss? Computational Ecology Lab
www.FoodWebs.org
47
Understanding Species Interactions:
Network Science - Food Webs
 Food webs who eats whom among species within their natural habitats
– Nodes – species
– Edges – feeding
relationships
Network
modeling, prediction, and
between the speciesoptimization –
challenging research area when
Question considering the dynamics of nodes.
how does Nature balance the
Key to understanding coupled
abundance of different species?
human-natural systems and allow
prediction of potential ecological and
Can we predict the consequences
economic consequences of natural
of invasive alien species?
and human-driven forces acting upon
them.
Neo D. Martinez
Can we predict the consequences
Pacific Ecoinformatics and
of extinction due to biodiveristy loss? Computational Ecology Lab
www.FoodWebs.org
48
Understanding Climate Change: A Data-Driven Approach
Disagreement between IPCC models
• Climate change is the defining issue of our era
• Actionable predictive insights are required to inform policy
• Physics-based models are essential but not adequate
–
–
Models make relatively reliable predictions at global scale for ancillary variables such as
temperature
But provide the least reliable predictions for variables that are crucial for impact assessment
such as regional precipitation
Regional hydrology exhibits large variations among
major IPCC model projections
Project aim:
A new and transformative data-driven approach
that complements physics-based models and
improves prediction of the potential impacts of
climate change
“... data-intensive science [is] …a
new, fourth paradigm for scientific
exploration." - Jim Gray
Transformative Computer Science Research
Predictive Modeling
Enable predictive modeling of
typical and extreme behavior from
multivariate spatio-temporal data
Relationship Mining
Enable discovery of complex
dependence structures: nonlinear associations or long range
spatial dependencies
Lead PI:
Vipin Kumar
Complex Networks
Enable studying of collective
behavior of interacting eco-climate
systems
High Performance Computing
Enable efficient large-scale spatiotemporal analytics on exascale HPC
platforms with complex memory
hierarchies
• Science Contributions
–
–
Data-guided uncertainty reduction by blending physics models and
data analytics
A new understanding of the complex nature of the Earth system and
mechanisms contributing to adverse consequences of climate change
• Success Metric
–
Inclusion of data-driven analysis as a standard part of climate
projections and impact assessment (e.g., for IPCC)
Socio-Economic Systems:
Poverty, Agricultural Systems, Harvesting Policies
50
Poverty maps
Identifying the poor is the first essential step
Poverty maps:
- Use multiple data sets to estimate and map
poverty patterns not directly measured.
- Machine learning and related methods can
permit more efficient use of data from varied
sources  analogy to species distribution but
little research has been done.
Example: 2002 Uganda poverty map
Chris Barrett and Corey Lang
Targeting maps
Targeting the best response to reduce poverty
Targeting maps:
How to estimate the impact and
marginal returns of different assets?
Poverty interventions need to be
targeted to specific areas
Asset-based investments have
spatially-varying marginal returns
Average marginal returns of different assets (Uganda)
Policies for Poverty Reduction:
Which set of interventions to apply to each targeted area
to maximize the poverty reduction, subject to resource constraints (e.g. budget))?
Challenging problems at the intersection of machine learning and optimization,
Chris Barrett and Corey Lang
applied economics and social science.
AI for Development
Catastrophe Modeling for Rwandan
Disease Surveillance
Can mobile phones be used as an early warning system for
disease outbreaks? Bayesian anomaly detection algorithms
A. Kapoor, N. Eagle, E. Horvitz
Spatiotemporal Diffusion of Contraceptive Norms
How do contraceptive norms spread through rural areas of the
developing world? Spatiotemporal diffusion models have the
potential to better evaluate the efficacy of HIV prevention
techniques and inform policy decisions related to public health. –
H. Yoshioka, N. Eagle
Many other exciting projects!!!!
53
Harvesting Policies
Fishery Management
54
Natural Resource Management:
Policies for harvesting renewable resources

Over Harvesting
The biomass of fish is estimated to be 1/10 of what it
was 50 years ago and is declining (Worm et al. 2006).
 The state of the world’s marine fisheries is
alarming
 Researchers believe that the collapse of the
world’s major fisheries is primarily the result of the
mismanagement of fisheries (Clark 2006; Costello
et al. 2008).
There is therefore a clear urgency of finding ways of
defining policies for managing fisheries is a sustainable manner.
Natural Resource Management:
Policies for harvesting renewable resources
Economy
Example of a Biological Growth Function F(x):
Logistic map: x t+1 = r xt (1 - xt), r is the growth rate
Increasing Complexity: more complex models
and multiple
species interactions
We are interested in identifying policy decisions
(e.g. when to open/close a fishery ground over time).
Combinatorial optimization problems with an underlying dynamical model.
Class of Computationally Hard Hybrid Dynamic Optimization Models
Clark 1976; Conrad 1999; Ang, Conrad, Just, 2009; Ermon , Conrad, Gomes and Selman 2010
56
An Application to the Pacific Halibut Fishery in Area 3A
Regulated by International Pacific Halibut Commission (IPHC)
Pacific Halibut Fishery in Area 3A
Ermon , Conrad, Gomes and Selman UAI 2010
General MDP Framework
Resource dynamics
Resource growing over time
Uncertain dynamics
Economics of harvesting
New stock size
Amount left
Stochasticity
Closed loop policy
To maximize total discounted utility
Goal: policy for optimal exploitation
over time
 natural MDP formulation
Ermon , Conrad, Gomes and Selman UAI 2010
Main result
 Fixed costs and variable costs
 Concavity of the growth function
 Robust optimization framework (Worst-case for Nature)
Theorem: The optimal policy is of S-s type.
s
growth
S
harvest
• If the stock is greater than s, harvest down
to S
• If the stock is smaller than s, let the stock
grow
The Pacific Halibut Fishery in Area 3A
 Regulated by International Pacific Halibut Commission
(IPHC)
Total Allowable Catch Policy 
Constant Proportion Policy
rate of fishing ~ 12.3% of estimated stock.
Our Result: Policies with periodic closures of a fishery
can outperform Constant Proportion Policies
Optimal policies for management of
Ermon, Conrad, Gomes, Selman, UAI 2010 renewable resources using MDPs
Markov Decision Processes (MDPs) to Model for
many renewable resource allocation problems.
General Approach to model renewable resources with complex dynamics!!!!
Renewable resources:
forests and fisheries
Problems with a similar mathematical
structure:
- Pollution management,
- Invasive species control,
- Supply chain management and
Inventory control, and many more.
-
Optimal policies for management of
Ermon, Conrad, Gomes, Selman, UAI 2010 renewable resources using MDPs
Energy
Energy efficiency and renewable energy
62
Energy Efficiency

Global warming and climate change concerns have led to
major changes in energy policy in many industrial countries.

There are tremendous opportunities to increase energy
efficiency, such as through the design of control systems for
smart buildings, smart grids, smart vehicles, data centers, etc.
– For example, in the United States buildings consume 39% of the
energy used nationwide and 70% of the electric energy used.
Smart buildings
Data Centers
 Environmental Protection Agency estimates data center capacity
will grow 10% year in next decade
 Data center costs
 increase 20% year (v.s. 6% for overall IT).
 Huge Inefficiencies - server utilization rarely exceeds 6% and facility
utilization as low as 50%. (McKinsey & Co Report 2008)
Main reasons for companies to
try to cut energy costs
 Co2 – carbon emissions of the world’s data centers
 emissions of Argentina and the Netherlands (McKinsey & Co Report 2008)
The IT industry and academia are actively researching new ways of increasing
energy efficiency ----- advanced power management hardware,
smart cooling systems , virtualization tools, dense server configurations, etc.
But increasing energy efficiency is not the main
path towards reducing carbon emissions64
Renewable Energy
 The development of renewable energy sources that
are cleaner and generate little or no carbon can have
a much greater environmental impact.
 There has been considerable technological progress
in the area of renewable energy sources, in part
fostered by government incentives.
Renewable
Energy
65
Energy Independence and Security Act (2007)
DOE was charged with implementing the Smart Grid:
Complex Digital Ecosystem
from largely non-digital, electromechanical grid to a network of digital
systems and power infrastructure;
from a centralized, producer-controlled network to a more decentralized
system allowing greater consumer and local producer interaction.
For example consumers will have smart
meters that can
1. track energy consumption,
2. monitor individual power circuits in the
home,
3. control smart appliances and actively
manage their energy use;
4. dynamically select different providers,
5. sell excess energy (e.g., car battery,
wind, biomass) to grid, etc
Research in sensing and measuring technologies, advanced control methods
(monitor and respond to events), dynamic pricing, improved interfaces,
decision support and optimization tools, and security and66 privacy.
Slide by Eve M. Schooler
How to get people to care?
Personalization
Source : EPRI
 Identify places consumers have an
impact and are impacted by resource
usage
– Use social science to understand consumer
attitudes
– Make sustainability economically relevant
 Identify reasons for industry to care, e.g.,
Source : ASU
– Perception as Environmental Leader
– Energy reductions in operations and products
– New markets, products, services
• Address tussle between Security &
Privacy of data
67
Goal: Trusted Personal
– Facilitate ubiquitous monitoring-inferencingactuation
– Simplify device and data management for mere
Energy Management
at the edge of the Smart
mortals
Grid
Challenge of Managing the Power Grid:
Balancing different supply and demand sources
Total energy produced must be equal to the
amount consumed by the system loads AT ALL TIMES:
Managing the Power Grid is a Balancing Act!!!
How to integrate the different
sources into the smart grid ?
challenging dynamical
problems!!!
What is the best mix of
energy generation
technologies?
What is the best mix of
storage technologies?
Where to locate renewable
energy plants?
Energy Efficiency and Renewable Energy:
Biofuels
Energy Independence and Security Act
(Signed into law in Dec. 2007)
Ambitious mandatory goal of
36 billion gallons of renewable fuels by 2022
(five-fold increase from 2007 level)
First generation Biofuels
Advanced Biofuels (“Cleaner”)
(non food crops and biomass )
Fuel
distributors
Gasoline
producers
Farmers
Non-energy
crops
Food
supply
Energy
crops
Energy
market
Water quality
Soil quality
Biodiversity
Switchgrass Wood Waste Animal Waste Municipal Waste
Consumers
Environmental
impact
Local air
pollution
Social welfare
Economic
impact
Computational challenges:
Large Scale Logistics Planning
Realistic Computational Models to Evaluate Impact
69
Large Scale Stochastic Logistics Planning for Biofuels
Large-scale investment in new technology provides exciting
logistical planning and optimization challenges and opportunities
Resulting optimization models are
beyond the scope of the current
state of the art in several ways:
Transportation Network
(Roads, Rail, Marine)
Feedstock Map
Distribution Terminals and
Inter-Modal Facilities to
Transfer Liquid Fuel
Biomass Map
- large-scale input;
- stochastic nature (e.g.,feedstock
and demand)
 new models to capture
uncertainty
new stochastic
optimization algorithms;
- dynamics of evolution of demand
and capacity
Potential
biorefinery
locations
70
Realistic Computational Models and Metrics
to Study Sustainability Impacts
(e.g. Impact of Renewable Energy Sources )
Current approaches limited in scope and complexity
Life Cycle Analysis

E.g.
based
on
general
equilibrium
models
(e.g.,
Nash
style)
deling and control of complex high-dimensional systems
General Equilibrium
Complex
Adaptive Systems

Strong
convexity
assumptions
to
keep
the
model
simple
enough
Models
mbining physics-based models, model-based reasoning,
and
for analytical, closed-form solutions (unrealistic scenarios)
optimization and control methods, and
Multi-Agent
Systems
 Limited
computational
thinking
Models
and Metrics!!!!!
machine
learning
models built
from large-scale
heterogeneous data.
Transformative research directions
 More realistic computational models in which meaningful
solutions
can be
computed
Multi-agent
systems,
multi-agent
 Large-scale data, beyond state-of-the-art CS techniques
equilibrium models, game theory, and
 Study of dynamics of reaching equilibrium — key for adaptive policy making!
design of effective mechanisms and
How to measure risks/
Fuel
distributors
policies for thepredict
exchange
of goods.
rare events?
Gasoline
producers
Farmers
Impact on Land-use
Non-energy
crops
Food
supply
Energy
crops
Energy
market
Water quality
Soil quality
Biodiversity
Consumers
Environmental
impact
Local air
pollution
Social welfare
Economic
impact
Impact on food prices?
Realistic Computational Models and Metrics
to Study Sustainability Impacts
(e.g. Impact of Renewable Energy Sources )
Current approaches limited in scope and complexity
Life Cycle Analysis
 E.g. based on general equilibrium models (e.g., Nash style)
General Equilibrium
Systems
 Strong convexity assumptions to keep the model simple Complex
enough Adaptive Models
and
for analytical, closed-form solutions (unrealistic scenarios)
Multi-Agent
Systems
 Limited computational thinking
Models
and Metrics!!!!!
Transformative research directions
 More realistic computational models in which meaningful
solutions
can be
computed
Multi-agent
systems,
multi-agent
 Large-scale data, beyond state-of-the-art CS techniques
equilibrium models, game theory, and
 Study of dynamics of reaching equilibrium — key for adaptive policy making!
design of effective mechanisms and
How to measure risks/
Fuel
distributors
policies for thepredict
exchange
of goods.
rare events?
Gasoline
producers
Farmers
Impact on Land-use
Non-energy
crops
Food
Energy
crops
Water quality
Soil quality
Biodiversity
Environmental
impact
Local air
pollution
Consumersof complex
supply
Impact on
food prices?
Modeling
and control
high-dimensional
systems
byEnergy
combining physics-based models, model-based reasoning,
market
optimization and control methods, and
Social welfare
machine learning models built from large-scale
Economic
impact
heterogeneous data.
The role of Experimental Research
Crop Soil Science
Material Discovery
73
Agricultural Systems and GHG’s emissions
Agriculture is the primary driver of land use change
and deforestation and an important source of
Greenhouse Gases (GHG)s: 52% and 84% of
global anthropogenic CH4 and N2O, significant C02
31%
Nitrogen Based
Fertilizers
Source of Greenhouse Gases
74
Sustainable Agriculture and Sustainable Communities
Super eco
EcoVillage in Ithaca,NY
Sustainable Agriculture and
Sustainable Communities
Collaborators
Laurie Drinkwater and Norm Scott
Super Techno
Nitrogen Cycle and Fertilizers
Dead Zone in
Nitrogen Based Gulf of Mexico
Fertilizers
Townsend & Howarth (2010)
Scientific American
Spatial and temporal analysis of
Nitrogen cycle
Fertilizers have played a key role in the
increase of food production but they key
culprits in the production of greenhouse
gases emissions creating dead zones
Study of fertilizers and
design of Experiments
Collaborator Harold van Es
76
Design of Scientific Experiments
(4 Treatments: A,B,C,D)
 Hybrid IP/CSP based
– Assignment formulation
– Packing formulation
– Different CSP models
 SAT/ CSP based approach
Latin Square
– State of the art model for Latin Square
+ symmetry breaking by initializing
first row and column (SBDD doesn’t help;
this is not a completion problem)
 Local search based approach
These approaches do not scale up
max order 6
Below average
Perfectly balanced
2.33
Above average
Spatially Balanced Latin Square
(Target number 30).
Science of Computation:
Discovering patterns, laws, and hidden structure
in computational phenomena
Streamlining Constraint Reasoning
Streamlining
Scaling up of solutions
Discovery of structural properties across solutions (machine learning);
Divide (“streamline”) the search space by imposing such additional properties.
Design of Scientific Experiments for Studying Fertilizers – Spatially Balanced Latin Squares
Existence of SBLS – open problem in combinatorics
Streamlining by Permutation of
Complete Columns of a
Cyclic Latin Square
Discovery of Construction for
SBLS for experimental design
LeBras, Perrault, Gomes, 2010
Can we find an domain independent way of
streamlining?
YES: XOR-Streamlining based on
random parity constraints
SBLS Order 35
Gomes and Sellmann 2003/2004
(Valiant and Vazirani 1986, Unique SAT)
Provable bounds for Counting and Sampling.
Gomes, Sabharwal, Selman, 2006
78
Material Discovery for Fuel Cell Technology
79
Material Discovery for Fuel Cell Technology
Goal of Material Discovery:
– Find new products
– Find product substitutes
– Understand material properties
Approach: Analysis of inorganic libraries
Sputter 2 or 3 metals (or oxides) onto a
silicon wafer (which produces a thin-film)
Use x-ray diffraction to study
crystallographic structure of the thin-film Ronan LeBras, Damoulas, John M. Gregoire,
Sabharwal, Gomes, van Dover
Note: electromagnetic radiation
experiments are very expensive!!
Fuel Cell
Example: study of platinum-tantulum library showed correlation
between crystallographic phase and improved fuel cell
oxidation catalysis (Gregoire et al 2010)
Applications
80
Problem Definition
81
Problem Definition
Al
Fe
Si
82
Problem Definition
Al
(38% Fe, 45% Al, 17% Si)
Fe
Si
83
Problem Definition
Al
[Source: Pyrotope, Sebastien Merkel]
(38% Fe, 45% Al, 17% Si)
Fe
Si
84
Problem Definition
120
Al
100
80
60
40
20
0
15
20
25
30
35
40
45
50
55
60
(38% Fe, 45% Al, 17% Si)
Fe
Si
85
The Problem: Labeling Points with “Phase(s)”
INPUT:
UNDERLYING
PHYSICAL
STRUCTURE:
Al
120
100
80
60
40
20
0
15
20
25
30
35
40
45
50
55
60
Al
120
100
pure phase
regions
80
60
40
20
0
15
20
25
30
35
40
45
50
55
60
70
60
50
40
30
20
10
0
15
20
25
30
35
40
45
50
55
60
65
120
120
100
100
80
80
60
60
40
40
20
20
0
15
20
25
30
35
40
45
50
55
60
Fe
0
15
Si
20
25
30
35
40
45
50
55
60
Si
Fe
OUTPUT:

NP-hard
Al
Labeled regions
mixed phase
region
+

Si
Fe
mixed phase
region
86
Synthesis of Constraint Reasoning and
Machine Learning Approaches for
Material Discovery
Standard Clustering Approaches:
Based on pure Machine learning techniques
+ Good at providing a “rough”, data driven,
global picture
+ Can incorporate complex dependencies
and global similarity structure
- Easily miss critical details (physical properties)
Constraint reasoning and optimization
Two “global alignment kernels”
for a ternary problem instance
Optimization Model
+ Great at enforcing the detailed constraints
+ Can encode and capture the “physics” behind the process
- Do not scale up
87
What’s New: Solving it “Properly” Requires…
… a robust, physically meaningful, scalable,
automated solution method that combines:
Underlying
Physics
A language for
Constraints
Machine Learning
for a “global
data-driven view”
enforcing
“local details”
Constraint Programming model
Significantly outperforming previous methods.
Similarity “Kernels”
& Clustering
88
Outline
I Computational Sustainability
II Examples of Computational Sustainability problems
highlighting research themes
III Institute for Computational Sustainability
IV Building a community in Computational Sustainability
V Conclusions
89
Interdisciplinary Research Projects (IRPs):
The Building Blocks of our Institute

A major part of the effort has been identifying a variety of collaborative projects
that bring computer scientists together with domain experts in a number of fields

Individual projects organized under the umbrella of IRPs
(Interdisciplinary Research Projects)
– Currently 13 Application IRPs, 5 Technical IRPs
– Many more in incubation!
Examples:
Application IRP:
Bird Conservation
Application IRP:
Biofuels
....
Technical IRP:
Dynamical Systems
Technical IRP:
Optimization
Species distribution
modeling
Equilibrium models
Parameter
estimation
Problem structure
Bird migration
Impact of gasoline taxes
Synchronization
Stochastic
optimization
RCW preservation
Carbon offsets in
cap-and-trade
Monitoring of
sustainability
Multi-agent inference
Critical transitions
Balanced designs
Reducing
sampling bias
Application IRPs
IRP Name
Researchers
Collaborators
Students
Reserve Selection and
Wildlife Corridors
Conrad, Gomes,
Sabharwal, Selman,
Shmoys
2.
Bird Conservation
• Species distribution
• Bird migration
• RCW preservation
Lab of O: Rosenberg,
Webb
TCF: Allen, Amundsen,
Vaughan
Cornell: Conrad,
Damoulas, Gomes,
Sabharwal, Shmoys
OSU: Dietterich,
Hutchinson, Sheldon,
Wong
Lab of O:
Farnsworth, Fink,
Hames,
Hochachka,
Kelling, Wood
Cornell: Hooker,
Joachims,
Riedewald
Ahmadizadeh,
Castorena, Dilkina,
Elmachtoub, Finseth
3.
Fishery Management
and Fish Preservation
Conrad, Gomes,
Sabharwal, Selman,
Yakubu, Zeeman
Jerald, Ziyadi
Alam, Ermon, Haycraft,
Li, Wiley
4.
Disease Prevention in
Dairy Cows
Damoulas, Ellner,
Gomes, Sabharwal,
Selman
Grohn, Cha
Ahmadizadeh, LeBras,
Prakash, Smith, Teose
Forest Fires
Borradaile, Dietterich,
Montgomery,
Rusmevichientong,
Shmoys, Wong
1.
5.
Dilkina, Dimiduk,
Drusvyatskiy
Ferst, Spencer,
Gagnon, Houtman,
Mark, Reynolds
continued …
Application IRPs
IRP Name
6.
Modeling Ecological
Dynamics
Researchers
Ellner
(continued )
Collaborators
Students
Jones, Mao-Jones,
Ritchie, King, Rohani,
Ionides, Kendall,
Reumann, Newman,
Ferrari, Hooker
7.
Biofuels
•Equilibrium models
•Cap-and-trade
Bento, Walker
Anderson
Bergstrom, Cuvilliez,
Kang, Klotz, Landry,
Latza, Mayton, Patel,
Roth, Stanislaus,
Taranto, Weiz
8.
Pastoral Systems in
East Africa
Barrett, Gomes,
Selman, Wong
Kariuki, Leeuw, Smith
Guo, Toth, Zhuo
9.
Poverty Mitigation and
Food Insecurity
Barrett, Damoulas,
Gomes, Lentz,
Naschold
Allen, Bell, Lang,
Maxwell, McGlinchy,
Murray
Dilkina, Guo, Harou,
Lang
10.
Invasive Species
Conrad, Shmoys
Dietterich,Gomes
Hutchinson, Sheldon
11.
Impact of Climate
Change
Guckenheimer,
Mahowald, Zeeman
Many colleagues in
climate science
12.
Stress/Environment
Interactions
Zeeman
McCobb
Gribizis
13.
Material Discovery
Damoulas, Gomes,
Sabharwal
Gregoire, van Dover
LeBras
Yang, Ermon, Spencer
Technical IRPs
IRP Name
Researchers
Collaborators
Students
Dynamical Systems
• Parameter estimation
14.
• Synchronization
• Monitoring of sustainability
Guckenheimer,
Strogatz, Zeeman
Haiduc, Kuehn
15.
Network Science
• High dimensional data
• Sensor networks
• Dynamics & Sync
Gomes, Hopcroft,
Selman, Strogatz
Ermon, Sheldon
16.
Bridging Constraint
Reasoning and Machine
Learning
Damoulas,
Dietterich, Gomes,
Sabharwal
17.
Combinatorial Reasoning
and Stochastic Optimization
• Problem structure
• Stochastic optimization
• Multi-agent inference
• Balanced designs
Gomes, Sabharwal,
Selman, Shmoys,
Sheldon
18.
Large Scale Computing
Paradigms for Massively
Large Data Sets and
Simulation-Intensive Studies
Gomes, Sabharwal,
Selman
Gregoire, van
Dover
LeBras
Malitski,
Sellmann
Cheung, Dilkina,
Ermon, Guo, Kroc,
LeBras, Perrault,
Ramanujan
Ahmadizadeh,
Dilkina, Kroc
93
Transformative Computer Science Research:
Driven by Deep Research Challenges posed by Sustainability
Design of policies to manage natural resources translating
into large-scale optimization and learning problems,
combining a mixture of discrete and continuous effects, in a
highly dynamic and uncertain environment  increasing
levels of complexity
Study computational problems as natural phenomena
 Science of Computation
Many highly interconnected components;
 From Centralized to Distributed:
multi agent systems
Distributed
Highly
Interconnected
Components / Agents
Dynamics
Complex dynamics and Multiple scales
 From Statics to Dynamics:
Complex systems and dynamical systems
Large-scale data and uncertainty
 Machine Learning, Statistical Modeling
Constraint Satisfaction and Optimization
Complex decision and optimization problems
 Constraint Reasoning and Optimization
Complexity levels in
Computational Sustainability Problems
Data and Uncertainty
94
Transformative Computer Science Research:
Driven by Deep Research Challenges posed by Sustainability
Design of policies to manage natural resources translating
into large-scale optimization and learning problems,
combining a mixture of discrete and continuous effects, in a
highly dynamic and uncertain environment
Constraint
Reasoning
& Optimization
Gomes PI,W, Hopcroft Co-PI
SelmanCo-PI, ShmoysCo-PI
 increasing levels of complexity
Study computational problems as natural phenomena
 Science of Computation
Many highly interconnected components;
 From Centralized to Distributed:
multi agent systems
Complex dynamics and Multiple time scales
 From Statics to Dynamics:
Complex systems and dynamical systems
Large-scale data and uncertainty
 Machine Learning, Statistical Modeling
Complex decision and optimization problems
 Constraint Reasoning and Optimization
Resource Economics,
Environmental
Sciences & Engineering
AlbersCo-PI,W, Amundsen, Barrett, Bento,
ConradCo-PI, DiSalvo, MahowaldW,
MontgomeryCo-PI,W
Rosenberg,SofiaW, WalkerAA
Environmental &
Socioeconomic
Needs
Dynamical
Models
Guckenheimer
Strogatz
ZeemanPI,W
YakubuAA
Transformative
Synthesis:
Dynamics &
Learning
Science of Computation
Science of Computation
Data
& Machine
Learning
DietterichPI
WongCo-PI
ChavarriaH
95
Institute Activities
Education
Research
Coordinating
transformative
synthesis
collaborations
Postdocs
Doctoral
students
Research
seminar
series
Honors
projects
Summer REU program targeting
minority students
Interdisciplinary
Research
Projects (IRPs)
Computational Sustainability courses
ICS
Outreach
K-12 Activities
Citizen Science
Lab. Of Ornithology
Rocky Mountain
Research Station
Conservation
Fund
Cornell
Cooperative
Extension
Travel Museum
The Nature Conservancy
Building Research Community
Conference &
Workshops
Web Portal
Seminars
External
Collaborations
Host visiting
Scientists
Computational Sustainability Community
Conferences, Workshops,
Discussion Groups
CompSust’09: Cornell, 2009
Blogs:
Over 225 international researchers from several
disciplines and institutions (universities, labs, government)
International Workshop on Constraint
Reasoning and Optimization for
Computational Sustainability
• CROCS-09 at CP-09
NIPS-09 Mini Symposium:
Machine Learning for
Sustainability
CompSust’10: MIT, 2010
Over 150 participants and lots of topics!!
• CROCS at CPAIOR-10
• CROCS at CP-10
AAAI 2011 Special Track on
Computational Sustainability
February 3, 2011: abstracts due
February 8, 2011: papers due
Conclusions
Computational Sustainability
Interdisciplinary field that aims to apply techniques
from computing and information science, and related fields
(e.g., engineering, operations research, applied
mathematics, and statistics ) for Sustainable Development.
Sustainable Development
encompasses balancing
environmental, economic, and
societal needs.
99
Climate
Public
Policy
Energy
Natural
Resources
Social
Factors
Sustainability
Economic
factors
Human
Factors
Education
100
Computational Sustainability
attempt at a definition
Wide interdisciplinary field , encompassing disciplines as diverse as
economics, sociology, environmental sciences and engineering , biology,
crop and soil science, meteorology and atmospheric science.
Focus:
Develop computational models, methods and tools for a broad range of
sustainability related applications and task:
from supporting decision making and policy analysis concerning the
management and allocation of natural resources
to the development of new sustainable techniques, products, practices,
attitudes, and behaviors.
Key challenge: to effectively and efficiently establish interdisciplinary collaborations
– the level of interconnectedness of social, economic, and environmental issues
makes it really challenging!
101
Computational Sustainability
Sustainability fields
New challenging applications
for Computer Science
New methodologies
In Computer Science
Computational Thinking that will
provide new insights into
sustainability problems:
New methodologies
In Sustainability fields
Computer science and related fields
Analogy with Computational Biology
Summary
Computational Sustainability has great potential to
advance the state of the art of computer science and
related disciplines with unique societal benefits!
Thank You! 103
Why Computational Sustainability?
Thank you!
From the Cloud Institute, NY
104
Statistics
Alg.
Theory
eScience
Climate
Public
Policy
Natural
Resources
Networks
&
Systems
AI
Social
Factors
CompSust
Computational
Energy
Sustainability
Appl.
Mathematics
HCI
Economic
factors
Human
Factors
Education
Game
Theory
Systems
Eng.
Operations
Research
105
Download