UN Development Report 2015
Disaster Risk Reduction Chapter
Unlocking the Big Data for Resilience with Google Earth Engine: Spotlight on Flood
Vulnerability Predictions for Senegal
Bessie Schwarz (1), Beth Tellman (2), Christiaan Adams (3)
1.
Yale Project on Climate Change Communication
2.
Arizona State University
3.
Google Inc.
Introduction
Imagine if anyone could type their zip code into a browser and see how climate change will change their community. What if any community around the world, no matter how small or poor, could conduct a sophisticated hydrological assessment to prepare for coming environmental changes? Revolutions in the amount, resolution, and speed of data today make solutions to these previously unanswerable disaster questions possible. Big Data opens new avenues for science, action, and communication for every stage of the disaster cycle. However, unlocking the promise of the Big Data revolution requires an equal revolution in computing capacity. Does it matter that the Senegalese government has cell phone record information for their citizens if they do have the processing power to analyze it?
This brief focuses specifically on Big Data for geographic information systems (GIS) and
Google Earth Engine (EE), a new cloud-based GIS processing platform and data repository.
EE not only revolutionizes the kind of geographic questions we can answer and who can answer them, but makes these solutions more intelligible for policy-makers on the ground.
This brief will first describe EE and its applications for disaster science and management.
Secondly it will present a global flood vulnerability prediction model being developed in EE by the authors as an example of what both of these revolutions mean practically for disaster work. Applying the model on Senegal reveals specific locations and populations in the country currently at risk from in-land flooding.
Google Earth Engine: What Is It and What Can It Do for Science?
EE brings together the world's satellite imagery — trillions of scientific measurements dating back over 40 years — in a cloud based GIS processer and combines it with the massive computing power of a modern data center.
i For a full list of available datasets see: https://earthengine.google.org/#index. The platform is unique in three primary ways: 1) the amount of data its stores, 2) the speed of its processing, and 3. the fact that it is browser based. EE’s data catalog is a multi-petabyte archive of georeferenced datasets, including images from earth observing satellites and airborne sensors (e.g. USGS Landsat, NASA
MODIS, USDA NAIP), weather and climate datasets, and digital elevation models. All these dataset are quickly accessible and many images have been already cleaned of clouds, greatly reducing time-intensive and occasionally prohibitive research steps like dataset requesting and data cleaning. Furthermore any user can directly upload their raster or vector data into the repository, allowing the user to share their files and customize their analysis.
Second, EE can process massive datasets extremely quickly by parallelizing the processing among thousands of central processing units (CPU). Analysis is done in a just-in-time computation model that enables real-time preview and debugging during algorithm development. This enables sophisticated statistical functions that can reduce computational time by days or even years. Finally all this analytical power is accessible from any computer with a good Internet connection.
ii
In addition to these technical strengths, the tool affords impressive communication capabilities through unparalleled visualizations. Not only are the resulting maps highly engaging, easy to understand, and interactive, but the results are presented with the generally recognizable Google Maps base-layer.
EE is designed for applied tools
Today scientists, independent researchers, and nations are mining EE datasets to map global trends, quantify differences on the Earth's surface, and monitor environmental problems. The platform is being used to detect deforestation, classify land cover, estimate forest biomass and carbon, and map the world’s roadless areas. A landmark example is the new version of Global Forest Watch (www.globalforestwatch.org), a decades old project of
The World Resources Institute focused on illegal deforestation. In partnership with the
University of Maryland and 40 others, the project analyzed almost 700,000 Landsat images, using a total of 20 terra-pixels of Landsat data and one million CPU hours on 10,000 computers working in parallel. The analysis determined annual global forest change in forest cover since 2000 (Hansen et al. 2013). See full result http://earthenginepartners.appspot.com/science-2013-global-forest. Google scientists say a standard computer would have taken 15 years to complete this analysis.
In another example, a human settlement mapping project of unprecedented size will determine the urban area extents of the entire globe. Using Landsat imagery in EE, the team, lead by Dr. Paolo Gamba at the University of Pavia, aims to eventually produce a highresolution global map at 30m, showing change over the last 10-20 years (Patel et al.,
2014). To explore more examples of EE analysis, see the Google Earth Engine Map Gallery - http://maps.google.com/gallery/details?id=z9yCydrmDxbc.k7cU5yQydBjM&hl=en)
EE science in the cloud means tangible applications on the ground. The Brazilian environmental group Imazon (imazon.org.br/?lang=en) created its own forest monitoring tools in EE to track Amazon deforestation when it suspected the government’s data was inadequate. The group now produces monthly reports (See imazon.org.br/mapas/?lang=en), as well as near real time alerts when forest lost is detected. When forest alerts sound, Imazon’s team of experts on the ground and the engaged public responds. EE-based science can help increase independent ecosystems monitoring, generate more participatory science, and better communicate environmental problems. Similarly GFW today conducts high resolutions loss/gain analysis every month
(http://google-latlong.blogspot.com/2014/02/monitoring-worlds-forests-withglobal.html).
EE and Big Data GIS’s can contribute most to data poor countries. Previously hidden corners of the earth, including areas most vulnerable to disaster, are now regularly observed from space and detectable by social media. To be fully effective and equitable, Big
Data and big computing should include governments and people of traditionally poor and less technically advanced countries, not just as research subjects or recipients of findings, but as equal partners in analysis and decision-making. We will not create sustainable tools for disaster if all Big Data researchers are from the least vulnerable areas. However, one of the major challenges to empowered Big Data has been developing a cost effective and distributable tool. EE’s speed, flexibility, and accessibility enables more people to perform sophisticated analysis, allows scientific users to respond to needs on the ground, and helps researchers better communicate the science to all types of end users.
EE for Disaster and a Global Flood Vulnerability Application
With these tools we can now harness the best of human and natural systems science to predict and visualize disasters and our vulnerability to them. Faster processing power not only allows for a better understanding of a myriad of disasters, but a more open browser based platform allows the science to be more participatory. This can produce specific tools for disaster such as hotspot mapping to identify vulnerable areas, improved early warning systems, and monitoring tools to reduce the environmental degradation behind disaster.
Increased participation in science helps with risk reduction by increasing community buyin (Sherbinin, 2014) and harnessing local knowledge.
In collaboration with Google Earth Outreach and Google Crisis Maps, this team is building a practical tool in EE for managing riverine flooding globally. The tool predicts sociophysical vulnerability to flooding at a high, sub-national resolution around the world in near real-time and in future climate scenarios. The model relies on publically available satellite imagery in EE and uploaded demographics data to refine a surface of risk inside any area of interest (e.g. county, watershed, or storm prediction zone). In one application depicted in Figure 1, the model uses weather estimates to dynamically determine the impact of a storm as it approached New York State.
Figure 1. Mock Web Interface of Model. The model can be dynamically run on a time step using the latest weather prediction (e.g. NOAA 5-day flood predication) to determine vulnerability to an oncoming storm as that storm approaches. This image depicts a sample storm over New York State.
Over the next year, the model will incorporate climate and socio-economic projections. The result will be a Google Maps-like web interface that assesses the impact of today’s flood events, changes in U.S. floodplains given different emissions scenarios, and future losses from increased flooding and changes in society. This information can help individuals, governments, and NGOs better understand current and future flood risks in their localities; it gives them more tools to identify hotspots of vulnerability, prepare for future hazards, and respond to new challenges in a changing world.
Spotlight on Senegal
We applied our flood vulnerability prediction model to Senegal to determine the country’s existing vulnerability from riverine flooding. The results
Figure 2. Physical Vulnerability to Flooding in Senegal. Sections of
Senegal most susceptive to riverine flooding are pictured in black
(top 10% most vulnerable). show 782 square Kilometers of the country to be at high risk. Additionally, 242,739
Senegalese are highly exposed to potential flooding and 91,732 of this population are furthermore highly socially vulnerable.
For biophysical indicators of vulnerability, the model customized a qualitative index for likelihood of
flooding within each pixel using several biophysical indicators: slope, elevation, watersheds with a high percentage of impervious surface, and locations with a high topographic index (ratio of upstream contributing area to slope). To assess the social vulnerability of Senegal, the model created a qualitative risk surface using three variables for age structure and population density. Indicators of social vulnerability used 100m resolution data from WorldPop (for more data see www.worldpop.org.uk), a global highresolution database and are selected based in part on vulnerability scholarship from the
Hazard and Vulnerability Institute (Cutter, 2003). The social vulnerability index was further aggregated at the department level.
Overall, the most vulnerable Senegalese are located in densely populated, poor, and lowlying areas near rives. Vulnerable communities are concentrated in three clusters in the northern most region of Saint-Louis, the southeastern most regions, and the costal region around Fatick. These results are limited and coarse, but they serve as an indicator of places most in need of attention during all stages of the disaster cycle.
iii This information could help determine the right populations for social programs and the best areas for infrastructure projects. More expensive, sophisticated, and qualitative assessments could be conducted for the identified hotspots. The map can also be adjusted in near real-time
Figure 3. Combined Socio-physical Vulnerability to Flooding in
Senegal. Areas most susceptible to riverine flooding are pictured during a flood to visualize and recommend areas to prioritize for immediate assistance.
The model’s final product is an interactive, online map of the results.
Given that the data for this model are publically available and that the EE analytical platform is cloud-based, in blue. Vulnerable areas with populations most at risk are pictured in pink. this map and its underlying data and methods could be easily made public for communication and citizen science. Running the model takes only minutes, given the appropriate data. We suggest that the model be applied to under researched and under resourced countries as a quick and inexpensive diagnostic tool.
Conclusion
The flooding vulnerability model presented here is just an early example of what the future of disaster science can look like – distributed, democratic, and data-driven. Big computing tools like EE that are open and accessible are the only way to fully leverage Big GIS Data.
This body can tap into this new potential by aiding the development of tangible tools such as hotspot identifiers and early warning systems. It should further assist practioners in
integrating these new technologies and methods into their planning and practice, especially in data poor countries.
References
Cutter, S.L., Boruff, B.J., Shirley, W.L., 2003. Social Vulnerability to Environmental Hazards.
Soc. Sci. Q. 84, 242–261. doi:10.1111/1540-6237.8402002
Hansen, M., Potapov, P., Moore, R., Hancher, M., Turubanova, S., Tyukavina, A., Thau, D.,
Stehman, S., Goetz, S., Loveland, T., Kommareddy, A., Egorov, A., Chini, L., Justice, C.,
Townshend, J., 2013. High-Resolution Global Maps of 21st-Century Forest Cover
Change. Science. 15: 850-853 doi: 10.1126/science.1244693
Patel, N., Angiuli, E., Gamba, P., Gaughan, A., Lisini, G., Stevens, G., Tatem, A., Trianni, G.
2015. Multitemporal settlement and population mapping from Landsat using
Google Earth Engine. International Journal of Applied Earth Observation and
Geoinformation . 35: 199–208 http://dx.doi.org/10.1016/j.jag.2014.09.005
Sherbinin, A., 2014. Climate Change Hotspots Mapping: What Have We Learned?. Climatic
Change. 123: 23–37 doi: 10.1007/s10584-013-0900-7 i While EE initially focused on supporting remote sensing, the tool is increasingly useful for vector and other
GIS analysis. ii iii
We recognize that technical skills are a significant barrier particularly for underdeveloped countries.
The analysis requires further refinement with more highly resolved data before it can be used for practical purposes.