Automated Geoprocessing, Web-based Mapping, and Data Manipulation using Open Source Software Dylan Keon NACSE GEO 580 guest lecture 9 May 2011 Northwest Alliance for Computational Science and Engineering Topics Or… “Other ways of doing things” Raster/vector data manipulation Automated geoprocessing Spatial databases Web-based mapping Information visualization User-defined simulation Keon - GEO 580 - 9 May 2011 Look Familiar? Keon - GEO 580 - 9 May 2011 NACSE Northwest Alliance for Computational Science and Engineering Interdisciplinary research group established in 1995 Provider of tools, infrastructure, data, and expertise for the interactive publication of scientific knowledge Develop solutions for automated web-based spatial analysis and display of large scientific databases Keon - GEO 580 - 9 May 2011 Open Source Software Open Source? Free - FOSS Code is open – anyone can modify/improve/adapt Licensing GPL (General Public License) most common Examples of large projects Linux Mozilla (Firefox, Thunderbird, etc) Drupal Apache PHP, Python Keon - GEO 580 - 9 May 2011 Open Source GIS Software Enormous growth in past few years Many tools Spatially-enabled databases Desktop applications Web-based mapping tools Programmatic tools Examples: GRASS (Geographic Resources Analysis Support System) GMT (Generic Mapping Tools) MapServer OpenLayers GDAL/OGR GEOS PROJ.4 QGIS (Quantum GIS) Keon - GEO 580 - 9 May 2011 Open GIS Standards Open GIS Consortium (OGC) Consortium of over 400 companies, agencies and universities (including OSU) “Develop encoding specifications that enable interoperability among diverse geospatial data stores, services, and applications” Reduces need to duplicate data, eases data updates Enables usage across diverse formats, projections Example specifications: Web Mapping Service (WMS) Web Feature Service (WFS) Web Coverage Service (WCS) www.opengeospatial.org Sensor Model Language (SensorML) Geography Markup Language (GML) Keyhole Markup Language (KML) Keon - GEO 580 - 9 May 2011 Open GIS Standards Open Source Geospatial Foundation (OSGeo) Supports collaborative development of key community-led open source GIS projects Provides infrastructure, funding, other support Promotes freely available data Example projects: MapServer OpenLayers GDAL/OGR GEOS PostGIS Quantum GIS Supports annual conferences • FOSS4G 2011 in Denver! www.osgeo.org Keon - GEO 580 - 9 May 2011 Raster Data Manipulation GDAL (Geospatial Data Abstraction Layer) Translator library for raster GIS data formats Reads over 120 raster data formats, writes to ~60 Common formats supported • • • • • GeoTIFF (read/write) ArcInfo binary grid (read only) ArcInfo ASCII grid (read/write) ENVI and ESRI .hdr labeled (read/write BSQ/BIP/BIL) HDF4, HDF5, NetCDF Large number of utilities • gdal_translate, gdalwarp, gdal_merge.py, gdal_rasterize, etc. www.gdal.org Keon - GEO 580 - 9 May 2011 Raster Data Manipulation GDAL Example: Transform raster data Use gdal_translate to convert raster data from one format to another: > ls us_tmax_2010.07.asc > gdal_translate –of GTiff us_tmax_2010.07.asc us_tmax_2010.07.tif Input file size is 7025, 3105 0...10...20...30...40...50...60...70...80...90...100 - done. > ls us_tmax_2010.07.asc us_tmax_2010.07.tif Other flags: -a_srs (projection), -a_nodata (nodata value), -stats (calculate stats), -projwin (clip), etc… www.gdal.org/gdal_translate.html Keon - GEO 580 - 9 May 2011 Raster Data Manipulation GDAL Example: Reproject raster data Use gdalwarp to transform raster data from one projection to another: > gdalwarp -s_srs 'EPSG:4326' -t_srs 'EPSG:102003' us_tmax_2010.07.tif us_tmax_2010.07_aea.tif Creating output file that is 7920P x 4269L. Processing input file us_tmax_2010.07.tif. Using internal nodata values (eg. -9999) for image us_tmax_2010.07.tif. 0...10...20...30...40...50...60...70...80...90...100 - done. Other flags: -r (resample method), -multi (multiprocessor), -cutline (mask), -csql (db mask), etc… www.gdal.org/gdalwarp.html www.spatialreference.org Keon - GEO 580 - 9 May 2011 Raster Data Manipulation GDAL Example: Mosaic raster data Use gdal_mosaic.py to mosaic multiple rasters: > python gdal_mosaic.py –of GTiff –o corvallis_mosaic.tif *.tif 0...10...20...30...40...50...60...70...80...90...100 - done. > ls -sh 11s5w20.tif 9.4M 11s5w20.tif > gdalinfo corvallis_mosaic.tif | grep 'Size is' Size is 1787, 1827 > ls –sh corvallis_mosaic.tif 654M corvallis_mosaic.tif > gdalinfo corvallis_mosaic.tif | grep 'Size is' Size is 11726, 19488 Other flags: -ps (output pixel size), -separate (put each input in separate band), etc… www.gdal.org/gdalwarp.html Keon - GEO 580 - 9 May 2011 Raster Data Manipulation Why is this useful? Keon - GEO 580 - 9 May 2011 Raster Data Manipulation Why is this useful? Utility, flexibility Speed Packages can be combined Automation Complex processing via scripted programming Keon - GEO 580 - 9 May 2011 Raster Data Manipulation Utility Automate processing > for i in /data/ascii_grids/*.asc; > do echo "processing $i..."; > do gdal_translate –of GTiff $i /data/geotiffs/${i%.*}.tif; > done Endless possibilities using Python, Perl, etc. Speed gdal_translate vs. arcpy.ASCIIToRaster_conversion 10,000 ASCII raster files gdal_translate (~4 sec per file): arcpy (~35 sec per file): 10000*4/3600 = 11.11 hours 10000*35/3600 = 97.22 hours Keon - GEO 580 - 9 May 2011 Vector Data Manipulation OGR (OGR Simple Features Library) Reads ~60 vector data formats, writes to ~30 Common formats supported • • • • • ESRI Shapefile (read/write) ESRI Personal GeoDatabase (read only) ArcInfo Binary Coverage (read only) GML, KML (read/write) Spatial Databases (PostGIS, Informix, SQL Server - all r/w) Command-line utilities • ogrinfo, ogr2ogr, ogrtindex www.gdal.org/ogr Keon - GEO 580 - 9 May 2011 Vector Data Manipulation OGR Example: Transform vector data Use ogr2ogr to convert vector data from one format to another: > ogr2ogr –f "KML" counties.kml counties.shp Use ogr2ogr to subset a shapefile: > ogr2ogr –f "ESRI Shapefile" –where "STATE_NAME='Oregon' AND NAME IN ('Benton','Lane','Polk')" counties_local.shp counties.shp Use ogr2ogr to subset & reproject a shapefile: > ogr2ogr –f "ESRI Shapefile" –where "STATE_NAME='Oregon' AND NAME IN ('Benton','Lane','Polk')" –t_srs "EPSG:2992" counties_local.shp counties.shp www.gdal.org/ogr2ogr.html Keon - GEO 580 - 9 May 2011 Coordinate Reprojection (PROJ.4) Example: Batch reproject a set of coordinates # ---- From Coordinate System ---# Lambert Conformal Conic # +init=epsg:3111 +proj=lcc +lat_1=-36 +lat_2=-38 +lat_0=-37 +lon_0=145 # +x_0=2500000 +y_0=2500000 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m # +no_defs # ---- To Coordinate System ---# Lat/long (Geodetic) # +proj=latlong +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 # # begin input processing cs2cs +init=epsg:3111 -f "%.7f" <<EOF 1 2264101 2477454 2 2265580 2477495 ... # output 27606 142.3430129 -37.1735168 0.0000000 27607 142.3596755 -37.1735182 0.0000000 ... 55186 148.4020836 -37.2040045 0.0000000 proj.osgeo.org Keon - GEO 580 - 9 May 2011 Spatial Databases (PostgreSQL/PostGIS) Example: Simple spatial query Find the area of Benton County: db=> SELECT ST_Area(the_geom) FROM gis.counties WHERE name = 'Benton' AND state_name = 'Oregon'; sqmi ------------------0.198952480006028 This is in decimal degrees, so first transform to appropriate projection (Oregon Lambert)… db=> SELECT ST_Area(ST_Transform(the_geom, 2992))/(5280^2) AS sqmi FROM gis.counties WHERE name = 'Benton' AND state_name = 'Oregon'; sqmi -----------------678.641255124375 postgis.refractions.net Keon - GEO 580 - 9 May 2011 Spatial Databases (PostgreSQL/PostGIS) Example: More spatial queries Find all counties named Benton…which is largest? db=> SELECT state_name, name, round((ST_Area(ST_Transform(the_geom, 2992))/(5280^2))::numeric,2) AS sqmi FROM gis.counties WHERE name = 'Benton' ORDER BY sqmi; state_name | name | sqmi -------------+--------+--------Indiana | Benton | 407.78 Minnesota | Benton | 413.23 Mississippi | Benton | 419.11 Tennessee | Benton | 444.64 Oregon | Benton | 678.64 Iowa | Benton | 719.21 Missouri | Benton | 760.10 Arkansas | Benton | 896.50 Washington | Benton | 1760.89 (9 rows) postgis.refractions.net Keon - GEO 580 - 9 May 2011 Spatial Databases (PostgreSQL/PostGIS) Example: More spatial queries Determine which county a lat/lon point falls within: db=> SELECT state_name, name AS county_name FROM gis.counties WHERE ST_Within(ST_GeomFromText('POINT(-121.4328 45.9881)',4326), the_geom); state_name | county_name ------------+------------Washington | Klickitat How about township/range…join to PLSS layer: SELECT c.state_name, c.name AS county_name, p.twnrng, p.sqmi FROM gis.counties c, gis.plss p WHERE p.state = c.state AND ST_Within(ST_GeomFromText('POINT(-121.4328 45.9881)',4326), the_geom); state_name | county_name | twnrng | sqmi ------------+-------------+----------+-------Washington | Klickitat | T6N R11E | 34.102 postgis.refractions.net Keon - GEO 580 - 9 May 2011 Spatial Databases (PostgreSQL/PostGIS) Why is this useful? Spatial operations handled internally Sub-second queries (important for web!) Can connect to database programmatically Programs that produce map output can connect to the database directly (no shapefiles, etc. needed) Can combine multiple complex spatial operations Can join spatial and “non-spatial” data based on the result of spatial queries Keon - GEO 580 - 9 May 2011 Programmatic Tools Raster manipulation with GDAL via Python GDAL has native hooks into Python Fairly simple to do sophisticated raster processing with a few commands Command-line Python example: determines the average value of a raster across both the rows and the columns: # Data are in ArcInfo binary format (integer) >>> import gdal >>> dataset = gdal.Open('/gis/US_Elevation/usdem_2k/001001.adf', GA_ReadOnly) >>> array = dataset.ReadAsArray() >>> avg = Numeric.average(Numeric.ravel(array)) >>> avg -0.0071967281963325313 Web-based Mapping Tools Examples (2) MapServer (server-side) • • • • • • Powerful open source web mapping software OGC-compliant, supports many formats Runs on multiple platforms Good spatial db support (PostGIS, ArcSDE, etc.) Integrates with GDAL to display raster data Programmatic manipulation via PHP, Python, Perl, etc. OpenLayers (client-side) • • • • mapserver.org openlayers.org Multiple configurations, base layers, etc. Navigation like Google Maps Can overlay layers from server-side mapping tools Easy to “drop in” to a website Keon - GEO 580 - 9 May 2011 Web-based Mapping Tools Live Examples… TFDD • http://ocid.nacse.org/tfdd • OpenLayers, MapServer, PostGIS/PostgreSQL, GDAL, custom PHP code IHNV & VHSV • http://gis.nacse.org/ihnv http://gis.nacse.org/vhsv • IHNV & VHSV use Mapserver, tiled TIGER/LINE data, custom PHP code SC Your Way 2009 • http://scyourway.supercomputing.org/transportation/by_map GEO 560 project example • http://geo.oregonstate.edu/~keondy/geo560/wells Keon - GEO 580 - 9 May 2011 Simulation Input - TCP • Shared resource with multiple tsunami modeling codes • Common parameterization • Web-based portal • Model codes run on large parallel systems, computed outputs downloaded by user Keon - GEO 580 - 9 May 2011 Portal Design and Use Keon - GEO 580 - 9 May 2011 Portal Design and Use Keon - GEO 580 - 9 May 2011 Portal Design and Use Keon - GEO 580 - 9 May 2011 Keon - GEO 580 - 9 May 2011 Portal Design and Use Keon - GEO 580 - 9 May 2011 Keon - GEO 580 - 9 May 2011 Portal Design and Use Keon - GEO 580 - 9 May 2011 Portal Design and Use • Fine grids must nest exactly within coarse grids • 5:1 or 3:1 ratio • No border intersections • Portal talks to spatial database to verify alignment on the fly Keon - GEO 580 - 9 May 2011 Portal Design and Use Keon - GEO 580 - 9 May 2011 Portal Design and Use Keon - GEO 580 - 9 May 2011 Research Goals • Model potential human response to tsunami events via dynamically generated userdefined simulations • Use tsunami computational modeling output from the TCP, combine visualization with human response simulation data • Entirely Web-based interface using open source client- and server-side software • Support interactive animation and querying Keon - GEO 580 - 9 May 2011 Simulation System Architecture TCP U,V velocities bathy/topo data runup series max inundation simulation config PHP JS Website params Filesystem Mapping Engine geoTIFFs config files MapServer GDAL Proj.4 map images config files data C++ Python geoTIFFs Simulation Model geometries Time series: - events - point obj - person data Spatial DB time-series data Website results PHP JS Google Viz API OpenLayers PostgreSQL PostGIS GEOS Proj.4 Keon - GEO 580 - 9 May 2011 Simulation Input Parameters Keon - GEO 580 - 9 May 2011 Simulation Model Simulation Model • Casualty model based on Yeh 2010 • 11 variables represent average male/female foot breadth hip breadth shoulder breadth stature hip height shoulder height foot length height abdominal depth chest depth weight • Population distribution based on configuration parameters, structure data, time of day • Intelligent routing over existing road networks • Algorithmic assessment of casualty status at each time step Yeh, H. 2010. Gender and age factors in tsunami casualties. Natural Hazards Rev 11(1): 29-34. Keon - GEO 580 - 9 May 2011 Spatial Database • Open source software – PostgreSQL, PostGIS, GEOS, Proj.4 • Simulation model code writes output to the database for each time step: Spatial DB • Event data Point object data Person data Seaside simulation: Initial condition of 6409 people with 400 time steps generated ~740K events Keon - GEO 580 - 9 May 2011 Mapping Engine Mapping Engine • Open source software – MapServer, GDAL, Proj.4 • Mapping engine code renders for each time step: Tsunami runup (colormapped geoTIFFs) Person location Person casualty status • Vector data rendered directly from spatial database • OpenLayers gives significant client-side flexibility, and can include multiple server-side spatial data sources Keon - GEO 580 - 9 May 2011 Simulation Results – Map Interaction Keon - GEO 580 - 9 May 2011 Simulation Results – Map Interaction Keon - GEO 580 - 9 May 2011 Simulation Results – Map Interaction Keon - GEO 580 - 9 May 2011 Simulation Results – Map Interaction Keon - GEO 580 - 9 May 2011 Simulation Results – Map Interaction Keon - GEO 580 - 9 May 2011 Simulation Results – Map Interaction Keon - GEO 580 - 9 May 2011 Simulation Results – Map Interaction Keon - GEO 580 - 9 May 2011 Simulation Results – Map Interaction Keon - GEO 580 - 9 May 2011 Simulation Results – Map Interaction Keon - GEO 580 - 9 May 2011 Simulation Results – Map Interaction Keon - GEO 580 - 9 May 2011 Simulation Results – Map Interaction Keon - GEO 580 - 9 May 2011 Simulation Results – Map Interaction Keon - GEO 580 - 9 May 2011 Simulation Results – Map Interaction Keon - GEO 580 - 9 May 2011 Simulation Results – Interactive Graph Keon - GEO 580 - 9 May 2011 Simulation Results – Interactive Graph Keon - GEO 580 - 9 May 2011 Simulation Results – Interactive Graph Keon - GEO 580 - 9 May 2011 Simulation Results – Interactive Table Keon - GEO 580 - 9 May 2011 Simulation Results – Interactive Table Keon - GEO 580 - 9 May 2011 Simulation Results – Interactive Table Keon - GEO 580 - 9 May 2011 Contact Info Dylan Keon KEC 2007 keon@nacse.org -or- keondy@geo.oregonstate.edu 541-737-6608