Bulk Interpolation using R Environment Jiří Kadlec – Aalto University, Finland Pavel Treml – Masaryk Water Research Institute and Charles University, Czech Republic 20.9.2013 FOSS4G Conference - Nottingham Fields with Observations in time and space What, Where, When (irregular sampling) Time, T “When” t A data value 2013-09-20 c D 4.2 s “What” Praha Vi “Where” Space, S Air Temperature (C) Variables, V Image created by CUAHSI, 2010 Type of data in this tutorial station Observations longitude latitude variable time value Praha 14.430 50.0156 temperature 2013-03-31 -0.2 Praha 14.430 50.0156 temperature 2013-04-01 1.1 Praha 14.430 50.0156 snow 2013-04-01 3.0 Brno 16.254 49.147 temperature 2013-03-31 -0.1 Brno 16.254 49.147 snow 2013-03-01 12.0 • Tens of variables • Hundreds of stations • Hundreds of observations per station • Incomplete data TASK For each time-step, Examine how one or More variables change In space Interpolation • Deterministic methods • Geostatistical methods • Repeated over many time steps • How to automate? Interpolation - Space Interpolation - Time value lat ? ? time lon the R Environment Free statistical software Windows, Mac, Linux Create High-quality graphics Many input data formats – Text file – Vector data (shapefile) – Raster data (grid) Many output data formats – Picture – Text file – Raster, vector Scripting language for automating repeated tasks Case study: Maps of air temperature and snow, Czech Republic • Year 2013 (365 maps) • Require identical color-scale START Load Data sets Select next time Select matching observations • Need to show rivers, boundaries • Need to show point values (to assess interpolation) YES Run Interpolation Create map cartography More times ? NO END Load Data Sets – Raster (SRTM elevation) Library(“sp”) library(“raster") library(“rcolorBrewer") srtm <- raster("srtm_utm.asc") colors=brewer.pal(6, "YlOrRd") intervals = c(0, 250, 500, 750, 1000, 1600) spplot(srtm_proj, col.regions=colors, at=intervals) Visualize raster using spplot Note: You can get DEM for your area in R using: Select packages With needed functions Raster: Load from local file Color ramp setting Load Data Sets – Vector (boundaries, rivers) library(“maptools") library(“sp") Select packages With needed functions border = readShapeLines(“border.shp") border_layer = list("sp.lines", border, lwd=2.0,col=“black") rivers = readShapeLines(“rivers.shp") proj4string(rivers) <- CRS("+proj=krovak +lat_0=49.5 +lon_0=24.83333333333333") rivers_proj <- spTransform(rivers, CRS("+proj=utm +zone=33")) river_layer = list("sp.lines", rivers_proj, lwd=1,col=“blue") layout = list(river_layer, border_layer) spplot(srtm, sp.layout=layout) Visualize vector datasets on top of raster Borders: Load shapefile Rivers: Load shapefile, Reproject vector from Krovak to UTM system Load Text File – Point Data (Stations) At first, we can use a subset (for first date/time in the dataset) data = read.table(“data.txt”, header = TRUE, sep = "\t", dec = ".") st = subset(data, DateTimeUTC == ‘2012-08-12’ & VariableCode==“SNOW") Coordinates(st) = ~Long + Lat proj4string(st) = CRS("+init=epsg:4326") stations <- spTransform(st, CRS("+proj=utm +zone=33")) stations_layer = list("sp.points", stations, pch=19, cex=1.0, col="black") labels = list("panel.text", coordinates(stations)[,1], coordinates(stations)[,2], labels=stations$value, col="black", font=1, pos=2) layout = list(stations_layer, labels) Reproject from Lat/Lon to UTM system All Layers together (raster, vector, points) (We use different color ramp in this example) Interpolation: IDW and Kriging Methods Interpolation: Covariate (Elevation) model = lm(value ~ elev, data) intercept = model$coefficients[1] slope = model$coefficients[2] plot(value~elev, data) abline(model) In our case temperature is often negatively correlated with elevation TEMP = a * ELEVATION + b Interpolation: Elevation as covariate Using TEMP = a * ELEVATION + b Instead of interpolating temperature directly, we create grid using regression equation and we only interpolate the residuals residuals Interpolated residuals TEMP = a * ELEVATION + b Combination (regression + interpol. residuals Color Breaks, Color Ramps One color ramp and set of user-defined color breaks easily re-used for different grids #user-defined color breaks colors = rev(brewer.pal(8, "YlGnBu")) brk <- c(-40,-35,-30,-25,-20,-15,-10,-5,0) plot(grid,breaks=brk,col=palette(colors) RColorBrewer packages provides Pre-defined color ramps Example Final Map: One time step Combines the previous steps … Snow depth (cm): 2013-01-01 Saving map to file • Set image size • Set image resolution • Other options (margins, multiple maps in one image) figure = “2012-02-07.jpg” png(filename = figure, width = 1500, height = 1000, pointsize = 25, quality = 100, bg = "white", res = 150) DO THE PLOT COMMANDS HERE dev.off() PNG picture JPEG picture PDF file WMF file .asc ascii grid file (for GIS softwares) …….. Bulk Interpolation: “FOR” loop (multiple time steps) START Schema of the Loop Load Data sets for (j in 1:length(timesteps)) { # select subset of observations # run interpolation # create map # export map to file (picture, pdf, …) } Select next time Select matching observations YES Run Interpolation Create map cartography More times ? NO END Final Map: Multiple time steps R as Compared with Desktop GIS Desktop GIS R Statistical Software Environment Maps are highly interactive (-) Maps are static Create small number of maps with graphical user interface (-) Some commands counter-intuitive for novice user Automated cartography (label placement…) (+) Create very large number of maps with same cartographic symbology Map-Layout, model builder tools (+) Task is easy to automate and reproducible ArcGIS, QGIS, SAGA, MapWindow,… IDV, Panoply, … (+) Good documentation and large user base Desktop GIS: project file (.mxd, .qpj, …) R: script (.R) References BOOKS • Bivand, Roger S., Edzer J. Pebesma, and Virgilio Gómez Rubio. Applied spatial data: analysis with R. Springer, 2008. • Hengl, Tomislav. A practical guide to geostatistical mapping of environmental variables. Vol. 140. No. 4. 2007. WEBSITES • www.spatial-analyst.net • www.r-project.org • http://hydrodata.info/api/ (source of data in our demos) R-PACKAGES USED RColorBrewer, maptools, rgdal, sp, raster Try the scripts yourself! hydrodata.info/R/ • Sample scripts for bulk interpolation • This presentation • Source data { central Europe meteorological observations }