Geoprocessing with Python and Numpy

```Geoprocessing with GDAL and
Numpy in Python
Delong Zhao
11-03-2011
Geoprocessing with GDAL and Numpy
in Python
• GDAL - Geospatial Data Abstraction Library
• Numpy - the N-dimensional array package for
scientific computing with Python.
• Both of them are open source software
dataset using
GDAL
Do some
calculation using
Numpy
Output to geospatial
dataset using GDAL
GDAL
• Supports about 100 raster formats
– ArcInfo grids, ArcSDE raster, Imagine, Idrisi,
– ENVI, GRASS, GeoTIFF
– HDF4, HDF5
– USGS DOQ, USGS DEM
– ECW, MrSID
– TIFF, JPEG, JPEG2000, PNG, GIF, BMP
– See http://www.gdal.org/formats_list.html
Numpy
• the fundamental package needed for scientific
computing with Python.
• a powerful N-dimensional array object
• tools for integrating C/C++ and Fortran code
• useful linear algebra, Fourier transform, and
random number capabilities.
• http://numpy.scipy.org/
Installation
• 1. Enthought python scientific computing
package, includes numpy
– http://www.enthought.com/
• 2. GDAL - Geospatial Data Abstraction Library
– http://www.lfd.uci.edu/~gohlke/pythonlibs/
• Or all of these has been installed on EOMF
and Cybercommons servers
Tutorial
• http://itmetr.net/itmetr.cgi/PyIntro
• http://www.gis.usu.edu/~chrisg/python/2009
/
• http://www.scipy.org/NumPy_for_Matlab_Us
ers
• https://www.cfa.harvard.edu/~jbattat/comput
er/python/science/idl-numpy.html
Sample 1
• Read two tif files (red band and nir band)
• Calculate
• Output NDVI in same projection and
georeference as the input file.
• Numpy example
Algorithm development for global cropping
intensity from 2000-2011
1-crop per year
2-crop per year
3-crop per year
Dynamics of winter wheat and paddy rice fields in Nanjing, Jiangsu, China
1 .0
V e g e ta tio n in d ice s
0 .8
NDVI
LSW I
EVI
F ie ld site in Jia n g su
(c) 7/3/99 2-weeks after rice transplanting
0 .6
0 .4
0 .2
winter wheat
0 .0
1 /1 /0 2
3 /1 /0 2
5 /1 /0 2
7 /1 /0 2
9 /1 /0 2
1 1 /1 /0 2
1 /1 /0 3
T im e (8 -d a y in te rva l)
(b) 6/11/99 rice field preparation
MODIS 8-day composites of surface reflectance product (MOD09A1)
NDSI
NDVI,
EVI,
LSWI
Evergreen
Permanent
Temporal profile analysis of individual pixels
Cropping intensity ( # of crops per year)
Crop calendar (planting & harvesting dates)
Global Mapping of Croplands
1 .0
o n e M O D IS p ixe l in B a n g k o k a re a
NDVI
LSW I
EVI
V e g e ta tio n In d ice s
0 .8
0 .6
0 .4
0 .2
0 .0
S ta rtin g d a te
S ta rtin g d a te
-0 .2
1
2
3
4
5
6
7
8
9
10
11
12
M o n th in 2 0 0 4
1 .0
O n e M O D IS p ixe l in M e k o n g b a s in
NDVI
LSW I
EVI
V e g e ta tio n In d ice s
0 .8
0 .6
0 .4
0 .2
0 .0
flo o d in g &
tra n s p la n tin g
-0 .2
Cropping Intensity map in 2004
1
2
3
4
5
flo o d in g &
tra n s p la n tin g
flo o d in g &
tra n s p la n tin g
6
7
8
M o n th in 2 0 0 4
9
10
11
12
Python multiprocessing
• http://docs.python.org/library/multiprocessin
g.html
•
•
•
import multiprocessing
pool = multiprocessing.Pool(processes=multiprocessing.cpu_count())
pool.map(doprocess,findfiles(root_dir))
Benchmark
• I did some benchmark. By using all 8 cpu and 16G
memory on one eomf server
• can finish 1 MODIS tile NDVI, EVI, CLOUD,SNOW,
LANDWATER, FLOOD, Drought products in 15
minutes.
• This means we can finish global 296 tiles 20002011 MODIS data in 786 hours (32 days) with one
server.
• And we have 6 computation servers, we can
improve it to 6 days if all the servers can do the
job.
```
