Online GeoSpatial Processing (OLGP): An Experimentation With UMN-MapServer Ranga Raju Vatsavai SRG, Department of Computer Science and Engineering RSL, Department of Forest Resources University of Minnesota Plan-B Presentation on March 26, 2003 Committee: Prof. Shashi Shekhar Prof. Jaideep Srivastava Prof. Thomas E. Burk Outline Motivation Contributions Relate Work Our Approach Architecture GeoProcessing Design Issues Validation (Case Study) WebGIS Architectures NRAMS and kNN Applications Conclusions Motivation Proper use and monitoring of environmental resources requires Remote Sensing and GIS Invaluable input for natural resource analysis and mapping Problem? Timely and accurate data on land use Regular availability Lack of an efficient and easy-to-use delivery mechanism. Internet Web has became popular as a vehicle for information distribution and client/server applications. GISes are Standalone Web offers a convenient way to share complex multimedia data Motivation Background Background ForNet MapServer One of 18 Remote Sensing Database (RSD) programs funded by NASA MapServer and ImageServer Monolithic CGI program Map Creation, Simple feature query, Feature annotation, feature classification, on-the-fly projection, etc. MapScript: Rapid prototyping of web applications with server-side scripting languages. Beyond WebMapping (OLGP extension to MapServer) GeoSpatial Analysis and Native DBMS support Motivation Contributions Contributions Beyond Web Mapping Efficient Implementation Remote Sensing, Sampling Data (FIADB) Extending Queries Load Balancing Client/Server Approach, Fine-tuning Integration of disparate data sources Online GeoSpatial Processing (OLGP) Arbitrary region of interest (ROI) Integration with RDBMS Innovative Applications – From theory to practice Contributions Related Work WebGIS Architectures WWW Intelligent mix of protocols – Popularity can be attributed to – Client/Server handshaking and HTTP HTML and XML User-friendly Web Clients (Netscape, IE, ..) Recent Advancements in development environments WebGIS Main Components The Client, The Server and the Network Limitations – No support for geographic elements Initial developments – Visualization (MapViewer from Xerox PARC) Contributions Related Work Related Work - WebGIS Architectures Initial Focus – Map Visualization (e.g. MapViewer –Xerox) First noticeable WebGIS – GRASSLinks (UC, Berkeley) Industry’s Initial response CGI Wrappers to their standalone GIS This resulted in “thin-client/fat-server” systems Limitations Benefits Server is overburdened with data access and analysis As the number of requests increases, server performance decreases Users does not need any additional resources Recent Advances in Internet Development Environments Applets, ActiveX Controls and Extendible Web Clients (“plug-ins”) Client-side GIS Resulted in “thick-client/thin-server” systems Related Work Our Approach Architecture Related Work Our Approach Architecture WEBSAS - Architecture “Balanced Client/Server” paradigm 3-tier Architecture Tier 1: Client Tier 2: Application Server CGI Module (+MapServer) GeoSpatial Analysis System Communication System Tier 3: GeoSpatial Database Access System Generic Image support (BIL/BSQ/BIP) Native RDBMS support (MySQL, Oracle etc). Related Work Our Approach GeoSpatial Analysis Availability of multi-temporal AVHRR imagery made it possible – NDVI and NDVI Profiles Plant phenology Quantitatively describe NPP patterns in time and space Monitor and Map natural resources at regional and global scales. A temporal profile is a graphical plot of sequential NDVI observations against time. These profiles quantify the remotely sensed vegetation’s seasonality and dynamics ROI and Polygon based queries Change Detection Related Work Our Approach GeoSpatial Analysis – Spatial Interpolation Queries Given a set of sample plots (locations) and a set of corresponding attributes a set of spatial database layers (RS,..) user specified arbitrary region of interest (ROI) Find Estimates for each location inside the ROI Constraints i.e. Generalize queries over space Non-numerical attributes, auto-correlation Objective Minimize error Related Work Our Approach GeoSpatial Analysis – Spatial Interpolation Queries Query Window Given FIADB SURVEY(..,statecd,cycle,subcycle,…) COUNTY(..,statecd,unitcd,countycd,..) PLOT(..,statecd,cycle,unitcd,plot,..) SUBPLOT(..,statecd, ..,subp, ..) COND(..,statecd,..,conid,..) TREE(..,statecd,..,tree,..) SEEDLING(..,statecd,..,spcd,..) #12 #6 #28 #13 #48 #61 #12 Find estimates at each cell Related Work Our Approach GeoSpatial Analysis – Spatial Interpolation Queries 1) 2) Extract FIA plot-id and coordinates from FIADB plot-id[], x[], y[] <SELECT p.plot, p.lat, p.lon FROM Plot p WHERE p.countycd = ‘137’ and … Coordinate transform :: lattitude/longitude into UTM (meters) ima_x[], img_y[] <- geo_to_utm(p.lat[], p.lon[]) Algorithm Related Work Our Approach GeoSpatial Analysis – Spatial Interpolation Queries 3) 4) For each plot, compute mean of 3x3 window Signature[][] <- mean(p.plot, DN[][]). For each pixel vector (scan each line, and each pixel in a line) pixel[] <- Dni, where i = 1,2, .., #of channels p.plot Signature[][] .. .. 203 [21,38,24,…]’ 204 [28,31,22,…]’ .. .. Algorithm Related Work Our Approach GeoSpatial Analysis – Spatial Interpolation Queries 5) Compute Euclidean-distance between pixel[] and each spectral signature Distance[plotid[]] <- euc_dist(pixel[], signature[][]) 255 Signature[][] Pixel[] IR 0 Red 255 Algorithm Related Work Our Approach GeoSpatial Analysis – Spatial Interpolation Queries 6) 7) Assign the closest FIA plot-id to the output pixel(x,y) Opixel(x,y) <- min(dist[]) Repeat (for all pixels) Algorithm Related Work Our Approach GeoSpatial Analysis – Spatial Interpolation Queries Generic FIA Query Calculate the total number of all live white pine trees on timberland in the state of Michigan SELECT SUM(p.expcurr * t.tpacurr) FROM plot p, cond c, tree t WHERE p.statecd = 26 AND (joint conditions ..) Limitations? How about estimates for a census bloc? Related Work Our Approach GeoSpatial Analysis – Spatial Interpolation Queries Extract plot-id’s from plot-id image Generate plot-id histogram ( {<plot-id, frequency>, …}) Formulate Query (on-the-fly) SELECT p.plot, p.expcurr, t.tpacurr FROM plot p, cond c, tree t WHERE p.statecd = 26 AND (join conditions ..) AND p.plot in (plot-id-list) Results = SUM (frequency[p.plot] * p.expcurr * t.tpacurr) FCC Image Plotid Image (Integration) plot Expcurr .. .. .. 203 1.88 ... … 308 2.11 Solution Related Work Our Approach Design Choices – System Level Performance Issues Communication – Amount of data to be transferred Increasing speed of internet connection Decreasing the amount of data to be transferred Computation – GeoSpatial Analysis Functions Progressive Vector Transmission – M.J. Egenhofer et. al. Efficient Spatial Data Transmission in WEBGIS – Z.-K. Wei Designing efficient algorithms Efficient data structures Our Approach Load Balancing Fine Tuning Partial Materialization System Configuration - Ease of Use Related Work Our Approach Design Choices Client Request Web Server MapServer Hard-coded Server Mapping Configuration LB=>C or S ? Fg Client Presentation Template Based Static Database Pre-compute Global Optimization Global + Local Related Work Our Approach Design Choices Load Balancing Client/Server Where should fg be computed Choices – Server, Client, Pre-realization Our Approach Based on amount of data to be transmitted over network – “Output(fg) < Input(fg)” Based on Response Time. fg On Server If ((tf < tc) && ( dp di ) || (dp << di)) otherwise fg on client In Server case Data to be transmitted = Output(fg) ( di) In Client case Data to be transmitted = Input(fg) Related Work Our Approach Design Choices Fg Level Fine Tuning Pre-realization An important criteria we have adopted in the development of geospatial database Criteria Apply fg first and populate the geospatial database If the operation is computationally intensive && Parameters are fixed (output is same) && Size (output) Size(input) Else apply fg on-the-fly in WEBSAS Related Work Our Approach Design Choices Fg Level Fine Tuning Multi-temporal Image Organization – Band Interleaved by Pixel (BIP). Given images BIP BSQ Related Work Our Approach Design Choices Partial Materialization Q u e r y R e s p o n s e T i m e No materialization Partial-materialization Partial materialization Division of fg into sequence of sub-tasks If possible pre-compute one of the compute-intensive subtask Example – kNN in queries involving interpolation NM – 484 MB (St. Louis) Full materialization PM – 48 MB FM – 96 MB/attribute NM – >10 H Storage Cost PM – (25W-15s, 365W-4m) FM – (25W-1s, 365W-1m) Related Work Our Approach Design Choices System Configuration Application is configured using server-side configuration files Map Object Label Object Layer Object Feature Object Web Object Process Object Front-end Standard HTML elements Java-Script Related Work Our Approach Design Choices System Configuration Map Object MAP Layer Object NAME Application’s Name STATUS On/Off IMAGECOLOR R G B UNITS Meters/Feet FONTSET Fonts file name MARKERSET Filename (Shade/Line) SIZE X Y Layers Scale ….. END Layer NAME Layer Name GROUP Name DATA File Name STATUS ON/OFF TYPE Annotation|Point|Line|… MINSCALE N CLASSITEM Column CLASS … LABELITEM Column END Our Approach Validation Validation Case Studies Application is on Web Since March 1999. Natural Resource Analysis and Mapping System (NRAMS) – A WebGIS application for Land Managers. Efficient Interpolation Queries (kNN Application) AVHRR/MODIS Download Facility Usage Statistics Public Domain Software Mailing Lists Over 600 users NRAMS-Frontend Analysis Window Size Mapping Numerical Window Histogram Plot Spectral Plot Save Image Browse Map Query Data Layers Our Approach Validation NRAMS – Numerical Window and Histograms Our Approach Validation NRAMS – Spectral Profiles Validation Conclusions Conclusions Visualization, Query and Beyond Various Design Choices To Improve Performance Load balancing Client/Server Architecture Fine tuning, Pre/Partial materialization Template Based System Configuration Open Source, Documentation, Mailing Lists, Trusted User Base, Dynamic, Innovative applications Online GeoSpatial Analysis (OLGP) 600 users, 10 developers, universities, public, private Avg. 10,000 Data products/month Future Directions Caching – Both server side and client side Persistent database connections Web Coverage Standard Acknowledgements http://terrasip.gis.umn.edu/projects/ NASA Funding Prof. Shekhar, Prof. Tom Burk, Prof. Jaideep, Steve Lime (DNR), Perry, Jamie, Mark Hansen, SRG members, RSL members.