Homework 3 – Textbook Problems Q 5.2 ) Consider the rough guide to worst-case time complexity of algorithms (Table 5.2, pp. 171). Classify the complexity of following algorithms into classes of constant, logarithmic, linear, n*log(n), polynomial and exponential: Assume relevant line-strings have N segments, polygons have N vertices and graphs have N vertices and E edges. Douglas-Pucker algorithm to discretize arcs (sec. 5.2.3, pp. 176-177) Compute area of a simple polygon (sec. 5.5.1, pp. 195-197) Compute centroid of a polygon (sec. 5.5.1, pp. 195-197) Point in polygon (sec. 5.5.2, pp. 197-199) Intersection or a polygon-pair, each with N vertices (sec. 5.5.3, pp. 201-202) Depth-first or breadth-first graph traversal (sec. 5.7.2, pp. 213-217) Single-pair shortest path in a graph (sec. 5.7.2, pp. 213-217) All-pair shortest paths in a graph (sec. 5.7.2, pp. 213-217) Hamiltonian circuit (or Traveling salesman problem) (sec. 5.7.2, pp. 213-217) Answer Problem Complexity Douglas-Pucker algorithm to discretize arcs O(n) where n = number of points required on the final discretized arc Compute area of a simple polygon O (n) where n = number of vertices of polygon. Compute centroid of a polygon O(n) where n = number of vertices of polygon. Point in polygon O(n2) where n = number of vertices of polygon. Intersection or a polygon-pair, each with N vertices O(n2) where n = number of vertices of the polygons. Depth-first or breadth-first graph traversal 1) For Depth First Search : O(n) where n = |V|, number of vertices in the graph. 2) For Breadth First Search: O(m + n) where m = |V| and n = |E| (number of vertices and edges respectively.) Single-pair shortest path in a graph O(n2) by Dijakstra’s algorithm where n = number of vertices in the graph. All-pair shortest paths in a graph O(n3) by Dijakstra’s algorithm where n number of nodes in the graph. Hamiltonian circuit (or Traveling salesman problem) Exponential time. Q 5.4 ) Consider alternative data models for spatial object domain (section 5.3, pp. 177-187) including spaghetti, node-arc-area, DCEL, etc. Which data model is closest to the following popular formats: 1. GML Simple Features. 2. ESRI Shapefile. Briefly justify your answer assuming Euclidean space. Answer Format GML Simple Features Data Model Extended Node - Arc – Area Topology Model (NAA) Explanation : The model explicitly has defined Points, Line-strings, and Polygons. And though, linestrings and polygons are defined only as sets of point-locations, topological operations are also supported by this model. ESRI Shapefile Extended Node - Arc – Area Topology Model (NAA) Same as above. Q 6.2 ) Compare and contrast R-trees and R+-trees. Consider a minimum orthogonal bounding rectangle for rectangle T and L. Add it to the set of rectangles in Fig. 6.29 (pp. 253) and Fig. 6.30 (pp. 254). Redraw the R-tree and R+tree in Fig. 6.29 and 6.30 if this new rectangle is inserted. Briefly justify your solutions by recalling how R-tree and R+tree deal with large objects based on narrative of section 6.6.2 (pp. 252254)? Answer: R and R+ trees are methods for indexing rectangles (Minimum Bonding Rectangles enclosing rectangles, polygon and complex spatial objects). Both the index structures comprise of a rooted tree, where each node represents a rectangle. Higher level nodes represent rectangles enclosing the lower level child nodes, while the leaf nodes contain the actual rectangles being indexed. Both approaches divide the space of the rectangle into while minimizing the total area covered by the groups and the overlap between groups. The R and R+ tree differ in how they divide the rectangles into groups. R+ tree prohibits rectangles represented using higher level nodes from overlapping. The rectangles indexed by the R+-tree might be partitioned into sub rectangles with each part being stored in separate nodes. This rectangle is however replicated in the leaf nodes. Thus the size of R+ tree might be greater than that of R tree for the same number of rectangles. Although this makes point and range search queries more efficient, insertion and deletion of nodes into the index while taking care of under- and over-flow conditions becomes more complex. Insertion of the large rectangle in R- tree: We try to minimize overlap and the total area of containing rectangles. Let A be the new MBR surrounding rectangle T and L. A’ is a rectangle containing X and YMBR’s. T A W A‘ B X A’ N H X Y S H F L W Y N Z A S T B Z F L R + tree prohibits overlapping rectangles. Therefore the resulting tree is: N’ T W A A P N Q B X W H N T B A’ Y S F H Z X N F Y L N S Z L For the following GIS problem, state if it can not be solved using algorithms. If affirmative, name an efficient algorithm for the problem: Geo-locate all human settlements e.g. tents (or usable roads) a few days after Haiti earthquake, given an aerial image (e.g. see examples: 1 , 2 ) a few days after Jan. 2010 earthquake. Cannot be solved. Also since aerial images are raster, they are stored using region quad trees. Given a digital satellite image, create an object based model for water bodies including wells (points), streams (linestring) and lakes (polygons). Note: Students interested in the state of the art on such problem, may find a recent Textbook section 5.6 titled vectorization and rasterization (pp. 207-211) interesting. The given raster image can be vectorized using different algorithms such as Zhang-Suen erosion algorithm and Douglas-Peucker algorithm. PM quadtree can be used to store the obtained vector image. Given a news article (e.g. those from Google News), photograph, or video-clip, identify the geographic location it is describing. Note: Students interested in the state of the art on such problems, may find a recent New York Times article interesting. Attempts have been made to solve this problem using techniques from image processing. Find the elevation of a given point of interest (POI), given a regular tessellation based representation of elevation field defined over the USA. For simplicity, use square based tesselation of a Euclidean plane. Assume the availability of a geo-coding service to convert POI to grid row and column. For simplicity, focus on non-nested tessellations. Height is directly obtained. Raster images may be stored using Region Quadtrees. Revisit the previous problem using regular triangular tesselation. List your assumptions about the geocoding service. Height is directly obtained. Raster images may be stored using Region Quadtrees. Revisit the previous problem using irregular triangular tesselation. List your assumptions about the geocoding service. Height is obtained using interpolation. Given postal addresses of hotels, determine the hotel closest to a specified point of interest (POI), e.g. the Mall of America. Use Euclidean space and straight-line distance for simplicity. Assume the availability of a geo-coding service to convert postal addresses to geographic point-locations. R tree can be used to efficiently answer range queries. Revisit the previous problem using Network space and network-distances. Assume roadmaps are represented as a graph using road-intersections as nodes and road-segments connecting adjacent intersections as edges. Also available is a a geo-coding service to place hotels and POIs to nearest roadintersection (nodes). PM Quadtree is used to store the network Given a graph representation of a road-network (or electricity distribution network), determine if it is connected. The given problem is solvable. We start from a node and then visit the nodes in the network reachable from the starting node using the BFS algorithms. If there are nodes which are not visited, the graph is not connected. Access method is an adjacency matrix. Pm Quad Tree can also be used.