Imageodesy on MPI & GRID for Co-seismic Shift Study Using Satellite Optical Imagery J. G. Liu and J. Ma Department of Earth Science and Engineering, Imperial College London, SW7 2AZ Abstract Geohazard monitoring and assessment is one of the major application research fields of DiscoveryNet project serving for scientific knowledge discovery via distributed high throughput devices. This paper reports the algorithm and software development of imageodesy, its implementation on MPI and GRID. The imageodesy techniques based on normalised-cross correlation and phase correlation are capable of measuring horizontal terrain shift at sub-pixel accuracy and thus a powerful tool for geohazard (e.g. earthquake) study. Initial processing experiment using imageodesy on DiscoveryNet workbench has led to important scientific results on measuring co-seismic shift of a major earthquake using crossevent Landsat-7 ETM+ images in a vast uninhabitable area along eastern Kunlun Mountains. 1. Introduction The DiscoveryNet project [1] is developing gridbased algorithms and workbench for the integration and the analysis of data generated in a variety of application areas including earth science and remote sensing in particular, geohazard monitoring and assessment. A typical characteristic of geohazards is that movement and displacement will be produced. For instance, an earthquake produces co-seismic displacement while a landslide is characterised by slope failure and mess movement. Detecting and measuring these changes are always of vital importance for earth scientist to accurately assess a hazard and understand its mechanism and thus to draw the best plan for hazard prevention. Remote sensing datasets of the surface of the earth are detailed, uniform and spatially comprehensive over wide areas. Changes on land surface can be identified by comparing two images acquired with the same sensor system in a time period across the event. Unfortunately, the changes relating to geohazards are often subtle at sub-pixel level in commonly used Earth Observation satellite images. Therefore effective and robust techniques for subpixel change detection are ultimately needed. SAR interferometry [2-4] is one of such techniques which is of very high accuracy at half wavelength of the radar beam. However, the technique is restricted by data availability and environmental conditions. Another technique is ‘imageodesy’ that is capable of detecting changes at sub-pixel accuracy and robust to environment conditions [5, 6]. One major obstacle preventing the technique from wide use is its great demand on computing for intensive data processing. As an essential functionality of data mining for environmental hazard monitoring for DiscoveryNet application, imageodesy algorithms and software based on Normalized Cross-Correlation (NCC) and phase correlation [7] have been developed and implemented on parallel computers MPI and GRID that enable very demanding process of large dataset and intensive computing at a high speed. 2. Imageodesy on MPI/GRID 2.1 The principle of Imageodesy The imageodesy technique proposed by R. E. Crippen [5] is based on local correlation feature matching techniques which are widely used in computing sciences. The processing flow of the imageodesy uses local normalised cross-correlation (NCC) for feature matching and shift measurement is shown in figure 1. Image Image “before” “after” Read dataset Read dataset assure processing efficiency. The search and correlation processing can also be performed at subpixel level after interpolation, to achieve higher accuracy. 2.2 MPI implementation of NCC imageodesy Set searching Set calculation window window Move calculation window Maximum N correlation Y Shift-X Shift-Y Correlation coefficient Figure 1. Flow chart of NCC Imageodesy. In the first step, pre- (master) and post- (slave) event (e.g. an earthquake) images are very precisely co-registered. Then the NCC between the two images, at any one pixel, is calculated in a calculation window centred at this pixel. The calculation window for the slave image moves, lineby-line and column-by-column, within a search window in the image to calculate the NCC coefficient between the master and slave images at every pixel position in the effective area (not include those in the margin frame of half calculation window size) of search window. Lastly, the position of maximum correlation indicates the best matching point between the master and slave images. The differences between the coordinates of the two images, at the matching point, are the Xshift (along the image line direction) and Y-shift (along the image column direction) of this pixel in the slave image. The magnitude and direction of the shift between pre- and post-event images can thus be calculated for every image pixel. The maximum NCC at each pixel is also output as the R image that gives a measure of data quality. Low quality data in the X- and Y-shift images can be eliminated by NCC thresholding. The calculation window must be of adequate size to match the image textural scale, and the search window should be large enough to cover the potential maximum shift as well as any co-registration errors but also small enough to The processing task of imageodesy is massive. A 15m resolution Landsat-7 ETM+ Panchromatic band image is about 3.75GB after interpolating to 3m pixel size. Each sub-pixel point requires 2500 times calculation of NCC in a 75×75 window within a 125×125 search window in order to locate feature shift between two images at sub-pixel level. Considering that the purpose of imageodesy is to measure sub-pixel level shift at each image pixel (not sub-pixel) position, we only need to find the optimal matching at each pixel, or 5 sub-pixel interval in this case. In other words, while the NCC calculation and searching process for the maximum NCC are at sub-pixel accuracy, the processing of the whole scene can be proceeded at pixel interval. Thus the computing load is reduced by n2 times where n is the data interpolation number. To further speed up the processing, fast NCC (FNCC) algorithm modified based on [7] was implemented. The NCC is defined as below: R (u , v ) = ∑ x , y [ f ( x, y ) − f u ,v ][t ( x − u, y − v) − t ] {∑ x, y [ f ( x, y) − f u ,v ] 2 ∑ x, y [t ( x − u, y − v) − t ] 2 }1 / 2 A high computing efficiency is achieved in FNCC algorithm by computing the items in the denominator of the above formula using a lookup table containing integral (running sum) of image columns of calculation window width. In a search window of size M 2 and moving calculation window of size N 2 , NCC requires N 2 ( M − N + 1) 2 additions and 2 N 2 (M − N + 1) 2 multiplications, while FNCC only needs N 2 additions and N 2 + N 2 ( M − N + 1) 2 multiplications. Thus, with slight complexity in programming, the algorithm avoids all the repeated operations in calculation of the denominator of NCC and speeds up the processing by 5 to 10 times (depending on the search window size and calculation window size). However, the processing task is still beyond of the capacity of a PC or a UNIX workstation for completing the job within a reasonable time period. Using one UNIX processor, it takes more than 400 hours to complete the imageodesy processing from one pair of cross-event ETM+ images. The core program of FNCC imageodesy was then adapted to MPI to benefit the power of UNIX parallel processor systems and shared large resources. This is accomplished by a data farming scheme to distribute the input datasets as many data blocks to each of the processors that all execute the same core program of imageodesy in parallel. In the data farming scheme, the imagery data are split in line stripe blocks with overlaps of half search window size. When the parallel processing for imageodesy is completed, the output line stripe blocks from all the processors are merged automatically to build up full scene X-shift, Y-shift and R images. There is one problem of MPI parallel processing: one slow or heavily loaded processor can dramatically delay the completion of the whole task. From the sense of parallel processing, this does not waste computing resources and drag down the performance of the system as the released processors can be immediately employed for other tasks. However, from the user end, this means a long waiting time for the results. A smart data farming scheme is usually desirable for parallel processing to manage the data distribution to each processor dynamically based on its loading and the balance between the computing costs of waiting and data re-distribution. Unfortunately, the dynamic data re-distribution is very difficult and inefficient for imageodesy process because each attempt of such need relocate and process the overlapping margins. Frequent data re-distribution means that data blocks processed are fragmented and thus wipe off the efficiency of the FNCC algorithm. Our current MPI program of FNCC imageodesy uses a fixed data farming scheme for simplicity. The processing using 24 fast UNIX parallel processors for one pair of cross-event ETM+ images took 10-12 hours. This speed is adequate for scientific research on earthquakes but still too slow on environmental hazards, such as landslides, monitoring. From online data mining point of view, processing time is largely spent on computing. Once the results are output, the X- and Y-shift images as well as the R image can be browsed online via DiscoverNet workbench on which the image visualisation and analysis are driven by graphic resolution rather than data resolution. Imageodesy is therefore less demanding on network than on computing. 2.3 GIRD implementation of NCC imageodesy Ideally, end users would prefer to be able to perform imageodesy analysis in true real time rather than waiting for hours even days to see the results. This is especially important when emergency measures need to be taken for hazard prevention. In concept, GRID provides shared computing resource virtually of no limit. To test the potential, we adapted the FNCC imageodesy software to run on GRID. The key part of GRID version is dynamic auto data distribution based on the available nodes on the GRID. In order to do so, the input images are split into the possible minimal input unit: image line. Such a scheme will wipe off 50% of the FNCC efficiency. FNCC is a neighbourhood processing, each image line (the minimal data unit) must carry its neighbour image lines with it in order to conduct the processing. This increases data communication by m times, where m is the search window size. The experiments on GRID using a small image (512×512) completed much slower than local processing using a single PC (2GHz processor). Submission larger images of a few thousands lines and columns to the GRID simply blocked the processing pipe line and failed to complete the task. The current status of GRID is not sufficient for the massive neighbourhood processing of FNCC imageodesy. The network bottleneck is created when splitting the dataset into very small fragments that introduces tremendous demand on data communication among nodes. The future of GRID for dealing with the type of processing of imageodesy lies on very fast high throughput network. 2.4 Phase-correlation imageodesy The further development of imageodesy software package is to introduce the FFT (Fast Fourier Transformation) based phase correlation as the engine to achieve optimal feature matching (figure 2). By transforming the image data within a matching window into frequency domain via FFT, the phase correlation can pinpoint the best matching position directly as the peak of the overlap between the frequency distribution of the two images, without the time consuming searching [8]. It seems that FFT based phase correlation has the potential to speed up the imageodesy significantly. This may not always be the case. For imageodesy, phase correlation needs to be performed at every pixel; the process involves forward and inverse FFT. The technique save the time for searching however FFT can be a much slower process than FNCC for a large calculation window. It is therefore more efficient for small correlation window and large search area that is the case for matching features of high spatial frequency. The open resource FFTW at MIT has been used for our software development. The current FFT library cannot be shared by parallel processors. The phase correlation imageodesy was therefore only implemented for single UNIX processor and PC. For large dataset processing to study co-seismic displacement of earthquake, MPI software of FNCC imageodesy is the only practical tool at the moment. Image “before” Image “after” Read Dataset Read Dataset Hamming Windowing Hamming Windowing FFTW FFTW Phase Correlation Inverse FFTW Delta X Delta Y Correlation coefficient Figure 2. The flow chart of phase correlation imageodesy algorithm. 3. Co-seismic shift of the Ms 8.1 Kunlun earthquake 3.1 Background An Ms 8.1 earthquake occurred on 14 Nov 2001 at 09:26:18 UTC in the East Kunlun Mountains along the Kusai Lake segment of Kunlun fault. An E-W to WNW-ESE direction surface rapture zone of 400 km long was produced and the left-lateral strike-slip was as large as 16.3m according to field observations immediately after the earthquake conducted by Chinese scientists [9]. Occurred in a high attitude, no man’s land, the undisturbed co- seismic surface raptures and shift features are ideal evidences for studying the tectonic movement and stress field of the mighty Kunlun fault. Remote sensing satellite observation is obviously among the most useful, effective and some times the only data source for a regional study. SAR interferometry (InSAR) would be an ideal technique to reveal the deformation field of the earthquake and to provide 2-D quantitative measurements of the movement. Unfortunately, there are no suitable across-event ERS SAR fringe pairs available in the region. With 15m resolution for its panchromatic band, Landsat-7 ETM+ images have the potential for detecting and measuring shifts of land surface features at metre level accuracy using imageodesy technique and therefore feasible for studying this earthquake with 16.3m field measured maximum strike-slip displacement. The identities of the ETM+ images used in this study are shown in Table 1. Table 1. The ETM+ scene used for this study Path/Row Up-Left corner Imaging Area LL Date 36.9984856N 3 Oct. Before 138/035 91.2616043E 2001 Kusai 37.0025253N 15 May After Lake 91.2769623E 2002 3.2 Data processing The across earthquake event images were accurately co-registered to 0.3 pixel RSM by a linear transform using ground control points. Coregistration based on a linear transform only rotates, shifts or rescales an image to fit to another and therefore does not remove the earthquake induced local miss-matching. The possible error introduced by linear transform as the result of inaccuracy of GCPs is linearly propagated and it is thus easy to identify and remove from the shift detection images produced. The co-registered images were then interpolated to 3m pixel size. Bi-linear re-sampling was used for both image co-registration and interpolation. Full scene imageodesy processing was then carried out using FNCC algorithm on MPI. The calculation window is 75×75 and the search window is 125×125. The phase correlation was also tested for full scene processing using non-interpolated data. The results are very similar to the outputs from FNCC. The X- and Y-shift images were carefully analysed for possible global errors. There are no obvious patterns indicating un-negligible miss-registration and effects of different sun illumination angles as expected [6]. However, the high accuracy of imageodesy revealed detailed scanning patterns of ETM+ scanner showing the weak point of this type of instruments for imageodesy analysis. The phenomena will be discussed in a separate paper. To reveal the true information of co-seismic shift from the noisy background of scanning patterns, smoothing filter was used in combination with NCC thresholding. 3.3 The scientific results The powerful data mining functionality of DiscoveryNet (a combination of image processing, GIS and imageodesy) enables versatile visual analysis of the final results from imageodesy. Figure 3 shows the analysis workflow. As the image lines happens to be nearly parallel to the Kunlun fault direction, the smoothed X-shift image alone provide simple and effective presentation of regional strike-slip displacement of the fault. Figure 4 is the X-shift image (0.7 NCC threshold) displayed as a pseudo colour layer with the post-earthquake ETM+ Pan greyscale image as a ‘backdrop’ and with interpreted faults (the red lines) overlain. Positive values (red to green) represent shift to the right (east), and negative values (cyan to blue) represent shift to the left (west), so that figure 4 reveals stunning patterns of regional movement along the Kunlun fault, as the result of the earthquake. The southern side of the fault, in yellow-red, is shown to have moved significantly to the right (east), relative to the northern side, in blue-green. According to the measurements from this image, the average leftlateral shift along the main segment of the Kunlun fault is 4.8m, ranging from 1.5m to 8.1m, and the maximum shift is as great as 13m to the west, near Kusai Lake and the Harvard CMT epicentre (111401B, http://www.seismology.harvard.edu/cgibin/CMT2/). The image measurements are highly compatible with field observation [9]. In order to illustrate the actual magnitude and direction of the co-seismic displacement, a vector presentation for the area where the maximum leftlateral shift occurred was derived from X- and Yshift images with 371×371 window averaging, 20% cut-off for elimination of extreme values and a 0.8 NCC criterion, as shown in figure 5. The image demonstrates that the south side of the fault was slipped significantly to the right (east) against a largely stable or slightly right-shifted north block of the Kunlun fault. The relative movement of the fault is left-lateral and the south side of the fault is the active block. This observation coincide with the fact that the earthquake wave propagated to the south in a long distance shaking the areas of Sichuan Province more than a thousand kilometres away while it diminished rapidly toward the north. The regional pattern of the displacement implies that the Eastern Kunlun fault acted as a transitional boundary allowing the southern block of the fault rotate clockwise as indicated by a number of researchers [10]. The scientific results of this study provide the first 2-D quantitative assessments of the regional coseismic displacement of this magnificent Kunlun earthquake. 4. Conclusions As an essential function of remote sensing data mining for geohazard study in DiscoveryNet project, imageodesy technique has been implemented on MPI and GRID using FNCC and phase correlation algorithms. So far the most efficient approach is based on FNCC algorithm operating on MPI. The FNCC implementation on GRID yields a very disappointing performance because the demand for data communication increases dramatically when the neighbourhood processing of imageodesy is distributed to many nodes. The bottleneck will be resolved once the hyper-speed network becomes available. The phase correlation based algorithm is currently not operational on parallel or GRID processing mode. It is only more efficient when the forward and inverse FFT operations in phase correlation take less time than searching in FNCC. For an initial experiment, the FNCC MPI imageodesy was applied to process a pair of crossevent Landsat-7 ETM+ images to study an Ms 8.1 earthquake occurred on 14 Nov 2001 in a vast uninhabitable area along eastern Kunlun Mountains. The data produced revealed the stunning patterns of the co-seismic left-lateral displacement along the Kunlun fault in a range of 1.5-8.1m. This is the first 2-D measurement of the regional movement of this earthquake. It is an important scientific knowledge discovery. 5. Acknowledgement This research is part of DiscoveryNet project (GR/R67750/01) supported by EPSRC e-science pilot project grant. Computing centre of Imperial College London provided parallel processing facilities and technical support. MIT Phase correlation website provided free access and technical support for software development. Xinjiang Bureau of Seismology is acknowledged for providing field photos and some reference materials References [1] V. Curcin, M. Ghanem, Y. Guo, M. Kohler, A. Rowe, J Syed, P. Wendel. Discovery Net: Towards a Grid of Knowledge Discovery. Proceedings of KDD-2002. The 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. July 23-26, 2002 Edmonton, Canada. [2] Massonnet, D., Rossi, M., Carmona, C., Adragna, F., Peltzer, G., Feigl, K. and Rabaut, T., 1993. The displacement of the Landaus earthquake mapped by radar interferometry. Nature, 364, 138-142. [3] Zebker, H., Rosen, P., Goldstein, R., Gabriel, A. K. and Werner, C., 1994, On the derivation of coseismic displacement fields using differential radar interferometry: the Landers earthquake. Journal of Geophysics Research, 99, 19617-19634. [4] Peltzer, G., Crampe, F. and King G., 1999, Evidence of Nonlinear Elasticity of the Crust from the Mw7.6 Manyi (Tibet) Earthquake, Science, 286, 272-276. [5] Crippen R. E., Measurements Of Subresolution Terrain Displacements Using SPOT Phanchromatic Imagery. Episodes, 15, 1992, 56-61. [6] van Puymbroeck, N., Michel, R., Binet, R., Avouac, JP and Taboury, J., 2000, Measuring Earthquakes From Optical Satellite Images. Applied Optics, 39, 3486-3494. [7] J.P. Lewis, Fast Normalized Cross-Correlation. Expanded version of paper from Vision Interface ‘Fast Template Matching’, 1995, 120-123 [8] Sreenath Srikrishnan, Roberto Araiza, Hongjie Xie, Scott A. Starks, and Vladik Kreinovich, Automatic Referencing Of Satellite and Radar Images. Proceedings of 2001 IEEE Systems, Man, And Cybernetics Conference. NASA Pan-American Center for Earth and Environmental Studies, University of Texas. [9] Lin, A., et al.. Co-seismic strike-slip and rupture length produced by the 2001 Ms 8.1 centreal Kunlun earthquake. Science, 296 (5575), 2015 – 2017 (2002). [10] Tapponnier, P. et al., (2001) Oblique stepwise rise and growth of the Tibet Plateau. Science, 294, 1671-1677 Figure 3. Workflow chart of imageodesy data analysis in raster and vector formats. Figure 4. The smoothing filtered X-shift image of the Kusai Lake scene with 0.7 NCC threshold, overlain on post earthquake ETM+ Panchromatic images, together with interpreted fault lineaments (the red lines). The X-shift images are presented in pseudo-colour, in a spectrum from blue (negative values), through cyan (zero), to red (positive values), and representing a maximum value range of –10.0 m to 14.0 m. Figure 5. The co-seismic shift vectors of the Kusai Lake area overlaid on the post earthquake ETM+ Pan image. The vectors were derived from X and Y-shift images, with 371×371 window averaging, 20% cut-off for elimination of extreme values, and a 0.8 NCC coefficient criterion.