New Method for automatic lineament extraction. Case Study

advertisement
New Method for automatic lineament extraction. Case Study:
Turcianska Kotlina, Slovakia
Osnova dle pokynů od Elsevier pro Computers & Geosciences:
1.
2.
3.
4.
5.
1.
Introduction
Material and methods
Results
Discussion
Conclusions
Introduction
State the objectives of the work and provide an adequate background, avoiding a detailed literature survey or a
summary of the results.
The aim of this paper is to present a new method for automated lineament extraction. The
method is tested on study area and compared with results of previous geomorphological
research. The analysis is made in two scales to prove scale resistance of method.
The DEM or different derived surfaces from DEM were used for automatic lineament delineation
by other authors, especially shaded relief (Abdullah et al., 2010, Masoud, Koike, 2011, Jordan,
Schott 2005), second derivatives of DEM (Wladis, 1999) and pure DEM (Vaz, 2011, Mallast et
al. 2011).
Authors use different methods of (semi)automated lines extraction. The most authors use image
pre-processing (edge enhancing, thresholding) followed by edge linking methods (Hough
Transform). In some cases, the pre-processing is part of the extraction (closed software
modules).
Pradhan, 2010 used manually extraction method based on automatically pre-processed images
with enhanced edges. Abdullah et al., 2010 uses software PCI Geomatica with module LINE,
which is used for extraction of linear features from raster images. Mallast et al., 2011 uses
software ERDAS Imagine modules and PCI Geomatica. Argialas, Mavranza 2004 uses
optimized Hough Transform method developed by Fitton and Cox, 1998. Pinto et al., 2013 uses
Hough Transform and software LESSA developed by Zlatopolsky, 1992.
The comparison of extracted lines with known reference data is commonly made in many
papers. The geological fault lines or expert lineaments are frequently used as reference data.
Although, the methodology of comparison differs paper to paper, the subjective visual
assessment method is used by all authors. Some papers apply more objective quantitative
methods. Abdullah 2010 computes simple statistic of count and length of lineaments to compare
different datasets. The raster comparison approach is implemented in Vaz 2012. The distance
between lineaments and reference point data (wells) is calculated as a comparison metric in
Mallas et al. 2011 to prove correlation with hydrologic data.
2.
Data
The goal of this article is to compare automatically extracted lines with data from previous
geomorphologic research made by Minár, Sládek 2009. The results were compared in two
scales 1:50 000 (Area A on Figure 1) and 1:10 000 (Area B on Figure 1).
Fig 1. The study area is located in Slovakia in Turcianska kotlina. Two locations with different
scales are partially overlapped.
2.1.
Scale 1:50 000
The dataset Expert corresponds to expert lineament map delineated from topographic map 1:50
000. The dataset Expert generalized is made by simplification and emphasizing main directions
of dataset Expert (Minár, Sládek 2009). The dataset Geology fault lines corresponds to detected
and expected fault lines from Geology map 1:50 000. The datasets Auto 30 m and Auto 50 m
presents lines extracted by algorithm described in this paper. The length and count statistic is
shown in table
1.
Tab. 1: The descriptive statistic of datasets 1:50 000
2.2.
Scale 1:10 000
There are analogous datasets with the difference that geology dataset is missing due to
insuficient data coverage of study area with detailed geology maps (1:10 000). The table 2
summarize the descriptive statistic of datasets 1:10 000.
3.
Tab. 2: The descriptive statistic of datasets 1:10 000
Method of automated lineament extraction
Provide sufficient detail to allow the work to be reproduced. Methods already published should be indicated by a
reference: only relevant modifications should be described.
This paper presents a new method of automated lineament extraction based on the raster
image analysis of digital elevation model (DEM) derived surfaces and post-processing results to
obtain the final lineament map. The method is composed of six steps (Figure 1): a) creation of
DEM, b) deriving hillshades from DEM, c) line extraction based on edge detection, d) noise
removing, e) cluster line analysis, f) classification of lineaments, .
Fig. 1: The workflow of new method for automated lineaments extraction
The results are then compared with geomorphologic research data from the study area to
evaluate the usefulness of the automatic method. The method of comparison is described in the
section 4.
3.1.
Creation of DTM
The parameters of DEM (source and spatial resolution of DEM) crucial influence the quality and
scale of the results. The source of DEM impacts on spatial accuracy and noise of the results
and directly determines the maximal value of the spatial resolution parameter. Depending on
scale of the analysis, the spatial resolution should be chosen. In our study area, two DEMs
interpolated from contours of topographic maps 1:10 000 and 1:50 000 were used. The spatial
resolution was chosen 30 m.
After choosing the appropriate DEM, pre-processing operations must be done. Some part of
algorithm is based on hydrological principles, thus the operation of filling the cells with an
undefined drainage direction (Fill algorithm from ESRI, 2013) is necessary to be done.
3.2.
Derived surfaces of DTM
Authors using shaded relief for lineaments extraction mentioned the dependency of results on
illumination azimuth. Mallast et al., 2011 and Abdullah et al., 2010 tend to avoid this azimuth
bias using combination of differently illuminated rasters to one raster which is used to extract
results. In this paper, this bias behaviour is taken as an advantage. The differently illuminated
shaded reliefs are used to extract different results which are processed separately.
The parameters which influence the shaded relief are spatial resolution, altitude of illumination
(height of light source) and azimuth of illumination (angle of light source). The resolution of
hillshade raster is the same as input DEM. Although the altitude of illumination influences the
image contrast, the changing the value minimally effects the results.
The value of 30° was chosen for our study area.
The azimuth of illumination has distinctive impact on results, thus the variations of this
parameter is used. The value of this parameter ranges from 0 to 360° using step 15°. Every
from 24 hillshades rasters is input for line extraction module.
The edge enhancing filters are commonly applied to rasters before edge detection methods. In
this paper, the enhancing filters are part of the line extraction algorithm – module LINE from the
software PCI Geomatica.
3.3.
Line extraction
The software PCI Geomatica with module LINE and variant of Hough Transform (HT) used by
Fitton and Cox, 1998 were tested for line extraction. The results from PCI software were much
more corresponding with the dataset of expert lineaments than the results from HT thus the PCI
software was chosen to fulfil the line extraction step.
The detail description of workflow and parameter setting of LINE module is already written in
internal PCI Help and papers Abdullah et al., 2009 and Mallast et al., 2011. Quoted authors
used linking and generalization abilities of LINE module. In our tests, this options made noisy
and meaningless results (see Figure X with example), which led to switch off this parameters.
The other parameters were set based on trial-and-error experiments as following:
Binary threshold GTHR = 10,
Length threshold LTHR = 10,
Radius of Gaussian filter: RADI = 10,
No generalization: FTHR = 1,
Switch off the line linking: ATHR = 0, DTHR = 0.
3.4.
Noise reduction
The most of authors reduce noise using different techniques. Eg. length thresholding (citations),
morphological operation closing (dilation and erosion) for raster images (Mallast et al., 2011).
In the present work, the data to reduce was represented by huge amount of vector lines. The
algorithm removes all non-relevant lines. In this case, the relevance is defined in the following
manner: line is relevant if lies on specific place where sufficient number of other lines with
similar length and orientation is located. The sufficient number of lines is called as frequency
threshold and is depended on total number of line sets. (It is recommended to choose this
parameter interactively based on visual interpretation of results after applying noise reduction.)
Finding of relevance for each line is based on raster approach. To remove lines which are
standing alone is the main principle. To achieve this, all vector line sets are converted to binary
rasters. The conversion is applied to buffers created around each line in order to set spatial
tolerance. The buffer size is chosen with respect to raster spatial resolution and character of the
study area. The raster values in the corresponding cells are counted to create one raster called
raster of relevance. The high values represent areas with high occurrence of lines (see Figure
X.). The information from raster of relevance is transferred to every line of all sets. The
frequency threshold is applied to every line. Only lines with higher value of raster of relevance
are preserved.
The logical core of this method is used also for classification of final lineaments (see section
3.6). Although, the raster approach is faster than process set line by line, not every line is well
evaluated. (see examples). The noise removing is necessary step before the cluster analysis to
decrease total number of lines to speed up the cluster analysis.
3.5.
Cluster line analysis
As it is seen from the Figure x, the lines create clusters. The meaning of this step is to recognize
all the clusters and replace the bundle of lines with representative single line. The bundles are
obvious easily seen by eye, but the computer must use the cluster line algorithm.
Mention Cluster analysis examples from other authors (KIV). But there is no analysis for this
specific task thus own algorithm for cluster line analysis was developed.
Figure X. Sets of lines obviously creates clusters
The basic principle is to trace every line and explore its local neighbourhood to identify similarly
oriented lines. To facilitate the analysis, all line sets are merged to one layer and the statistic of
azimuth and length are calculated.
The merged set is sorted descendant by line’s length and for each line is made the following
workflow:
1.
2.
3.
4.
5.
Choose first (longest) line from the set
Make buffer around chosen line
Select all lines which are completely within the buffer
Fine selection to lines which has azimuth in the range +/- 20° from the chosen line
If the selection contains more than 4 lines continue to step 6, otherwise continue to
step 7
6. Create buffer around selected lines (=cluster) with the following attributes: count of
selected lines, average length, average azimuth
7. Delete all selected lines from the set
8. Repeat from the step 1
The setting of algorithm’s variables (buffer size, count threshold, azimuth condition) was made
based on analysis of line bundles (average width, count, azimuth dispersion).
The cluster line analysis results in identifying clusters. Each cluster is saved as polygon feature
with all necessary attributes. The post-processing makes average line from each cluster using
centroid of the polygon and average azimuth and average length. The set of average lines is
final lineament layer.
3.6.
Classification of lineaments
The extracted lines have origin in discontinuities of raster image. The geomorphological
meaning of extracted lines is given using of shaded relief raster as input to the analysis. To
classify different geomorphological structures, the positive and negative lineaments are defined.
The positive lineaments represent ridges and negative lineaments represent valleys. To
interpret these types separately, it is necessary to distinguish these classes. (Citations about
different geomorphological meaning of these types).
Abdullah et al., 2010 says that using of specific azimuth of illumination is able to distinguish
positive and negative lineaments. This hypothesis is possibly true only for specific location. The
tests proved that it is not generally valid. Other authors classified positive and negative lines in
these ways … (citations).
The own algorithm was developed to classify ridges and valleys lines. The main principle is to
assess the lineaments vicinity to the drainage network. Logically, the vicinity is high in the case
of negative and low in the case of positive lineaments.
Figure X. Classification of lineaments to positive and negative classes
In step 3.4, the algorithm which assesses the frequency of lines was presented. The same
algorithm is adapted here. Instead of raster of relevance, the water accumulation raster is used
as an input. For each line, the mean and median value of water accumulation is computed.
These two statistics are used to differentiate between line on the ridge and line in the valley.
The median is required to avoid uncertainty in some specific cases where mean could be
influenced by few pixels with high value of accumulation.
4.
The comparative methods
In this paper, the quantitative and qualitative methods are used to assess the automatically
extracted lines. The quantitative methods statistically describe datasets and their relation. The
qualitatively methods visually assess the datasets and make interpretation based on results of
quantitative methods.
4.1.
Correlation method
The vector based comparison is proposed. The main principle is to imitate visual assessment, in
other words the method tries to find if the line from the test dataset has corresponding line in the
reference dataset.
The proposed quantitative geometric method does not replace visual assessment. More than
geometric values are needed to evaluate geomorphologic quality of lineaments. But this method
offers good comparison metric.
The proposed method computes mutual correlation of two vector datasets - dataset A and
dataset B. Firstly, dataset A is compared to B, then dataset B compared to A, both correlation
index are important to evaluate the mutual correlation.
For each line from dataset A, the algorithm finds similar lines from dataset B. The similarity is
defined by spatial vicinity and azimuth tolerance. The correlation index is computed like length
ratio of founded lines. The ratio is always less or equal? than 1. The workflow of the comparison
method is illustrated on the figure X.
The method is driven by two parameters, the search radius (size of line’s buffer) and azimuth
tolerance. The search radius parameter depends on study area and scale of the research. It
expresses the maximum distance between two lines to be considered as similar. The azimuth
tolerance parameter expresses the maximal deviance between line azimuths to be considered
as similar. In this study, the search radius 250 m and azimuth tolerance 25° was used for
comparison parameters.
Figure X. The workflow of comparison algorithm
4.2.
Directional analysis
The morphometric statistics are widely used by authors (citations). Statistics which support the
lineament interpretation are rose diagrams, length and azimuth distribution (citations). The
algorithm which computes length and azimuth form coordinates was applied for each line.
Software like GEORIENT can be used to plot rose diagrams from this data. (citations for
Georient).
The rose diagrams are usually plotted using classes 5 or 10°. It means that numbers of lines
from intervals 0-4°, 5-9°... are cumulated to classes before plotting. This approach has one
disadvantage. Changing the start of interval (1-5°, 6-9°...) produce differently appeared rose
diagrams which could lead to different interpretation (see Figure X a and b). Figure X c) shows
rose diagram independent to start interval (interval size is 1°). But this diagram is too detailed
and difficult to interpret.
The solution of this problem is to use moving average line to smooth the detailed data and
emphasis the main trends. The software GEORIENT is not capable to plot trend lines, thus
histogram graph in Excel was chosen as appropriate solution. For each angle in x-axis, the
graph shows the relative count of lines on y-axis. The graph is interspersed by 7 values moving
average line computed from three previous, current and three next values (see Figure X d).
Fig X: Methods of directional analysis. Top: Rose diagrams of one dataset, a) 5° classes, start
interval 0°, b) 5° classes, start interval 1°, c) 1° classes. Bottom: histogram with moving average
5.
Results
Results should be clear and concise.
Results of lineament extraction on study area.
5.1.
Scale 1:50 000
Comparison of methods
Quantitative comparison
Tab. X: Results of statistical comparison of datasets 1:50 000
Interpretation
Figure X. Dataset to compare 1:50 000
Figure X. Results of directional analysis of datasets 1:50 000 with markers of main directions.
5.2.
Scale 1:10 000
Tab. X: Results of statistical comparison of datasets 1:10 000
Figure X. Dataset to compare 1:10 000
Figure X. Results of directional analysis of datasets 1:10 000 with markers of main directions.
Subjective qualitative comparison
6.
Discussion
This should explore the significance of the results of the work, not repeat them. A combined Results and Discussion
section is often appropriate. Avoid extensive citations and discussion of published literature.
Vliv parametrů na výsledky, možnosti ovlivnění výsledků uživatelem
7.
Conclusions
The main conclusions of the study may be presented in a short Conclusions section, which may stand alone or form a
subsection of a Discussion or Results and Discussion section.
Zde tech. detaily, aplikace v jiném článku
References
MINÁR, J., SLÁDEK, J. (2009). Morphological network as an indicator of a morphotectonic field
in the central Western Carpathians (Slovakia) In Z. Geomorph. N.F. Berlin, p. 23-29.
Download