Haijiang Zhang - University of Wisconsin

advertisement
-ECE539 Project Report (Professor Yu Hen Hu)-
Application of Multilayer Perceptron (MLP) Neural
Network in Identification and Picking P-wave arrival
Haijiang Zhang
Department of Geology and Geophysics
University of Wisconsin-Madison
Abstract
Quickly detecting and accurately picking the first-arrival of a P wave is of great
importance in locating earthquakes and characterizing velocity structure, especially in the
era of large volumes of digital and real-time seismic data. The detector should be capable
of finding the onset of the P-wave arrival against the background of microseismic and
cultural noise. Normally, P-wave onset is characterized by a rapid change in the
amplitude and/or the arrival of high-frequency energy.
The Akaike information criteria (AIC) picker has been used to detect and pick the Pwave arrival (Maeta 1986; Maeta 1989). But AIC picker requires an appropriate time
window, or it will detect the wrong P-wave arrival. The Multilayer Perceptron (MLP)
neural network is used to detect the P-wave arrival, from which a time window can be
chosen for the AIC picker. This method has been applied to our PASO array data set.
About 90% of P first-arrivals are detected correctly. Compared with manual picks, this
picker provides onset times and uncertainties with high confidence. 91% of autopicks are
within 0.15 seconds of analyst picks for this data set.
1
1. Introduction
Quickly detecting and picking the arrival times for P and S waves from the recordings
of earthquake events are of great importance in event location, event identification,
source mechanism analysis, and spectral analysis. Traditionally, this work is did by an
analyst who checking the seismograms and picking out P and S arrivals based on his
individual experience. This task is time consuming and subjective, especially in the era of
large volumes of digital and real-time seismic data. There is a need to provide a more
reliable and robust alternative, which is less time consuming and perhaps more objective.
There have been some techniques in the literature to detect and pick the seismic waves
arrivals. The traditional approach to automatic phase detection has been to apply a series
of narrow bandpass frequency filters and then use the absolute value as the characteristic
function (CF). When the ratio between the short term average (STA) and the long-term
average (LTA) of the CF exceeds a predefined threshold, a detection is declared.
Absolute values and the envelope function of the seismogram are usually used as CF
(Allen, 1982).
Artificial neural networks have also been used to construct the characteristic function
to detect and pick the seismic phases (Dai et al., 1995, 1997; Zhao et al., 1999; Wang et
al., 1997). It is claimed that ANN method is very successful and promising in detecting
and picking seismic phases. There are two different types of input vector fed to the neural
network, which are the associated values of the seismograms such as mean amplitude,
spectral properties, planarity, etc., and the absolute values of the seismograms,
respectively. Comparatively, the former method may lose information and involve too
much computing time. Using the full waveforms as the network input might be a better
choice. ANN is very successful in detecting the seismic phases. However, it is difficult to
pick the seismic arrival time from the characteristic function. It is not easy to determine
which point should be chosen as the arrival time because there is a region of the
characteristic function exceeding the predefined threshold. Multi-term method is tried to
shrink this region, but it still requires an empirical value to determine the phase arrival
(Zhao et al., 1999). Different from the previous methods, the Akaike Information
Criterion (AIC) picker is used to pick the P-wave arrival in this report. When the time
2
window is chosen properly, AIC picker can choose the phase arrival very accurately. The
MLP neural network will choose a time window for the AIC picker.
This report will review the AIC picker and the Multilayer Perceptron (MLP) neural
network first. Then I will discuss the problem of constructing the MLP neural network to
detect the P-wave arrival and how the AIC picker is used to pick the P-wave arrival.
Finally the application of this method in the PASO array data is given.
2. AIC Picker
Suppose that the seismogram can be divided into locally stationary segments each
modeled as an Autoregressive (AR) process and the intervals before and after the onset
time are two different stationary processes (Sleeman et al, 1999). The order and the value
of the AR coefficients change when the characteristic of the current segment of
seismogram is different from before. For example, the typical seismic noise is well
represented by a relatively low order AR process, whereas seismic signals usually require
higher order AR process (Leonard, et al., 1999). Akaike Information Criterion (AIC) is
always used to determine the order of the AR process when fitting a time series with AR
process, which indicates the badness of the model fit as well as the unreliability (Akaike,
1974). This method has been used in onset estimation by analyzing the variation in AR
coefficients representing both multi-component and single-component traces of
broadband and short period seismogram (Leonard et al., 1999). When the order of the AR
process is fixed, AIC function is a measure for the model fit, and the point where AIC is
minimized determines the optimal separation of the two stationary time series in the least
squares sense, and thus is interpreted as the phase onset (Sleeman et al, 1999). This
picker is known as AR-AIC picker (Leonard, 2000).
Different from AR-AIC picker, Maeta calculates AIC function directly from the
seismogram, without using the AR coefficients (Maeta, 1985 and Maeta, 1986). The
onset is the point where the AIC has a minimum value. For the seismogram x, the AIC
value is defined as
AIC(k)=k*log(variance(x[1,k]))+(n-k-1)*log(variance(x[k+1,n]))
where k goes through all the seismogram.
Noted that AIC picker finds the onset point as the global minimum. For this reason, it
is necessary to choose a time window that includes only the segment of seismogram of
3
interest. If the time window is chosen properly, AIC picker can find the p-wave arrival
accurately. For the seismogram with a very clear onset, AIC values have a very clear
global minimum, which corresponds to the P-wave arrival (Figure 1a). For the
seismogram with a relatively low S/N ratio, there are a few local minima in AIC values.
But the global minimum still indicates accurately the P-wave onset (Figure 1b). When
there are more noises in the seismogram, global minimum cannot guarantee to indicate
the P-wave arrival (Figure 1c). That is, the signal to the noise ratio in the seismogram
affects the accuracy of the AIC picker to some extent. But it is noted that this effect is not
significant. For this reason, we do not filter the seismogram in advance because the band
pass filter can reduce the first motion and distort the true P-wave arrival (Douglas et al.,
1997).
a
b
c
Figure 1. Seismogram and its corresponding AIC values. a) For Seismogram with clear
p-wave arrival, AIC value is a very clear minimum point. b)For seismogram with clear pwave arrival with relatively lower S/N ratio, AIC function has many local minima,
whereas the global minima still corresponds to the p-wave onset. c) For very low S/N
seismogram, there are a few of local minima close to each other. In this case, the global
minima ca not be guaranteed to be the p-wave arrival.
4
If there are more seismic phases in a time window, AIC picker will choose the
stronger phase (Figure 2). On the other hand, AIC picker is not "smart" enough that it
will usually pick an "onset" for any segment of data no matter whether there is a true
phase arrival in the time window or not (Figure 3). For this reason, we need guide the
work of AIC picker by choosing an appropriate window for it.
Figure 2.Seismogram with two phases and the corresponding AIC values. It is noted that
there are clear local minima with respect to each phase arrival. But the global minimum
indicates the arrival of stronger phase.
Figure 3. Seismic noise data and its AIC values. The minimum value does not indicate
any phase arrival although it divides the data into two different stationary segments.
3. Artificial Neural Network: Multilayer Perceptrons (MLP)
Multilayer perceptrons have been successfully applied to solve many difficult and
diverse problems. The mathematical perceptron was proposed by McCulloch and Pitts
5
(1943) to mimic the behavior of a biological neuron (Haykin, 1999). The biological
neuron is mainly composed of three parts: the dendrites, the soma, and the axon. The
dendrites accept information from other neurons by synapses. These input signals are
attenuated with an increasing distance from the synapses to the soma. The soma
integrates the received signal and thereafter activates an output depending on the total
input. The axon transmits the output signal to other neurons by the synapses located at the
tree structure at the end of the axon (Ban, 2000).
The mathematical neuron proceeds in a similar way but simpler way as integration
takes place only over space. Typically, the network is made up of sets of nodes arranged
in layers, an input layer, one or more hidden layers and an output layer. The input signal
propagates through the network in a forward direction, on a layer-by-layer basis. Each
node is the basic processing unit with a nonlinear activation function. The outputs of the
nodes in one layer are transmitted to nodes in another layer through links called weights,
which can effectively amplify or attenuate the signals. Except for the input layer, the net
input to each node is the sum of the weighted outputs of nodes in the previous layer.
MLP successfully solve some difficult problems by training them in a supervised
manner with a highly popular algorithm known as the error back-propagation algorithm,
which is based on the error-correction learning rule. Basically, the error-correction
learning consists of two phases: a forward phase and a backward phase. In the forward
phase, the input vector is fed into the nodes of the input layer and propagates through the
network layer by layer. The output vector is produced as the actual response of the
network. In the forward phase, the weights connecting the network nodes are fixed.
During the backward phase, however, the synaptic weights are all adjusted based on an
error-correction rule. This method attempts to find the most suitable solution for a global
minimum in the mismatch between the desired output pattern and its actual value for all
of the training samples. The degree of mismatch for each input-output pair is quantified
by solving for unknown synaptic weights between the hidden and output layer and then
by propagating the mismatch backwards through the network to adjust the synaptic
weights to make the actual response of the network move closer to the desired response
in a statistical sense.
A multilayer perceptron has three distinctive characteristics (Haykin, 1999):
6
(1)
The model of each neuron in the network includes a nonlinear activation
function. Two types of nonlinear activation function are usually used: the
sigmoid function and the hyperbolic tangent function.
(2)
The network includes one or more hidden layers, which could enable the
network to learn complex tasks by extracting progressively more meaningful
features from the input patterns.
(3)
The network exhibits a high degree connectivity, which is determined by the
synapses of the network.
4. MLP neural network: Detection of the P-wave arrival
Several characteristic functions of the seismogram can be used as the input of the
neural network, such as the absolute value function, the square function, Allen’s function,
the envelop function, and the modified differential function. Following Dai’s method
(Dai et al., 1995, 1997), the absolute values of the seismogram is chosen as the input of
the MLP neural network since they have the highest fidelity and processing speed and are
most objective amongst these functions. The reason that the seismogram itself is not used
is that the first motion of an arrival has two directions (up and down) and is source
dependent.
30 samples of absolute values of the seismogram are fed into the neural network. The
input samples are normalized because the amplitude of the seismogram is strongly
dependent on the magnitude and epicentral distance of an earthquake. By this
normalization, a small set of training data can cover all the recordings with different
amplitudes. For P-wave segment, the arrival is located at the 20th sample. The noise
segment is extracted from the prior part of the P-wave arrival. The part before the onset is
made longer than the part after it in order to achieve better distinction between the signal
patterns and noise patterns. Figure 4 shows the P-wave segment and the noise segment,
respectively. There are two output nodes of the neural network flag the input segment
with (1, 0) for P arrivals and (0, 1) for the background noise.
It is very important to select the appropriate training sets. The training sets should
represent the typical features of a signal with different frequency characters. A rule of
thumb is to begin with a very small training set and add new patterns until performance is
7
satisfactory. For the PASO array data, 9 pairs of the P-wave arrival and noise segments
are chosen to train the MLP network (Figure 5).
For the input vector, MLP neural network creates the decision boundary for the input
space, making it possible to recognize patterns. Any given decision boundary can closely
be approximated by a two-layer network-one hidden layer and one output layer-having a
sigmoid activation function. For this reason, only one hidden layer is used for configuring
the MLP neural network.
.
Figure 4. P-wave arrival and noise segments
Figure 5. 9 pairs of P-wave arrival and noise segments are used to
train the MLP neural network.
8
Currently, there is no good hint to determine the number of hidden nodes, which is
highly problem dependent (Hu, 2001). With too few hidden nodes, the network may not
be powerful enough for a given learning task. If too many hidden nodes are used,
however, the computation is too expensive and the network could be over-fitting the
current training sets and cannot generalize to the other data sets. For our PASO dataset, 5
hidden nodes are best with the classification rate of 94.5% for the training set and 82%
for a separate testing set. Professor Hu’s popular program bp.m (Hu, 2001) is used to
train the network. Figure 6 shows the learning curve for 18 P-wave arrival and noise
segments.
Figure 6. Learning curve for the MLP network with 5 hidden nodes. To train the network,
learning rate is 0.1, momentum is 0.8, epoch size is 18, hyperbolic tangent function and
sigmoid function are used to the hidden layer and the output layer, respectively.
After the MLP neural network is trained, it is applied to the entire seismogram by
moving a time window of the same size of the input layer. The resulting outputs are
converted into a time series N(t) (Dai et al. 1995):


1
o1 (t ) 2  1  o2 (t )2
2
which is used to detect the seismic arrivals. This function exaggerates the difference
N (t ) 
between the desired output and the background noise. Figure 7 shows the seismogram
and its corresponding N(t) values. It is noted that the point when N(t) exceeds a
predefined threshold can be used to detect P-wave arrival. For the PASO array data, 0.3 is
chosen as the threshold. With this method, 90% P-wave arrivals are detected.
9
a
b
Figure 7 (a) Seismogram and (b) its N(t) values constructed from the outputs of the
neural network. It is noted that N(t) function has a left shift of about 20 samples, which is
due to the 20th sample in the time window corresponding to the P-wave arrival.
5. MLP neural network: picking the P-wave arrival
In Dai’s method, N(t) is also used to pick the arrival onset using the local maximum
of N(t). The local maximum is in the window, beginning when the N(t) exceeds the
threshold, with a length of the input segments (Dai, 1995). But Zhao (1999) noticed that
the detector is activated when a signal enters the window and is inhibited when it leaves
the window. In his opinion, it is more reasonable to determine the arrival time based on
the rise edge of the peak rather than on its maximum or central point between the rise
edge and fall edge. Arrival time is chosen as T   , where T is the time when N(t)
exceeds the threshold and  is chosen empirically, say 0.2 seconds.
Different from the above methods, AIC picker is used to pick the P-wave arrival for
our dataset. It is noted in the first part that AIC picker is not smart enough to pick the P-
10
wave arrival correctly. But when a time window is chosen properly, it can pick the arrival
time accurately. Based on this fact, MLP neural network is used only to detect the Pwave arrival and then choose a time window for the AIC picker (Figure 8). For the PASO
array data, we choose the time when N(t) exceeds the threshold as an estimate arrival,
and a time window of 3 seconds is chosen with the estimate arrival as its center.
Compared with manual picks, this picker provides onset times and uncertainties with
high confidence. 91% of autopicks are within 0.15 seconds of analyst picks for this data
set (Figure 9).
Conclusions and further works
A MLP-AIC picker is proposed to detect and pick the P-wave arrival. A MLP neural
network can be trained well with a small set of P-wave arrival and noise segments. The
network is used to detect the P-wave arrival and provide a time window for the AIC
picker, which can pick the P-wave arrival accurately within the time window. Tested with
the real data, the MLP-AIC picker seems not too sensitive to the signal to noise ratio and
can detect 90% P-wave arrival. Among those picks, 91% autopicks are within 15 ms of
analyst picks. Compared with the conventional STA/LTA method, the MLP neural
network is more adaptive with regard to phase frequency since the network can be trained
with the patterns of a variety of frequency characters.
Further works should focus on applying this method to pick S-wave arrival and
improving the detection rate of the picker.
Acknowledgements
The author acknowledges Professor Yu Hen Hu for his insightful instruction on the
Artificial Neural Network and Fuzzy Systems, and his permission to use his MLP
programs.
11
a
b
Figure 8. (a) Seismogram same as that shown in Figure 7a, and (b) its corresponding
AIC values. The minimum AIC value indicates the P-wave arrival.
Figure 9. The MLP-AIC picker is used to pick the P-wave arrival for 3 seismograms from
PASO array data.
12
References
Akaike, H., Markovian representation of stochastic processes and its application to the
analysis of autoregressive moving average process. Ann. Inst. Stat. Math., 26,36326,387, 1974
Allen, R.V., Automatic earthquake recognition and timing from single trace, Bull. Seism.
Soc. Am., 68, 1521-1532, 1978
Ban, M., C. Jutten, Neural networks in geophysical applications, Geophysics, 65, 4,
1032-1047, 2000
Dai, H., C. MacBeth, Automatic picking of seismic arrivals in local earthquake data using
an artificial neural network, Journal of Geophysical Research, 120, 758-774,
1995
Dai, H., C. MacBeth, The application of back-propagation neural network to automatic
picking seismic arrivals from single-component recordings, Journal of
Geophysical Research, 102, B7, 15,105-15,114, 1997
Haykin, S., Neural Networks: A Comprehensive Foundation, Prentice Hall, New Jersey,
second edition, 1999
Hu, Yu Hen, Course notes on Introduction to Artificial Neural Network and Fuzzy
Systems, 2001
Leonard, M., B.L.N. Kennett, Multi-component autoregressive techniques for the
analysis of seismograms, Physics of the Earth and Planetary Interiors, 113, 247263, 1997
Naoki Maeda, A method for reading and checking phase times in auto-processing system
of seismic wave data, Zisin=Jishin, 38, 3, 365-379, 1985
Sleeman, R., and T. v. Eck, Robust automatic P-phase picking: an on-line implementation
in the analysis of broadband seismogram recordings, Physics of the Earth and
Planetary Interiors, 113, 265-275, 1999
Wang, J., T. Teng, Identification and picking of S-phase using an artificial neural
network, Bulletin of Seismological Society of America, 87, 5, 1140-1149, 1997
Zhao, Y., K. Takano, An artificial neural network approach for broadband seismic phase
picking, Bulletin of the Seismological Society of America, 89, 3, 670-680, 1999
13
Download