Analysis of Untraditional Risks Based on Data Mining
Zhong Ning1, YueYang1,Tong Liu2
School of Management, Fudan University, Shanghai 200433,
Dep. Of Economy, Customs College, Shanghai 200433
Abstract -Customs development is not only faced with
traditional security risks but also the threat of more and
more non-traditional security factors are exacerbating the
risk of customs operation. In this paper, data mining (Data
Mining) are used for the analyzing for a large number of
customs data .And this paper further explores the neural
network model in the customs of non-traditional security
issues in the applicability of risk. Obtained through research,
data mining technology is an excellent study to the future
non-traditional security issues and it providesa certain
methodological basis to future research.
Keywords - Data Mining; Non-traditional Security;
Risk Management; Neural Network
In recent years, with the accelerating process of
economic globalization, international and domestic
factors required customs to perform a new more
non-traditional functions. The customs are facing
increasing, expanding non-traditional challenges of
functional tasks.
In the 1970s, some American scholars have
proposed the concept of non-traditional security and
non-traditional security as defined in the scope, nature,
and the difference between traditional security and
contact. The late 20th century early 21st century, on the
study of non-traditional security issues unfolding. Since
the events of September 11 and SARS incident, the
non-traditional security issues are very big concern.
Thurman, Mathias, Emshwiller, who proposed a customs
facing non-traditional security factors. Rosencrance,
Linda mentioned that the information technology and
customs relations between non-traditional security risks.
In recent years, with the non-traditional security threat to
China's expanding domestic academic research on
been significantly
strengthened, some scholars such as Yu Xiaofeng,
Guozhi Lin, Zhou introduced Shakespeare and other
"non-traditional security" concept and issues the
background, "non-traditional security" features, and
non-traditional security and responsible image of the
building and focuses on building a big country and the
role and image of the proposed "non-traditional security
maintenance," the Chinese way. Although many scholars
have put forward at the strategic level, China's response
to the idea, but at the technical level of the application has
few, for the customs of non-traditional security functions,
and even less. Chinese customs functions of traditional
security is already quite mature, instead of the traditional
security functions has not yet attracted sufficient
Data Mining (Data Mining), is stored in the database
from the data warehouse or other repository of large
number of According to obtain valid, novel, potentially
useful and ultimately understandable patterns in the
process. In many cases, data mining, also known as
knowledge discovery in databases. Knowledge discovery
in data mining are the most important part, to technical
terms, it refers to a wide range of data extracted from the
mining of a large number of unknown and valuable
knowledge of the mode or law of other methods, which
include related rules, time series, artificial intelligence,
statistics, databases, etc.. Found out from the knowledge
database can be used in information management, process
control, scientific research, decision support and many
other aspects.
Traditional data mining and data analysis (such as
query, reporting, on-line application analysis) is the
essential difference between data mining in the absence of
clear information of assumptions go digging and found
that knowledge. Data mining has received information
previously unknown, the validity and usefulness of the
three characteristics. Previously unknown information is
the information had not been anticipated in advance, that
data mining is to find those who can not rely on intuition
or knowledge of information found, and even
counterintuitive information or knowledge, and tap out
the message that the more unexpected, may be more
valuable. Therefore, the data mining analysis of the data
than the traditional customer loyalty is more suitable for
this exploratory study of the problem.
Back propagation neural network is a learning
algorithm. Neural network is a set of connected input /
output unit, in which each connection associated with a
weight. In the learning phase, by adjusting these weights
to predict the correct input tuple class label.
A. Multi-layer feed forward neural networks
Back propagation algorithm in multilayer feed
forward neural network learning. It iteratively learns to
class label for the tuples of a set of forecast weights.
Multilayer feed forward neural network consists of an
input layer, one or more hidden layer and output layer.
Examples of multi-layer feed forward network shown in
the following figure:
Figure 1. multi-layer feed forward neural networks
Each is composed of a unit. Network input training
tuple corresponding to each measured attribute. Provide
input to the cell layer called the input layer. These input
through the input layer, and then weighted the same time
called the hidden layer provides a "class of neurons," the
second layer. The hidden layer unit of output can be input
to another hidden layer, and so on. The number of hidden
layer is arbitrary, although in practice usually only one
level. Finally, a hidden layer of the composition of output
as the weighted output is input layer of the unit and
released to the output layer of network prediction given
Input layer of the cell are called the input unit.
Hidden layer and output layer unit, due to their biological
basis of the symbol, sometimes called neuroses, or said
output unit. Figure 1 shows the multi-layer neural
network with two output units. Thus, they are called as
two neural networks. (Do not remember the input layer,
because it used to pass input values to the next level.)
Similarly, the network contains two hidden layer neural
network called the three, and so on. The network is feed
forward, if its weight is not back to the input units, output
units, or the previous layer. Network is fully connected, if
every unit down to a layer of each unit to provide input.
Each output unit before taking the weighted sum of the
output layer unit as input (see above). It will be a
non-linear (excitation) function acting on the weighted
input. Multilayer feedforward neural network can predict
the class as a non-linear combination of the input model.
From a statistical standpoint, they are non-linear
regression. Given enough hidden units and enough
training samples, multi-layer feed forward network can
approximate any function.
B. Defining the network topology
How involved in neural network topology, at the
beginning of training, the user must specify the unit
number of input layer, hidden layers (if more than one
layer), the number of units for each hidden layer and
output layer unit number, to determine the network
Tuple in the training of each attribute value is
normalized by measuring the input help to speed up the
learning process. Typically, the input values are
normalized, so that they fall into the 0.0 to 1.0.
Discrete-valued attributes can be re-encoded, so that the
value of an input unit for each domain. For example, if
property A has three possible values or known value {a0,
a1, a2}, you can assign three input units, said A. That can
be used I0, I1, I2 as input units. Each unit is initialized to
0 if A = a0, then I0 is set to 1; if A = a1, I1 set; it goes.
Neural network can be used to classify (predict a given
tuple of class labels) or predicted (predicted continuous
value output). For classification, an output unit can be
used to represent two classes (one class value of 1
represents a value of 0 represents another class). If more
than two classes, each class with an output unit.
For the "best" hidden layer unit number, no clear
rules. Network design is a process of trial and error, and
may affect the accuracy of the results of the training
network. The initial weight may also affect the accuracy
of the results. Once the network is trained, and their
accuracy can not accept, usually with different network
topologies or use a different set of initial weights, repeat
the training process. Accuracy can be estimated using
cross-validation technique to help identify when to find
an acceptable network.
C. Back propagation
After the iterative process to disseminate the training
data set tuples, each tuple of the same network prediction
and the actual target value comparison. Target can be
suppressed tuples training class label (for classification)
or continuous values (for forecasts). For each training
sample, modify the network weights between forecast and
actual target value of the minimum mean square error.
Such a change is "backward", ie from the output layer,
through each hidden layer, to the first hidden layer (so
called back propagation). Although not guaranteed, in
general, the weight will eventually converge, the learning
process stops.
(1) Initialize the weights: the weight of the network
is initialized to the smallest random number (for example,
from -1.0 to 1.0, or from -0.5 to 0.5). Each unit has an
associated bias (bias), bias is also similar to the random
initialization of the minimum number.
Each training tuple X by the following steps to deal with.
(2) Forward propagation input: First, the training
tuples available to network input layer. Input through the
input unit, does not change. In other words, the input unit
j, Oj its output is equal to its input value Ij. Then,
calculate the hidden layer and output layer, each unit
actually input and output. Hidden layer or output layer
units actually entered with the input of a linear
combination of the calculation. To help explain this, the
following figure shows a hidden layer or output layer unit.
In fact, there are many inputs per unit, is connected to its
upper layer of each unit of output. Each connection has a
weight. To calculate the net input of the unit, connect the
unit corresponding to each input is multiplied by its
weight, then summed. Given the hidden layer or output
layer unit j, the net input to unit j is Ij:
Figure 2. Forward propagation input
A hidden or output unit j: unit j on input from the
output layer. These multiplied with corresponding
weights, weight and form. Weighted and combined to a
unit j associated with the bias on. A nonlinear activation
function for the net input (for ease of explanation, the
input unit j is marked as y1, y2, ..., yn. If unit j in the first
hidden layer, the inputs correspond to the input tuple (x1,
x2, ..., xn)).
l j  WijOi  j
Which, Wij is the level of unit i to unit j, the
connection weights; Oi is the output level of unit i; and θj
unit j is the bias. Bias as the threshold used to change the
unit of activity.
Hidden layer and output layer whichever is the net
input to each unit, and then act on its activation function,
as shown above. Unit with the performance of the
function is represented by the symbol neuron activity. To
logistic (logistic) orS-shaped (sigmoid) function. Given
the net input unit j Ij, Oj is the output of unit j using the
following formula:
oj 
1  e  Ij
This function is also known as extrusion function
(squashing function), because it maps a large input range
to a smaller range of 0 to 1. Logistic function is nonlinear
and differentiable, making the back propagation
algorithm for nonlinear separable classification modeling.
For each hidden layer, until the last hidden layer,
calculate the output value of Oj, given the network
prediction. In practice, due to the backward error
propagation will also be required of these intermediate
output values stored in the middle of each unit output
value is a good way. This technique can significantly
reduce the amount of computation required.
(3) the error back propagation: reflection by
updating the network weights and bias prediction error,
the error back propagation. For the output layer unit j, the
error Errj calculated using the formula:
Errj=Oj(1-Oj)(Tj-Oj) (3)
Which, Oj is the actual output unit j, and Tj is the j
given training tuples given the known target. Note, Oj (1 Oj) is the logistic function derivative.
To calculate the error of hidden layer unit j, consider the
next level in the unit is connected to j and the error
weighting. Error of hidden layer unit j is
Errj = Oj (1 - Oj)
 Errkwjk
Which, Wjk by the next higher level unit k to unit j
in the connection weights, and Errk unit k is the error?
Updates weight and bias, to reflect the spread of errors.
Weighted by (5) update, which, ΔWijWij is the right
Variable l is the learning rate, usually between 0.0
and 1.0 take constant values. Back propagation learns to
use the gradient descent search a collection of weights.
These weights fit the training data, the network class
prediction with the known tuple-mean-square distance
between the (target minimum. Learning rate to help avoid
1 decision space of local minima (ie, weight
falling into the
appears to converge,
Tam is not the optimal solution), and
help to find the global minimum. If the learning rate is
too small, learning will be very slow. If the learning rate
is too large, may appear between the swing in the wrong
solution. But in reality, sucked the learning rate is set to 1
/ t, t is the current training set the number of iterations.
Update Bias by the following. Which, Δθj change is
biased θj
Δθj=(l)Errj (7)
θj=θj+Δθj (8)
Note that here we deal with each tuple to update
weights and biases, which is called an instance update.
Weight and bias increments can be accumulated to a
variable, you can focus on training in all processed tuples
updated after the weight and bias. The latter strategy is
called periodic updates, scanning of the training set
iteration is a cycle. In theory, the mathematical derivation
of back propagation with periodic updates, and update
instances of the practice is more common, because it
usually produces more accurate results.
Termination conditions: training to stop, if
of the previous cycle of ΔWij are less than a
specified threshold, or
is less than a threshold, or
-specified number of cycles.
Computational efficiency depends on the time spent
training the network. Given | D | tuples and w a weight,
each cycle requires O (| D | × w) time. However, in the
worst case may be entered next week's installments of the
index number n. In practice, the time required for network
convergence is very uncertain.
D. Neural network design
Customs relating to non-traditional security risks
illegal smuggling and intellectual property protection the
final results of the analysis can be divided into two types:
no illegal smuggling, smuggling any irregularities. The
nature of these two analytical results are completely
different, if the customs clearance in advance according
to the basic data to determine whether the goods contain a
risk, and risk level, then you can focus on dealing with
high risk goods, their increased inspection efforts, and for
no risk or low risk rating procedures of customs clearance
of goods to take a simple approach with appropriate
checks, which can greatly improve the efficiency of
Customs, the rational allocation of officers.
Currently Customs is to determine the risk of
information after some investigation, according to the
survey results and the results of the trial, and finally
determined. If only the general level of risk or no risk
information, the Department is no need to put so many
resources to deal with. To conserve resources, we can
data mining technology, customs clearance of goods
based on the basic properties, combined with the Customs
database record information, the risk of the goods to
make a simple analysis, so that you can follow-up
management of the customs of help.
This chapter selected 2006 - 2010 and 2010,
Shanghai Customs Xuzhou some import and export
customs clearance data as a research sample, the sample
in the appendix. Constrained by data availability, this
paper only selected one representative of 82 samples. One
of the 62 samples used for model training, 20 samples
used for model testing.
In selecting indicators, the general cargo clearance
business number, business name, corporate credit rating,
import and export methods, trade, import and export
cargo information (quantity, price, etc.), import and
export country (region), the accompanying documents,
etc. indicators; and irregularities have been found
smuggling and infringement cases, with the risk
indicators are illegal ways, illegal channels, select
channel clearance, seized tools, case type, amount of
money involved, the amount of tax evasion and so on. In
this paper, selected indicators, first select the data more
complete index, by reference to the relevant customs
documents and ask Customs related personnel, tested and
validated, and finally selected the five risk indicators as
the model input variables.
Reference to the Customs the original database,
including corporate names, import and export methods,
illegal methods.
We see from the raw data, raw data are described in
some text, attribute names and attribute values are not
discrete, but some vague description, and there are some
attributes of our purpose and focus of the classification is
not much relationship, such as seized units. Establishment
of data warehouse also requires pre-processing the raw
data, so that it can be applied to specific algorithms.
Data cleaning process by filling out the value of vacant,
smooth noisy data, identify, remove outliers, and resolve
inconsistencies to "clean up" data, this data is done on the
original treatment of the following aspects:
(1) is to change the text property value data for the
neural network algorithm can accept numeric values, such
as business risk level has five (AA, A, B, C, D), then we
can carry out these five levels data transformation, with
"0" AA class enterprise, "1" means Class A enterprise, "2"
B enterprises, "3" means Class C business, "4" class D
enterprises. This data becomes a neural network
algorithm can identify the data.
(2) the original data in some of the properties and
classification of the purpose and focus not so much to
remove, such as the seized units, seized time
(3) the value of the hollow handle to the database,
such as business risk level is divided into: AA, A, B, C, D
five, but often appear in the database, individual
irregularities, illegal smuggling can be classified as C
personal class enterprise.
Neural network is built on samples of the
aforementioned research methods training and testing
procedures. Parameter values used to set the input layer
of five hidden layer unit number 20, the output layer is 2.
Training function TRAINGD, training times for the 1000
training accuracy is 0.1, learning rate of 0.05. Training
function uses an adaptive learning rate algorithm.
Output variables are divided into not found
infringement and found infringing, respectively [10] and
Verify the actual output of neural network, we found
that the results of fitting accuracy rate of 85%, can be
used to assist non-traditional security risk management.
At the same time we also found that model accuracy is
relatively low, indicating that the model may be further
optimized, it is because the data established a rough
indicator, and because of the size of the model input
variables and data availability, many of the risks factors
not considered in the model, thereby causing the input
value and the actual value is not very consistent. After
this work needs to be further strengthened and improved
to make the model more accurate.
Start from the concept of data mining and the
existence of the non-traditional security risk information,
risk analysis proposed neural network model has been
proposed. According to relevant rules of texting the
customs import and export information into a
standardized discrete data, discrete data according to
established three-layer artificial neural network model,
and of the Customs Non-traditional security risk rating,
and finally to validate the model through the empirical
validity of . The model further support the establishment
of effective risk management mechanism of science,
scientific management and the efficient allocation of
resources, improve the efficiency of customs officers,
customs officers work to reduce the blindness and
randomness, to provide effective decision support tools.
Artificial neural network technology as an important
means of developing intelligent, research-based data
mining customs of non-traditional security risk analysis is
of great significance.
This research is aided financially by National
NaturalScience Foundation(60974087), Natural Science
Foundation of Shanghai(09ZR1420900) and Research
and innovation Project of Shanghai Education
[1].G. Hodge. Impact of the Internet on Customer Service and
Product Development Among the CENDI Agencies[J],
[2].Q. C. Thurman, A. L. Giacomazzi, M. D. Reisiget al.
Community-based gang prevention and intervention: An
evaluation of the neutral zone[J]. Crime & Delinquency,
1996, 42(2): 279
[3].M. Koenig-Archibugi. International governance as new
Raison The case of the EU common foreign and security
[4].L. G. Easley, R. L. Martin. System and method for providing
container security[J], 2006
2001, 35(43): 8
[6].J. Han, M. Kamber, J. Pei. Data mining: concepts and
techniques[M]: Morgan Kaufmann Pub, 2011
[7].P. Giudici, S. Figini. Applied data mining for business and
industry[M]: Wiley Online Library, 2009
[8].M. J. A. Berry, G. S. Linoff. Data mining techniques[M]:
Wiley-India, 2009