Uploaded by Ahmad Azzam

ANN

advertisement
Artificial Neural Networks (ANN)
For Data Mining
The Biology Analogy


Brain cells vs. other
cells?
Neurons: brain cells




3
Nucleus (at the center)
Dendrites provide inputs
Axons send outputs
Synapses increase or
decrease connection
strength and cause
excitation or inhibition of
subsequent neurons
Artificial Neural Networks (ANN)
Three Interconnected Artificial Neurons
Biological
Soma
Dendrites
Axon
Synapse
Slow speed
Many neurons
(50-150 Billions)
4
Artificial
Node
Input
Output
Weight
Fast speed
Few neurons
(Dozens)
ANN Fundamentals

Components and Structure


“A network is composed of a number of processing elements
organized in different ways to form the network structure”
Processing Elements (PEs) – Neurons
Network


Structure of the Network

5
Collection of neurons (PEs) grouped in layers
Topologies / architectures – different ways to interconnect PEs
ANN Fundamentals

6
Calculations on the PE level
ANN Fundamentals

Processing Information by the Network




7
Inputs
Outputs
Connection weights
Summation Function
(transfer function)
ANN Fundamentals

Transformation (Transfer) Function


Computes the activation level of the neuron
Function types: Linear, sigmoid (logical activation), or
hyperbolic tangent function
Y = 0.77
8
ANN Architectures / Structures
9
Learning in ANN
1.
2.
3.
10
Compute outputs
Compare outputs with
desired targets
Adjust the weights and
repeat the process
Neural Network
Application Development

Preliminary steps




ANN Application Development Process
1.
2.
3.
4.
5.
6.
7.
8.
9.
11
Requirement determination
Feasibility study
Top management champion
Collect Data
Separate into Training and Test Sets
Define a Network Structure
Select a Learning Algorithm
Set Parameter values, Initialize Weights
Transform Data to Network Inputs
Start Training (Revise Weights)
Stop and Test
Implementation/Deployment: Use the
Network with New Cases
Data Collection and Preparations
Collect data and separate it into
 Training set (60%)
 Cross validation set (20%)
 Testing set (20%)
Make sure that all three sets
represent the population: true
random sampling (stratification)
Error (MSE)




12
Best
Generalization
Cross
Validation Set
x
Training Set
x
Number of Iterations (Epochs)
Use training and cross validation cases to adjust the
weights
Use test cases to validate the trained network
Neural Network Architecture

Feed forward Neural Network

Multi Layer Perceptron, - Two, Three, sometimes
Four or Five Layers
Class 1 - FLOP
(BO < 1 M)
Class 2
(1M < BO < 10M)
MPAA Rating (5)
(G, PG, PG13, R, NR)
Competition (3)
(High, Medium, Low)
Class 3
(10M < BO < 20M)
Star Value (3)
(High, Medium, Low)
Class 4
(20M < BO < 40M)
Genre (10)
(Sci-Fi, Action, ... )
Class 5
(40M < BO < 65M)
Class 6
(65M < BO < 100M)
Technical Effects (3)
(High, Medium, Low)
Sequel (1)
(Yes, No)
...
...
Class 7
(100M < BO < 150M)
Class 8
(150M < BO < 200M)
Number of Screens
(Positive Integer)
Class 9 - BLOCKBUSTER
(BO > 200M)
INPUT
LAYER
(26 PEs)
13
HIDDEN
LAYER I
(18 PEs)
HIDDEN
LAYER II
(16 PEs)
OUTPUT
LAYER
(9 PEs)
Neural Network Preparation


Choose the network's structure (nodes and layers)
Determine several parameters





Select initial conditions (randomize the weights)
Transform training and testing data to the required
format

14
Learning rate (high or low) / momentum
Initial weight values
Other parameters
Non-numerical Input Data (text, pictures): preparation may
involve simplification or decomposition
Training the Network


Present the training data set to the network
Adjust weights to produce the desired output
for each of the inputs


15
Several iterations of the complete training set to
get a consistent set of weights that works for all
the training data
Each iteration is called an Epoch

Batch vs. Online learning
Supervised Learning:
Backpropagation






Back-propagation (back-error propagation)
Most widely used learning
Relatively easy to implement
Requires training data for conditioning the
network before using it as a predictor
Network includes one or more hidden layers
Network is considered to be feed-forward
* Also, look at the other learning methods in your book
16
Backpropagation algorithm

How does backpropagation algorithm
minimizes the error
• By taking the partial derivative
of the error of the network
with respect to each weight,
the ANN learns about the
direction of the error and if
possible moves to minimize it.
17
Backpropagation
1.
2.
3.
4.
5.
Initialize the weights
Read the input
vector
Generate the output
Compute the error
Error = Out - Desired
Change the weights

Drawbacks:


18
A large network can take a very long time to train
May not converge
Testing






Test the network after training
Examine network performance: measure the
network’s prediction/classification ability
Do the inputs produce the appropriate outputs?
Not necessarily 100% accurate
But may be better than most other algorithms
Test plan should include



19
Routine cases
Potentially problematic situations
May have to retrain based on the test results
Implementation

Frequently requires

Interfaces with other CBIS



20
Embedded into parent software applications
User training
Gain confidence of the users and
management early
A Sample Neural Network Project
Backdropcy Prediction – Sharda et al.
21
ANN Development Tools










22
NeuroSolutions
Statistica Neural Network Toolkit
Braincel (Excel Add-in)
NeuralWorks
Brainmaker
PathFinder
Trajan Neural Network Simulator
NeuroShell Easy
SPSS Neural Connector
Matlab Neural Network Toolkit
Benefits of ANN







23
Pattern recognition, learning, classification,
generalization and abstraction, and interpretation of
incomplete and noisy inputs
Character, speech and visual recognition
Can tackle highly complex / nonlinear problems
Robust
Fast
Flexible and easy to maintain
Powerful hybrid systems
Limitations of ANN




24
Lack explanation capabilities
(a.k.a. black-box syndrome)
Training time can be excessive and tedious
for large and complex data sets
Usually requires significantly large amounts
of training and test data
Also requires knowledge to set the proper
parameters to generate a good model
Download