Uploaded by amir nabil

Lecture Note 1

advertisement
Artificial Neural Network
Lecture Note: 6
|Page1
Multi-class classification
Multi-class classification can be converted to multiple logistic
regressions using ONE sVS ALL
2
What actions can human Brain ( BNN ) do ??
§
Human can do Classification
o EX: when a child see a pen , he can say that this pen belong to class of
pens
§
Human can do Clustering (group similar patterns together)
o EX: child can say that some things in the same group because they are
similar
§
Human can do Mapping (pattern association)
o Associate a pattern with itself ( storing information )
§ Ex: when studying, we save info by associating info by itself
Note:
Child must be trained before asking him to classify some thing or to
cluster similar patterns
• There is a part in the human brain that allow the human to do the
previous actions called Biological Neural Network (BNN)
• The main component in human neural system is Neuron Cell
• Neuron considered to be a small processor and memory in human brain
Human brain
• The brain is a highly complex, non-linear, parallel information
processing system.
• It performs tasks like pattern recognition, perception, motor control,
many times faster than the fastest digital computers.
• It characterize by
– Robust and fault tolerant
– Flexible – can adjust to new environment by learning
– Can deal with fuzzy, probabilistic, noisy or inconsistent information
– highly parallel
– small, compact and requires little power.
3
Human Brain VS Von Neuman Computer
Human brain
Von Neumann
computer
# elements
1010 - 1012 neurons
107 - 108
transistors
# connections /
element
104
50
switching
frequency
103 Hz
109 Hz
energy / operation 10-16 Joule
10-6 Joule
power
consumption
10 Watt
100 - 500 Watt
reliability of
elements
low
reasonable
reliability of
system
high
reasonable
Data
representation
analog
digital
Memory
localization
distributed
localized
Control
distributed
localized
Processing
parallel
sequential
Skill acquisition
learning
programming
4
Biological Neuron structure
•
•
brain consists of approximately 1011 elements called neurons.
•
Each of these axons splits up into a series of smaller fiber,
Communicate through a network of long fiber called axons.
which communicate with other neurons via junctions called
synapses that connect to small fibers called dendrites
attached to the main body of the neuron (Soma)
•
Basic computational unit is the Neuron
► Dendrites (inputs, 1 to 104 per neuron)
► Soma (cell body)
► Axon (output)
► Synapses
5
How neurons work
• Synapse like a one-way valve.
• Electrical signal is generated by the neuron, passes down the axon,
and is received by the synapses that join onto other neurons
dendrites.
• Electrical signal causes the release of transmitter chemicals which
flow across a small gap in the synapse (synaptic cleft).
• Chemicals can have an excitatory effect on the receiving neuron
(making it more likely to fire) or an inhibitory effect (making it less
likely to fire)
• Total inhibitory and excitatory connections to a particular neuron
are summed, if this value exceeds the neurons threshold the neuron
fires, otherwise does not
Learning in networks of neurons
• Knowledge is represented in neural networks by the strength of the
synaptic connections between neurons (hence “connectionism”)
• Learning in neural networks is accomplished by adjusting the
synaptic strengths (weights)
• There are three primary categories of neural network learning
algorithms :
1. Supervised — exemplar pairs of inputs and (known, labeled) target
outputs are used for training.
2. Reinforcement — single good/bad training signal used for training.
3. Unsupervised — no training signal; self-organization and clustering
produced by the “training”
6
Artificial Neural network (ANN)
► An Artificial neural network is an information-processing system
that has certain performance characteristics in common with
biological neural networks.
► ANN have been developed as generalizations of mathematical
models of human cognition or neural biology.
BNN VS ANN
Biological neural network (BNN)
Soma
Dendrite
Axon
Strength of connection
between the neurons
Synapse
Learning the solution to a
problem
Examples
Artificial neural network (ANN)
Neuron
Input
Output
Weight-value for the specific
connection
Weigh
Changing the connection weights
Training data
§ Figure of neuron
7
How ANN Works ?
1) Information processing occurs at many simple elements called
neurons.
2) Signals are passed between neurons over connection links.
3) Each connection link has an associated weight, which
multiplies the signal transmitted.
4) Net input is calculated as the weighted sum of the input signals
5) Each neuron applies an (Transfer) activation function to its
net input (sum of weighted input signals) to determine its output
signal.
6) Each neuron has a single threshold value
7) An output signal is either discrete (e.g., 0 or 1) or it is a realvalued number (e.g., between 0 and 1)
y = f(netinput).
è f is activation function
8
Adding Bias
• A linear neuron is a more flexible model if we include a bias.
• A Bias unit can be thought of as a unit which always has an
output value of 1, which is connected to the hidden and output
layer units via modifiable weights.
• It sometimes helps convergence of the weights to an acceptable
solution
• A bias is exactly equivalent to a weight on an extra input line
that always has an activity of 1.
y = f(netinput).
è f is activation function
9
Example-1:
Calculate net input
net input = 3
Calculate output using activation function (step function)
è
Example-2
10
Example-3 – Sigmoidal function
net input = 3
11
Characteristics of ANN
1. Architecture (Structure):
the pattern of nodes and connections between them
2. Training, or learning, algorithm:
method of determining the weights on the connections
3. Activation function
function that produces an output based on the input values
received by node
Characteristic of Artificial Neural Network
ý Architecture :arrangement of neurons into layers and the connection
pattern between layers
1- Feed forward NN
• Single layer
• Multi-layer
2- Feed backward (recurrent) NN
3- Associative networks
ý Training algorithm: setting the values of the weights
1. Supervised training
2. Unsupervised training
3. Reinforcement training
ý Activation function
1. Identity function
2. Binary step function
3. Bipolar sign function
4. Binary sigmoid
5. Bipolar sigmoid
12
Architecture
1- Feed Forward NN
- The neurons are arranged in separate layers (input – hidden ouput)
- There is no connection between the neurons in the same layer
- The neurons in one layer receive inputs from the previous layer
- The neurons in one layer delivers its output to the next layer
- Allow signal to travel one way only from input to output
- No feed back
- Associates input with output
- The connections are unidirectional (Hierarchical)
A-
Single layer : has one layer of connection weights
► input layer of source neurons connected to neurons of output
layer.
► input neurons are fully connected to output units but are not
connected to other input units
► output meurons are not connected to other output units.
13
B- Multi-Layer network :
► A net with one or more layers (or levels) of neurons (hidden
neurons) between the input units and output layers.
► There is a layer of weights between two adjacent levels of
units (input, hidden, or output).
► Multilayer nets can solve more complicated problems than can
single layer nets, but training may be more difficult.
2- Feed-back (Recurrent) NN :
► Some connections are present from a layer to the previous
►
►
►
►
►
layers
More biologically realistic.
Signals traveling in both direction by introducing loops
Powerful and complicated
Dynamic network
Recurrent Network : has at least one feedback loop
14
3- Associative Network:
► There is no hierarchical arrangement
► The connections can be bidirectional
Training – Learning algorithm.
15
1- Supervised Learning
• the network is presented with inputs together with the target (teacher signal) outputs.
• Then, the neural network tries to produce an output as close as possible to the target
signal by adjusting the values of internal weights.
• The most common supervised learning method is the “error correction method”, using
methods such as
– Least Mean Square (LMS)
– Back Propagation
2- Un-Supervised Learning
• There is no teacher (target signal) from outside and the network adjusts its weights in
response to only the input patterns.
• competitive learning: the neurons take part in some competition for each input.
• The winner of the competition and sometimes some other neurons are allowed to
change their weights
• In simple competitive learning only the winner is allowed to learn (change its weight).
• In self-organizing maps other neurons in the neighborhood of the winner may also
learn.
3-
Reinforcement training
• Generalization of Supervised Learning;
• Uses some random search strategy until correct answer is found
• The teacher scores the performance of the training examples.
• Based on actions
• Use performance score to change weights randomly.
16
1- Supervised
2- Unsupervised
Define:
Each of the training patterns
associated with target output
vector
Data:
(Input, desired output)
Define:
Each of the training patterns not
associated with target output
vector
Data:
(Different input)
Problems:
Classification , regression
Pattern recognition
Problems:
Clustering , data reduction
NN models:
perceptron
Heb
NN models:
Self-organizing maps (SOM)
Hopfield
Activation function
1- Identity function – linear transfer function
• Performs no input squashing
• Not very interesting...
• output = input
17
2- Binary step function – threshold function – hard limit
transfer function – unipolar
• Convert the net input, which is a continuously valued
variable, to an output unit that is a binary (l or 0)
• The binary step function is also known as Heaviside function.
3- Bipolar step function
output =
1
-1
net > = θ
net < θ
4- Binary sigmoid – log sigmoid
•
•
•
•
Squashes the neuron’s pre-activation between 0 and 1
Always positive
Bounded
Strictly increasing
18
5- hyperbolic tangent (‘‘tanh’’)
•
•
•
•
6-
Squashes the neuron’s pre-activation between -1 and 1
Can be positive or negative
Bounded
Strictly increasing
Rectified linear activation function
• Bounded below by 0 (always non-negative)
• Not upper bounded
• Strictly increasing
19
7-
Radial basis activation function
20
Artificial Neural Network Development Process
21
Linearly Separable Function
§ The function is capable of assigning all inputs to two categories.
§ Used if number of classes is 2
§ Decision boundary : line that partitions the plane into two
decision regions
§ Decision boundary has equation
b + åi =1 x i w i = 0
n
o Positive region : decision region for output 1 with equation
b + x1w1 + x2 w 2 ³ 0
o Negative region : decision boundary for output -1 with
equation
b + x1w1 + x2 w 2 < 0
22
§ If two classes of patterns can be separated by a decision boundary
then they are said to be linearly separable
§ If such a decision boundary does not exist, then the two classes
are said to be linearly inseparable (Non-Linear Separable)
§ Linearly inseparable problems cannot be solved by the simple
network , more sophisticated architecture is needed.
23
Capacity of single neuron
• Could do binary classification (two outputs):
• also known as logistic regression classifier
o if greater than 0.5,
predict class 1
o otherwise, predict
class 0
u
Can solve linearly separable problems
24
Can’t solve non-linearly separable problems
u
The First Artificial Neuron (McCulloch-Pitts network)
• The McCulloch-Pitts neuron is perhaps the earliest artificial neuron
•
The neuron has binary inputs (0 or 1) labelled xi where i =
1,2,..,n.
• The activation (output) of a McCulloch-Pitts neuron is binary.
• The neuron either fires (has an activation of 1) or does not fire
(has an activation of 0).
• Each neuron has a fixed threshold value
T such that if the net
input to the neuron is greater than the threshold, the neuron fires.
• The activation function is Binary step
•
•
25
Architecture
• In general, a McCulloch-Pitts neuron Y may receive signals from any
number of other neurons.
• Each connection path is either excitatory, with weight w > 0, or
inhibitory, with weight w < 0 .
• All excitatory connections into a particular neuron have the same
weights.
• Output of each neuron is as follow
• Figure of McCulloch-Pitts neuron
Algorithm
• The weights for a McCulloch-Pitts neuron are set, together with the
threshold for the neuron's activation function,
• The analysis is used to determine the values of the weights and
threshold.
• Logic functions will be used as simple examples for a number of
neural nets.
26
Example -1 : And Function
Example -2 : OR Function
27
Example -3 : And Not
Example - 4: NAND Function
28
Example -5 : XOR Function
29
ý Applications of neural Network
☺ Financial modelling – predicting stocks, currency exchange rates
☺ Other time series prediction – climate, weather,
☺ Computer games – intelligent agents, backgammon
☺ Control systems – autonomous adaptable robotics
☺ Pattern recognition – speech recognition, hand-writing recognition,
☺ Data analysis – data compression, data mining
☺ Noise reduction – ECG noise reduction
☺ Bioinformatics – DNA sequencing
ý Advantage of Neural Network
o
ANN are powerful computation system: consists of many nuerons
o
Generalization:
§ can learn from training data and generalize to new one.
§ using responses to prior input patterns to determine the
response to a novel input
o
fault tolerance:
§ able to recognizes a pattern that has some noises
§ Still works when part of the net fails
o
Massive parallel processing:
§ process more than one pattern at same time using same
set of weights
o
distributed memory representation
o
Adaptability:
§ increase network ability of recognition by more training
o
low energy consumption
o
useful for brain modeling
o
used pattern recognition
o
Able to learn any complex non-linear mapping
o
Learning instead of programming
o
Robust: Can deal with incomplete and/or noisy data
30
ý Disadvantage of Neural Network
o
Need training to operate
o
High processing time for training
o
require specialized HW and SW
o
lack of understanding the behavior of NN
o
convergence not guaranteed ( to reach solution is not
guaranteed )
o
no mathematical proof for the learning process
o
Difficult to design
o
They are no clear design rules for arbitrary applications
o
Learning process can be very time consuming
o
Can over-fit the training data, becoming useless for
generalization
ý Types of Problems Solved by NN
☺ Classification: determine to which of a discrete number of classes
a given input case belongs
☺ Regression: predict the value of a (usually) continuous variable
(weather)
☺ Times series- you wish to predict the value of variables from earlier
values of the same or other variables.
☺ Clustering (Natural language processing , Data mining)
☺ Control (Automotive control) (robotics)
☺ Function approximation (Modelling)
Modelling of highly nonlinear industrial processes , Financial market
prediction.
31
ý Who is concerned with NNs?
• Computer scientists want to find out about the properties of nonsymbolic information processing with neural nets and about learning
systems in general.
• Statisticians use neural nets as flexible, nonlinear regression and
classification models.
• Engineers of many kinds exploit the capabilities of neural networks
in many areas, such as signal processing and automatic control.
• Cognitive scientists view neural networks as a possible apparatus to
describe models of thinking and consciousness (High-level brain
function).
• Neuro-physiologists use neural networks to describe and explore
medium-level brain function (e.g. memory, sensory system, motorics).
• Physicists use neural networks to model phenomena statistical
mechanics and for a lot of other tasks.
• Biologists use Neural Networks to interpret nucleotide sequences.
• Philosophers and some other people may also be interested in
Neural Networks for various reasons.
32
Download