EC 9170
Deep Learning for Electrical &
Computer Engineers
Lecture 01:
Deep feedforward networks
Faculty of Engineering, University of Jaffna
•Is Artificial Intelligence, Machine learning,
and Deep Learning the same thing?
Ability of machine to imitate human
intelligence
Algorithms to incorporate intelligence into
machine by automatically learning from
data
Algorithms that mimics human brain to
incorporate intelligence into machine
Artificial Intelligence
• Al is any technique, code or algorithm that enables machines to
develop, demonstrate and mimic human cognitive behaviour or
intelligence, hence the name "Artificial Intelligence.”
• The broadest term, AI is the development of machines that can
mimic human intelligence and behavior. AI can enable machines to
understand, interact, and communicate with humans.
• Some of the most successful applications of Al around us can be seen
in Robotics, Computer Vision, Virtual Reality, Speech Recognition,
Automation, Gaming and so on...
Artificial Intelligence
Weak AI
General AI
• Apple Siri
• Google Assisitant
• Alexa
Strong AI
Machine learning
• Machine learning is the sub-field of Al, which gives machines the
ability to improve their performance over time without explicit
intervention or help from the human being.
• ML is the use of algorithms to learn from data and discover patterns.
ML systems can automatically learn and improve.
• In this approach machines are shown thousands or millions of
examples and trained how to correctly solve a problem.
• Most of the current applications of machine learning leverage
supervised learning
• Other uses of ML can be broadly classified between unsupervised
learning and reinforced learning.
Machine learning
Supervised Unsupervised
Learning
Learning
Reinforcement
Learning
Deep learning
• Deep learning is a sub field of ML that very closely tries to mimic
human brain’s working using neurons.
• DL is the use of complex neural networks to learn from large,
unstructured data sets.
• These techniques focus on building Artificial Neural Networks (ANN)
using several hidden layers.
Neural Network
Artificial
Neural
Network
(ANN)
Convolutional
Neural
Network
(CNN)
Recurrent
Neural
Network
(RNN)
Deep learning
• There are variety of deep learning networks such as Multilayer
Perceptron ( MLP), Autoencoders (AE), Convolution Neural Network
(CNN), Recurrent Neural Network (RNN), Deep Feedforward
Network etc.
Why Deep learning is growing?
• Processing power needed for Deep learning is readily becoming
available using GPs, Distributed Computing and powerful CPUs
• Moreover, deep learning models seem to outperform machine
learning models as the data grows.
• Explosion of features and datasets
• Focus on customisation and real-time decisions
• Uncover hard to detect patterns (using traditional techniques)
when the incidence rate is low
• Higher operational efficiency
Challenges with Deep learning
• Data Quality and Quantity: high-quality labeled data can be expensive and time-consuming.
Additionally, the quality of the data can significantly impact the performance and robustness of
the models.
• Computational Resources: Need computational resources, including powerful GPUs or even
specialized hardware like TPUs (Tensor Processing Units). This can be a barrier for smaller
organizations or researchers with limited access to such resources.
• Overfitting: especially when trained on limited data or when the model capacity is too high
relative to the complexity of the problem. Techniques like dropout, regularization, and data
augmentation are commonly employed to mitigate this issue.
• Interpretability: Deep learning models are often considered "black boxes" due to their
complexity, making it challenging to understand how they arrive at a particular prediction. This
lack of interpretability can be problematic, especially in critical applications like healthcare or
finance, where understanding the reasoning behind a decision is crucial.
Build a Neural Network
• Neural Network: A computational model that works in a similar
way to the neurons in the human brain.
• Biological neurons are organized in a vast network of billions of neurons.
• Each neuron typically is connected to thousands of other neurons.
Build a Neural Network
• A biological neuron is composed of a Cell body, many dendrites (branching
extensions), one axon (long extension), synapses
• Biological neurons receive signals from other neurons via these synapses.
When a neuron receives a sufficient number of signals within a few milliseconds, it
fires its own signals.
• Comparison between biological neuron and artificial neuron
Neural Network
• Neural network consists of large number of highly interconnected
neurons in it.
• Each neuron takes an input, performs some operations then passes
the output to the following neuron.
Two-Layer Neural Network
Key Components
1. Layers• Input layer: It contains artificial neurons which receive input data, which could be
raw data (e.g., pixel values of an image). Input layer neurons depend on the number
of features.
• Output layer: the final layer in the neural network, contains artificial neurons that
are responsible for producing the model's predictions or outputs. output layer
neurons depend on the number of outputs.
• Hidden layers: layers of neurons that perform computations and transformations on
the input data. They are called "hidden" because they are not directly observable as
inputs or outputs of the system. Instead, they serve as intermediate layers between
the input and output layers, capturing complex patterns and features in the data.
More neurons = More calculation = More time
Key Components Cont…
2. Neurons - Basic unit of a Neural Network. It can take inputs from
other neurons and give the corresponding output. the inputs and
output can only be a binary number i.e. 0 or 1.
3. Weights - Connection between every pair of neurons. the
importance is given to each factor in computing the output.
Typically chosen randomly in the first run and optimized using
backward propagation.
Key Components Cont…
4. Activation Function- Function used to generate outputs by matrix
multiplication of inputs and weights along with bias.
F(x) =
0.67
Key Components Cont…
➢ Neural Network Notation
Key Components Cont…
➢ Neural Network Notation
Key Components Cont…
4. Forward Propagation- Weights for each input are initialized to make
predictions and compute error. Output from each layer is fed
forward to the next layer.
Key Components Cont…
4. Loss Function- To compute error between actual and prediction
values and measure models performance. Hyperparameters are
fine tuned to minimize the loss function. Some common loss
functions are- Mean Square Error, Log loss, Cross entropy,
A Simple Artificial Neural Network
• One or more binary inputs and one binary output
• Activates its output when more than a certain number of its inputs are
active.
➢ Linear Threshold Unit (LTU)
• Inputs of a LTU are numbers (not binary).
• Each input connection is associated with a weight.
• Computes a weighted sum of its inputs and applies a step function to
that sum.
➢ Perceptron
• The perceptron is a one-layer LTU.
• The input neurons output whatever input they are fed.
• A bias neuron, which just outputs 1 all the time.
• If we use logistic function (sigmoid) instead of a step function, it
computes a continuous output.
➢ How is a Perceptron Trained?
• For an LTU to give an output it needs to know the values of the
weights w1, w2… wn.
• The Perceptron training algorithm is inspired by Hebb's rule.
• When a biological neuron often triggers another neuron, the
connection between these two neurons grow stronger.
• Feed one training instance x to each neuron j at a time and make its
prediction y cat.
• Update the connection weights.
➢ Perceptron in Keras
Multi-Layer Perceptron (MLP)
Perceptron Weakness
Incapable of solving some trivial problems, e.g., XOR classification problem. Why?
Multi-Layer Perceptron (MLP)
Perceptron Weakness
Incapable of solving some trivial problems, e.g., XOR classification problem. Why?
Multi-Layer Perceptron (MLP)
• The limitations of Perceptrons can be eliminated by stacking multiple
Perceptrons.
• The resulting network is called a Multi-Layer Perceptron (MLP) or deep
feedforward neural network.
• A feedforward neural network is composed of:
• One input layer
• One or more hidden layers
• One final output layer
Every layer except the output layer includes
a bias neuron and is fully connected to the
next layer
➢ How Does it Work?
• The model is associated with a directed acyclic graph describing how
the functions are composed together.
• E.g., assume a network with just a single neuron in each layer.
➢ XOR with Feedforward Neural Network
➢ How to Learn Model Parameters W?
Feedforward Neural Network - Cost Function
We use the cross-entropy (minimizing the negative log-likelihood) between the
training data y and the model's predictions 𝑦ො as the cost function.
➢ Gradient-Based Learning
• The most significant difference between the linear models we have seen so
far and feedforward neural network?
• The non-linearity of a neural network causes its cost functions to become nonconvex
➢ Gradient-Based Learning Cont…
• Linear models, with convex cost function, guarantee to find global
minimum.
• Convex optimization converges starting from any initial parameters.
• Stochastic gradient descent applied to non-convex cost functions has no
such convergence guarantee.
• It is sensitive to the values of the initial parameters.
• For feedforward neural networks, it is important to initialize all weights to
small random values.
• The biases may be initialized to zero or to small positive values.
Training Feedforward Neural Networks
Training Feedforward Neural Networks Cont…
Hidden Units
Feedforward Network in Keras
Thank you!