Uploaded by Sahand Salah

Neural Networks and deep learning

advertisement
4/29/2021
Neural Networks
and deep learning
Kara Salar
Sahand Salah
G.A1
Supervised by: M. Hawkar Kamaran
Table of Contents
Abstract........................................................................................................................................................ 2
Introduction ................................................................................................................................................. 3
Neural Network Definition ......................................................................................................................... 4
The need for neural networks .................................................................................................................... 5
Advantages of Neural Network.................................................................................................................. 6
Limitations of Neural Network .................................................................................................................. 6
What is Neurons? ........................................................................................................................................ 8
Different types of Neural Networks in Deep Learning ............................................................................ 9
Artificial Neural Networks (ANN) ........................................................................................................... 9
Recurrent Neural Network (RNN): ......................................................................................................... 11
Convolution Neural Network (CNN) ...................................................................................................... 13
Training Neural Networks ....................................................................................................................... 14
Linear Regression ................................................................................................................................... 14
Simple Linear Regression With scikit-learn ....................................................................................... 16
Polynomial Regression With scikit-learn............................................................................................ 16
Conclusion ................................................................................................................................................. 21
Table of Figures
The preceding diagram displays the hyperplane ---------------------------------------------------------------------------------------- 9
Artificial Neural Networks-------------------------------------------------------------------------------------------------------------------- 10
Prceptron ------------------------------------------------------------------------------------------------------------------------------------------ 11
Convolution Neural Network ---------------------------------------------------------------------------------------------------------------- 13
MSE diagram ------------------------------------------------------------------------------------------------------------------------------------- 15
Page 1 of 22
Abstract
Deep learning is based on neural networks, which are a class of machine learning methods that are
being used in a wide range of fields including industry, health, technology, and science.
This chapter examines some of the most important characteristics of deep neural networks, as well
as aspects of their nature and architecture. We provide an overview of some of the various types
of networks and how they can be used.
Page 2 of 22
Introduction
Neural networks have been around since the 1940s, and, as a result, they have very a bit of history.
Neural network could be a scientific model for data preparing. A neural net isn't a settled program,
but or maybe a demonstrate, a framework that forms data, or inputs.
The characteristics of a neural network are as takes after: Data handling happen in its easiest
frame, over basic components called neurons. Neurons are associated and they trade signals
between them through association joins. Association joins between neurons can be more grounded
or weaker, and this decides how information is handled. Each neuron has an inside state that's
decided by all the incoming associations from other neurons. Each neuron incorporates a diverse
actuation work that's calculated on its state, and decides its yield flag.
A more common depiction of a neural arrange would be as a computational chart of numerical
operations, but we'll learn more approximately that afterward. Able to distinguish two fundamental
characteristics for a neural net: The neural net engineering: This portrays the set of connectionsnamely, feedforward, repetitive, multi or single-layered, and so on-between the neurons, the
number of layers, and the number of neurons in each layer. The learning: This describes what is
commonly characterized as the preparing. The foremost common but not select way to prepare a
neural network is with the slope plummet and backpropagation.
A standard neural arrange (NN) comprises of numerous basic, associated processors called
neurons, each creating a grouping of real-valued actuations. Input neurons get enacted through
sensors seeing the environment, other neurons get actuated through weighted associations from
already dynamic neurons. A few neurons may impact the environment by activating activities.
Learning or credit task is almost finding weights that make the NN show craved behavior, such as
driving a car.
Depending on the issue and how the neurons are associated, such behavior may require long causal
chains of computational stages, where each organize changes (regularly in a non-linear way) the
total actuation of the organize. Profound Learning is approximately precisely allotting credit over
numerous such stages.
Page 3 of 22
Neural Network Definition
Neural networks are a set of calculations, customized loosely after the human brain, that are
planned to recognize plan.
They decipher sensory data through a kind of machine recognition, labeling or clustering crude
input. The designs they recognize are numerical, contained in vectors, into which all real-world
data, be it pictures, sound, content or time arrangement, must be translated.
Neural networks help us cluster and classify. You'll think of them as a clustering and classification
layer on best of the information you store and manage. They help to bunch unlabeled information
concurring to similitudes among the case inputs, and they classify data when they have a labeled
dataset to prepare on.
(Neural networks can too extricate features that are bolstered to other algorithms for clustering and
classification; so you will think of profound neural systems as components of bigger machinelearning applications including calculations for support learning, classification and regression.)
Page 4 of 22
The need for neural networks
Neural networks have been around for numerous a long time, and they have gone through a few
periods amid which they have fallen in and out of favor. However, as of late, they have relentlessly
picked up ground over numerous other competing machine-learning calculations.
This resurgence is due to having computers that are quick, the utilize of graphical handling units
(GPUs) versus the foremost conventional utilize of computing handling units (CPUs), superior
calculations and neural net plan, and progressively bigger datasets that we'll see in this book.
To induce an idea of their victory. One of the assignments within the challenge is to classify
obscure pictures in these categories.
In 2011, the champion accomplished a top-five precision of 74.2%. In 2012, Alex Krizhevsky and
his group entered the competition with a convolutional arrange (a extraordinary sort of profound
organize). That year, they won with a top-five precision of 84.7%.
Since at that point, the victors have continuously been convolutional systems and the current topfive precision is 97.7%. However, deep learning calculations have exceeded expectations in other
regions; for illustration, both Google Now and Apple's Siri collaborators depend on profound
systems for discourse acknowledgment and Google's utilize of profound learning for their
interpretation motors.
Here is why: To begin with: knowing the hypothesis of neural systems will assist you get it the
rest of the book, since a huge lion's share of neural networks in utilize nowadays share common
standards.
Understanding simple systems implies simply will get it deep networks too. In addition, having a
few fundamental knowledge is continuously great. It will assist you a part after you confront a few
unused fabric.
Page 5 of 22
Advantages of Neural Network







Store data on the complete network
Just like it happens in conventional programming where data is put away on
the organize and not on a database. On the off chance that some pieces of data vanish from
one put, it does not halt the total organize from functioning.
The capacity to work with inadequately knowledge:
After the preparing of ANN, the output delivered by the information can be incomplete
or inadequately. The significance of that lost data decides the need of performance.
Great falt tolerance:
The output generation isn't influenced by the debasement of one or more than one cell
of manufactured neural organize. This makes the systems way better at enduring faults.
For
an fake neural network to ended
up able to
memorize,
it
is essential to diagram the cases and
to educate it
according
to
the output that's craved by appearing those illustrations to the arrange. The advance of
the organize is specifically relative to the occurrences that are selected.
Slow Corruption:
Indeed a arrange encounters relative corruption and moderates over time. But it does
not instantly erode the network.
Capacity to prepare machine:
ANN learn from occasions and make choices through commenting on comparative events.
The capacity of parallel processing:
These systems have numerical quality which makes them competent of performing more
than one work at a time.
Limitations of Neural Network
1. Black BOX
Arguably, the best-known drawback of neural networks is their “black box” nature. Essentially
put, you don’t know how or why your NN came up with a certain output. For illustration, once
you put an picture of a cat into a neural network and it predicts it to be a car, it is exceptionally
difficult to understand what caused it to reach at this prediction. After you have highlights that
are human interpretable, it is much less demanding to get it the cause of the botch. By
comparison, calculations like choice trees are exceptionally interpretable. This is often
imperative since in a few spaces, interpretability is critical. Usually why a parcel of banks don’t
utilize neural networks to foresee whether a individual is financially sound — they ought to
clarify to their clients why they didn't get the credit, something else the individual may feel
unjustifiably treated. The same holds genuine for destinations like Quora. In the event that a
machine learning calculation chosen to erase a user's account, the client would be owed an
Page 6 of 22
clarification as to why. I question they'll be fulfilled with “that’s what the computer said." Other
scenarios would be critical trade choices. Can you envision the CEO of a enormous company
making a choice approximately millions of dollars without understanding why it ought to be
done? Fair since the "computer" says he ought to do so?
2. Term OF DEVELOPMENT
Although there are libraries like Keras that make the improvement of neural networks reasonably
straightforward, now and then you would like more control over the points of interest of the
calculation, like when you're attempting to fathom a troublesome issue with machine learning
that no one has ever done before. In that case, you might utilize Tensorflow, which gives more
openings, but it is additionally more complicated and the improvement takes much longer
(depending on what you need to construct). At that point a commonsense address emerges for
any company: Is it truly worth it for costly engineers to spend weeks creating something that will
be fathomed much speedier with a easier algorithm?
3. Amount OF DATA
Neural networks more often than not require much more information than conventional machine
learning calculations, as in at slightest thousands on the off chance that not millions of labeled
tests. This isn’t a straightforward issue to bargain with and numerous machine learning issues
can be fathomed well with less information in case you utilize other algorithms. Although there
are a few cases where neural systems do well with small information, most of the time they
don’t. In this case, a straightforward calculation like credulous Bayes, which bargains much way
better with small information, would be the fitting choice.
4. COMPUTATIONALLY EXPENSIVE
Usually, neural networks are too more computationally costly than conventional calculations.
State of the craftsmanship profound learning calculations, which realize fruitful preparing of
truly profound neural networks, can take a few weeks to prepare totally from scratch. By
differentiate, most conventional machine learning calculations take much less time to prepare,
extending from a number of minutes to a couple of hours or days. The sum of computational
control required for a neural network depends intensely on the measure of your information, but
too on the profundity and complexity of your organize. For case, a neural organize with one layer
and 50 neurons will be much quicker than a irregular woodland with 1,000 trees. By comparison,
a neural organize with 50 layers will be much slower than a arbitrary timberland with as it were
10 trees.
Page 7 of 22
What is Neurons?
A neuron could be a scientific work that takes one or more input values, and yields a single
numerical value. In this chart, we are able see the diverse components of the neuron .The neuron
is characterized as takes after:
1. To begin with, we compute the weighted
entirety of the inputs xi and the weights
ωi (moreover known as an enactment esteem). Here, xi is either numerical values that speak
to the input information, or the yields of other neurons (that's , in the event that the neuron is
portion of a neural organize): The weights ωi are numerical values that speak to either the
strength of the inputs or, on the other hand, the quality of the associations between the neurons.
The weight b could be a uncommon esteem called bias whose input is continuously 1.
2. At that point, we utilize the result of the weighted whole as an input to the actuation work f,
which is additionally known as exchange work. . There are numerous sorts of actuation
capacities, but they all got to satisfy the prerequisite to be non-linear, which we'll clarify
afterward within the chapter. You might have taken note that the neuron is exceptionally
comparative to expel calculated relapse and the perceptron, Machine Learning – an
Presentation. You'll think of it as a generalized form of these two calculations. In case we
utilize the calculated work or step work as enactment capacities, the neuron turns into
calculated relapse or perceptron separately. Furthermore, in the event that we do not utilize
any actuation work, the neuron turns into direct relapse. In this case, be that as it may, we are
not restricted to these cases and, as you'll see afterward, they are seldom utilized in hone.
Machine Learning – an Presentation, the enactment esteem characterized already can be
deciphered as the dab item between the vector w and the vector x: . The vector x will be
opposite to the weight vector w, in case . Subsequently, all vectors x such that characterize a
hyperplane within the highlight space R n , where n is the measurement of x. That sounds
complicated! To superior get it it, lets consider a uncommon case where the enactment work
is f(x) = x and we as it were have a single input esteem, x. The yield of the neuron at that point
gets to be y = ωx + b, which is the direct condition. This appears that in one dimensional input
space, the neuron characterizes a line. In case we visualize the same for two or more inputs,
we are going see that the neuron characterizes a plane, or a hyperplane, for an self-assertive
number of input measurements. ready to moreover see that the part of the predisposition, b, is
to permit the hyperplane to move absent from the center of the facilitate framework.
Page 8 of 22
The preceding diagram displays the hyperplane
In the event that we do not utilize inclination, the neuron will have constrained representation
control, the perceptron (thus the neuron) as it were works with straightly distinguishable
classes, and presently we know that since it characterizes a hyperplane. To overcome this
confinement, we ought to organize the neurons in a neural Network.
Different types of Neural Networks in Deep Learning
This article focuses on three important types of neural networks that form the basis for most pretrained models in deep learning:



Artificial Neural Networks (ANN)
Recurrent Neural Networks (RNN)
Convolution Neural Networks (CNN)
Artificial Neural Networks (ANN)
A single perceptron (or neuron) can be envisioned as a Logistic Relapse. Artificial Neural
Network, or ANN, could be a gather of multiple perceptrons/ neurons at each layer. ANN is
additionally known as a Feed-Forward Neural network since inputs are handled as it were within
the forward direction:
Page 9 of 22
Artificial Neural Networks
As you can see here, ANN consists of 3 layers – Input, Hidden and Output. The input layer
accepts the inputs, the hidden layer processes the inputs, and the output layer produces the result.
Essentially, each layer tries to learn certain weights
ANN can be used to solve problems related to:



Tabular data : is data that is structured into rows, each of which contains information
about some thing.
Image data: Is a photographic or trace objects that represent the underlying pixel data of
an area of an image element, which is created, collected and stored using image
constructor devices.
Text data: also referred to as text data mining, similar to text analytics, is the process of
deriving high-quality information from text. ... High-quality information is typically
obtained by devising patterns and trends by means such as statistical pattern learning.
Advantages of Artificial Neural Network (ANN)
Artificial Neural Network is able of learning any nonlinear work. Consequently, these networks
are famously known as Widespread Work Approximators. ANNs have the capacity to learn
weights that map any input to the output.
One of the most reasons behind all inclusive guess is the actuation work. Activation functions
present nonlinear properties to the network. This helps the organize learn any complex relationship
between input and output.
Page 10 of 22
Prceptron
As you can see here, the output at each neuron is the activation of a loaded sum of inputs. But wait
– what happens if there is no activation function? The network only learns the linear function and
can be never learn complex relationships.
Recurrent Neural Network (RNN):
Recurrent Neural Network may be a generalization of feedforward neural network that has an
internal memory. RNN is repetitive in nature because it performs the same work for each input
of data whereas the yield of the current input depends on the past one computation. After
creating the yield, it is replicated and sent back into the repetitive arrange. For making a choice,
it considers the current input and the yield that it has learned from the past input.
Page 11 of 22
As you can see here, RNN has a recurrent commentate on the hidden states. This looping
constraint ensures that sequential information is captured in the input data.
We can use recurrent neural networks to solve the problems related to:

Time Series data: also referred to as time-stamped data, is a sequence of data points
indexed in time order. Time-stamped is data collected at different points in time.

Text data: also referred to as text data mining, similar to text analytics, is the process of
deriving high-quality information from text. ... High-quality information is typically
obtained by devising patterns and trends by means such as statistical pattern learning.


Audio data : you are always in contact with audio. Your brain is continuously processing
and understanding audio data and giving you information about the environment.
Page 12 of 22
Advantages of Recurrent Neural Network
RNN can model sequence of data so that each sample can be assumed to be dependent on previous
ones
Recurrent neural network are even used with convolutional layers to extend the effective pixel
neighbourhood.
Disadvantages of Recurrent Neural Network
Gradient vanishing and exploding problems.
Training an RNN is a very difficult task.
It cannot process very long sequences if using tanh or relu as an activation function.
Convolution Neural Network (CNN)
Convolutional neural networks (CNN) are all the seethe within the deep learning community right
presently. These CNN models are being utilized over diverse applications and spaces, and they’re
particularly predominant in image and video processing projects. The building pieces of CNNs are
channels a.k.a. bits. Bits are used to extricate the significant features from the input using the
convolution operation. Let’s attempt to get a handle on the significance of channels utilizing
images as input information.
Convolution Neural Network
Page 13 of 22
Training Neural Networks
The common concept we have to get it is the following: Every neural network is a guess of a
function, so each neural network will not be equal to the specified function, but instead will vary
by a few value called error. During training, the point is to play down this blunder.
Since the blunder may be a function of the weights of the network, we need to play down the
blunder with regard to the weights. The blunder work is a function of numerous weights and, so,
a function of numerous factors.
Scientifically, the set of focuses where this work is zero speaks to a hypersurface, and to discover
a minimum on this surface, we need to choose a point and after that take after a bend within the
course of the minimum.
We ought to note that a neural network and its preparing are two separate things. This implies we
will alter the weights of the arrange in some way other than slope plunge and backpropagation, but
usually the foremost prevalent and effective way to do so and is, apparently, the as it were way
that's as of now utilized in hone.
Linear Regression
Linear regression may be a special case of a neural network; that’s, it is a single neuron with the
character enactment work. In this segment, we will learn how to prepare straight relapse with slope
plummet and, within the taking after areas, we will expand it to training more complex models.
You will be able see how the slope plummet works within the taking after code square.
Initialize the weights w with some random values repeat:
# compute the mean squared error (MSE) loss function for all samples of the training set
# We will denote MSE with J
# Update the weights w based on the derivative of J with respect to each weight
Until MSE falls below threshold
At to begin with, this might see frightening, but fear not! Behind the scenes, it is exceptionally
basic and straightforward science. But lets not lose locate of our objective, which is to alter the
Page 14 of 22
weights, ω, in a way that will offer assistance the calculation to predict the target values. To do
this, to begin with we ought to know how the output y i varies from the target value t i for each
test of the preparing dataset (we utilize superscript documentation to stamp the i-th sample). We
will utilize the mean-squared mistake loss work (MSE), which is rise to the mean value of the
squared contrasts y i - ti for all tests (the whole number of tests in the training set is n).
We will signify MSE with J for ease of utilize and, to underscore that, we can use other loss
functions. Each y i may be a work of ω, and so, J is additionally a work of ω.
As we specified already, the misfortune work J speaks to a hypersurface of measurement equal to
the measurement of ω (we are certainly moreover considering the inclination) To demonstrate this,
envision that we have as it were one input value, x, and a single weight, ω. Able to see how the
MSE changes with regard to ω within the taking after chart:
MSE diagram
Our objective is to play down J, which implies finding such w, where the value of J is at its global
minimum. To do this, we got to know whether J increments or diminishes when we modify ω, or,
in other words, the primary subsidiary (or angle) of J with regard to ω.
Page 15 of 22
Simple Linear Regression With scikit-learn
There are five essential steps when you’re executing linear regression:





Import the packages and classes you need.
Provide information to work with and inevitably do suitable transformations.
Create a regression model and fit it with existing data.
Check the comes about of model fitting to know whether the demonstrate is satisfactory.
Apply the demonstrate for predictions.
Polynomial Regression With scikit-learn
Implementing polynomial regression with scikit-learn is exceptionally comparative to straight
relapse. There is as it were one extra step: you would like to convert the cluster of inputs to
incorporate non-linear terms such as 𝑥².
Step 1: Import packages and classes:
import numpy as np
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import PolynomialFeatures
Step 2a: Provide data:
x = np.array([5, 15, 25, 35, 45, 55]).reshape((-1, 1))
y = np.array([15, 11, 2, 8, 25, 32])
Presently you have the input and output in a reasonable organize. Keep in mind simply require the
input to be a two-dimensional array. That is why. reshape() is used.
Step 2b: Transform input data
Usually the unused step you wish to execute for polynomial regression! As you’ve seen prior, you
wish to incorporate 𝑥² (and maybe other terms) as extra highlights when executing polynomial
regression. For that reason, you ought to change the input array x to contain the extra column(s)
with the values of 𝑥² (and in the long run more features). It’s conceivable to convert the input array
Page 16 of 22
in a few ways (like utilizing embed() from numpy), but the course PolynomialFeatures is
exceptionally helpful for this reason. Let’s make an instance of this class:
transformer = PolynomialFeatures(degree=2, include_bias=False)
You can provide several optional parameters to PolynomialFeatures:



degree is an integer (2 by default) that represents the degree of the polynomial regression
function.
interaction_only is a Boolean (False by default) that decides whether to include only
interaction features (True) or all features (False).
include_bias is a Boolean (True by default) that decides whether to include the bias
(intercept) column of ones (True) or not (False).
This example uses the default values of all parameters, but you’ll sometimes want to experiment
with the degree of the function, and it can be beneficial to provide this argument anyway.
Before applying transformer, you need to fit it with .fit():
transformer.fit(x)
Once transformer is fitted, it’s prepared to form a unused, adjusted input. You apply .transform()
to do that:
x_ = transformer.transform(x)
You can also use .fit_transform() to replace the three previous statements with only one:
x_ = PolynomialFeatures(degree=2, include_bias=False).fit_transform(x)
That’s fitting and changing the input array in one statement with .fit_transform(). It too takes the
input array and viably does the same thing as .fit() and .transform() called in that arrange. It
moreover returns the altered cluster. This is often how the modern input cluster looks:
Page 17 of 22
>>> print(x_)
[[ 5. 25.]
[ 15. 225.]
[ 25. 625.]
[ 35. 1225.]
[ 45. 2025.]
[ 55. 3025.]]
Step 3: Create a model and fit it
model = LinearRegression().fit(x_, y)
Step 4: Get results
>>> r_sq = model.score(x_, y)
>>> print('coefficient of determination:', r_sq)
coefficient of determination: 0.8908516262498564
>>> print('intercept:', model.intercept_)
intercept: 21.372321428571425
>>> print('coefficients:', model.coef_)
coefficients: [-1.32357143 0.02839286]
You can obtain a very similar result with different transformation and regression arguments:
x_ = PolynomialFeatures(degree=2, include_bias=True).fit_transform(x)
The variable show once more compares to the modern input cluster x_. Subsequently x_ ought to
be passed as the primary contention rather than x. This approach yields the taking after comes
about, which are comparable to the past case:
Page 18 of 22
>>> r_sq = model.score(x_, y)
>>> print('coefficient of determination:', r_sq)
coefficient of determination: 0.8908516262498565
>>> print('intercept:', model.intercept_)
intercept: 0.0
>>> print('coefficients:', model.coef_)
coefficients: [21.37232143 -1.32357143 0.02839286]
Page 19 of 22
Step 5: Predict response
On the off chance that you need to urge the predicted reaction, fair utilize .predict(), but keep in
mind that the contention ought to be the adjusted input x_ rather than the old x:
>>> y_pred = model.predict(x_)
>>> print('predicted response:', y_pred, sep='\n')
predicted response:
[15.46428571 7.90714286 6.02857143 9.82857143 19.30714286 34.46428571]
As you'll see, the expectation works almost the same way as within the case of linear regression.
It fair requires the altered input rather than the first.
Page 20 of 22
Conclusion
AI As We Get it It Most of the AI we know nowadays works on a rule of deep learning: a machine
is given a set of information and a wanted output, and from that it produces its possess calculation
to solve it. The framework at that point repeats, propagating itself.
This can be called a neural network. It is essential to utilize this strategy to form AI, as a computer
can code quicker than a human; it would take lifetimes to code it manually.
Teacher of Electrical Engineering and Computer Science at MIT Tommi Jaakkola says, "In the
event that you had a really little neural arrange, you could be able to get it it. But once it gets to be
exceptionally huge, and it has thousands of units per layer and perhaps hundreds of layers, at that
point it gets to be very un-understandable."
We are at the arrange of these huge frameworks presently. So, in arrange to form these machines
clarify themselves - an issue that will have to be be illuminated some time recently we will put any
believe in them - what strategies are we using?
Page 21 of 22
References
https://books.google.iq/
https://www.deep-learning-site.com/
http://archive.keyllo.com/
https://serokell.io/
https://www.sciencedirect.com/
Page 22 of 22
Download