4/29/2021 Neural Networks and deep learning Kara Salar Sahand Salah G.A1 Supervised by: M. Hawkar Kamaran Table of Contents Abstract........................................................................................................................................................ 2 Introduction ................................................................................................................................................. 3 Neural Network Definition ......................................................................................................................... 4 The need for neural networks .................................................................................................................... 5 Advantages of Neural Network.................................................................................................................. 6 Limitations of Neural Network .................................................................................................................. 6 What is Neurons? ........................................................................................................................................ 8 Different types of Neural Networks in Deep Learning ............................................................................ 9 Artificial Neural Networks (ANN) ........................................................................................................... 9 Recurrent Neural Network (RNN): ......................................................................................................... 11 Convolution Neural Network (CNN) ...................................................................................................... 13 Training Neural Networks ....................................................................................................................... 14 Linear Regression ................................................................................................................................... 14 Simple Linear Regression With scikit-learn ....................................................................................... 16 Polynomial Regression With scikit-learn............................................................................................ 16 Conclusion ................................................................................................................................................. 21 Table of Figures The preceding diagram displays the hyperplane ---------------------------------------------------------------------------------------- 9 Artificial Neural Networks-------------------------------------------------------------------------------------------------------------------- 10 Prceptron ------------------------------------------------------------------------------------------------------------------------------------------ 11 Convolution Neural Network ---------------------------------------------------------------------------------------------------------------- 13 MSE diagram ------------------------------------------------------------------------------------------------------------------------------------- 15 Page 1 of 22 Abstract Deep learning is based on neural networks, which are a class of machine learning methods that are being used in a wide range of fields including industry, health, technology, and science. This chapter examines some of the most important characteristics of deep neural networks, as well as aspects of their nature and architecture. We provide an overview of some of the various types of networks and how they can be used. Page 2 of 22 Introduction Neural networks have been around since the 1940s, and, as a result, they have very a bit of history. Neural network could be a scientific model for data preparing. A neural net isn't a settled program, but or maybe a demonstrate, a framework that forms data, or inputs. The characteristics of a neural network are as takes after: Data handling happen in its easiest frame, over basic components called neurons. Neurons are associated and they trade signals between them through association joins. Association joins between neurons can be more grounded or weaker, and this decides how information is handled. Each neuron has an inside state that's decided by all the incoming associations from other neurons. Each neuron incorporates a diverse actuation work that's calculated on its state, and decides its yield flag. A more common depiction of a neural arrange would be as a computational chart of numerical operations, but we'll learn more approximately that afterward. Able to distinguish two fundamental characteristics for a neural net: The neural net engineering: This portrays the set of connectionsnamely, feedforward, repetitive, multi or single-layered, and so on-between the neurons, the number of layers, and the number of neurons in each layer. The learning: This describes what is commonly characterized as the preparing. The foremost common but not select way to prepare a neural network is with the slope plummet and backpropagation. A standard neural arrange (NN) comprises of numerous basic, associated processors called neurons, each creating a grouping of real-valued actuations. Input neurons get enacted through sensors seeing the environment, other neurons get actuated through weighted associations from already dynamic neurons. A few neurons may impact the environment by activating activities. Learning or credit task is almost finding weights that make the NN show craved behavior, such as driving a car. Depending on the issue and how the neurons are associated, such behavior may require long causal chains of computational stages, where each organize changes (regularly in a non-linear way) the total actuation of the organize. Profound Learning is approximately precisely allotting credit over numerous such stages. Page 3 of 22 Neural Network Definition Neural networks are a set of calculations, customized loosely after the human brain, that are planned to recognize plan. They decipher sensory data through a kind of machine recognition, labeling or clustering crude input. The designs they recognize are numerical, contained in vectors, into which all real-world data, be it pictures, sound, content or time arrangement, must be translated. Neural networks help us cluster and classify. You'll think of them as a clustering and classification layer on best of the information you store and manage. They help to bunch unlabeled information concurring to similitudes among the case inputs, and they classify data when they have a labeled dataset to prepare on. (Neural networks can too extricate features that are bolstered to other algorithms for clustering and classification; so you will think of profound neural systems as components of bigger machinelearning applications including calculations for support learning, classification and regression.) Page 4 of 22 The need for neural networks Neural networks have been around for numerous a long time, and they have gone through a few periods amid which they have fallen in and out of favor. However, as of late, they have relentlessly picked up ground over numerous other competing machine-learning calculations. This resurgence is due to having computers that are quick, the utilize of graphical handling units (GPUs) versus the foremost conventional utilize of computing handling units (CPUs), superior calculations and neural net plan, and progressively bigger datasets that we'll see in this book. To induce an idea of their victory. One of the assignments within the challenge is to classify obscure pictures in these categories. In 2011, the champion accomplished a top-five precision of 74.2%. In 2012, Alex Krizhevsky and his group entered the competition with a convolutional arrange (a extraordinary sort of profound organize). That year, they won with a top-five precision of 84.7%. Since at that point, the victors have continuously been convolutional systems and the current topfive precision is 97.7%. However, deep learning calculations have exceeded expectations in other regions; for illustration, both Google Now and Apple's Siri collaborators depend on profound systems for discourse acknowledgment and Google's utilize of profound learning for their interpretation motors. Here is why: To begin with: knowing the hypothesis of neural systems will assist you get it the rest of the book, since a huge lion's share of neural networks in utilize nowadays share common standards. Understanding simple systems implies simply will get it deep networks too. In addition, having a few fundamental knowledge is continuously great. It will assist you a part after you confront a few unused fabric. Page 5 of 22 Advantages of Neural Network Store data on the complete network Just like it happens in conventional programming where data is put away on the organize and not on a database. On the off chance that some pieces of data vanish from one put, it does not halt the total organize from functioning. The capacity to work with inadequately knowledge: After the preparing of ANN, the output delivered by the information can be incomplete or inadequately. The significance of that lost data decides the need of performance. Great falt tolerance: The output generation isn't influenced by the debasement of one or more than one cell of manufactured neural organize. This makes the systems way better at enduring faults. For an fake neural network to ended up able to memorize, it is essential to diagram the cases and to educate it according to the output that's craved by appearing those illustrations to the arrange. The advance of the organize is specifically relative to the occurrences that are selected. Slow Corruption: Indeed a arrange encounters relative corruption and moderates over time. But it does not instantly erode the network. Capacity to prepare machine: ANN learn from occasions and make choices through commenting on comparative events. The capacity of parallel processing: These systems have numerical quality which makes them competent of performing more than one work at a time. Limitations of Neural Network 1. Black BOX Arguably, the best-known drawback of neural networks is their “black box” nature. Essentially put, you don’t know how or why your NN came up with a certain output. For illustration, once you put an picture of a cat into a neural network and it predicts it to be a car, it is exceptionally difficult to understand what caused it to reach at this prediction. After you have highlights that are human interpretable, it is much less demanding to get it the cause of the botch. By comparison, calculations like choice trees are exceptionally interpretable. This is often imperative since in a few spaces, interpretability is critical. Usually why a parcel of banks don’t utilize neural networks to foresee whether a individual is financially sound — they ought to clarify to their clients why they didn't get the credit, something else the individual may feel unjustifiably treated. The same holds genuine for destinations like Quora. In the event that a machine learning calculation chosen to erase a user's account, the client would be owed an Page 6 of 22 clarification as to why. I question they'll be fulfilled with “that’s what the computer said." Other scenarios would be critical trade choices. Can you envision the CEO of a enormous company making a choice approximately millions of dollars without understanding why it ought to be done? Fair since the "computer" says he ought to do so? 2. Term OF DEVELOPMENT Although there are libraries like Keras that make the improvement of neural networks reasonably straightforward, now and then you would like more control over the points of interest of the calculation, like when you're attempting to fathom a troublesome issue with machine learning that no one has ever done before. In that case, you might utilize Tensorflow, which gives more openings, but it is additionally more complicated and the improvement takes much longer (depending on what you need to construct). At that point a commonsense address emerges for any company: Is it truly worth it for costly engineers to spend weeks creating something that will be fathomed much speedier with a easier algorithm? 3. Amount OF DATA Neural networks more often than not require much more information than conventional machine learning calculations, as in at slightest thousands on the off chance that not millions of labeled tests. This isn’t a straightforward issue to bargain with and numerous machine learning issues can be fathomed well with less information in case you utilize other algorithms. Although there are a few cases where neural systems do well with small information, most of the time they don’t. In this case, a straightforward calculation like credulous Bayes, which bargains much way better with small information, would be the fitting choice. 4. COMPUTATIONALLY EXPENSIVE Usually, neural networks are too more computationally costly than conventional calculations. State of the craftsmanship profound learning calculations, which realize fruitful preparing of truly profound neural networks, can take a few weeks to prepare totally from scratch. By differentiate, most conventional machine learning calculations take much less time to prepare, extending from a number of minutes to a couple of hours or days. The sum of computational control required for a neural network depends intensely on the measure of your information, but too on the profundity and complexity of your organize. For case, a neural organize with one layer and 50 neurons will be much quicker than a irregular woodland with 1,000 trees. By comparison, a neural organize with 50 layers will be much slower than a arbitrary timberland with as it were 10 trees. Page 7 of 22 What is Neurons? A neuron could be a scientific work that takes one or more input values, and yields a single numerical value. In this chart, we are able see the diverse components of the neuron .The neuron is characterized as takes after: 1. To begin with, we compute the weighted entirety of the inputs xi and the weights ωi (moreover known as an enactment esteem). Here, xi is either numerical values that speak to the input information, or the yields of other neurons (that's , in the event that the neuron is portion of a neural organize): The weights ωi are numerical values that speak to either the strength of the inputs or, on the other hand, the quality of the associations between the neurons. The weight b could be a uncommon esteem called bias whose input is continuously 1. 2. At that point, we utilize the result of the weighted whole as an input to the actuation work f, which is additionally known as exchange work. . There are numerous sorts of actuation capacities, but they all got to satisfy the prerequisite to be non-linear, which we'll clarify afterward within the chapter. You might have taken note that the neuron is exceptionally comparative to expel calculated relapse and the perceptron, Machine Learning – an Presentation. You'll think of it as a generalized form of these two calculations. In case we utilize the calculated work or step work as enactment capacities, the neuron turns into calculated relapse or perceptron separately. Furthermore, in the event that we do not utilize any actuation work, the neuron turns into direct relapse. In this case, be that as it may, we are not restricted to these cases and, as you'll see afterward, they are seldom utilized in hone. Machine Learning – an Presentation, the enactment esteem characterized already can be deciphered as the dab item between the vector w and the vector x: . The vector x will be opposite to the weight vector w, in case . Subsequently, all vectors x such that characterize a hyperplane within the highlight space R n , where n is the measurement of x. That sounds complicated! To superior get it it, lets consider a uncommon case where the enactment work is f(x) = x and we as it were have a single input esteem, x. The yield of the neuron at that point gets to be y = ωx + b, which is the direct condition. This appears that in one dimensional input space, the neuron characterizes a line. In case we visualize the same for two or more inputs, we are going see that the neuron characterizes a plane, or a hyperplane, for an self-assertive number of input measurements. ready to moreover see that the part of the predisposition, b, is to permit the hyperplane to move absent from the center of the facilitate framework. Page 8 of 22 The preceding diagram displays the hyperplane In the event that we do not utilize inclination, the neuron will have constrained representation control, the perceptron (thus the neuron) as it were works with straightly distinguishable classes, and presently we know that since it characterizes a hyperplane. To overcome this confinement, we ought to organize the neurons in a neural Network. Different types of Neural Networks in Deep Learning This article focuses on three important types of neural networks that form the basis for most pretrained models in deep learning: Artificial Neural Networks (ANN) Recurrent Neural Networks (RNN) Convolution Neural Networks (CNN) Artificial Neural Networks (ANN) A single perceptron (or neuron) can be envisioned as a Logistic Relapse. Artificial Neural Network, or ANN, could be a gather of multiple perceptrons/ neurons at each layer. ANN is additionally known as a Feed-Forward Neural network since inputs are handled as it were within the forward direction: Page 9 of 22 Artificial Neural Networks As you can see here, ANN consists of 3 layers – Input, Hidden and Output. The input layer accepts the inputs, the hidden layer processes the inputs, and the output layer produces the result. Essentially, each layer tries to learn certain weights ANN can be used to solve problems related to: Tabular data : is data that is structured into rows, each of which contains information about some thing. Image data: Is a photographic or trace objects that represent the underlying pixel data of an area of an image element, which is created, collected and stored using image constructor devices. Text data: also referred to as text data mining, similar to text analytics, is the process of deriving high-quality information from text. ... High-quality information is typically obtained by devising patterns and trends by means such as statistical pattern learning. Advantages of Artificial Neural Network (ANN) Artificial Neural Network is able of learning any nonlinear work. Consequently, these networks are famously known as Widespread Work Approximators. ANNs have the capacity to learn weights that map any input to the output. One of the most reasons behind all inclusive guess is the actuation work. Activation functions present nonlinear properties to the network. This helps the organize learn any complex relationship between input and output. Page 10 of 22 Prceptron As you can see here, the output at each neuron is the activation of a loaded sum of inputs. But wait – what happens if there is no activation function? The network only learns the linear function and can be never learn complex relationships. Recurrent Neural Network (RNN): Recurrent Neural Network may be a generalization of feedforward neural network that has an internal memory. RNN is repetitive in nature because it performs the same work for each input of data whereas the yield of the current input depends on the past one computation. After creating the yield, it is replicated and sent back into the repetitive arrange. For making a choice, it considers the current input and the yield that it has learned from the past input. Page 11 of 22 As you can see here, RNN has a recurrent commentate on the hidden states. This looping constraint ensures that sequential information is captured in the input data. We can use recurrent neural networks to solve the problems related to: Time Series data: also referred to as time-stamped data, is a sequence of data points indexed in time order. Time-stamped is data collected at different points in time. Text data: also referred to as text data mining, similar to text analytics, is the process of deriving high-quality information from text. ... High-quality information is typically obtained by devising patterns and trends by means such as statistical pattern learning. Audio data : you are always in contact with audio. Your brain is continuously processing and understanding audio data and giving you information about the environment. Page 12 of 22 Advantages of Recurrent Neural Network RNN can model sequence of data so that each sample can be assumed to be dependent on previous ones Recurrent neural network are even used with convolutional layers to extend the effective pixel neighbourhood. Disadvantages of Recurrent Neural Network Gradient vanishing and exploding problems. Training an RNN is a very difficult task. It cannot process very long sequences if using tanh or relu as an activation function. Convolution Neural Network (CNN) Convolutional neural networks (CNN) are all the seethe within the deep learning community right presently. These CNN models are being utilized over diverse applications and spaces, and they’re particularly predominant in image and video processing projects. The building pieces of CNNs are channels a.k.a. bits. Bits are used to extricate the significant features from the input using the convolution operation. Let’s attempt to get a handle on the significance of channels utilizing images as input information. Convolution Neural Network Page 13 of 22 Training Neural Networks The common concept we have to get it is the following: Every neural network is a guess of a function, so each neural network will not be equal to the specified function, but instead will vary by a few value called error. During training, the point is to play down this blunder. Since the blunder may be a function of the weights of the network, we need to play down the blunder with regard to the weights. The blunder work is a function of numerous weights and, so, a function of numerous factors. Scientifically, the set of focuses where this work is zero speaks to a hypersurface, and to discover a minimum on this surface, we need to choose a point and after that take after a bend within the course of the minimum. We ought to note that a neural network and its preparing are two separate things. This implies we will alter the weights of the arrange in some way other than slope plunge and backpropagation, but usually the foremost prevalent and effective way to do so and is, apparently, the as it were way that's as of now utilized in hone. Linear Regression Linear regression may be a special case of a neural network; that’s, it is a single neuron with the character enactment work. In this segment, we will learn how to prepare straight relapse with slope plummet and, within the taking after areas, we will expand it to training more complex models. You will be able see how the slope plummet works within the taking after code square. Initialize the weights w with some random values repeat: # compute the mean squared error (MSE) loss function for all samples of the training set # We will denote MSE with J # Update the weights w based on the derivative of J with respect to each weight Until MSE falls below threshold At to begin with, this might see frightening, but fear not! Behind the scenes, it is exceptionally basic and straightforward science. But lets not lose locate of our objective, which is to alter the Page 14 of 22 weights, ω, in a way that will offer assistance the calculation to predict the target values. To do this, to begin with we ought to know how the output y i varies from the target value t i for each test of the preparing dataset (we utilize superscript documentation to stamp the i-th sample). We will utilize the mean-squared mistake loss work (MSE), which is rise to the mean value of the squared contrasts y i - ti for all tests (the whole number of tests in the training set is n). We will signify MSE with J for ease of utilize and, to underscore that, we can use other loss functions. Each y i may be a work of ω, and so, J is additionally a work of ω. As we specified already, the misfortune work J speaks to a hypersurface of measurement equal to the measurement of ω (we are certainly moreover considering the inclination) To demonstrate this, envision that we have as it were one input value, x, and a single weight, ω. Able to see how the MSE changes with regard to ω within the taking after chart: MSE diagram Our objective is to play down J, which implies finding such w, where the value of J is at its global minimum. To do this, we got to know whether J increments or diminishes when we modify ω, or, in other words, the primary subsidiary (or angle) of J with regard to ω. Page 15 of 22 Simple Linear Regression With scikit-learn There are five essential steps when you’re executing linear regression: Import the packages and classes you need. Provide information to work with and inevitably do suitable transformations. Create a regression model and fit it with existing data. Check the comes about of model fitting to know whether the demonstrate is satisfactory. Apply the demonstrate for predictions. Polynomial Regression With scikit-learn Implementing polynomial regression with scikit-learn is exceptionally comparative to straight relapse. There is as it were one extra step: you would like to convert the cluster of inputs to incorporate non-linear terms such as 𝑥². Step 1: Import packages and classes: import numpy as np from sklearn.linear_model import LinearRegression from sklearn.preprocessing import PolynomialFeatures Step 2a: Provide data: x = np.array([5, 15, 25, 35, 45, 55]).reshape((-1, 1)) y = np.array([15, 11, 2, 8, 25, 32]) Presently you have the input and output in a reasonable organize. Keep in mind simply require the input to be a two-dimensional array. That is why. reshape() is used. Step 2b: Transform input data Usually the unused step you wish to execute for polynomial regression! As you’ve seen prior, you wish to incorporate 𝑥² (and maybe other terms) as extra highlights when executing polynomial regression. For that reason, you ought to change the input array x to contain the extra column(s) with the values of 𝑥² (and in the long run more features). It’s conceivable to convert the input array Page 16 of 22 in a few ways (like utilizing embed() from numpy), but the course PolynomialFeatures is exceptionally helpful for this reason. Let’s make an instance of this class: transformer = PolynomialFeatures(degree=2, include_bias=False) You can provide several optional parameters to PolynomialFeatures: degree is an integer (2 by default) that represents the degree of the polynomial regression function. interaction_only is a Boolean (False by default) that decides whether to include only interaction features (True) or all features (False). include_bias is a Boolean (True by default) that decides whether to include the bias (intercept) column of ones (True) or not (False). This example uses the default values of all parameters, but you’ll sometimes want to experiment with the degree of the function, and it can be beneficial to provide this argument anyway. Before applying transformer, you need to fit it with .fit(): transformer.fit(x) Once transformer is fitted, it’s prepared to form a unused, adjusted input. You apply .transform() to do that: x_ = transformer.transform(x) You can also use .fit_transform() to replace the three previous statements with only one: x_ = PolynomialFeatures(degree=2, include_bias=False).fit_transform(x) That’s fitting and changing the input array in one statement with .fit_transform(). It too takes the input array and viably does the same thing as .fit() and .transform() called in that arrange. It moreover returns the altered cluster. This is often how the modern input cluster looks: Page 17 of 22 >>> print(x_) [[ 5. 25.] [ 15. 225.] [ 25. 625.] [ 35. 1225.] [ 45. 2025.] [ 55. 3025.]] Step 3: Create a model and fit it model = LinearRegression().fit(x_, y) Step 4: Get results >>> r_sq = model.score(x_, y) >>> print('coefficient of determination:', r_sq) coefficient of determination: 0.8908516262498564 >>> print('intercept:', model.intercept_) intercept: 21.372321428571425 >>> print('coefficients:', model.coef_) coefficients: [-1.32357143 0.02839286] You can obtain a very similar result with different transformation and regression arguments: x_ = PolynomialFeatures(degree=2, include_bias=True).fit_transform(x) The variable show once more compares to the modern input cluster x_. Subsequently x_ ought to be passed as the primary contention rather than x. This approach yields the taking after comes about, which are comparable to the past case: Page 18 of 22 >>> r_sq = model.score(x_, y) >>> print('coefficient of determination:', r_sq) coefficient of determination: 0.8908516262498565 >>> print('intercept:', model.intercept_) intercept: 0.0 >>> print('coefficients:', model.coef_) coefficients: [21.37232143 -1.32357143 0.02839286] Page 19 of 22 Step 5: Predict response On the off chance that you need to urge the predicted reaction, fair utilize .predict(), but keep in mind that the contention ought to be the adjusted input x_ rather than the old x: >>> y_pred = model.predict(x_) >>> print('predicted response:', y_pred, sep='\n') predicted response: [15.46428571 7.90714286 6.02857143 9.82857143 19.30714286 34.46428571] As you'll see, the expectation works almost the same way as within the case of linear regression. It fair requires the altered input rather than the first. Page 20 of 22 Conclusion AI As We Get it It Most of the AI we know nowadays works on a rule of deep learning: a machine is given a set of information and a wanted output, and from that it produces its possess calculation to solve it. The framework at that point repeats, propagating itself. This can be called a neural network. It is essential to utilize this strategy to form AI, as a computer can code quicker than a human; it would take lifetimes to code it manually. Teacher of Electrical Engineering and Computer Science at MIT Tommi Jaakkola says, "In the event that you had a really little neural arrange, you could be able to get it it. But once it gets to be exceptionally huge, and it has thousands of units per layer and perhaps hundreds of layers, at that point it gets to be very un-understandable." We are at the arrange of these huge frameworks presently. So, in arrange to form these machines clarify themselves - an issue that will have to be be illuminated some time recently we will put any believe in them - what strategies are we using? Page 21 of 22 References https://books.google.iq/ https://www.deep-learning-site.com/ http://archive.keyllo.com/ https://serokell.io/ https://www.sciencedirect.com/ Page 22 of 22