… let us move on to… Artificial Neural Networks March 31, 2016 Introduction to Artificial Intelligence Lecture 16: Neural Network Paradigms I 1 Computers vs. Neural Networks “Standard” Computers Neural Networks one CPU / few processing cores highly parallel processing fast processing units slow processing units reliable units unreliable units static infrastructure dynamic infrastructure March 31, 2016 Introduction to Artificial Intelligence Lecture 16: Neural Network Paradigms I 2 Why Artificial Neural Networks? There are two basic reasons why we are interested in building artificial neural networks (ANNs): • Technical viewpoint: Some problems such as character recognition or the prediction of future states of a system require massively parallel and adaptive processing. • Biological viewpoint: ANNs can be used to replicate and simulate components of the human (or animal) brain, thereby giving us insight into natural information processing. March 31, 2016 Introduction to Artificial Intelligence Lecture 16: Neural Network Paradigms I 3 Why Artificial Neural Networks? Why do we need another paradigm than symbolic AI for building “intelligent” machines? • Symbolic AI is well-suited for representing explicit knowledge that can be appropriately formalized. • However, learning in biological systems is mostly implicit – it is an adaptation process based on uncertain information and reasoning. • ANNs are inherently parallel and work extremely efficiently if implemented in parallel hardware. March 31, 2016 Introduction to Artificial Intelligence Lecture 16: Neural Network Paradigms I 4 How do NNs and ANNs work? • The “building blocks” of neural networks are the neurons. • In technical systems, we also refer to them as units or nodes. • Basically, each neuron – receives input from many other neurons, – changes its internal state (activation) based on the current input, – sends one output signal to many other neurons, possibly including its input neurons (recurrent network) March 31, 2016 Introduction to Artificial Intelligence Lecture 16: Neural Network Paradigms I 5 How do NNs and ANNs work? • Information is transmitted as a series of electric impulses, so-called spikes. • The frequency and phase of these spikes encodes the information. • In biological systems, one neuron can be connected to as many as 10,000 other neurons. March 31, 2016 Introduction to Artificial Intelligence Lecture 16: Neural Network Paradigms I 6 Structure of NNs (and some ANNs) • In biological systems, neurons of similar functionality are usually organized in separate areas (or layers). • Often, there is a hierarchy of interconnected layers with the lowest layer receiving sensory input and neurons in higher layers computing more complex functions. • For example, neurons in macaque visual cortex have been identified that are activated only when there is a face (monkey, human, or drawing) in the macaque’s visual field. March 31, 2016 Introduction to Artificial Intelligence Lecture 16: Neural Network Paradigms I 7 “Data Flow Diagram” of Visual Areas in Macaque Brain Blue: motion perception pathway Green: object recognition pathway March 31, 2016 Introduction to Artificial Intelligence Lecture 16: Neural Network Paradigms I 8 Stimuli in receptive field of neuron March 31, 2016 Introduction to Artificial Intelligence 9 March 31, 2016 Introduction to Artificial Intelligence 10 Structure of NNs (and some ANNs) • In a hierarchy of neural areas such as the visual system, those at the bottom (near the sensory “input” neurons) only “see” local information. • For example, each neuron in primary visual cortex only receives input from a small area (1 in diameter) of the visual field (called their receptive field). • As we move towards higher areas, the responses of neurons become less and less location dependent. • In inferotemporal cortex, some neurons respond to face stimuli shown at any position in the visual field. March 31, 2016 Introduction to Artificial Intelligence Lecture 16: Neural Network Paradigms I 11 Receptive Fields in Hierarchical Neural Networks neuron A March 31, 2016 receptive field of A Introduction to Artificial Intelligence Lecture 16: Neural Network Paradigms I 12 Receptive Fields in Hierarchical Neural Networks neuron B in top layer receptive field of B in input layer March 31, 2016 Introduction to Artificial Intelligence Lecture 16: Neural Network Paradigms I 13 How do NNs and ANNs work? • NNs are able to learn by adapting their connectivity patterns so that the organism improves its behavior in terms of reaching certain (evolutionary) goals. • The strength of a connection, or whether it is excitatory or inhibitory, depends on the state of a receiving neuron’s synapses. • The NN achieves learning by appropriately adapting the states of its synapses. March 31, 2016 Introduction to Artificial Intelligence Lecture 16: Neural Network Paradigms I 14 An Artificial Neuron synapses x1 neuron i x2 Wi,1 Wi,2 … … xi Wi,n xn n net input signal net i (t ) wi , j (t ) x j (t ) j 1 output March 31, 2016 x i (t ) f i (neti (t )) Introduction to Artificial Intelligence Lecture 16: Neural Network Paradigms I 15 The Net Input Signal The net input signal is the sum of all inputs after passing the synapses: n net i (t ) wi , j (t ) x j (t ) j 1 This can be viewed as computing the inner product of the vectors wi and x: net i (t ) || wi (t ) || || x(t ) || cos , where is the angle between the two vectors. March 31, 2016 Introduction to Artificial Intelligence Lecture 16: Neural Network Paradigms I 16 The Activation Function One possible choice is a threshold function: f i (net i (t )) 1, if net i (t ) 0, otherwise The graph of this function looks like this: fi(neti(t)) 1 0 March 31, 2016 Introduction to Artificial Intelligence Lecture 16: Neural Network Paradigms I neti(t) 17 Binary Analogy: Threshold Logic Units Example: x1 x2 w1 = 1 w2 = 1 = 1.5 x1 x2 x3 w3 = -1 x3 March 31, 2016 Introduction to Artificial Intelligence Lecture 16: Neural Network Paradigms I 18 Networks Yet another example: x1 w1 = XOR = w2 = x1 x2 x2 Impossible! TLUs can only realize linearly separable functions. March 31, 2016 Introduction to Artificial Intelligence Lecture 16: Neural Network Paradigms I 19 Linear Separability A function f:{0, 1}n {0, 1} is linearly separable if the space of input vectors yielding 1 can be separated from those yielding 0 by a linear surface (hyperplane) in n dimensions. Examples (two dimensions): x2 1 1 1 0 0 1 0 1 x2 x1 1 0 0 0 1 0 1 x1 linearly separable March 31, 2016 1 linearly inseparable Introduction to Artificial Intelligence Lecture 16: Neural Network Paradigms I 20 Linear Separability To explain linear separability, let us consider the function f:Rn {0, 1} with f ( x1 , x2 ,..., xn ) 1, if n w x i 1 i i 0, otherwise where x1, x2, …, xn represent real numbers. This will also be useful for understanding the computations of artificial neural networks. March 31, 2016 Introduction to Artificial Intelligence Lecture 16: Neural Network Paradigms I 21 Linear Separability n w x f ( x1 , x2 ,..., xn ) 1, if i i i 1 0, otherwise Input space in the two-dimensional case (n = 2): x2 -3 -2 -1 0 3 2 1 1 1 -1 -2 -3 2 1 3 x1 w1 = 1, w2 = 2, =2 March 31, 2016 x2 -3 -2 -1 3 2 1 1 1 -1 -2 -3 2 0 3 x1 w1 = -2, w2 = 1, =2 Introduction to Artificial Intelligence Lecture 16: Neural Network Paradigms I x2 -3 -2 -1 3 2 1 1 -1 -2 -3 2 0 3 x1 w1 = -2, w2 = 1, =1 22 Linear Separability So by varying the weights and the threshold, we can realize any linear separation of the input space into a region that yields output 1, and another region that yields output 0. As we have seen, a two-dimensional input space can be divided by any straight line. A three-dimensional input space can be divided by any two-dimensional plane. In general, an n-dimensional input space can be divided by an (n-1)-dimensional plane or hyperplane. Of course, for n > 3 this is hard to visualize. March 31, 2016 Introduction to Artificial Intelligence Lecture 16: Neural Network Paradigms I 23 Linear Separability Of course, the same applies to our original function f using binary input values. The only difference is the restriction in the input values. Obviously, we cannot find a straight line to realize the XOR function: x2 1 1 0 0 0 1 0 1 x1 In order to realize XOR with TLUs, we need to combine multiple TLUs into a network. March 31, 2016 Introduction to Artificial Intelligence Lecture 16: Neural Network Paradigms I 24 Multi-Layered XOR Network x1 x2 1 -1 0.5 1 1 x1 -1 x2 1 March 31, 2016 0.5 x1 x2 0.5 Introduction to Artificial Intelligence Lecture 16: Neural Network Paradigms I 25 Capabilities of Threshold Neurons What can threshold neurons do for us? To keep things simple, let us consider such a neuron with two inputs: x1 Wi,1 Wi,2 xi x2 The computation of this neuron can be described as the inner product of the two-dimensional vectors x and wi, followed by a threshold operation. March 31, 2016 Introduction to Artificial Intelligence Lecture 16: Neural Network Paradigms I 26 Capabilities of Threshold Neurons Let us assume that the threshold = 0 and illustrate the function computed by the neuron for sample vectors wi and x: second vector component x wi first vector component Since the inner product is positive for -90 90, in this example the neuron’s output is 1 for any input vector x to the right of or on the dotted line, and 0 for any other input vector. March 31, 2016 Introduction to Artificial Intelligence Lecture 16: Neural Network Paradigms I 27