Lecture 1: Book: Read Chapter 1.1. and 1.2 (inclusive) from the book: “Neural Networks and Learning Machines” (3rd Edition) by Simon O. Haykin (Nov 28, 2008) Human brain computes in entirely different way than conventional computer Brain is parallel computer Brain organizes neurons to perform computations, such as: pattern recognition, perception, motor control). It does it many times faster than fastest computer At birth, a brain already has considerable structure and the ability to build its own rules of behavior through experience Plasticity permits the developing nervous system to adapt to its surrounding environment Neural network is a machine that is designed to model the way in which the brain performs a particular task or function of interest We focus on a class of neural networks that perform useful computations through learning. To achieve good performance, neural networks use massive interconnection of simple computing cells (neurons [processing units]) Neural network viewed as an adaptive machine: o A neural network is a massively distributed parallel processor made up of simple processing units that has natural tendency for storing experiential knowledge and making it available for use. It resembles the brain in two respects: 1. Knowledge is acquired by the network from its environment through learning process 2. Interneuron connection strengths (synaptic weights) are used to store the acquired knowledge Learning algorithm – procedure used to perform learning process – its function is to modify synaptic weights of the network Neural network can modify its own topology (motivated by neurons in a brain that can die and new synaptic connections would grow) Benefits of Neural Networks: o NN derives its computing power through a) massive parallel distribution and b) ability to learn and generalize (NN produces reasonable outputs for inputs not encountered during training [learning]) o These 2 capabilities of NN makes it possible to find good approximate solutions to intractable problems o In practice, large problems are decomposed into smaller ones and NN is assigned to sub-problems that match NNs capabilities o NN offer: Nonlinearity – artificial neuron can be linear or nonlinear. NN that consists of nonlinear neurons is nonlinear. Nonlinearity is distributed. Input-Output Mapping – popular paradigm of learning: modification of synaptic weights of a neural network by applying a set of training examples (task examples). Each example consists of a unique input signal and corresponding desired response. NN presented with example, synaptic weights are modified to minimize difference between the desired response and actual response of NN. Training is repeated for many examples until NN reaches steady state where there are no further significant changes in synaptic weights. Training set may be re-applied in different order. NN learn from examples by constructing input-output mapping for problem at hand. No prior assumptions are made on input data. Adaptivity – NN are capable of adapting their synaptic weights to changes in surrounding environment (NN trained in one environment can easily be retrained to deal with minor changes in environment it operates in). When NN operates in nonstationary environment (where statistics change with time), NN may be designed to change synaptic weights in real time. Architecture of NN and its ability to adapt makes it a useful tool for adaptive pattern classification, adaptive signal processing and adaptive control. Generally, more adaptive we make a system, ensuring it will be stable, the more robust its performance will be when system is required to operate in nonstationary environment. Note, adaptivity does not always lead to robustness (it may do the opposite – short time constants, that change rapidly, may respond to disturbances, causing degradation of system’s performance). Constants should be long enough to ignore disturbances and still short enough to respond to meaningful changes in environment (known as stability-plasticity dilemma). Evidential Response – In the context of pattern classification, NN can be designed to provide information not only about which particular pattern to select, but also about confidence in the decision made. Information about confidence may be used to reject ambiguous patterns and, therefore, improve classification performance. Contextual Information – knowledge is represented by the structure and activation state of NN. Every neuron is potentially affected by the global activity of all other neurons. Fault Tolerance – NN implemented in hardware, has potential to be fault tolerant. For example, if one neuron or its links are damaged, recall of stored pattern is damaged in quality, but, due to distribution of the NN, considerable amount will need to be damaged to degrade it seriously. It has been proved empirically, however, it may be necessary to take corrective measures in designing the algorithm used to train the network VLSI Implementability – parallel nature of NN makes it potentially fast for the computation of certain tasks. Same feature makes NN well suited for implementation using verylarge-scale-integrated (VLSI) technology. VLSI benefit is that it provides ways to capture very complex behavior in a hierarchical fashion. Uniformity of Analysis and Design – NN are universal as information processors – same notation is used in all domains where NN are applied. Manifestation of this feature: - Neurons are parts common to all NN - Makes it possible to share theories and share learning algorithms in different applications - Modular networks can be built through seamless integration of modules (construct network of modules) Neurobiological Analogy – motivation for NN comes from the brain. (fault-tolerant, fast, powerful) The Human Brain o Human nervous system can be depicted as 2-stage system: Stimulus->Receptors<->Neural net(Brain)<->Effectors->Response Neural net continuously receives information, perceives it, and makes appropriate decisions. Arrows: Forward transmission & Feedback Receptors convert stimuli from human body or external environment into electrical impulses that transfer information to neural net (brain) Effectors convert electric impulses, generated by neural net (brain) into responses as outputs o Neurons are five/six times slower than silicon logic gates. However, brain makes up for slow rate of operation by having lots of neurons (10 billion with 60 trillion synapses/connections) with many interconnections between them. In addition, brain has very efficient o o o o o o o o o o Lecture 2: structure. Brain uses 10^(-16) joules per operation/second, while computers require much more energy. Synapses/nerve endings mediate interactions between neurons. Most common synapse – chemical synapse – prior process releases transmitter chemical that spreads across synaptic joints between neurons and then acts on post-synaptic process (converts prior electrical signal into chemical signal and back to post synaptic electrical signal) [nonreciprocal two-port device]. Traditionally: synapse is a connection that can excite or inhibit, but not both, on receptive neuron Plasticity serves a) creation of new synaptic connection between neurons and b) modification of existing synapses Axons – transmission lines (smoother surface, fewer branches, greater length) Dendrites – receptive zones (resembles tree) (irregular surface, more branches) Neurons come in different shapes and sizes Most of neurons encode their outputs as series of brief electrical pulses (action potentials/spikes) that originate at or close to cell body of neurons and propagate across individual neuron at constant velocity and amplitude. Voltage decays exponentially with distance Structural organization of the brain: Central nervous system->Interregional circuits->Local circuits>Neurons->Dendritic trees->Neural microcircuits->Synapses>Molecules Different sensory inputs (motor, visual, auditory, etc.) are mapped onto corresponding areas of the cerebral cortex We are nowhere near recreating these levels with artificial neural networks. Networks we are able to design are primitive compared to Local circuits and interregional circuits Lecture 3: