Introduction & Fundamentals • Part I: Introduction • Part II: Fundamental Concepts • Part III: Classification Lab Part I: Introduction What Is The Problem? Our world is full of data. After collection and organization, data, if we are lucky, becomes information. In today's interconnected world, information exists in electronic form that can be stored and transmitted instantly. Challenge is to understand, integrate, and apply information to generate useful knowledge (“actionable intelligence”) Are we drowning in data/information but starved for knowledge?? Data, Data Everywhere……. How do we extract knowledge from noisy mass of data ? Every source of data, from process and product manufacturing to medical research to activity in financial markets to patient examinations, to the billions of consumer and business purchase transactions that occur every day, is influenced by “other data” from the surrounding environment. Our world is a noisy and messy source of data - virtually nothing is known with certainty. Knowledge is based on data analysis that accommodates uncertainty. Empirical Models that Learn What Is The Solution? Interpretation requires data acquisition, cleaning (preparing the data for analysis), analysis, and presentation in a way that permits knowledgeable decision making and action. Key is to extract information about data from relationships buried within the data itself. Tools and Technology Human brain is most powerful pattern recognition engine ever invented, however, it is not very good at serially processing huge quantities of discrete data. Enter New Breed of Processor: Artificial Neural Networks • Instead of programming computational system to do specific tasks, teach system how to perform task • To do this, generate Artificial Intelligence System- AI • Empirical model which can rapidly and accurately find the patterns buried in data that reflect useful knowledge • One case of these AI models is neural networks • AI systems must be adaptive – able to learn from data on a continuous basis Artificial Intelligence • Artificial Intelligence techniques such as Neural networks, genetic algorithms and fuzzy logic are among the most powerful tools available for detecting and describing subtle relationships in massive amounts of seemingly unrelated data. • Neural networks can learn and are actually taught instead of being programmed. • Teaching mode can be supervised or unsupervised • Neural Networks learn in the presence of noise Question: What are Artificial Neural Networks? Answer: Output Layer ……… Hidden Layer 2 ……… Hidden Layer 1 ……… Input Buffer ……… Biological Basis of ANNs Biological Basis of ANNs • Animals exhibit intelligence………………. • Biological neural networks………….. • Human beings can benefit from simulation of biological neural networks on computers. These are Artificial Neural Networks (ANN) • Artificial Neural Networks (ANN) are………. • ANN’s represent an attempt to simulate…… • ANNs have many names: connectionist systems, neural nets, neurocomputers, parallel distributed processing systems, machine learning algorithms, etc • Each neuron is linked to its neighbors with varying coefficients of connectivity that represent the strengths of these connections • Learning………….. What are Neural Networks Used For? 2 Basic Types of Learning Neural Nets Supervised Learning Unsupervised Learning Types of Problems • Mathematical Modeling (Function Approximation) • Classification • Clustering • Forecasting • Vector Quantization • Pattern Association • Control • Optimization Mathematical Modeling (Function Approximation) • Modeling – often mathematical relationship between two sets of data unknown analytically • No closed form expression available • Empirical data available defining output parameters for each input • Data is often noisy • Need to construct a mathematical model that correctly generates outputs from inputs (See fig 1.22 on page 29) • Approx carried out using …………………………. • Network learns …………………………………………. • Trained Neural Net can be substituted for …………………………. • Fast computation, reasonable approximation Classification • Assignment of objects to specific class • Given a database of objects and classes of those objects • Deduce …………………………… • Create a classifier that will ……………….. Clustering • Grouping together objects similar to one another • Usually based on some “distance” measurement in object parameter space • Objects and distance relationships available • No prior info on classes or groupings • Objects clustered based on ……………….. • Clustering may precede ……………………… • Similar to statistical k-nearest neighbor clustering method Forecasting • Prediction of future events based on history • Laws underlying behavior of system sometimes hidden; too many related variables to handle • Trends and regularities often masked by noise • Prediction system must be able to ………………. • Time series forecasting – special case of ………. • Weather, Stock market indices, machine performance Vector Quantization • Kohonen classifier – most well known • Object space divided into several connected regions • Objects classified based on proximity to regions • Closest region or node is “winner” • Form of compression of high dimensional input space • Successfully used in many geological and environmental classification problems where input object characteristics often unknown Pattern Association • Auto-associative systems useful when incoming data is a corrupted version of actual object e.g. face, handwriting • Corrupt input sample should trigger …………… • Require a response which ………………… • May require several iterations of repeated modification of input • Will be discussed under …………………….. Control • Manufacturing, Robotic and Industrial machines have complex relationships between input and output variables • Output variables define state of machine • Input variables define machine parameters determined by operation conditions, time and human input • System may be static or dynamic • Need to map inputs to outputs for stable smooth operation • Examples include chemical plants, truck backup, robot control Optimization • Requirement to improve system performance or costs subject to constraints • Maximize or Minimize ……………………. • “Terrain” of objective function typically very ………………………………….. • Large number of …………….. affecting objective function (high …………… of problem) • Design variables often subject to ………………. • Lots of local ……………………….. • Neural nets can be used to find global optima (Ch 7) Now for some Practical Applications ! Neural Network Applications • • • • • • • Neural networks have performed successfully where other methods have not, predicting system behavior, recognizing and matching complicated, vague, or incomplete data patterns. Apply ANNs to pattern recognition, interpretation, prediction, diagnosis, planning, monitoring, debugging, repair, instruction, control GOTCHA! Biomedical Signal Processing Biometric Identification System Reliability Business Spiral Inductor Modeling Target Tracking • Common use for neural networks is to project what will most likely happen demand prediction. Can help in setting ……………………. For example, hospital emergency rooms, communications systems, power distribution, consumer goods manufacture and storage, ………… • Extremely successful in categorization, pattern recognition. System classifies object under investigation (e.g. an illness, a pattern, a picture, a chemical compound, a word, the financial profile of a customer) into one of numerous possible categories. This triggers ………………………………… GOTCHA! • GOTCHA…………………………. • Current surveillance and reconnaissance systems (S&R) structured to observe huge areas, attempt to detect movement of hostile forces. • “Forensic” approach: ……………………………………. • • USAF Command and Control Battlelab (C2B): use past S&R imagery to locate an explosion, run data back in time to identify the vehicle (or object) which carried munitions, “lock” onto vehicle & backtrack to locate significant portions of path – assembly areas, passenger pickup, arming site, and any other spot with intelligence value • Technology: imagery, net-centric gathering & sharing of data, target identification, ………………………….. • Collaboration of information ……………………….. Biomedical Signal Processing Biometric Identification • “Instant Physician” developed using neural net • Net presented with a set of symptoms, medical records • Output is best diagnosis and treatment • Finger prints never change. Bifurcations or “Minutae” ……………………………………… • Minutiae-based techniques find minutiae points and map their relative placement on the finger • Large volumes of fingerprints are collected and stored everyday in a wide range of applications including forensics, access control, and driver license registration • Automatic recognition of people based on fingerprints requires …………………………. • FBI database contains 70 million fingerprints! System Control & Reliability • Backing Up a truck to a loading dock is a difficult problem for a novice, easy for an experienced driver • Very difficult problem mathematically • Can train a neural net to ……………………. • Automobile airbags can do serious damage ………………………… • Accelerometer MEMS are ……………. • System reliability continuously assessed & failure pre-empted by correct interpretation of data from accelerometers Business • Mortgage Risk Assessment – reduces delinquency rates • Inputs include years of employment, # of dependents, property info, income, loan-to-value-ratio • Output is ………………………. • Prediction of of behavior of stock market indices • Requires knowledge of …………….. • Time series forecasting • Short and long term predictions Spiral Inductor Modeling • In today’s portable wireless communications market, demand is for low cost, low power dissipation, high frequency IC building blocks that incorporate spiral inductors on the silicon substrate • Challenge: ……………………… • Empirical models widely reported based on actual measurements but non-predictive and do not permit re-design of inductor layout • Neural network approach serves as basis for ………………… ……. and permits ………………………. from post-optimization inductor circuit-level parameters • Ilumoka& Park, Proc. SSST 04, Georgia Tech, Atlanta GA, March 2004 Automatic Target Tracking & Recognition • Algorithms for automated tracking & recognition of targets have recently been identified by the National Critical Technologies Panel as of critical importance to mission of the Department of Defense. • Automated target recognition, localization, and tracking in the presence of ……………. is an important signal processing problem • Algorithms have been developed for …………………………… such as those that occur during active jamming, non-cooperative maneuvering & complex battlefield scenarios of the future MUTUAL FUNDS: NEURAL NETWORKS versus REGRESSION ANALYSIS • For neural networks to be successful, they must outperform methods currently being used in the marketplace. • Mutual funds are basically ……………………………………. …. Mutual funds have become a major force on Wall Street over the past few years. They function much like an individual security and their prices should reflect all public information. Relationships between …………………………………………………. are very hard to forecast. For years, regression analysis has been a popular tool investors have used to forecast …………………… of mutual funds. • Investors know that neural networks might be able to pinpoint these relationships better than old methods. • Predictions made for Net Asset Value using 15 economic variables as inputs showed that neural networks were 40% better as tools for forecasting: …………………………….(difference between actual and forecasted NAV) was ………. for neural nets as compared to …….. for regression. • Important reason for superior performance of neural networks is its ………….. It was able to look at all aspects of relationships, whereas regression analysis was ……………………………………………… • . Historical Perspective Origin of ANNs is neurobiological research in the early 20th century Several fronts of attack: • Neurobiologists: How do nerve cells behave when stimulated by an electric current? • Psychologists: How is learning accomplished by animals? • Mathematicians: How can we apply gradient descent to neuron learning? • McCulloch & Pitts – 1st math model of neuron • Learning rules devp by Hebb (1945), Rosenblatt (1958), Widrow-Hoff (1961) • Several Limitations encountered • See pages 5-7 for chronological history Neurons • Biological neurons are ………………………. receiving & sending signals across synapses to other neurons via tree-like dendrites at both ends (see fig 1.3, page 8) • Artificial neurons are ……………………….. (fig 1.4, page 10) in which inputs are weighted and summed to produce a weighted sum output net • Activation function f is ……………………………… f(net) corresponding to firing frequency of the biological neuron • Several different activations are possible including ……………… • Although neural networks have great potential, there is still a long way to go. Complex neural networks have less than the brain power of a ………………. (100,000 neurons). Human brains contain about …………………….. neurons. • Neural network software sales annually exceed $2 billion because they offer ……………………………….. Part II: Fundamental Concepts _ a _a _ _ _ m What is a conceptual framework that permits investigation of phenomena in a field of enquiry ? Answer …………………….. Important ANN Parameters 1. Architecture (or Topology) 2. Learning Rule 3. Paradigm: Combination of Architecture & Learning Rule – complete neural network model - emulates …………………………. ……………….. ……………….. PARADIGM Architecture MLP Feedforward • What is architecture? …………………….. • Single node, single layer insufficient for practical problems; require multiple nodes connected by excitatory (positive) or inhibitory (negative) weights • 3 types of nodes: ………..(receive external inputs), ………. (generate external outputs) , ……….. (no interaction with external environment) • Nodes often partitioned into layers (layered nets); intra-layer connections may be prohibited (acyclic) – see fig 1.13, 1.14 page 19 • Feedforward nets – ………………………. LVQ Hopfield Net Hamming BAM • Recurrent nets – …………………………… ART Modular Net Backpropagation Learning Rule • What is a learning rule or algorithm ? • Learning is process by which neural net adapts itself to stimulus in order to produce a desired response • Learning rule is ……………………. • Just as individuals learn differently, neural network have different learning rules • Learning may be Supervised or Unsupervised • Supervised learning requires that when the input stimuli are applied, the desired output is known a priori Competitive Correlation ……………. ……………….. Paradigms Examples of Paradigms • Adaptive Resonance Theory (ART) (………………………………) • Modular Neural Networks MNN (……………………………………) • Learning Vector Quantization (……………………) Modular Neural Network Paradigm successfully applied to Spiral Inductor Modeling Backpropagation Modular Architecture Course Objective To understand, successfully apply and evaluate neural network paradigms for problems in science, engineering and business Supervised Learning Will begin with supervised learning. Procedure is: Select Neural Net …………… & ………………….. Present………. to neural net Supply …………………… Train neural net to learn ……………………………… Test or Verify that network has learned and can ……………. well ………………….. network Biological Neural Net Example : Human Eye retina fovea rods • • • • • • • • • • • cones lens optic Eye is ………………. of brain – powerful bio-electrochemical computer nerve Light enters thru ……and focuses on …….. (similar to photographic film) Retina is dense matrix of photoreceptors – ………………………………. Rods – form ……… images in dim light 100X more sensitive than cones Cones handle …………….., 4X faster than rods in response to light Rods, Cones convert light to electric signals, total 130million, 6% cones Highest conc of cones is in ……….., 1.5mm diameter, 2000 cones Retinal neurons arranged in layers receive electric signals via synapses Pre-processing of image takes place at retinal level in neuron network Signals arrive at …………….(1 million), axons of which form …………. Optic nerve fibers terminate in lateral geniculate nucleus LGN in brain Visual Pathways Layers of retinal neurons cones rods Right Optic Nerves from L & R eye to R & L LGN Left Questions on Biological Example • Q1: What is basic building block of nervous system? • Q2: What are basic parts of neuron? • Q3: If cones were absent from retina, how would color pictures be perceived? • Q4:How can eye be adjusted to large differences in light intensity? (e.g. sun to star is 10billion range Part III: Classification Lab Botanical Application Example: Iris Flower Classification Botanical Application Example: Iris Flower Classification • 3 species of Iris – SETOSA, VERSICOLOR, VIRGINICA • Each flower has parts called PETALS & SEPALS • Length and Width of sepal & petal can be used to determine iris type • Data collected on large number of iris flowers • For example, in one flower petal length=6.7mm and width=4.3mm also sepal length=22.4mm & sepal width =62.4mm. Iris type was SETOSA • Neural net will be trained to determine specie of iris for given set of petal and sepal width and length Iris training and testing data: Sepal Length Sepal Width Petal Length Petal Width Iris Class 0.224 0.624 0.067 0.043 Setosa 0.749 0.502 0.627 0.541 Veracolor 0.557 0.541 0.847 1.000 Virginica 0.110 0.502 0.051 0.043 Setosa 0.722 0.459 0.663 0.584 Veracolor 0.776 0.416 0.831 0.831 Virginica 0.196 0.667 0.067 0.043 Setosa 0.612 0.333 0.612 0.584 Veracolor 0.612 0.416 0.812 0.875 Virginica 0.055 0.584 0.067 0.082 Setosa 0.557 0.541 0.627 0.624 Veracolor 0.165 0.208 0.592 0.667 Virginica 0.027 0.376 0.067 0.043 Setosa 0.639 0.376 0.612 0.498 Veracolor 0.667 0.208 0.812 0.710 Virginica 0.306 0.710 0.086 0.043 Setosa 0.196 0.000 0.424 0.376 Veracolor 0.612 0.502 0.694 0.792 Virginica 0.137 0.416 0.067 0.000 Setosa Iris Flower Classification • Since output is non-numeric, will use a 3bit binary code to specify output • 1 0 0 represents SETOSA • 0 1 0 represents VERSICOLOR • 0 0 1 represents VIRGINICA • Columns 1-4 rep sepal L, W and petal L, W in mmX0.01 • Sample data below 0.224 0.749 0.557 0.11 0.722 0.776 0.196 0.624 0.502 0.541 0.502 0.459 0.416 0.667 0.067 0.627 0.847 0.051 0.663 0.831 0.067 0.043 0.541 1 0.043 0.584 0.831 0.043 1 0 0 1 0 0 1 0 1 0 0 1 0 0 0 0 1 0 0 1 0