Monday March 22, 2004 DSES 6180-01 DATA MINING AND KNOWLEDGE DISCOVERY Instructor: Prof. Mark J. Embrechts (x 4009 or 371-4562) Office hrs: CII 5217 Thursday 10:00-11:00 Class Time: Monday/Thursday 4-5:20 pm (Jonson-Rowland Science Center 2C30) Book: Margaret H. Dunham, Data Mining: Introductory and Advanced Topics, Prentice Hall 2003. LECTURES #22&23: METANEURAL HANDS-ON About 80 percent of artificial neural network applications are based on the backpropagation algorithm. This lecture will review backpropagation (history and algorithm) and illustrates the use of the MetaNeural software to solve some practical neural network problems. We will discuss a fully connected feedforward (perceptron) type of neural network trained with the backpropagation algorithm. Neural network applications addressed in the literature range from pattern recognition, classification, clustering, visualization, (process) control and neuro-control, game playing, diagnostics (e.g., car industry, medical, manufacturing), forecasting, prediction, regression models, optimization (e.g. finance), hardware and embedded control implementation, code simulation, and in-silico drug design. Practical implementations of neural networks will discuss the bias, the number of hidden layers, the number of neurons, transfer functions, preprocessing, training parameters, momentum, data preparation and preprocessing, other cost functions, training strategies, pruning and growing of networks, when to stop training, the curse of dimensionality, bootstrapping, … Handouts: 1. Drew Van Camp, “Neurons for Computers,” Scientific American, pp. 170-172, September 1992. 2. Geoffry E. Hinton, “How Neural Networks Learn from Experience,” Scientific American, pp. 145 - 151 (September 1992). Homework #4 (due April 19) Experiment with the iris data and Analyze and show your predictions for iris.tes. The homework should be in a report style and rich on graphics. The purpose of the homework is to show that you are competent applying a neural network for making predictions on a dataset. You can additionally run other datasets or use different neural network software. 1 Deadlines: January 22 January 29 February 16 March 1 March 4 March 18 March 8&11 March 15 April 8 April 19 April 22/26 HW#0 (Web browsing). Project Proposal HW #1 Quiz #1 on PLS paper by Svante Wold et al. HW #2 HW #3 Spring Break Progress Report Guest lecture HW#4 Final Presentations 2