SWARM INTELLIGENCE IN DATA MINING Written by Crina Grosan, Ajith Abraham & Monica Chis Presented by Megan Rose Bryant INTRODUCTION We will cover the following Biological motivation of some theoretical concepts of swarm intelligence Particle Swarm Optimization Ant Colony Optimization Basic Data Mining Terminologies Implementation with Swarm Intellegince Insect Swarm BIOLOGICAL COLLECTIVE BEHAVIOR BIOLOGICAL COLLECTIVE BEHAVIOR Swarm behavior can be seen in bird flocks, fish schools, and insects. Group behaviors of some organisms are so integrated that they appear to move as a coherent entity. These behaviors are the influence behind swarm optimization. Swarm (“School”) of Fish MAIN PRINCIPLES OF COLLECTIVE BEHAVIOR Homogeneity: every organism in swarm has the same behavior model. No leader. Locality: motion is influenced only by nearest members. Collision Avoidance: avoids collision with nearby members. Velocity Matching: attempt to match velocity of nearby members. Flock Centering: attempt to stay close to nearby members. TYPES OF COLLECTIVE DYNAMICAL BEHAVIOR Swarm: an aggregate with cohesion, but low level of parallel alignment among members. Torus: individuals perpetually rotate around an empty core. Direction of rotation is random. Dynamic Parallel Group: individuals are polarized and move as a coherent group, but group form and density fluctuate. Highly Parallel Group: much more static in terms of exchange of spatial positions. Form and density variety is minimal. SWARMS AND ARTIFICIAL LIFE SWARMS AND ARTIFICIAL LIFE Collective behavior algorithms have been applied to a variety of well-known algorithms including: Traveling Salesman Problem Quadratic Assignment Problem Graph Problems Clustering Data mining etc. TSP Point Set of Argentina PARTICLE SWARM OPTIMIZATION (PSO) PSO is a population based search algorithm. Initialized with a population of random solutions (called ‘particles’). Each particle has an associated velocity. Particles fly through space with dynamically adjusted velocities according to historical behaviors. Particles fly towards better and better search area over time. Swarm of Starlings BIOLOGICAL INTUITION FOR PSO Imagine the following: you are a bird in a flock of birds that is searching for a single French fry in a McDonald’s parking lot (the search area). You don’t know where the fry is, but you do know how far the food is and the position of all flock members. What is the best strategy to find the French fry? An effective strategy is to follow the bird closest to the food. Flock of Birds PSO ALGORITHM PSO learns from this scenario and uses it to solve optimization problems. Each bird is a particle (a single solution). Each bird has a fitness value and a velocity. Birds fly through the search space by following the bird nearest the food thus far. Political Cartoon ANT COLONIES OPTIMIZATION Now imagine that you are an ant among a colony of ants. When searching for food, you begin by searching the area closest to the nest in a random fashion. As you go, you leave behind a pheromone trail to tell your ant friends what you have found. When you find food, you use these pheromones to let everyone know how much there is and its quality. Pheromone Trail DATA MINING DATA MINING Data mining is the application of specific algorithms for extracting patterns from data. Historically, this application had been given many names including knowledge extraction, information discovery, and data pattern processing. Swarm Optimization can be very helpful in this process of Knowledge Discovery. STEPS OF KNOWLEDGE DISCOVERY 1. Developing understanding of the domain, prior knowledge, and the goal. 2. Creating a target data set. 3. Data cleaning and preprocessing. 4. Data reduction and projection. 5. Matching the goals with a particular data mining method. Success requires continual growth SWARM INTELLIGENCE AND KNOWLEDGE DISCOVERY SWARM INTELLIGENCE & KNOWLEDGE DISCOVERY Data mining and swarm optimization can be used together to form a method that often leads to very good results. PSO methods have been used successful in pattern recognition, image processing, and unsupervised classification & image segmentation. PSO IN DATA MINING Particle Swarm Optimization has been used in several data mining algorithms including the following: Visual Data Mining Recommender Systems Classification Tasks etc. PSO can often be employed when other implementations would be too large or too costly. Recommender System ANT COLONY OPTIMIZATION AND DATA MINING ANT COLONY OPTIMIZATION IN DATA MINING Ant Colony Optimization has been used with great success in clustering. Modeled after real ant behaviors, the computer ants ‘pick up’ data and move it to other areas with similar data. Several species have been studied to model different behaviors. Cluster Analysis CONCLUSIONS CONCLUSIONS Biological behaviors can inform efficient optimization techniques such as Particle Swarm Optimization and Ant Colony Optimization. These optimization techniques have a variety of applications in Data Mining. They can often be employed when other techniques prove too costly. Zerg Rush, Starcraft II