ARTIFICIAL INTELLIGENCE Artificial intelligence is concerned with how to make computers do things at which, at the moment, people are better. This is the ability of a digital computer or computer-controlled robot to perform tasks commonly associated with intelligent beings. Artificial intelligence involves developing systems that are able to carry out intellectual processes characteristics of humans, such as the ability to reason, discover meaning, generalize, or learn from past experiences. Categories of AI Narrow AI: This is when a machine has superior performance to a human when doing one specific task. General AI: This is when a machine is similar in its performance to a human in any intellectual task. Strong AI: This is when a machine has superior performance to a human in many tasks. Purpose and Structure of graphs A graph is a collection of nodes or vertices between which there can be edges. Each node has a name. An edge can have an associated label which is a numerical value. An example is presented below; Page 1 of 18 A graph can be used to represent a variety of scenarios. One common representation is that the nodes represent places and the edge labels represent the distances between those places. Edges are only included in the graph when there is a route available for direct travel between the pair of nodes. Such graphs can, for example, find the shortest route between two nodes that are not adjacent to each other. • When two nodes are connected by an edge, they are called neighbours • The degree of a node is the number of other nodes that it is connected to (i.e. the number of neighbours that it has) • A loop is an edge that connects a node to itself • A path is a sequence of nodes that are connected by edges Page 2 of 18 • A cycle is a closed path, i.e. a path that starts and ends at the same node (and no node is visited more than once) A graph can be used to represent a wide variety of complex data relationships. For example: • • • • Social networking: the nodes could be individual people. An edge could represent that two people are acquaintances. Transport networks: the nodes could be towns. An edge could represent that a train line connects two towns. The internet: the nodes could be routers. An edge could represent that two routers are connected. The World Wide Web: the nodes could be webpages. An edge could represent that the pages are linked. TYPES OF GRAPHS ❖ UNDIRECTED GRAPHS: An undirected graph allows you to move (traverse) in either direction between nodes. Below is a diagram of an undirected graph. The edges are simple lines, not arrows. The graph has 7 nodes and 11 edges. Dunwich is a neighbour of both Blaxhall and Harwich. Clacton has 3 edges directly connecting it to 3 other vertices; it therefore has a degree of 3. There are several paths from Dunwich to Clacton. Examples are: • • Dunwich → Harwich → Clacton Dunwich → Blaxhall → Harwich → Clacton Dunwich → Blaxhall → Harwich → Dunwich is a cycle. Page 3 of 18 ❖ DIRECTED GRAPHS (DIGRAPHS) Here, the edges have direction, which means that you move between nodes in a specified direction. In diagrams, arrows are used (instead of lines) to represent the edges. If the edge is bidirectional, two arrows are used (although you may sometimes see versions with double-headed arrows). In the road map example, most roads will be bidirectional; however, there will also be one-way streets. Page 4 of 18 ❖ WEIGHTED / LABELLED GRAPHS This is a graph can have values associated with the edges. Weighted graphs can be either directed or undirected. Weights can be used to record information relating to the edges. For example, in a mapping application, you may want to record distance between towns, or the time needed to travel between one town and another. Weights will enable you to find the shortest path between nodes. Page 5 of 18 Weighted, Undirected graph Weighted, Digraph Page 6 of 18 We could use our intelligence to find the shortest route (using the first graph above) between node A and node G by considering all of the possible routes and calculating the overall distance for each route. For A to B to C to D, overall distance is 40 + 10 + 40 = 90 For A to B to F to E to D, overall distance is 40 + 15 + 20 + 5 = 80, which is the shortest For A to F to E to D overall distance is 60 + 20 + 5 = 85. For A to F to B to C to Z overall distance is 60 + 15 + 10 + 40 = 125. For each graph containing 100 nodes, this could be quite time consuming. Fortunately, a number of artificial intelligence algorithms have been developed to solve this type of problem. DJIKSTRA’S ALGORITHM Dijkstra’s algorithm has one motivation: to find the shortest paths from a start node to all other nodes on the graph. • • The cost of a path that connects two nodes is calculated by adding the weights of all the edges that belong to the path. The shortest path is the sequence of nodes, in the order they are visited, which results in the minimum cost to travel between the start and end node. When the algorithm has finished running, it produces a list that holds the following information for each node: • • • The node label The cost of the shortest path to that node (from the start node) The label of the previous node in the path Using the information in this list you can backtrack through the previous nodes back to the start node. This will give you the shortest path (sequence of visited nodes) from the start node to each node and the cost of each path. Forexample: 1) Page 7 of 18 The graph above illustrates the connections between five nodes. Using Dijkstra's algorithm, find the shortest path from the start node A to all other nodes. 2) Given the following set of information produced by running Dijkstra's algorithm, in the form of a visited list, what is the route of the shortest path from A to G? Node A C F E B D G Visited list Cost (from start) 0 3 9 10 12 13 15 Assignment: Page 8 of 18 Previous none A C F A E E Carryout researches and make short notes on the following; a) Writing Dijkstra’s algorithm using structured English b) Writing Dijkstra’s algorithm using code c) Limitations of the Dijkstra’s algorithm A* ALGORITHM This is a searching algorithm that searches for the shortest path between the initial and final stage. A* algorithm has 3 parameters: - (g): It is the cost of moving from the initial cell to the current cell. Basically, it is the sum of all the cells that have been visited since leaving the first cell. - (h): Heuristic value: It is the estimated cost of moving from the current sale to the final sell. The actual cost cannot be calculated until the final sale is reached. Hence h is the estimated cost. NB: We must make sure that there is never an overestimation of the cost. - (f): It is used to find the least cost from one node to another. - { f = g + h } : It is responsible for finding the optimal path between source and destination. The way they are going to make this decision is by taking the F value into account. The algorithm selects the smallest value cells and moves to that cell. The process continues until the algorithm reaches its goal cell. Page 9 of 18 Example: Find the shortest path from A to F using A* algorithm using the chart below Assignment: Carryout researches and make short notes on the following; a) Writing A* algorithm using structured English b) Writing A* algorithm using code c) Comparing the A* and Dijkstra’s algorithm MACHINE LEARNING Page 10 of 18 When machine learning, systems learn without being programmed to learn. The requirements for machine learning can be summarized as follows 1. Database system has a defined task or tasks to perform. 2. Knowledge is acquired through the experience of performing the tasks 3. As a result of this experience and knowledge gained, the performance of future taxes improved. a) SUPERVISED LEARNING The system is fed knowledge with associated classification. For example, an AI program might be under development for making exam paper questions. In the supervised learning, answers to examination questions could be provided together with a grade for each one or with categorized comments. A special case of supervised learning is where an expert system is being developed. An expert system, always focus always has a focus on a narrowly defined domain of knowledge. In this case, human experts are given samples of data requiring analysis. Page 11 of 18 The experts provide the conclusions to be drawn from the data. The data and conclusions are input into and input to the knowledge base. The effectiveness of the system can be tested by a human expert providing sample data and checking the accuracy of the conclusions provided by the expert system. If performance is poor, then further data and conclusions are input into the system. ,n jjj b) UNSUPERVISED LEARNING The system has to draw its own conclusions from its experience of the results of the tax it has performed, for these algorithms are needed that can organize or categorize the knowledge acquired. An example is where conceptual clusters are identified which are based on a hierarchical framework. In this approach, the knowledge is initially all placed in the root of a tree structure. Then, depending on attributes of the knowledge, selected groups are moved into branches of the tree. Page 12 of 18 Nowadays, unsupervised learning is a dominant activity. Powerful computer systems having access to massive data banks are regularly used to make decisions based on previous actions recorded. We all have our activity on the World Wide Web recorded and stored. This stored data is then used to make decisions about what products or services should be recommended to us in future internet use. Systems are able to identify hidden patterns from the data provided. They are not trained using the ‘right’ answer. c) REINFORCEMENT LEARNING This has some features similar to supervised and unsupervised learning. The system is not trained. It learns on the basis of ‘reward and punishment’ when carrying out an action. That is, it uses trial and error in algorithms to determine which action gives the highest / optimal outcome. Examples include search engines, online games and robotics. Page 13 of 18 For example; An agent is learning how best to perform in an environment. The environment has many defined states. At each step, the agent takes an action An agent has a policy that guides its actions The policy is influenced by the recorded history and the knowledge of the current state of the environment. An action changes the environment to a new state. The agent receives a reward that is a measure of how effective the action was in relation to the achievement of the overall goal. The policy will guide the agent in deciding whether the next action should be exploiting knowledge already known or exploring a new avenue. ARTIFICIAL NEURAL NETWORKS The neural networks in our brains provide our intelligence. It therefore seems obvious that artificial neural networks should be considered as a foundation for AI systems. Page 14 of 18 An artificial neural network could be created in software or hardware. The triangles are the nodes in the network which represent artificial neurons. In general, a node can receive one or more inputs and can provide an output to one or more of the other nodes. Layers of artificial neural networks - The Input layer: the data’s entry point into the system The Hidden layer: where the information gets processed The Output layer: where the system decides how to proceed based on the data. The column of three nodes on the left receive input. The column on the right provides output The nodes in between form a hidden layer. Some artificial neural networks will contain several hidden layers. Page 15 of 18 The neural network functions via a collection of nodes or connected units, just like artificial neurons. These nodes loosely model the neuron network in the animal brain. An artificial neuron receives a signal in the form of a stimulus, processes it and signals other neurons connected to it. The neural workings of an artificial neural network. An artificial neuron receives a stimulus in the form of a signal that is a real number. Then; - The output of each neuron is computed by a nonlinear function of the sum of its inputs. The connections among the neurons are called edges. Both neurons and edges have a weight. This parameter adjusts and changes as the learning proceeds. The weight increases or decreases the strength of the signal at the connection. Neurons may have a threshold. A signal is sent onward only if the aggregate signal crosses this threshold. DEEP LEARNING Here, machines think in a way similar to the human brain. They handle huge amounts of data using artificial neural networks. Page 16 of 18 Backpropagation of Neural Network This is the central mechanism by which artificial neural networks learn. It is the messenger telling the neural network whether or not it made a mistake when it made a prediction. To propagate is to transmit something (light, sound, motion or information) in a particular direction or through a particular medium. To backpropagate is to transmit something in response, to send information back upstream – in this case, with the purpose of correcting an error. With backpropagation in deep learning, we deal the transmission of information and that information relates to the error produced by the neural network when it makes a guess about data. Backpropagation is synonymous with correction. Algorithms experience the world through data. So by training a neural network on a relevant dataset, we seek to decrease its ignorance. The knowledge of a neural network with regard to the world is captured by its weights, the parameters that alter input data as its signal flows through the neural network towards the net’s final layer, which will make a decision about that input. Page 17 of 18 Those decisions are often wrong, because the parameters transforming the signal into a decision are poorly calibrated; they haven’t learned enough yet. Forward propagation is when a data instance sends its signals through a network’s parameters toward the prediction at the end. Once the prediction is made, its distance from the ground truth (error) can be measured. So, the parameters of the neural network have a relationship with the error the net produces. When the parameters change, the error changes too. We change the parameters using optimization algorithms such as gradient descent. Assignment: Make short notes on the Uses of Error Backpropagation. Page 18 of 18