An introduction to Machine Learning: Applications and value Wahidullah Rahmani Student number: 216422 Pawan K.C Student number: 216368 1 Table of contents An introduction to Machine Learning: importance and applications ........................................ 1 1 Introduction ....................................................................................................................... 3 2 Literature review ................................................................................................................ 5 3 Machine learning ............................................................................................................... 6 4 3.1 What is machine learning?........................................ Feil! Bokmerke er ikke definert. 3.2 Supervised learning, unsupervised learning and semi-supervised learning ............. 10 3.3 Deep Learning ........................................................................................................... 12 Applications ...................................................................................................................... 12 4.1 Computer vision ........................................................................................................ 12 4.2 Natural language processing ..................................................................................... 15 4.3 Anomaly detection and Predictive modelling ........................................................... 17 5 Conclusion ........................................................................................................................ 18 6 References........................................................................................................................ 19 2 1 Introduction Machine learning (ML) is the fascinating subfield of artificial intelligence that allows computers to learn from data. By performing calculations on big data using computer algorithms and build a model that help structure said algorithm, ML enable the ability to solve complex human tasks at an increased efficiency and accuracy. As the field develops, machine learning shows potential in transforming a wide range of sectors, where business has already reaped the benefit of ML using the technology to increase department efficiency and productivity. The Data science and Machine learning Market study reported that in 2019, 40% of Marketing and Sales teams consider AI and machine learning critical to their success as a department. Furthermore, the study finds that the R&D, Business Intelligence Competency Centres (BICC) and executive management all score above 60% in a measure of importance level [2]. Figure 1 below show a graph of the importance level by department for ML and AI: Figure 1: Graph of importance measurement of ML in departments 3 From the explosion of data (big data) that is available and the increase of computational power in recent years, the field of machine learning has seen exiting new advancement. In areas such as healthcare, education, transport and maintenance machine learning has already improved the effectiveness of products and services [2]. For example, ML is integrated in image recognition systems supporting doctors with medical images, recommender systems used by online retailers to improve consumer satisfaction and development of autonomous vehicles in transport. Furthermore, machine learning can add new insights and value to scientific research such as neuroscience and particle physics [3]. A research on the impact of ML on the economy and employment in the UK show that [4]: - 35% of job could with a 66% chance be automated over the next decade - 15 million jobs could be automated over the next decade - Technically, it is possible to automate over 70% of the component tasks for 10% of jobs In any case, the application of ML is vast and already seen in many different areas, highlighting the potential and importance of the technology. In this paper I will define machine learning and discuss its current applications and values. The paper will first discuss what machine learning is and how it works, thereafter discuss its applications in three different sub-field of ML and briefly discuss the value it creates in research and business. The last chapter will conclude with comments and future work. 4 2 Literature search In this chapter I will clarify how I conducted the literature review for this paper. The objective of the study is to analytically evaluate and investigate the definition of machine learning and the application of ML in computer vision, natural language processing and anomaly detection. The database Google Scholar was included in the search. Keyword used were: Machine learning introduction, Machine learning applications, Machine learning in computer vision, Machine learning in natural language processing and Anomaly detection. This resulted in a large set of articles, where the mean result of all the keywords were approximately 4540000 articles. After examinations of around 50 articles’ content excluding citations the number of narrow down to 20 articles. These articles were chosen mainly due to their introductory content which is relevant to this paper as it covers the basics of machine learning, and due to a further advance search. The advance search added the word ‘application’, ‘business’ and ‘impact’ to narrow down the search as this paper discuss mainly the application of each sub field of machine learning. In the next chapter I will start defining a few machine learning concepts, outline the steps used to build a machine learning application (model) and explain the current methods that exist. 5 3 What is machine learning? Machine learning, a branch of artificial intelligence, is an application that provides a system the ability to learn from data and improve from experience without being explicitly programmed. In machine learning algorithms, which from data science is defined as a sequence of statistical processing steps, a data set is trained to find patterns and make predictions. The accuracy of the predictions and decisions depends on the quality of such models. Below I will describe the process of building a machine learning model that utilize such algorithms. When data scientists build a machine learning model there are typically 4 main steps involved: 1. Collect and prepare data For the purpose of building the model, relevant data must be collected and prepared before it is ingested to solve a unique problem. In some cases, the data is labelled (tagged data) and therefore will clarify to the algorithm the features and classifications the model needs to identify. The other data will be unlabelled, and the model will have the task of assigning classifications and extracting features from that data. In either case, the data collected is either a ‘training set’, ‘cross validation set’ or ‘test set’ where each set are separate from each other: • Training set is used to train the algorithm to solve the problem it is designed to solve, • Cross validation set is for validation and improvement of the model accuracy • Test set will test the final trained model’s prediction on new unseen data. Before the model is trained to solve a problem, however, the data must be prepared – analyse and remove biases and imbalances, dedupe (remove duplicate data) if necessary, randomized or label data. This is called ‘pre-processing’ the data and it is an important first step for model building as it can increase overall efficiency and accuracy of the model. 2. Choosing a model The next step is to choose a model, which is dependent on the type (labelled or unlabelled) and the amount of data in the training set, and the type of problem at hand. 6 Common types of machine learning models used with labelled data are: • Regression models: There are two main types of regression models: logistic regression and linear regression models. Linear regression is used to predict the value of a dependent variable(output) based on the independent variables (input) whereas logistic regression is used when the output dependent variables is binary in nature based on its independent variables (input). An example of linear regression is predicting an engineer’s annual salary (dependent variable) based on the years of experience or education (independent variable), whereas an application of logistic regression (most often used in classifications problems) could be to classify a type of fruit present in an image. • Decision trees: Decision tree is a predictive model that utilizes classified data (the branches) in order to make recommendations (the leaves) based on a set of decision rules. For instance, if we want to predict whether a person is fit or not (classification problem) based on their eating habits, diet and physical activity this would be a classification tree. In this case, the classified data is the description of that person and his habits and the recommendation by the predictive model is either fit or unfit. • Instance-based algorithmic models: A machine learning algorithm that generalizes to new examples by measuring its similarity with the training examples. This is called instance-based because it compares the new problem instances with instances from the training. An example of instance-based model is the K-Nearest Neighbour (KNN) which is a classification model that estimates the likelihood of a data point belonging to one group or another based on its closest distance to other data points. 7 Common type of unlabelled machine learning models: • Clustering models: A clustering model uses an algorithm that automatically groups a set of data points that have similar records into ‘clusters’ and labels them according to the group they belong to. This process is carried out without prior knowledge of the groups’ characteristic. The most common types of clustering algorithms are: K-means, Kohen clustering and TwoStep. • Association models: The model uses an association algorithm that finds important relations between variables or features in a data set. The algorithm is a rule-based machine learning technique used in data mining (turning raw data into useful information) which applies a measure of interestingness in order to generate a new association rule for new information. An application of this to predict customer preference based on their shopping history, where the association rule algorithm improves performance and generates new rules as more data on the customer is analysed. • Neural networks: A computational learning system based on a collection of connected units or nodes called artificial neurons which resembles the biological neural network of a human brain. A neural network has 3 different layers: input layers (data ingestion), at least on hidden layer (calculations performed on the input data) and an output layer where the probability of each conclusion of the hidden layers are assigned. The neural network algorithm does not require specific rules that defines what to expect from the input data, and instead learns from many labelled examples of the training set. By learning the characteristic of the input that are needed to construct a correct output through many processed training examples, the neural network model can now accurately predict the desired results on new unseen inputs. 8 Figure 2: Neural network with layers An example of a neural network algorithm is to predict whether an image contains a cat or not. By processing many training examples consisting of data points (e.g., colour pixel values) that represent a cat, the neural network can learn to classify a cat and from new unseen image predict whether a cat is present. Below in Figure a simplified version of this is shown: Figure 3: Neural network algorithm classifying a cat 3. Training the algorithm to create the model: In this step the model’s algorithm is trained, which is a process where the algorithm’s ability to predict the correct outcome is iteratively improved. With each iteration the algorithm is trained as following: • The relevant variables are processed through the models’ algorithm, • Compares the output with the actual results that it should produce (error calculation • Adjust weight (parameters) and possible biases such that the accuracy of predictions might be improved 9 • Run the variables through the algorithm again and repeat the above steps if necessary • The trained algorithm with optimal accuracy is used create the machine learning model 4. Testing and improving model: The final step of the machine learning process is to test the trained model with new data (test data). For example, if the model is to identify spam the test data will be new incoming email messages that must be classified. This final step is necessary as the model shows a better approximation on how it will perform in the real world. 3.1 Supervised learning, unsupervised learning and semi-supervised learning There are four main methods used in machine learning: Supervised learning, unsupervised learning, semi-supervised learning and reinforcement learning. Supervised learning This is the machine learning task that uses labelled training data that consists of inputs and desired outputs, and the model learns a function that maps new examples according to these input-output pairs. The example of identifying a cat in the image mentioned is a supervised learning algorithm, as the model is designed to existence of the cat based on a data set of various labelled cat images. The advantage of supervised learning compared to the other methods is that it requires less training data and makes the training process easier as the model predictions are based on already labelled results. However, labelled data can be expensive to prepare and the model can be subjected to bias or overfitting, which is where the model is too closely tied or too biased to the training data that the algorithm fails to handle the variations in the new data set accurately. 10 Figure 4: A typical supervised learning algorithm Unsupervised learning Unsupervised learning method take unlabelled (raw) training data and uses an algorithm to extract meaningful features that needs to either be classified, labelled or sorted with minimum human supervision. In the example of detecting spam, an unsupervised learning algorithm can take a huge training set of emails, reveal patterns and features that indicate spam and as a result classify new examples more efficiently over time. Figure 5: Typical unsupervised learning algorithm Semi-supervised learning Semi-supervised learning is a medium between supervised and unsupervised learning. This method typically uses labelled data set to guide classification and feature extraction for a larger, unlabelled data set. On the other hand, if labelling there are not enough labelled examples and/or the labelling process is too expensive, an unsupervised learning algorithm can solve this problem and feed the output data into a supervised learning algorithm. Reinforcement learning This machine learning method is called behavioural machine learning and it concerned with how software agents (colloquially known as bots or robots) ought to act in an environment in order to derive at a successful outcome. The model is like the supervised learning model, however rather than learning from labelled data it learns by using trial and error. This is a 11 sequence of exploring uncharted territory (action) and exploiting current knowledge (state of environment). Reinforcement learning is, for example, used in learning an agent to play blackjack. In this example, the state would be the sum of the cards, the action to take is either to hit or stand and the successful outcome is to get 21. 3.2 Deep Learning Deep learning is a subset of machine learning based on multi-layered artificial neural networks. This is like a neural network described previously but with more than 3 hidden layers between the input layer and output layer of the network. Within each hidden layer data is calculated, biases and weights are applied, and the algorithm progressively learns over time and improves the accuracy of the desired outcome. Deep learning models can be supervised, semi-supervised or unsupervised. Currently, the deep learning model is used in many AI applications and services and has improved areas such as automation (self-driving cars) and performing analytical and physical task without any human effort. 4 Applications 4.1 Computer vision In this chapter, I will discuss the interaction between machine learning and computer vision and the limitations that arises between them. Although these two sub fields of artificial intelligence have been a long-standing tradition and can be considered matured technologies in industry [5], their application has been limited in the past. Recently, due to the availability of large amount of data and a new abundance of processing power significant advances in computer vision and machine learning have been made [6]. Computer vision is the scientific field that deals with how computers can gain a complex level of understanding from digital images and videos. From the perspective of engineering, it is the ability of a computer agent to comprehend and automate task that a human visual system would do. There are three widely popular machine learning paradigms for computer vision today: neural networks, support vector machines (SVM) and probabilistic graphical methods [2]. The most common neural network used is convolutional neural networks (CNN’s) which is a category of neural networks that has neurons with dimensions (width, 12 height and depth). SVM’s are sub-domains of supervised machine learning and it is popular in classification tasks due to its efficiency [3]. Furthermore, there are currently many open libraries available for the implementation of complex computer vision algorithms which can be integrated with programming languages such Python, Java etc. Machine learning is currently used in the analysis and classification of images, specifically the discipline of image recognition and signal processing which falls under the general discipline of pattern recognition [9]. Image below shows a typical example of image analysis: Figure 6: Classification, object detection and instance segmentation The image shows a machine learning algorithm for object detection, classification of the object (cat or dog) and their instance segmentation (classification and delineation). For this application, a deep learning algorithm is used, which has learned from many training examples in the form of images of dogs and cats [13]. Machine learning and computer vision has many innovative applications in fields such as engineering, healthcare, agriculture and sports to name a few. Table 1 below will categorize and describe some common solutions that exists: Demonstrated area Description Food security, agriculture production and flood prediction Identify related plant and [3] weed species using ML on satellite images. Determine the health of crops in Sub-Saharan African agriculture. Deep learning algorithm used [12] to detect breast cancer using X ray images. DL is also used to interpret high quality gastrointestinal endoscopic images. Haemorrhoid detection and endoscopy Reference 13 Traffic prediction Self-driving cars Performance evaluation and game predictions Traffic flow detection (classifying objects such as cars, bicycles, pedestrians etc.) In Los Angeles, CV and ML is used to track and manage traffic. CNN’s technologies used in research for path planning, scene perception and motion control of autonomous vehicles. Automated cricket scoreboard from umpire gestures and in analysation of tennis player performance to predict future outcomes. [9] [2] [2] Moreover, the application of computer vision is used in predictive maintenance (fault detection in manufacturing units) human behavioural studies and many other industrybased problems for increased effectivity and accuracy of solutions [12]. The field of computer vision is rapidly advancing as the robustness of the algorithm increase [5], and as research evolves in areas such as biological sciences, human activity, management, maintenance and many other related areas. From [10] the study highlights the development of research areas in the integration of machine learning with computer vision: From the graph we can see a wide range of research being conducted in important areas. Conclusively, machine learning renders big capabilities of solving human vision problems. In applying a synthesis of vision algorithms and machine learning models we can implement 14 advanced techniques of image understanding and as a result increase efficiency in areas such as predictive maintenance, medical images and quality control. From the integration of the two fields, we have a strong potential of contributing to the changing dynamic of a computer vision system, provide agents with strong image analysis capabilities and ultimately improve safety, healthcare and food production. In the next chapter, I will discuss another application of machine learning: Natural Language Processing (NLP). 4.2 Natural language processing NPL is a subfield of linguistics and artificial intelligence defined as the ability of a computer to process and analyse human language. For example, computers can use NPL for reading text, speech recognition and measure sentiment. In any case, the goal of NPL is to use raw language input data coupled with linguistics and algorithms to transform or enrich texts. For this process, due to a massive amount of unstructured data being generated, machine learning has become critical in automating analysation of texts and speech data efficiently. In general, the technique of NLP is to break sentences into piece, interpret the relationships between them and explore the meaning that arises. Most common NPL algorithms include tasks such as tokenization and parsing, stemming, language detection and the identification of semantic relationships. These tasks are used in the following popular methods of NPL: • Content categorization: A summary of a text document, produced by searching and indexing words, detecting duplicates and using content alerts. • Sentiment analysis: Also known as opinion mining, it is the process of identifying, extracting and quantifying affective states and subjective information. • Conversion: The method of recognizing phonemes or words and transform text into speech or vice versa. • Machine translation: Automatically translating words or speech between languages. Natural language processing methods have a wide range of technological solutions. The most used machine learning techniques for NPL are deep learning neural networks, vector quantization and dynamic time-warping [4]. Table 2 below outlines and describes a few such applications that exists today: 15 Demonstrated area Description Reference Chatbots A software agent that automatically generates a conversation with a human agent. Used by many businesses to aid consumer queries. Detection of spam in emails by extraction of meaning and frequency of words within the content of the email. Extract relevant and useful information from social media, customer service representatives, email etc. to improve and develop business. Automatic grammar check is also used widely. Mobile apps, home automation systems, virtual assistance and video games are among the technologies that use speech recognition algorithms. Google translate, Microsoft translator and Amazon translate use NPL algorithms to translate text and speech. [] Spam filters Information extraction Voice driven interface Machine translation [2] [5] [5] [5] NPL is, therefore, extremely useful in areas such as streamlining customer service, telecommunications, fraud detection and improving business performance. From this we can already see emerging application of future technologies such as automating home security systems, thermostats, lights and other extensions of our everyday tool. By integrating machine learning algorithms with NPL we increase efficiency of analysing and manipulating texts and develop exciting new technologies that can interact with and provide quick solutions to humans. 16 In the next chapter I will discuss another two important machine learning applications: anomaly detection and prediction. 4.3 Anomaly detection and Predictive modelling Anomaly detection (AD) is the process of identifying any outliers of a data set. Typically, the outliers might be unusual network traffic, faults in the system or perhaps data that should be cleaned before analysis [6]. Traditionally, anomaly detection has been a manual task and therefore a tedious job, however, with the implementation of machine learning models there have been significant improvements [9]. The engineering process, then, creates an AD system that works more accurately, is adaptive over time and can handle big data more efficiently. A few examples of ML algorithms in anomaly detection are SVM’s, decision trees and k-nearest neighbours. Predictive modelling in machine learning use historical data to make predictions on new data. For these models, the most widely used methods are decision trees, regression (linear and logistic) and neural networks. Machine learning and predictive modelling area in most cases used in predictive analysis such as possible stock market changes and customer behaviour. Both anomaly detection and predictive modelling can be used interactively in predictive analysis such as fraud detection and predictive maintenance. Below in Table 3 outline and describe the application of anomaly detection and predictive modelling: Demonstrated area Description Data cleaning Identifying and removing [1] anomalous data from data set in supervised learning to increase accuracy of ML model Distinguish legitimate and [1] fraudulent behaviour using supervised learning algorithms. An example is MasterCard where ML process data such as transaction, location and time. Fraud detection Reference 17 System health monitoring Stock market prediction Recommendation systems Using anomaly detection in railway maintenance task of predicting potential failures in advance Predict changes in stock market or reduce market risk. The banking and financial services use such models to increase revenue, productivity etc. Understanding consumer behaviour is important in retail and production companies for planning and increasing revenue. Netflix use predictive modelling to generate recommendations of movies for their members. [9] [1] [2] In today’s world managing and monitoring a distributed systems’ performance can be extremely difficult. Anomaly detection and predictive modelling integrated with machine learning can efficiently monitor errors within thousands of items and rapidly inform the responsible parties. Furthermore, companies drive revenue and increase productivity by increasing security, predicting early faults and take advantage of their data to predict consumer needs and trends. 5 Conclusion Machine learning is an exciting and developing sub-field of artificial intelligence. Through enabling computers to act intelligently, by learning directly from data examples and experience, machine learning algorithms can automatically carry out complex tasks. In highlighting the importance of machine learning, a wide range of applications has been discussed such as image analysis for classifying and detecting objects in autonomous vehicles, manipulating and interpreting text for machine translation, automatically detecting fault in components, among others. Many companies already implement ML and showing promising results of increased efficiency productivity - driving revenue in a wide range of businesses. In healthcare, engineering, traffic management, human activity and many other sectors I believe machine learning has potential to bring significant benefit. Future work should address accuracy and limitations of current machine learning algorithms. 18 6 References [1] Choudhary, Patrick S. Introduction to Anomaly Detection. Towards Data Science. John Wiley & Sons. 2019 [2] Columbus, L., 2020. State of AI and Machine Learning In 2019. [online] Forbes. Available at: https://www.forbes.com/sites/louiscolumbus/2019/09/08/state-of-ai-andmachine-learning-in-2019/?sh=1f36b6211a8d [3] Debates, Stephanie Renee, Mapping Sub-Saharan African Agriculture in HighResolution Satellite Imagery with Computer Vision & Machine Learning. 2017, Princeton University [4] Dash, Tirtharaj and Tanistha Nayak, English Character Recognition using Artificial Neural Network. arXiv preprint arXiv:1306.4621, 2013. [5] Goldberg Y. Neural network methods for natural language processing. Synthesis Lectures on Human Language Technologies. 2017;10(1):1–309. [6] Hofmann, A., Schmitz, C. and Sick, B. Rule extraction from neural networks for intrusion detection. International Journal of Computer Applications (0975 – 8887) Volume 79 – No.2, October 2013 [7] Karlsen, Simen Skaret, Automated Front Detection-Using computer vision and machine learning to explore a new direction in automated weather forecasting. 2017, The University of Bergen [8] Kerle, Norman, Markus Gerke, and Sébastien Lefèvre, GEOBIA 2016: Advances in Object-Based Image Analysis—Linking with Computer Vision and Machine Learning. 2019. [9] Mitra, Shounak, Applications of Machine Learning and Computer Vision for Smart Infrastructure Management in Civil Engineering. 2017. [10] Steger, Carsten, Markus Ulrich, and Christian Wiedemann, Machine vision algorithms and applications. 2018: John Wiley & Sons. [11] Vinyes Mora, Silvia, Computer vision and machine learning for in-play tennis analysis: framework, algorithms and implementation. 2018, Imperial College London. [12] Vemuri, Anant S, Survey of Computer Vision and Machine Learning in Gastrointestinal Endoscopy. arXiv preprint arXiv:1904.13307, 2019. [13] Zhang, Fan, Wei Li, Yifan Zhang, and Zhiyong Feng, Data Driven Feature Selection for Machine Learning Algorithms in Computer Vision. IEEE Internet of Things Journal, 2018. 5(6): p. 4262-4272. 19