Uploaded by Wahidullah Rahmani

SEL3000-1-Kunstig Intelligens

advertisement
An introduction to Machine Learning:
Applications and value
Wahidullah Rahmani
Student number: 216422
Pawan K.C
Student number: 216368
1
Table of contents
An introduction to Machine Learning: importance and applications ........................................ 1
1
Introduction ....................................................................................................................... 3
2
Literature review ................................................................................................................ 5
3
Machine learning ............................................................................................................... 6
4
3.1
What is machine learning?........................................ Feil! Bokmerke er ikke definert.
3.2
Supervised learning, unsupervised learning and semi-supervised learning ............. 10
3.3
Deep Learning ........................................................................................................... 12
Applications ...................................................................................................................... 12
4.1
Computer vision ........................................................................................................ 12
4.2
Natural language processing ..................................................................................... 15
4.3
Anomaly detection and Predictive modelling ........................................................... 17
5
Conclusion ........................................................................................................................ 18
6
References........................................................................................................................ 19
2
1 Introduction
Machine learning (ML) is the fascinating subfield of artificial intelligence that allows
computers to learn from data. By performing calculations on big data using computer
algorithms and build a model that help structure said algorithm, ML enable the ability to
solve complex human tasks at an increased efficiency and accuracy. As the field develops,
machine learning shows potential in transforming a wide range of sectors, where business
has already reaped the benefit of ML using the technology to increase department efficiency
and productivity.
The Data science and Machine learning Market study reported that in 2019, 40% of
Marketing and Sales teams consider AI and machine learning critical to their success as a
department. Furthermore, the study finds that the R&D, Business Intelligence Competency
Centres (BICC) and executive management all score above 60% in a measure of importance
level [2]. Figure 1 below show a graph of the importance level by department for ML and AI:
Figure 1: Graph of importance measurement of ML in departments
3
From the explosion of data (big data) that is available and the increase of computational
power in recent years, the field of machine learning has seen exiting new advancement. In
areas such as healthcare, education, transport and maintenance machine learning has
already improved the effectiveness of products and services [2]. For example, ML is
integrated in image recognition systems supporting doctors with medical images,
recommender systems used by online retailers to improve consumer satisfaction and
development of autonomous vehicles in transport. Furthermore, machine learning can add
new insights and value to scientific research such as neuroscience and particle physics [3]. A
research on the impact of ML on the economy and employment in the UK show that [4]:
-
35% of job could with a 66% chance be automated over the next decade
-
15 million jobs could be automated over the next decade
-
Technically, it is possible to automate over 70% of the component tasks for 10% of jobs
In any case, the application of ML is vast and already seen in many different areas,
highlighting the potential and importance of the technology. In this paper I will define
machine learning and discuss its current applications and values.
The paper will first discuss what machine learning is and how it works, thereafter discuss its
applications in three different sub-field of ML and briefly discuss the value it creates in
research and business. The last chapter will conclude with comments and future work.
4
2 Literature search
In this chapter I will clarify how I conducted the literature review for this paper. The
objective of the study is to analytically evaluate and investigate the definition of machine
learning and the application of ML in computer vision, natural language processing and
anomaly detection. The database Google Scholar was included in the search.
Keyword used were: Machine learning introduction, Machine learning applications, Machine
learning in computer vision, Machine learning in natural language processing and Anomaly
detection. This resulted in a large set of articles, where the mean result of all the keywords
were approximately 4540000 articles. After examinations of around 50 articles’ content
excluding citations the number of narrow down to 20 articles. These articles were chosen
mainly due to their introductory content which is relevant to this paper as it covers the
basics of machine learning, and due to a further advance search. The advance search added
the word ‘application’, ‘business’ and ‘impact’ to narrow down the search as this paper
discuss mainly the application of each sub field of machine learning.
In the next chapter I will start defining a few machine learning concepts, outline the steps
used to build a machine learning application (model) and explain the current methods that
exist.
5
3 What is machine learning?
Machine learning, a branch of artificial intelligence, is an application that provides a system
the ability to learn from data and improve from experience without being explicitly
programmed. In machine learning algorithms, which from data science is defined as a
sequence of statistical processing steps, a data set is trained to find patterns and make
predictions. The accuracy of the predictions and decisions depends on the quality of such
models. Below I will describe the process of building a machine learning model that utilize
such algorithms.
When data scientists build a machine learning model there are typically 4 main steps
involved:
1. Collect and prepare data
For the purpose of building the model, relevant data must be collected and prepared before
it is ingested to solve a unique problem. In some cases, the data is labelled (tagged data) and
therefore will clarify to the algorithm the features and classifications the model needs to
identify. The other data will be unlabelled, and the model will have the task of assigning
classifications and extracting features from that data.
In either case, the data collected is either a ‘training set’, ‘cross validation set’ or ‘test set’
where each set are separate from each other:
•
Training set is used to train the algorithm to solve the problem it is designed to solve,
•
Cross validation set is for validation and improvement of the model accuracy
•
Test set will test the final trained model’s prediction on new unseen data.
Before the model is trained to solve a problem, however, the data must be prepared –
analyse and remove biases and imbalances, dedupe (remove duplicate data) if necessary,
randomized or label data. This is called ‘pre-processing’ the data and it is an important first
step for model building as it can increase overall efficiency and accuracy of the model.
2. Choosing a model
The next step is to choose a model, which is dependent on the type (labelled or unlabelled)
and the amount of data in the training set, and the type of problem at hand.
6
Common types of machine learning models used with labelled data are:
•
Regression models: There are two main types of regression models: logistic regression
and linear regression models. Linear regression is used to predict the value of a
dependent variable(output) based on the independent variables (input) whereas
logistic regression is used when the output dependent variables is binary in nature
based on its independent variables (input). An example of linear regression is
predicting an engineer’s annual salary (dependent variable) based on the years of
experience or education (independent variable), whereas an application of logistic
regression (most often used in classifications problems) could be to classify a type of
fruit present in an image.
•
Decision trees: Decision tree is a predictive model that utilizes classified data (the
branches) in order to make recommendations (the leaves) based on a set of decision
rules. For instance, if we want to predict whether a person is fit or not (classification
problem) based on their eating habits, diet and physical activity this would be a
classification tree. In this case, the classified data is the description of that person and
his habits and the recommendation by the predictive model is either fit or unfit.
•
Instance-based algorithmic models: A machine learning algorithm that generalizes to
new examples by measuring its similarity with the training examples. This is called
instance-based because it compares the new problem instances with instances from
the training. An example of instance-based model is the K-Nearest Neighbour (KNN)
which is a classification model that estimates the likelihood of a data point belonging
to one group or another based on its closest distance to other data points.
7
Common type of unlabelled machine learning models:
•
Clustering models: A clustering model uses an algorithm that automatically groups a set
of data points that have similar records into ‘clusters’ and labels them according to the
group they belong to. This process is carried out without prior knowledge of the
groups’ characteristic. The most common types of clustering algorithms are: K-means,
Kohen clustering and TwoStep.
•
Association models: The model uses an association algorithm that finds important
relations between variables or features in a data set. The algorithm is a rule-based
machine learning technique used in data mining (turning raw data into useful
information) which applies a measure of interestingness in order to generate a new
association rule for new information. An application of this to predict customer
preference based on their shopping history, where the association rule algorithm
improves performance and generates new rules as more data on the customer is
analysed.
•
Neural networks: A computational learning system based on a collection of connected
units or nodes called artificial neurons which resembles the biological neural network
of a human brain. A neural network has 3 different layers: input layers (data ingestion),
at least on hidden layer (calculations performed on the input data) and an output layer
where the probability of each conclusion of the hidden layers are assigned. The neural
network algorithm does not require specific rules that defines what to expect from the
input data, and instead learns from many labelled examples of the training set. By
learning the characteristic of the input that are needed to construct a correct output
through many processed training examples, the neural network model can now
accurately predict the desired results on new unseen inputs.
8
Figure 2: Neural network with layers
An example of a neural network algorithm is to predict whether an image contains a cat or
not. By processing many training examples consisting of data points (e.g., colour pixel values)
that represent a cat, the neural network can learn to classify a cat and from new unseen
image predict whether a cat is present. Below in Figure a simplified version of this is shown:
Figure 3: Neural network algorithm classifying a cat
3. Training the algorithm to create the model: In this step the model’s algorithm is trained,
which is a process where the algorithm’s ability to predict the correct outcome is
iteratively improved. With each iteration the algorithm is trained as following:
•
The relevant variables are processed through the models’ algorithm,
•
Compares the output with the actual results that it should produce (error
calculation
•
Adjust weight (parameters) and possible biases such that the accuracy of
predictions might be improved
9
•
Run the variables through the algorithm again and repeat the above steps if
necessary
•
The trained algorithm with optimal accuracy is used create the machine
learning model
4. Testing and improving model: The final step of the machine learning process is to test
the trained model with new data (test data). For example, if the model is to identify
spam the test data will be new incoming email messages that must be classified. This
final step is necessary as the model shows a better approximation on how it will
perform in the real world.
3.1 Supervised learning, unsupervised learning and semi-supervised learning
There are four main methods used in machine learning: Supervised learning, unsupervised
learning, semi-supervised learning and reinforcement learning.
Supervised learning
This is the machine learning task that uses labelled training data that consists of inputs and
desired outputs, and the model learns a function that maps new examples according to
these input-output pairs. The example of identifying a cat in the image mentioned is a
supervised learning algorithm, as the model is designed to existence of the cat based on a
data set of various labelled cat images.
The advantage of supervised learning compared to the other methods is that it requires less
training data and makes the training process easier as the model predictions are based on
already labelled results. However, labelled data can be expensive to prepare and the model
can be subjected to bias or overfitting, which is where the model is too closely tied or too
biased to the training data that the algorithm fails to handle the variations in the new data
set accurately.
10
Figure 4: A typical supervised learning algorithm
Unsupervised learning
Unsupervised learning method take unlabelled (raw) training data and uses an algorithm to
extract meaningful features that needs to either be classified, labelled or sorted with
minimum human supervision. In the example of detecting spam, an unsupervised learning
algorithm can take a huge training set of emails, reveal patterns and features that indicate
spam and as a result classify new examples more efficiently over time.
Figure 5: Typical unsupervised learning algorithm
Semi-supervised learning
Semi-supervised learning is a medium between supervised and unsupervised learning. This
method typically uses labelled data set to guide classification and feature extraction for a
larger, unlabelled data set. On the other hand, if labelling there are not enough labelled
examples and/or the labelling process is too expensive, an unsupervised learning algorithm
can solve this problem and feed the output data into a supervised learning algorithm.
Reinforcement learning
This machine learning method is called behavioural machine learning and it concerned with
how software agents (colloquially known as bots or robots) ought to act in an environment
in order to derive at a successful outcome. The model is like the supervised learning model,
however rather than learning from labelled data it learns by using trial and error. This is a
11
sequence of exploring uncharted territory (action) and exploiting current knowledge (state
of environment). Reinforcement learning is, for example, used in learning an agent to play
blackjack. In this example, the state would be the sum of the cards, the action to take is
either to hit or stand and the successful outcome is to get 21.
3.2 Deep Learning
Deep learning is a subset of machine learning based on multi-layered artificial neural
networks. This is like a neural network described previously but with more than 3 hidden
layers between the input layer and output layer of the network. Within each hidden layer
data is calculated, biases and weights are applied, and the algorithm progressively learns
over time and improves the accuracy of the desired outcome.
Deep learning models can be supervised, semi-supervised or unsupervised. Currently, the
deep learning model is used in many AI applications and services and has improved areas
such as automation (self-driving cars) and performing analytical and physical task without
any human effort.
4 Applications
4.1
Computer vision
In this chapter, I will discuss the interaction between machine learning and computer vision
and the limitations that arises between them. Although these two sub fields of artificial
intelligence have been a long-standing tradition and can be considered matured
technologies in industry [5], their application has been limited in the past. Recently, due to
the availability of large amount of data and a new abundance of processing power significant
advances in computer vision and machine learning have been made [6].
Computer vision is the scientific field that deals with how computers can gain a complex
level of understanding from digital images and videos. From the perspective of engineering,
it is the ability of a computer agent to comprehend and automate task that a human visual
system would do. There are three widely popular machine learning paradigms for computer
vision today: neural networks, support vector machines (SVM) and probabilistic graphical
methods [2]. The most common neural network used is convolutional neural networks
(CNN’s) which is a category of neural networks that has neurons with dimensions (width,
12
height and depth). SVM’s are sub-domains of supervised machine learning and it is popular
in classification tasks due to its efficiency [3]. Furthermore, there are currently many open
libraries available for the implementation of complex computer vision algorithms which can
be integrated with programming languages such Python, Java etc.
Machine learning is currently used in the analysis and classification of images, specifically the
discipline of image recognition and signal processing which falls under the general discipline
of pattern recognition [9]. Image below shows a typical example of image analysis:
Figure 6: Classification, object detection and instance segmentation
The image shows a machine learning algorithm for object detection, classification of the
object (cat or dog) and their instance segmentation (classification and delineation). For this
application, a deep learning algorithm is used, which has learned from many training
examples in the form of images of dogs and cats [13].
Machine learning and computer vision has many innovative applications in fields such as
engineering, healthcare, agriculture and sports to name a few. Table 1 below will categorize
and describe some common solutions that exists:
Demonstrated area
Description
Food security, agriculture
production and flood
prediction
Identify related plant and
[3]
weed species using ML on
satellite images. Determine the
health of crops in Sub-Saharan
African agriculture.
Deep learning algorithm used
[12]
to detect breast cancer using X
ray images. DL is also used to
interpret high quality
gastrointestinal endoscopic
images.
Haemorrhoid detection and
endoscopy
Reference
13
Traffic prediction
Self-driving cars
Performance evaluation and
game predictions
Traffic flow detection
(classifying objects such as
cars, bicycles, pedestrians etc.)
In Los Angeles, CV and ML is
used to track and manage
traffic.
CNN’s technologies used in
research for path planning,
scene perception and motion
control of autonomous
vehicles.
Automated cricket scoreboard
from umpire gestures and in
analysation of tennis player
performance to predict future
outcomes.
[9]
[2]
[2]
Moreover, the application of computer vision is used in predictive maintenance (fault
detection in manufacturing units) human behavioural studies and many other industrybased problems for increased effectivity and accuracy of solutions [12].
The field of computer vision is rapidly advancing as the robustness of the algorithm increase
[5], and as research evolves in areas such as biological sciences, human activity,
management, maintenance and many other related areas. From [10] the study highlights the
development of research areas in the integration of machine learning with computer vision:
From the graph we can see a wide range of research being conducted in important areas.
Conclusively, machine learning renders big capabilities of solving human vision problems. In
applying a synthesis of vision algorithms and machine learning models we can implement
14
advanced techniques of image understanding and as a result increase efficiency in areas
such as predictive maintenance, medical images and quality control. From the integration of
the two fields, we have a strong potential of contributing to the changing dynamic of a
computer vision system, provide agents with strong image analysis capabilities and
ultimately improve safety, healthcare and food production. In the next chapter, I will discuss
another application of machine learning: Natural Language Processing (NLP).
4.2
Natural language processing
NPL is a subfield of linguistics and artificial intelligence defined as the ability of a computer
to process and analyse human language. For example, computers can use NPL for reading
text, speech recognition and measure sentiment. In any case, the goal of NPL is to use raw
language input data coupled with linguistics and algorithms to transform or enrich texts. For
this process, due to a massive amount of unstructured data being generated, machine
learning has become critical in automating analysation of texts and speech data efficiently.
In general, the technique of NLP is to break sentences into piece, interpret the relationships
between them and explore the meaning that arises. Most common NPL algorithms include
tasks such as tokenization and parsing, stemming, language detection and the identification
of semantic relationships. These tasks are used in the following popular methods of NPL:
•
Content categorization: A summary of a text document, produced by searching and
indexing words, detecting duplicates and using content alerts.
•
Sentiment analysis: Also known as opinion mining, it is the process of identifying,
extracting and quantifying affective states and subjective information.
•
Conversion: The method of recognizing phonemes or words and transform text into
speech or vice versa.
•
Machine translation: Automatically translating words or speech between languages.
Natural language processing methods have a wide range of technological solutions. The
most used machine learning techniques for NPL are deep learning neural networks, vector
quantization and dynamic time-warping [4]. Table 2 below outlines and describes a few such
applications that exists today:
15
Demonstrated area
Description
Reference
Chatbots
A software agent that
automatically generates a
conversation with a human
agent. Used by many
businesses to aid consumer
queries.
Detection of spam in emails by
extraction of meaning and
frequency of words within the
content of the email.
Extract relevant and useful
information from social media,
customer service
representatives, email etc. to
improve and develop business.
Automatic grammar check is
also used widely.
Mobile apps, home
automation systems, virtual
assistance and video games
are among the technologies
that use speech recognition
algorithms.
Google translate, Microsoft
translator and Amazon
translate use NPL algorithms to
translate text and speech.
[]
Spam filters
Information extraction
Voice driven interface
Machine translation
[2]
[5]
[5]
[5]
NPL is, therefore, extremely useful in areas such as streamlining customer service,
telecommunications, fraud detection and improving business performance. From this we can
already see emerging application of future technologies such as automating home security
systems, thermostats, lights and other extensions of our everyday tool. By integrating
machine learning algorithms with NPL we increase efficiency of analysing and manipulating
texts and develop exciting new technologies that can interact with and provide quick
solutions to humans.
16
In the next chapter I will discuss another two
important machine learning applications:
anomaly detection and prediction.
4.3 Anomaly detection and Predictive modelling
Anomaly detection (AD) is the process of identifying any outliers of a data set. Typically, the
outliers might be unusual network traffic, faults in the system or perhaps data that should be
cleaned before analysis [6]. Traditionally, anomaly detection has been a manual task and
therefore a tedious job, however, with the implementation of machine learning models
there have been significant improvements [9]. The engineering process, then, creates an AD
system that works more accurately, is adaptive over time and can handle big data more
efficiently. A few examples of ML algorithms in anomaly detection are SVM’s, decision trees
and k-nearest neighbours.
Predictive modelling in machine learning use historical data to make predictions on new
data. For these models, the most widely used methods are decision trees, regression (linear
and logistic) and neural networks. Machine learning and predictive modelling area in most
cases used in predictive analysis such as possible stock market changes and customer
behaviour. Both anomaly detection and predictive modelling can be used interactively in
predictive analysis such as fraud detection and predictive maintenance.
Below in Table 3 outline and describe the application of anomaly detection and predictive
modelling:
Demonstrated area
Description
Data cleaning
Identifying and removing
[1]
anomalous data from data set
in supervised learning to
increase accuracy of ML model
Distinguish legitimate and
[1]
fraudulent behaviour using
supervised learning algorithms.
An example is MasterCard
where ML process data such as
transaction, location and time.
Fraud detection
Reference
17
System health monitoring
Stock market prediction
Recommendation systems
Using anomaly detection in
railway maintenance task of
predicting potential failures in
advance
Predict changes in stock
market or reduce market risk.
The banking and financial
services use such models to
increase revenue, productivity
etc.
Understanding consumer
behaviour is important in retail
and production companies for
planning and increasing
revenue. Netflix use predictive
modelling to generate
recommendations of movies
for their members.
[9]
[1]
[2]
In today’s world managing and monitoring a distributed systems’ performance can be
extremely difficult. Anomaly detection and predictive modelling integrated with machine
learning can efficiently monitor errors within thousands of items and rapidly inform the
responsible parties. Furthermore, companies drive revenue and increase productivity by
increasing security, predicting early faults and take advantage of their data to predict
consumer needs and trends.
5 Conclusion
Machine learning is an exciting and developing sub-field of artificial intelligence. Through
enabling computers to act intelligently, by learning directly from data examples and
experience, machine learning algorithms can automatically carry out complex tasks. In
highlighting the importance of machine learning, a wide range of applications has been
discussed such as image analysis for classifying and detecting objects in autonomous
vehicles, manipulating and interpreting text for machine translation, automatically detecting
fault in components, among others.
Many companies already implement ML and showing promising results of increased
efficiency productivity - driving revenue in a wide range of businesses. In healthcare,
engineering, traffic management, human activity and many other sectors I believe machine
learning has potential to bring significant benefit. Future work should address accuracy and
limitations of current machine learning algorithms.
18
6 References
[1] Choudhary, Patrick S. Introduction to Anomaly Detection. Towards Data Science.
John Wiley & Sons. 2019
[2] Columbus, L., 2020. State of AI and Machine Learning In 2019. [online] Forbes.
Available at: https://www.forbes.com/sites/louiscolumbus/2019/09/08/state-of-ai-andmachine-learning-in-2019/?sh=1f36b6211a8d
[3] Debates, Stephanie Renee, Mapping Sub-Saharan African Agriculture in HighResolution Satellite Imagery with Computer Vision & Machine Learning. 2017,
Princeton University
[4] Dash, Tirtharaj and Tanistha Nayak, English Character Recognition using Artificial
Neural Network. arXiv preprint arXiv:1306.4621, 2013.
[5] Goldberg Y. Neural network methods for natural language processing. Synthesis
Lectures on Human Language Technologies. 2017;10(1):1–309.
[6] Hofmann, A., Schmitz, C. and Sick, B. Rule extraction from neural networks for
intrusion detection. International Journal of Computer Applications (0975 – 8887)
Volume 79 – No.2, October 2013
[7] Karlsen, Simen Skaret, Automated Front Detection-Using computer vision and
machine learning to explore a new direction in automated weather forecasting. 2017,
The University of Bergen
[8] Kerle, Norman, Markus Gerke, and Sébastien Lefèvre, GEOBIA 2016: Advances
in Object-Based Image Analysis—Linking with Computer Vision and Machine
Learning. 2019.
[9] Mitra, Shounak, Applications of Machine Learning and Computer Vision for Smart
Infrastructure Management in Civil Engineering. 2017.
[10] Steger, Carsten, Markus Ulrich, and Christian Wiedemann, Machine vision
algorithms and applications. 2018: John Wiley & Sons.
[11] Vinyes Mora, Silvia, Computer vision and machine learning for in-play tennis
analysis: framework, algorithms and implementation. 2018, Imperial College London.
[12] Vemuri, Anant S, Survey of Computer Vision and Machine Learning in
Gastrointestinal Endoscopy. arXiv preprint arXiv:1904.13307, 2019.
[13] Zhang, Fan, Wei Li, Yifan Zhang, and Zhiyong Feng, Data Driven Feature
Selection for Machine Learning Algorithms in Computer Vision. IEEE Internet of
Things Journal, 2018. 5(6): p. 4262-4272.
19
Download