Introduction to Machine Learning

advertisement
MACHINE LEARNING
1. Introduction
What is Machine Learning?
2



Need an algorithm to solve a problem on a
computer
An Algorithm is a sequence of instructions to
transform input from output
Example: Sort list of numbers
 Input:
set of numbers
 Output: ordered list of numbers


Many algorithms for the same task
May be interested in finding the most efficient
Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
What is Machine Learning?
3








Don’t have an algorithm for some tasks
Example: Tell the spam e-mail for legitimate e-mails
Know the input (an email) and output (yes/no)
Don’t know how to transform input to output
Definition of spam may change over the time and from
individual to individual
We don’t have a knowledge, replace it with data
Can easily produce large amount of examples
Want a computer to extract an algorithm from the
examples
Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
What is Machine Learning?
4



Believe that there is a process explaining the data
We don’t know details about the process we know
it’s not random
Example: Consumer Behavior
 Frequently




buy beer with chips
Buy more ice-cream ins summer
There is certain patterns in data
Rarely can’t indentify patterns completely
Can construct good and useful approximation
Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
Approximations to Patterns
5



May not explain everything
Still detect some patterns and regularities
Use patterns
 Understand
the process
 Make a prediction
Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
Data Mining: Application of ML to large
databases
6








Retail: Market basket analysis, Customer
relationship management (CRM)
Finance: Credit scoring, fraud detection
Manufacturing: Optimization, troubleshooting
Medicine: Medical diagnosis
Telecommunications: Quality of service optimization
Bioinformatics: Motifs, alignment
Web mining: Search engines
...
Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
Examples of Ml Applications
7








Learning association: Basket Analysis
If people buy X they typically buy Y
There is a customer who buys X and don’t buy Y
He/She is a potential Y customer
Find such customers and target them for cross-selling
Find an association rule: P(Y|X)
D customer attributes (e. g.age, gender)
P(Y|X,D)
Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
Association Rules examples
8

Bookseller instead of supermarket
 Products

are books or authors
Web portal
 Links
the user is likely to click
 Pre-download pages in advance
Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
Credit Scoring
9


Bank want to predict a risk associated with loan
Probability that customer will default given the
information about the customer
 Income,



Savings, profession
Association between customer attributes and his risk
Fits a model to the past data to be able to
calculate a risk for a new application
Accept/Refuse application
Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
Classification Problem
10




Two classes of customers: low-risk and high-risk
Input: information about a customer
Output: assignment to one of two classes
Example of classification rule
 IF
income> θ1 AND savings> θ2 THEN low-risk ELSE
high-risk

Discriminant: function that separates the examples
of different classes
Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
Discriminant Rule
11
Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
Discriminant Rule
12



Prediction: User rule for novel instances
In some instances may want to calculate
probabilities instead of 0/1
P(Y|X), P(Y=1|X=x) =0.8 , customer has an 80%
probability of being high-risk
Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
Pattern Recognition
13





Optical character recognition (OCR) , recognizing
character codes from their images
Number of classes as many as number of images
we would like to recognize
Handwritten characters (zip code on envelopes or
amounts on checks)
Different handwriting styles, character sizes, pen or
pencil
We don’t have a formal description that covers all
A’s characters and none of non-A’s
Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
OCR
14







All A have something in common
Extract pattern from examples
Use redundancy in human languages
Word is a sequence of characters
Not all sequences are equally likely
Can still r?ad some w?rds
ML algorithms should model dependencies among
characters in a word and word in the sequence
Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
Face recognition
15




Input : image
Classes: people to be recognized
Learning program: learn to associate faces to
identities
More difficult then OCR
 More
classes
 Images are larger
 Differences in pose and lightening cause significant
changes in image
 Occlusions: Glasses, beard
Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
Face Recognition
16
Training examples of a person
Test images
AT&T Laboratories, Cambridge UK
http://www.uk.research.att.com/facedatabase.html
Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
Medical Diagnosis
17

Input: Information about the patient
 Age,


past medical history, symptoms
Output: illnesses
Can apply some additional tests
 Costly
and inconvenient
 Some information might be missing
 Can decide to apply test if believe valuable

High price of wrong decision
Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
Speech Recognition
18





Input: acoustic signal
Output: words
Different accents and voices
Can integrate language models
Combine with lips movement
 Sensor
fusion
Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
Knowledge Extraction
19



The rule is simpler than the data
Example: Discriminant separating low-risk and high
risk customer helps to define low risk customer
Target low risk customer through advertising
Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
Compression
20




Explanation simpler than data
Discard the data , keep the rule
Less memory
Example: Image compression
Learn most common colors in image
 Represent slightly different but similar colors by single
value

Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
Outlier detection
21




Find instances which do not obey rules
Interesting not in a rule but an exception not
covered by the rule
Examples: Learn properties of standard credit card
transactions
Outlier is a suspected fraud
Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
Why “Learn” ?
22



Machine learning is programming computers to optimize
a performance criterion using example data or past
experience.
There is no need to “learn” to calculate payroll
Learning is used when:
Human expertise does not exist (navigating on Mars),
 Humans are unable to explain their expertise (speech
recognition)
 Solution changes in time (routing on a computer network)
 Solution needs to be adapted to particular cases (user
biometrics)

Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
What We Talk About When We Talk
About“Learning”
23



Learning general models from a data of particular
examples
Data is cheap and abundant (data warehouses,
data marts); knowledge is expensive and scarce.
Example in retail: Customer transactions to consumer
behavior:
People who bought “Da Vinci Code” also bought “The
Five People You Meet in Heaven” (www.amazon.com)

Build a model that is a good and useful
approximation to the data.
Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
What is Machine Learning?
24



Optimize a performance criterion using example
data or past experience.
Role of Statistics: Inference from a sample
Role of Computer science: Efficient algorithms to
Solve the optimization problem
 Representing and evaluating the model for inference

Based On E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
Regression
25


Example: Price of a
used car
x : car attributes
y : price
y = g (x | θ )
g ( ) model,
θ parameters
y = wx+w0
Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
Supervised Learning
26




Regression and classification are supervised learning
problems
There is an input and output
Need to learn mapping from input to output
ML approach: assume a model defined up to a set of
parameters




y = g(x|θ)
Machine Learning Program optimize the parameters to
minimize the error
Linear model might be two restrictive (large error)
Use more complex models

y = w2x2 + w1x + w0
Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
Supervised Learning: Uses
27




Prediction of future cases: Use the rule to predict
the output for future inputs
Knowledge extraction: The rule is easy to
understand
Compression: The rule is simpler than the data it
explains
Outlier detection: Exceptions that are not covered
by the rule, e.g., fraud
Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
Unsupervised Learning
28





Learning “what normally happens”
No output, only input
Statistics: Density estimation
Clustering: Grouping similar instances
Example applications
Customer segmentation in CRM
 Image compression: Color quantization
 Document Clustering

Based On E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
Reinforcement Learning
29

Learning a policy: A sequence of outputs
 Single
action is not important
 Action is good if its part of a good policy
 Learn from past good policies
 Delayed reward


Example: Game playing
Robot in a maze
 Reach
the goal state from an initial state
Based on Introduction to Machine Learning © The MIT Press (V1.1)
Download