Uploaded by Neha Gupta

Machine Learning Notes

advertisement
• The goal of machine learning is to build computer systems that can adapt and learn from their experience.
• Hypothesis: concept (i.e., classification function) belonging to the Hypothesis Space of a learning
algorithm
• Hypothesis Space: The space of classifiers from which the learning algorithm selects a hypothesis.
• Inductive reasoning moves from specific instances into a generalized conclusion, while Deductive
reasoning moves from generalized principles that are known to be true to a true and specific conclusion.
The accuracy of inductive reasoning is questionable.
• Inductive bias of a learning algorithm is the set of assumptions that the learner uses to predict outputs
given inputs that it has not encountered.
• Example of Machine Learning:
1) Learning to recognize spoken words (Lee, 1989; Waibel, 1989).
2) Learning to drive an autonomous vehicle (Pomerleau, 1989).
3) Learning to classify new astronomical structures (Fayyad et al., 1995).
4) Learning to play world-class backgammon (Tesauro 1992, 1995).
• Supervised learning – Use specific examples to reach general conclusions or extract general rule
1) Classification 2)Regression
• Unsupervised learning (Clustering) – Unsupervised identification of natural groups in data
• Reinforcement learning– Feedback (positive or negative reward) given at the end of a sequence of steps
Inductive reasoning and Deductive reasoning
• Deduction: reasoning from general premises, which are known or presumed to be
known, to more specific, certain conclusions.
All men are mortal. (Premise)
John is a man. (Premise)
John is mortal. (Conclusion)
• Induction: reasoning from specific cases to more general, but uncertain,
conclusions.
John and Tim are in the college Basketball team
Deductive
Validity of Conclusions
conclusions can be proven
All Basketball team members are tall
to be valid if
All politicians believe in the inclusive idea of their nations. the premises
are known to
X is a politician
be true.
X believes in the inclusive idea of his country
Deductive, invalid
John and Tim are tall
Inductive
Conclusions
may be
incorrect even
if the premises
are true.
Hypothesis
• Target function: In machine learning, we want to learn or approximate a particular function
that maps input x to f(x). For example, let's us distinguish spam from non-spam email.
The target function f(x) = y is the true function f that we want to model.
• Hypothesis: A hypothesis is a certain function that we believe (or hope) is similar to the true
function, the target function that we want to model. In context of email spam classification, it
would be the rule we came up with that allows us to separate spam from non-spam emails.
What is a Concept?
• A Concept is a a subset of objects or events defined over a larger set [Example: The concept of a bird is the subset of all
objects (i.e., the set of all things or all animals) that belong to the category of bird.]
• Alternatively, a concept is a Boolean-valued function defined over this larger set [Example: a function defined over all
animals whose value is true for birds and false for every other animal].
Things
Birds
Animals
Cars
What is Concept-Learning?
• Given a set of examples labeled as members or non-members of a concept, concept-learning consists of automatically
inferring the general definition of this concept.
• In other words, concept-learning consists of approximating a boolean-valued function from training examples of its input
and output.
Example of a Concept Learning task
•
Concept: Good Days for Water Sports
•
Attributes/Features:
• Sky (values: Sunny, Cloudy, Rainy)
• AirTemp (values: Warm, Cold)
• Humidity (values: Normal, High)
• Wind (values: Strong, Weak)
• Water (Warm, Cool)
• Forecast (values: Same, Change)
•
Inductive Bias
(values: Yes, No)
Example of a Training Point:
<Sunny, Warm, High, Strong, Warm, Same, Yes>
Concept Learning as Search
•
Concept Learning can be viewed as the task of searching through a
large space of hypotheses implicitly defined by the hypothesis
representation.
•
Selecting a Hypothesis Representation is an important step since it
restricts (or biases) the space that can be searched. [For example,
the hypothesis “If the air temperature is cold or the humidity high
then it is a good day for water sports” cannot be expressed in our
chosen representation.]
Sl.
No
x1
x2
y
1
2
3
4
5
6
0.7
0.8
0.8
1.2
0.6
1.3
0.7
0.9
0.25
0.8
0.4
0.5
0
1
0
1
0
1
7
8
0.9
0.9
0.5
1.1
0
1
The inductive bias of a learning algorithm is the set of
assumptions that the learner uses to predict outputs
given inputs that it has not encountered.
Symbols used in propositional logic
– Connectives:

and

or

not


+
, >
Conjunction
Disjunction
Negation
implies If then
equivalent to
xor
False, True
Decision Trees
•
•
•
•
•
1)
2)
3)
4)
The most commonly used classification technique
Use supervised learning
Easy for us to understand the learned results
Can deal with missing values and irrelevant features
Computationally cheap to use
Decision tree classify instances by sorting them down from the root tree to some leaf node.
Each node specifies a test of some attribute of the instance
Each branch from the node corresponds to one of the possible values of this attribute.
An instance is classified by starting at the root node of the tree, testing the attribute specified by this node, then
moving down the tree branch corresponding to the value of attribute in the given example.
5) This process is repeated for the subtree rooted at the new node.
A Decision Tree for Play Tennis
(Outlook = Sunny , Temperature = Hot, Humidity = High, Wind = Strong)
YES = (Outlook = Sunny  Humidity = Normal) v (Outlook = Overcast) v (Outlook = Rain  Wind = Weak)
Select attribute which partitions the learning set into subsets as
“pure” as possible
0 log 0  0
Information gain of an attribute A relative to a
collection of examples
Repeat the steps for each non-terminal descendant node,
using only the training examples associated with that node.
Attribute that have been incorporated higher in the tree are
excluded.
This process continues until either of the two conditions is
met:
(1)Every attribute has already been included along this path
through the tree, or
(2) the training examples associated with this leaf node all
have the same target attribute value.
•The depth of a node is the number of edges from the node to the tree's root node.
A root node will have a depth of 0.
•The height of a node is the number of edges on the longest path from the node to a leaf.
A leaf node will have a height of 0
•The height of a tree would be the height of its root node, or equivalently, the depth of its
deepest node.
Download