Uploaded by johndoe.officialcontact

An Unusual Approach to MACHINE LEARNING

advertisement
An Unusual
Approach to
MACHINE LEARNING
Nipuna Samarasekara
Any guesses what this graph is?
What’s unusual about today’s session?
Not going to teach ML
Perhaps not even talk about ML until halfway.
●
●
●
Please don’t panic, we will discuss ML today :)
ML Algorithms < = > Hammers
Learn Everything About Hammers
Hammer Expert
Designs new
hammers
Typical User
Uses hammers to
get your work
done
Hammer Users
●
●
●
Use hammers to build useful things
In the process develop expertise on hammers
Cannot foresee and prepare for all the challenges you’ll face
Let’s Try This Approach
●
●
Let’s try to solve an interesting problem
Apologies for using a non real world problem
○
We don’t have enough time in this session to accurately model a real world problem and
demonstrate the process
N Queens Problem - Queen’s legal moves
N Queens Problem - Queen’s attacking each other
N Queens Problem - Queen’s not attacking each other (rare in the real world :D)
N Queens Problem - Finally
Place N queens on an N by N board such that no two queens are attacking
each other. Example: On a 4x4 board 4 queens are placed like this.
How do we find a solution?
Let’s start placing queens on the board.
How do we find a solution? - cont
How do we find a solution? - cont
How do we find a solution? - Final Solution
For an 8x8 board
Let’s analyse our solution
●
●
We are traversing in a tree structure.
If we design a naive algorithm
○
○
○
●
We’d be traversing this tree.
With some additional observations we can
reduce the number of nodes we need to
traverse in order to find a solution.
This solution is O(n!)
O(n!) grows very fast with n.
Let’s analyse our solution - cont
●
●
O(n!) is not fast. People have managed to traverse a very optimized
version of the complete tree for n = 27 (last time I checked), but not
beyond.
So how do we solve this problem for larger n.
Machine Learning (Finally)
●
●
●
Now that we are stuck we will explore if ML is helpful.
As we discussed before ML algorithms are tools, and we leverage on
those to solve actual problems.
Today, we’ll see how Genetic Algorithms will help us solve the problem at
hand
Machine Learning is for Lazy People
Exact Algorithms
Machine Learning
Not Customizable
Slight variants can solve a variety of
problems.
Can be very difficult to come up with
Once implemented easier to modify and
tune for other tasks.
Requires a deep understanding of the
domain.
Selecting the right tool
●
Selecting an algorithm to solve the problem is tricky. Knowing about
a lot of algorithms can help. However having used most of them is
even better.
Reinforcement Learning
●
●
Don’t need data to get started (in general)
Maximizing a reward (measurable in some form)
Genetic Algorithms
●
●
●
●
●
Tool of the day!
Let’s take a quick look at how it
works.
Inspired by the process of evolution.
Selection is based on a fitness
function. Phenotypes with high
fitness are more likely to survive.
Results in phenotypes with high
fitness.
Genetic Algorithms - Let’s mimic nature
●
Modelling N queens problem so that we can
use ML algorithms to solve it
○
Queen configurations are our phenotypes (creatures).
Some Observations (Back to the N queens problem)
-
Each row of the chessboard should not contain
more than 1 queen (otherwise they will attack
each other.)
Modelling the problem
So each one of the N queens should live in a different row (same is true for
columns, but let’s not focus on that).
This allows us to present an N-queen placement (or a queen configuration)
using a sequence of N numbers.
i-th element of the sequence represents the column number of the queen in
the i-th row.
Modelling the problem
The configuration shown on the right
will be represented as the sequence
21446857
2
●
Yes, these configurations still have
conflicts (queens attacking each other.)
4
●
And now, we no longer worry about
chess boards. We think in terms of
number sequences.
●
1
4
6
8
5
7
Genetic Algorithms - Initialization
We need an initial population of creatures to work with. We random number
(between 1 and n) sequences with length n (=8 to demonstrate the process).
-
12345678
14235345
11111111
… 10,000 more randomly generated sequences.
Genetic Algorithms - Selection
Now we want to select the strongest creatures (sequences) from the initial
population. We need a fitness function to evaluate each one of them.
This is a critical piece of problem modelling. We need to define a fitness
function such that when it’s optimized, the resultant configuration is a valid
solution (in our case queen configuration with no conflicts.)
We will choose a very simple, but powerful fitness function.
fitness(configuration) = Number of conflicts in the configuration
Conflict: A pair of queens attacking each other.
Conflict
1 conflict
All queens attacking each other.
Many conflicts. (28 of them.)
Genetic Algorithms - Selection
We use the number of conflicts in each configuration and select a reduced set
(1000 out of 10,000 we generated) configurations with minimal number of
conflicts.
Genetic Algorithms Crossover
●
We use the reduced set for the following
●
We stochastically choose 2 configurations from the set and use the pair to
create a brand new configuration (next generation).
●
We repeat this process 10,000 times to generate 10,000 configurations
(next generation.)
Crossover - Explained
●
Randomly choose a length (say L).
●
First L elements of the new sequence will come
from one parent and the rest of the elements will
come from the other.
Genetic Algorithms - Evolution
●
We repeat the entire process.
○
i.e. select the best 1000 of the current gen -> crossover to create the next gen (10,000) ->
….
○
With each iteration average fitness of the population will improve. (In our case number of
conflicts will decrease.)
○
And eventually we’ll arrive at a configuration with zero conflicts. (with some probability.)
Wait “with some probability”?
●
What if the process doesn’t?
●
This is where knowing about fine details can help you. If you don’t
already know this is the time to read and learn more about the
algorithm you are using.
Fine Details
●
In genetic algorithms, an entire population can be stuck in a local minima.
(It’s fine if you don’t understand these details now. You eventually will.)
●
We are essentially trying to find a minima of a function in the hyperspace.
●
Each creature (configuration) represents a point in that surface. And the
location of the subsequent generation is bounded (as of now) solely by
the locations of the creatures in the current generation.
Overcoming this problem - Mutations
●
●
Remember crossover, where we merge two parents to
create a new configuration.
We randomly decide to introduce a mutation to the
resultant configuration.
○
I.e. we pick one of the queens and move that queen to a random
column within the same row
Mutations
Performing mutations to the configurations in the new generation will allow
us to break out of the local minima.
●
And thus the algorithm will produce a solution with zero conflicts with a
much higher probability.
Things to note
●
This approach is clearly better that the tree traversal approach.
●
Tree traversal can yield a solution for n < 100 within a reasonable amount
of time.
●
Genetic algorithm we discussed above (with improved fitness functions
and mutation strategy) can solve the problem for n ~ 10,000 within a
reasonable amount of time.
Key Learnings
●
Using ML to solve a problem
a. Model the actual problem such that we can use ML to tackle it.
b. Identify a small set of ML algorithms (to be used to solve the problem)
■ Should not try to use all ML algorithms you are aware of.
■ This step is tricky, with experience you’ll be able to examine the
structure of the problem closely and pick a handful of algorithms.
■ Also note that step a. And b. Are often intertwined. The way you formulate
your problem should take certain characteristics of the ML algorithm you
are planning to use into account.
c. Collect data. Refine, preprocess and extract features. (Usually very time
consuming.) [In our example we employed reinforcement learning. So we didn’t
really work on retrieving and processing data]. Once again availability of data
can impact your choices in b.
d. Employ ML algorithms to solve the problem. You’ll come across challenges.
Read and learn about the algorithms you are using to improve and fine
tune those.
Key Learnings
There are additional steps. We need to test if our ML solutions generalize well (ex: to unseen data
with different characteristics, to large input sizes).
And if you want to productionize the solution and deploy, inner workings of the ML models should be
carefully examined to make sure that it produces reasonably good results for the entirety of the
input space.
Where to start?
●
●
●
●
●
●
-
Artificial neural networks
Decision tree learning
Support vector machines
Regression analysis
Bayesian networks
Genetic algorithms
Some basic algorithms you can add to your repertoire.
In order to start, you don’t need to go through the details and become
experts.
Just go though the wikipedia page of each one and understand an
oversimplified version of what they do.
And then?
Start solving problems using ML.
Other tools
●
Computer programming
○
Languages
■ Considerations
● Availability of frameworks and libraries
● Development speed
● GPU support
● Speed of execution
■ Widely used
● Python
● cpp
● Java
● Matlab
● R
I don’t have a powerful computer at home?
Don’t worry there are online tools like colab which allows you to experiment
with ML (or anything) and gives you a reasonable computing power. There are
probably other tools as well.
●
●
Gives you access to GPUs (and even TPUs)
Easy sharing (python notebooks)
Style Transfer Demo
Other Examples
References
https://www.geeksforgeeks.org/top-10-algorithms-every-machine-learning-engineer-should-know/
https://www.educba.com/machine-learning-algorithms/
https://en.wikipedia.org/wiki/Eight_queens_puzzle
https://www.codesdope.com/blog/article/backtracking-explanation-and-n-queens-problem/
https://www.tensorflow.org/hub/tutorials/tf2_arbitrary_image_stylization
https://www.javatpoint.com/n-queens-problems
Thank You
Questions are welcome
Download