Uploaded by robhodde

Machine Learning - Rational Forest

advertisement
A Design-Driven Approach to
A.I. Explainability
CUNY SPS MS Data Science
DATA 698 Prof Sabrina Khan
Rob Hodde
May 17, 2022
Introduction
* Leading A.I. Models are black boxes
* People are afraid of black boxes
* This is not good
And we’re worried plenty…
What is being done?
Introducing:
XAI
Explainable Artificial
Intelligence
XAI
XAI
XAI
So What’s wrong with XAI?
XAI
It’s another black box!
We propose a transparent
A.I. engine,
Easy to explain,
Yet powerful…
Rational Forest (RAF)
GAM (Generalized Additive Model)
+ RF (Random Forest)
Maps entire training space
as unique decision trees
RAF
Instead of randomly building many
decision trees that disagree with each
other…
And inviting them to vote
democratically…
RAF
Why not use all the data,
Build every species of tree,
And use the best tree for the job?
We propose
a Rational Forest
is Better Than a Random Forest
OK, PROVE IT!
Build a competitive prediction engine that is
explainable to the non-technical end user that
answers the following question:
“If I buy this stock today, will the price go up in
the next week?”
Method
ology
No free
lunch
The “No Free Lunch”
theorem states that,
no engine can work best
on all problems.
The Rational Forest is designed
to answer the research question.
It may not perform well on
other questions.
FASTEN SEAT BELT
MAJOR JARGON FEST APPROACHING
Method
ology
Python: VS Code
dev
environment
.NET/SQL: VS
Method
ology
data
store
All data stored in MS-SQL server.
Fast, reliable, powerful, integrated, scalable!
Method
ology
data
collection
Commercial Provider: First Rate Data
Method
ology
data
model
A tabular data model allows dynamic SQL generation
When the table is updated, the code updates itself
Method
ology
response
variable
Tesla
predictors
experi
mentation
Common measures of recent
volatility and price movement
response
curves
experi
mentation
classifiers
experi
mentation
One-hot encode predictors
to vote Yay or Nay
collinearity
experi
mentation
After removing weaker predictors,
We are ready to vote!
experi
mentation
vote
Four “Yay” Votes For TSLA on March 8 =
71% Likely to Profit
WIPE YOUR EYES
CHUG MOUNTAIN DEW
power of
the vote
experi
mentation
Wait…
How does it calculate the 71% ?
RAF build
experi
mentation
1: Start With Predictor Pairs:
Use the training data to calculate
how strong they are together:
RAF build
2: Add Another Predictor:
experi
mentation
Precision
If the new predictor makes the team stronger, keep it.
Otherwise, discard.
Keep adding predictors; up to twelve can play on a
team.
RAF build
experi
mentation
At the end you get something like this
Example 1
experi
mentation
Stock
RAF
Example 2
Stock
experi
mentation
RAF
Example 2
experi
mentation
accuracy
experi
mentation
RAF
classification
Scoring is based on Test (holdout) data only.
accuracy
comparisons
experi
mentation
RAF
TPOT Rec 1 hour
TPOT Rec 12 hours
TPOT Rec 3 days
explain
ability
experi
mentation
explain
ability
Specific Lift table:
experi
mentation
experi
mentation
explain
ability
explain
ability
More About Lift:
Like triage, the first
intervention is the
most important
Additional
countermeasures
are necessary,
but add less
experi
mentation
explain
ability
General Lift
experi
mentation
IS THIS THING EVER GOING TO END
TAKE DEEP BREATHS
conclusion
1. RAF = Hybrid Ensemble Classifier
2. Competitive
3. Explainable
Next steps
1. Graded Voting
2. Scrambled Lift
3. Mo’ Models
ALL DONE !!!
THANK YOU !! 
Download