Karen Haigh Steven Harp BBN Technologies Adventium Labs

advertisement
Improving Self-Defense by
Learning from Limited Experience
Karen Haigh
BBN Technologies
Steven Harp
Adventium Labs
Haigh & Harp, Learning from Limited Experience
Overview
• Goal: Systems that autonomously improve
their defenses with experience.
• Several ways to do this...
• Examples discussed:
– Learning to recognize anomalies
– Self Immunizing against observed exploits
– Acquiring multistage attacks concepts
– Learning effective responses
Haigh & Harp, Learning from Limited Experience
Learning in Cyber Security
• What is (machine) learning?
– Automatically using prior experience to improve
performance over time
• Problems addressable by learning?
– Detection: distinguish problem from non-problem
– Immunity:
• Good: “an exploit should succeed at most once”
• Better: “a vulnerability should be exploitable at most
once”
– Response: how best to actively counter an attack?
Long Term Goal: Cognitive Immunity
Haigh & Harp, Learning from Limited Experience
Opportunities & Techniques
Detecting Attacks
Passive
Observation
Relatively well explored.
Example: anomaly detection
Work remains to be done to
detect attacks extended over
multiple hosts and steps.
Responding to
attacks
Not well explored.
CSISM innovation:
Situation-dependent
utility of responses.
Cortex innovation: Use
Not well explored
(e.g. in sandbox / experiments to generalize from CSISM innovation:
instances of attacks to classes of Variations on responses
taster /
attacks.
laboratory)
CSISM innovation: Use
experiments to identify
necessary & sufficient elements
of multi-step attacks.
Experiment
Haigh & Harp, Learning from Limited Experience
Modelling Defended Systems
• Expert Rules
• Offline Learning
• Online Learning
Experimental Sandbox
Offline Training
+ Good data
Online Training
- Unknown data
+ Complex environment
- Dynamic system
+ Complex environment
+ Dynamic system
Expert Heuristics
+ Good data
Experimental Sandbox
+ Good data (self-labeled)
- Complex environment
- Dynamic system
+ Complex environment
+ Dynamic system
Haigh & Harp, Learning from Limited Experience
Very hard for
adversary to
“train” the
learner!!!
Complex Domain: Human Rules are Incomplete
Quad 0&1 are
slower than
Quads 2&3.
Complex domain:
human calibration
(incorrectly)
claimed that Quad 1
was slowest,
missing Quad 0
Time by Quad
Experience
DPASA (DARPA OASIS) Haigh & Harp, Learning from Limited Registration
Complex Domain (2)
caf_plan, chem_haz
and maf_plan are slower
than other clients
Complex domain:
human calibration
(incorrectly) claimed
that caf_plan &
maf_plan were
slowest because of
hand-typed
password, missing
chem_haz
Registration
Limited Experience Time by Client Type
DPASA (DARPA OASIS) Haigh & Harp, Learning from
Learning for Calibration
• Calibrate the parameters of rules for normal operating
conditions
– Important first step because it learns how to respond to
normal conditions
– For example: learn timing parameters for rapid response
controller, e.g.
• Client Registration, PSQ server local probes, SELinux
enforcement, SELinux flapping, File integrity checks
– Need to handle multi-modal data:
CSISM / BBN
Haigh & Harp, Learning from Limited Experience
Results for all Registration times
These two
“shoulder” points indicate
Beta=0.00
upper
05 and lower limits.
As more observations are collected, the
estimates become more confident of the
range of expected values (i.e. tighter
estimates to observations)
CSISM / BBN
Haigh & Harp, Learning from Limited ExperienceAlgorithm of Last & Kandel, 2001
Generalization of Attack Signatures
Cortex Project
Haigh & Harp, Learning from Limited Experience
Generalization
• Goal: Learn a most general concept from instances
of attacks and block all similar attacks against the
vulnerability.
 Dealing with Zero-day attacks...
• Payload Analysis Challenges
– How to automatically recognize which element(s) of an
attack are essential?
– How to generalize them to their boundary conditions?
• avoid the fragility of simple pattern matching rules
• Approach: Experimentation
– Validation of attack concepts  0 false positives
Cortex / Honeywell
Haigh & Harp, Learning from Limited Experience
Generalization by Experimentation
Model of normal traffic
Taste
Tester
Experiment
1) Score suspicious elements
2) Replace with innocuous or
generalized values
3) Validate in tester
Model contains
axes of
vulnerability
Cortex / Honeywell
Blocking
Rules
• Payload content
– Binary machine instructions
– Unusual payload (e.g. unix commands, registry
keys, database administrative commands)
– Length (# bytes/terms)
• Resource consumption patterns
• Probing (e.g. password guessing)
• Session-wide (multiple queries)
Haigh & Harp, Learning from Limited Experience
Cortex Demo Architecture and Use Cases
Normal Query
AMP
Query
Mission Planning
CSM
Master DB
Once per phase
Proxy (Dexter)
Block known bad queries
Taste test
Log results
RTS
.
Replicate
Switch Tasters
Rebuild Tasters
Send to Learning
Replicator
Replicate queries
Switch Tasters
Create tasters
Delete tasters
Heartbeat Status
Learner
Read Training Data
Experiment
Generate Rules
Cortex / Honeywell
Haigh & Harp, Learning from Limited Experience
Tasters
Tasters
Tasters
Cortex Demo Architecture and Use Cases
Attack
is through
blocked
Attack
gets
Query
CSM
Proxy (Dexter)
Block known bad queries
Taste test
Log results
AMP
RTS
.
Replicate
Switch Tasters
Rebuild Tasters
Send to Learning
Master DB
Replicator
Replicate queries
Switch Tasters
Create tasters
Delete tasters
Heartbeat Status
Learner
Read Training Data
Experiment
Generate Rules
Cortex / Honeywell
Haigh & Harp, Learning from Limited Experience
Tasters
Tasters
Tasters
Example Results: MySQL
Attacks
Notes
String buffer overflow
(password)
Correctly generalized single attack to
number of valid bytes.
Integer overflow
Correctly generalized single attack to
0x7FFF max value
MySQL DOS attack
Noted that hex bytes were suspicious,
so generalized bytes and correctly
blocked integer overflow!
Project was tested with a red-team model
Cortex / Honeywell
Haigh & Harp, Learning from Limited Experience
Identification of Multistage Attacks
CSISM Project
Haigh & Harp, Learning from Limited Experience
MultiStage Attacks: Challenges
• Detect and generalize multi-step attacks across time and space.
– Multistage attacks involve a sequence of actions that span multiple
hosts and take multiple steps to succeed.
• Challenges:
– Which observations are necessary & sufficient?
• Incidental observations that are either
– side effects of normal operations, or
– chaff explicitly added by an attacker to divert the defender.
• Concealment (e.g. to remove evidence)
• Probabilistic actions (e.g. to improve probability of attack success)
– What are the most reliable observations?
– What are the parameter boundaries?
• Approach: Experimentation
– Allows validation of pruning
CSISM / BBN
Haigh & Harp, Learning from Limited Experience
Architectural Schema
2 Observations
A Actions
CSISM Sensors
(ILC, IDS)
“Sandbox”
1 2 3 4 5 6
Observations ending in failure
of protected system.
Only some are essential.
B
C
A
A
C
B C
A
B D
Attack Theory
Experimenter
Defense
Measures
Experimenter
1 2 3 4 5 6
A
B C
D
A
B X C ?
Viable Attack
Theories
CSISM / BBN
Haigh & Harp, Learning from Limited Experience
Viable Defense
Strategies and
Detection Rules
Multi-Stage Learner
• Do {
– Generate Theory according to heuristic
The hard
part!
• Complete set of theories is Permutations( Powerset(
observations ))
– Test Theory
– Incrementally update controller rulebases
• } while Theories remain
•
For only 10 observations, there are > 10,000,000 possible theories (not including
variations on steps!)
CSISM / BBN
Haigh & Harp, Learning from Limited Experience
Hypothesis Generation
• Query learner generates attack hypotheses
– in heuristic order to acquire the concept rapidly
• Candidate Heuristics
– Look for shorter attacks first (adjustable prior)
– Suspect order of steps has an influence
– Suspect steps to interact positively (for the attacker)
– Prefer hypotheses with less common / more
suspicious elements
Project was tested with a red-team model
CSISM / BBN
22
Response Learning
CSISM Project
Haigh & Harp, Learning from Limited Experience
Situation-dependent Action Utilities
• Learn tradeoffs among potential responses; context changes
appropriateness of responses changes
– Context includes descriptions of users, attack elements, system
performance, etc
– Benefit is effectiveness of defense action
– Cost includes effort to mount response and impact on availability
• Challenges:
– Measuring the effect of responses is hard:
• Complex domain  rarely identical situations  non-deterministic
actions/effects
• Approach: Experimentation
– System “snapshots” get close to identical conditions
CSISM / BBN
Haigh & Harp, Learning from Limited Experience
Response Learning: Results Pending
• Bias toward results that worked in similar situations in
the past
– Hybrid Reinforcement learning and Nearest-Neighbour
approaches
• Given a set of hypotheses about the locus of an attack
– Search for true locus:
• Hierarchical based on system architecture
• Bias by historical attack patterns
– Select response based on similarity match to prior attacks:
• Same response when quality was high
• Alternate response when quality was low
 Project will be tested by a red-team on 20 May 2008. Goal is to demonstrate
“better” responses over time.
CSISM / BBN
Haigh & Harp, Learning from Limited Experience
Conclusion
Haigh & Harp, Learning from Limited Experience
Learning Benefits
• Learning can improve the defensive posture
– better knowledge (about the attacks or attacker), better policies
• Learning can improve how the system responds to
symptoms
– better connection between response actions and their triggers
• Active Learning
– A mechanism for recognizing Zero-day attacks
– No false positives — only validated attacks are added
• Learning techniques are enablers for the next level of
enhancements in adaptive defense
Adaptation is the key to survival
Haigh & Harp, Learning from Limited Experience
From Proof-of-Concept to Production
Demonstrated Future Directions
Generalization Able to
generalize
instances to
classes.
•More axes of vulnerability
•More handling of joint probabilities
•More domains
•Meta learning to induce new axes
Multi-stage
attack
•Probabilistic actions
Able to
identify Chaff •Concealment
Responses
Able to map
context to
response
•Model of normal
•Generalization
•Richer context, richer responses
•Automatic measurement of benefit
•Scalable “snapshots”
Haigh & Harp, Learning from Limited Experience
Backup
Haigh & Harp, Learning from Limited Experience
Multistage Attacks
•
•
Detect and then generalize multi-step attacks across time and
space.
Multistage attacks involve a sequence of actions that span
multiple hosts and take multiple steps to succeed.
–
–
–
A sequence of actions with causal relationships.
An action A must occur set up the initial conditions for action B.
Action B would have no effect without previously executing action A.
For example
1. gain ability to execute commands on Box1 as unprivileged user by
exploiting a buffer overflow in Service1
2. gain root shell by running an exploit of a race condition
3. disable protection mechanism, e.g. SElinux
4. replace dpasa jar with attacker jar code
5. run attacker code that sends bad refs to Box2, Box3, Box4.
Walk-Away-Message
Haigh & Harp, Learning from Limited Experience
Attacks (MySQL DoS-1)
• mysql-com_table-dump-memory-corruption
– Malformed request leaves MySQL unstable
• Countermeasures:
– Block the malformed com_table_dump command using
learned pattern and proxy filter rules.
– Restart the server
– Block all requests from the offending sources
Haigh & Harp, Learning from Limited Experience
Attacks (MySQL DoS-2)
• mysql-password-handler-buffer-overflow
– Excessive password length can crash server
• Countermeasures:
– Block connections which proffer “abnormal”
passwords (learned response or statistical
anomaly).
– Restart the server.
– Block all requests from the offending sources.
Haigh & Harp, Learning from Limited Experience
Attacks (MySQL DoS-3)
• mysql-remote-fulltext-search-DoS
– Malformed request crashes server
• Countermeasures:
– Detect and block malformed queries
– Block all queries of this type (fulltext-search)
– Block all requests from the offending sources.
– Restart the server
Haigh & Harp, Learning from Limited Experience
Download