Kazuhiko Kato
University of Tsukuba
Japan
•
•
•
Modeling
Approximates complex human or software behavior.
• Enables effective analysis of usage patterns.
•
Virtualization
Simulates real resources,
Adding some capabilities such as access control,
Modifying some semantics.
Model
Program/User
Virtulization
Resource
2
Anomaly Detection Based on a
Feature Extraction Approach
M. Oka and K. Kato
(1) Misuse detection
Matched one is a misuse.
"Pattern matching"
Misuse pattern
DB
(2) Anomaly detection
Matched one is a normal one.
Non-matched one is an anomaly.
"Model matching"
Normal model
DB
Our view : IDS can be recognized as a
"pattern recognition" problem.
4
I. Bottom-up approach
Normal input Learning
(historgram, n-grams)
Vector-space model
Classify
Tested input
II. Top-down approach
A specialist gives a structural model
(automaton, Bayesian/
HMM network)
Learning Classify
Normal input
Structural model with parameters
Tested input
5
•
Apply a feature extraction technique to construct a structural model automatically .
Can be recognized as a hybrid-approach.
•
Inspired by the Eigenface technique developed in the computer vision area.
In 1990s, a pioneered work known by Matthew
Turk and Alex Pentland.
6
“Eigenface approach is considered the first facial recognition technology that worked.”
(Wikipedia: Eigenface)
By applying the PCA technique, every face can be approximately represented by: a i
α + b i
β + c i
γ + d i
δ
α γ
β δ
Some eigenfaces from AT&T Laboratories Cambridge.
7
Event sequence Embedded structural relations
Generalization
Instantiation
8
Co-occurrence between event pairs scope size
= 3
1
1
2
Event sequence
Co-occurrence between
0 and
1 2
= 1
0
0 0 2 1
0 1 1
= 2
1
0 1 0 0
Co-occurrence matrix
9
Event sequence Matrix Structural model
0 1 2 0
0 0 2 1
0 1 1 1
0 1 0 0
Interpreted as adjacent matrix.
It has a huge dimension in general, but can be reduced by PCA.
10
Model generation phase
Co-occurrence matrix
&
Vectorization
Principal
Component
Analysis
Top eigenvectors
Profiling phase
Co-occurrence matrix
&
Vectorization
Inner product
Feature vector
11
User profile
Tester profile
Feature vector
Structural model
Similarity testing
Threshold Normal
Reject feature vector
12
E1
01 2 0 f1 =
01 2 0 threshold
E2
01 2 0 f2
=
01 2 0 threshold
E3
01 2 0 f3
=
01 2 0
0100 threshold
•
Schonlau Data
Collected by Dr. Matt Schonlau http://www.schonlau.net/
-
-
-
-
-
-
User
1
...
50
UNIX command log
70 users
Randomly chosen 50 users
20 users were masqueraders
15,000-truncated commands for each user
100 commands as one block
Learn Test
14
ROC Curve
Perfect
Be tte r
Ran do m gue
W orse ss
False positive rate
16
100
90
80
70
60
50
40
30
20
10
0
0 10 20 30 40 50 60 70 80 90 100
False Positive Rate
17
(Non-tuned Prototype with MatLab)
Offline phase
Transform sequences to co-occurrence matrices
Calculate N eigenvectors
Obtain feature vectors
Construct layered networks
Generate lookup tables with subnetworks
Online phase
Transform a sequence to a co-occurrence matrix
Obtain a feature vector
Construct layered networks
Compare networks with a lookup table
(minutes)
Total(840.73)
26.77
23.60
6.76
677.1
106.5
(seconds)
Total (22.134)
0.642
0.162
16.25
5.08
In Japan (as well as other countries :-), security problems are nation-wide problem.
Some years ago, attack to servers were most problematic.
Recently, information leakage from end-user or client environments are most problematic.
20
Operation mistake
Lost or stolen note PC.
Lost USB memory.
Abandoned computers, hard disks.
•
P2P, file-exchange system.
Serious problem in Japan: Winny
Exposing virus
21
•
•
•
•
•
Servers are relatively easy to protect, since:
Limited number
Located in a closed space
Managed by specialists
•
Clients are problematic:
Huge number
Not limited to a closed space
Often managed by end-users
22
•
•
Applying software version update and patches.
For bugs and software vulnerabilities
Applicable to only known and solved problems.
•
Using anti-virus middleware
Announcement
On Mach 15, 2006, Mr. Abe, Japanese Chief Cabinet
Secretary (the previous Prime Minister) announced:
“The most effective way to avoid information leakage is not to use Winny. I sincerely beg it to all of you.”
23
NISC (National Institute of Security Center) of the Cabinet Secretariat in Japan decided to develop a Secure Virtual Machine in
2006 Spring.
•
The development is submitted to a
University-based team headed by myself.
Developers are gathered from Fujitsu, Hitachi,
NEC, NTT Data.
Applications
OS
Hardware
Applications
OS
SVM
Hardware
24
•
•
University of Tsukuba
K. Kato, T. Shinagawa, Y. Shinjyo,
H. Eiraku, K. Omote, S. Hasegawa, T. Horie, K.
Tanimoto
•
Supporting professors
Y. Oyama (UEC), K. Korai, S. Chiba (TITech), K. Kono
(Keio), E. Kawai, Yagi, Seiki (NAIST), M. Hirano
(TNCT)
25
•
Support assured confidentiality
Enforced and transparent encryption of storage
& network data
•
•
• Strict ID/Key management using IC cards
•
Support commodity operating systems
Windows XP/Vista, Linux
•
Practical use
Could be used in Japanese government
Released as open source software
Supported by IT vendors in the future.
26
Internet LAN
VPN Server
PIN: ****
Card
Reader
IC Card
•
•
Insert IC card
User and roles identification
Private keys
•
•
Enter PIN number
Boot OS
with authorized security level
Connect VPN
to authorized servers
•
Extract IC card
Suspend/Shutdown OS
Windows XP/Vista, Linux
Storage Drivers NIC Drivers
Storage
Data
Encryption
Storage
SVM
ID
ID and Key
Management
Access Control
Hardware
Card Reader
Network
VPN
Management
NIC
28
•
•
Hybrid device accesses
Non storage/network access is almost pass-through
Graphics, sound, mouse, keyboard, power mgmt, etc.
-
Storage/network access is intercepted/virtualized by
VMM
HDD, USB memory, NIC, etc.
•
Protection-based virtualization
“Device does access control and encryption”
Sensitive control I/O is checked and data I/O is encrypted
29
Device Drivers
Guest OS
Device Drivers
Almost Pass-through
SVM
Passed-through Drivers Intercepted Drivers
Control I/O Data I/O
Access Control Encryption
Hardware
Non Storage/Network Devices Storage/Network Devices
7
•
•
•
Specially designed VMM for client security.
•
Light-weight
Limited virtualization
Device drivers in a guest OS are reused.
Low overhead is expected.
-
Incremental development is possible.
e.g. Increasing the virtualized devices.
31
•
VM core made from scratch is running for
Windows XP/Vista, and Linux.
•
Source code status (lines)
VMM core: 19K
Network (IPSec) - 18K
•
•
•
•
NIC driver - working
IC card for the Japanese government officials - 15K
IDE hard disk - 1.5K
--- Current total: 52K(+)
VM framework and network virtualization will be available soon as OSS soon.
32
•
A single guest operating system currently.
•
No modification to guest OSs
Cannot prevent virus infection.
-
-
Cannot prevent Winny installation/execution.
Can control network connection.
Can watch packet patterns.
33
•
Verification of SVM.
SVM framework itself.
Its code size is limited.
• Device virtualization code.
(Semi-)automatic generation of device virtualization code from specification.
Simultaneous multiple guest OSs.
VM-level IDS
Organizational control of access policy
34
Sub organization A
Organization
Access control policy
Sub organization B Internet
35