1 - CSIE -NCKU

advertisement
GPEP : Graphics Processing Enhanced PatternMatching for High-Performance Deep Packet
Inspection
Author:
Lucas John Vespa, Ning Weng
Publisher:
2011 IEEE International Conferences on Internet of Things ,and Cyber , Physical and Social
Computing (4th CPSCom)
Presenter:
Ye-Zhi Chen
Date:
2012/04/25
Introduction
GPEP uses an optimized version of our pattern matching algorithm
called P3FSM, which has low operational complexity, but reduces
the memory requirement such that the state tables can fit into the
small on chip memories of a GPU
P3FSM
1. DFA Optimization (split-DFA):
This optimization splits the DFA transitions into primary and
secondary blocks at the first level of the DFA.
All incoming transitions to the primary block are removed from the
DFA.
An example split-DFA is shown in Figure 1(b).
The two blocks are encoded into two separate memory tables. If a
transition is not present in the secondary block table, then the
primary block table acts as a default transition lookup for the
current input character.
P3FSM
Primary
I
secondary
P3FSM
2. Deriving State Codes :
(1) Group all states in the SDFA that have the same next state into a
group
(2) Groups with the same character are combined into a cluster.
(3) the number of clusters are reduced by merging all the clusters that
do not have common states in the secondary block to form one cluster
(4)Encoding the groups :
a) Character Signature (cs) : it identifies the character required
for transitions to a state
b) State Signature (ss) : it identifies the next state
(5)State code : a state code for each state is obtained by concatenating
the group codes for the groups that a state is a member of
P3FSM









G1[S0][H]
G2[S0][S]
G3[S1][E]
G4[S1 S5][I]
G5[S2 S7 S9][H]
G6[S3 S8 ][R]
G7[S4][S]
G8[S5][E]
G9[S6][S]
I
P3FSM
C1 C2
H
S
E
R
I
P3FSM
Falure
index
Operating Table :
(1) Charater / Cluster Table (cc) :
(2)Code Table (code):
Sindex =Choffset + Ssig
P3FSM
P3FSM
Memory Efficient :
┐
Equation 1 : STT = Q*「log2Q *28
Q is the total number of state of the DFA
┐
Equation 2 :P3FSM = Q*(L+「log2P )
L is the length of state code
P is the number of patterns to be detected
P3FSM
GPEP ARCHITECTURE
GPEP ARCHITECTURE
Host :
•
The host creates and optimizes the DFA
•
The host transfers the resulting tables to the memory of the GPU
•
The host also maintains the current packet buffer which is mapped
to the global memory of the GPU
GPEP ARCHITECTURE
Device :
The memory tables necessary for the P3FSM kernel operation are
stored in the local data store (LDS) of each compute unit, and the
private memory of each stream core.
GPEP ARCHITECTURE
GPEP ARCHITECTURE
Download