Simulation

advertisement
Web Data Management
Bisimulation
1
In this lecture
• Semistructured data model
• Graph Simulation and Bisimulation
• Computing (bi)simulation
Resources
Adding structure to semistructured data by Buneman, Davidson,
Fernandez, Suciu, in ICDT 97
Data on the Web Abiteboul, Buneman, Suciu : section 6.4
2
The Semistructured Data Model
Bib
&o1
complex object
paper
paper
book
references
&o12
&o24
&o29
references
author
title
year
author
http
references
author
title publisher
author
author
&o43
page
title
author
&25
&96
1997
last
firstname
lastname
atomic object
firstname
lastname
&243
“Serge”
“Abiteboul”
“Victor”
Object Exchange Model (OEM)
first
&206
“Vianu”
122
133
3
Syntax for Semistructured Data
May omit oid’s:
{ paper: { author: “Abiteboul”,
author: { firstname: “Victor”,
lastname: “Vianu”},
title: “Regular path queries …”,
page: { first: 122, last: 133 }
}
}
4
Set Semantics for Trees
Want to say that {a, a, b} = {a, b}
Define equality for trees first, then for graphs
Definition Two trees t, t’ are equal, t=t’, if:
1. They are both atomic values with same value
2. t = {t1, ..., tm}, t’ = {t1’, ..., tn’} and:
– i=1,...,m, j=1,...,n s.t. ti = tj’
– j=1,...,n, i=1,...,m s.t. ti = tj’
5
Set Semantics: Example
a
c
b
c
1
b
d
2
c
e
3
a
=
2
d
e
3
a
c
c
1
1
c
1
b
c
1
c
2
d
e
3
6
Set Semantics for Graphs
• Previous definition does not apply directly
to graphs with cycles
• Need to adapt it  bisimulation
• First, we will define a simulation
7
Graph Simulation
Definition Two edge-labeled graphs G1, G2
A simulation is a relation R between nodes:
• if (x1, x2)  R, and (x1,a,y1)  G1,
then exists (x2,a,y2)  G2 (same label)
s.t. (y1,y2)  R
x1
G1
R
x2
a
a
y1
R
G2
y2
Note: if we insist that R be a function  graph homeomorphism
8
Graph Bisimulation
Definition Two edge-labeled graphs G1, G2
A bisimulation is a relation R between nodes
s.t. both R and R-1 are simulations
9
Set Semantics for Semistructured Data
Definition Two rooted graphs G1, G2 are equal if
there exists a bisimulation R from G1 to G2
such that (root(G1), root(G2))  R
• Notation: G1  G2
• For trees, this is precisely our earlier definition
10
Examples of Bisimilar Graphs
a
b
a
=
b
c
c
c
a
a
a
a
a
a
=
...
11
Examples of non-Bisimilar
Graphs
a
a
a
G1=
b
c
G2=
b
c
• This is a simulation but not a bisimulation
– Why ?
• Notice: G1, G2 have the same sets of paths
12
Examples of Simulation
• Simulation acts like “subset”
{a, b}  {a, b, c}
a
b
c
a b
{a, b:{c}}  {d, a:{e,f}, b:{c,g}}
a
d
b
e
c
a
b
f
c
g
• Question:
• if DB1  DB2 and DB2  DB1 then DB1  DB2 ?
13
Answer
if DB1  DB2 and DB2  DB1 then DB1  DB2 ?
No. Here is a counter example:
DB1
a
DB2
a
a
b
b
DB1  DB2 and DB2  DB1 but NOT DB1  DB2
14
Facts About a (Bi)Simulation
• The empty set is always a (bi)simulation
• If R, R’ are (bi)simulations, so is R U R’
• Hence, there always exists a maximal
(bi)simulation:
– Checking if DB1=DB2: compute the maximal
bisimulation R, then test (root(DB1),root(DB2)) in R
15
Computing a (Bi)Simulation
• Computing the maximal (bi)simulation:
– Start with R = nodes(G1) x nodes(G2)
– While exists (x1, x2)  R that violates the
definition, remove (x1, x2) from R
• This runs in polynomial time ! Better:
– O((m+n)log(m+n)) for bisimulation
– O(m n) for simulation
– Compare to finding a graph homeomorphism !
NP Complete
16
Download