slides - DBPL 2013

advertisement
Introducing Access Control in
Webdamlog
Serge Abiteboul
INRIA Saclay & ENS Cachan
Joint work with Emilien Antoine, Gerome Miklau,
Julia Stoyanovich and Vera Zaychik Moffitt
Mai 30, 2012
ICDE 2012
•
•
•
•
•
The Web as a distributed knowledge base
Webdamlog: a rule-based language for the
Web
Access control in Webdamlog
The Webdamlog system
Conclusion
Abiteboul – DBPL 2013
2
•
A typical Web user’s data
What kinds of data?
-
all kinds
data: photos, music, movies, reports, email
metadata: photo taken by Alice in Paris on ...
ontologies: Alice’s ontology and mapping with other
ontologies
localization: Alice’s pictures are on Picasa, back-ups are at
INRIA
- security: Facebook credentials (Alice, 123456)
Social
- annotations: Alice likes Elvis’ website
data
- beliefs: Alice believes Elvis is alive
- external knowledge: Bob keeps copies of Alice’s pictures
Abiteboul – DBPL -2013time, provenance, ...
3
A typical Web user’s data
•
•
What kinds of data?
all kinds
Where is the data?
everywhere
-
laptop, desktop, smartphone, tablet, car computer
mail, address book, agenda
Facebook, LinkedIn, Picasa, YouTube, Tweeter
svn, Google docs
also access to data / information of family, friends,
companies associations
Abiteboul – DBPL 2013
4
A typical Web user’s data
•
•
•
What kinds of data?
all kinds
Where is the data?
everywhere
heterogeneous
What kind of organization?
-
terminology: different ontologies
systems: personal machines, social networks
distribution: different localization
security: different protocols
quality: incomplete / inconsistent information
Abiteboul – DBPL 2013
5
Example of processing
Alice and Bob are getting engaged. Their friends want to offer
them an album of photos where they are together
To make such a photo album
•
•
Find friends of Alice & Bob (say with Facebook)
for each friend, find where she keeps her photos (say,
Picassa)
•
•
•
find the means to access her photos possibly via
friends
find the photos that feature Bob and Alice together,
e.g., using tags or face recognition software
possibly ask someone to verify the results
Some reasoning is needed to execute these tasks automatically!
Abiteboul – DBPL 2013
6
A typical Web user
•
•
•
•
Overwhelmed by the mass of information
Cannot find the information needed
Is not aware of important events
Cannot manage/control how others
access and use his/her own data
Abiteboul – DBPL 2013
7
How can systems help?
•
We need to move from a Web of text to a
Web of knowledge
•
In the spirit of semantic Web
To better support user needs,
-
Systems need to analyze what is
happening and construct knowledge
Systems should exchange knowledge
Systems should reason and infer
knowledge
Abiteboul – DBPL 2013
8
YOU need help!
Thesis
All this forms a distributed knowledge base
with processing based on automated reasoning
Abiteboul – DBPL 2013
9
Our topic
•
Distributed reasoning
Exchanging facts and rules
Webdamlog
•
Access control
control
Abiteboul – DBPL 2013
with access
10
•
•
•
•
•
The Web as a distributed knowledge base
Webdamlog: a rule-based language for the
Web
Access control in Webdamlog
The Webdamlog system
Conclusion
Abiteboul – DBPL 2013
11
Webdamlog: a datalog-style
language
Datalog
A prehistoric language by Web time...
+ nice and compact syntax
+ well-studied with many extensions
+ recursion essential: network cycles
Webdamlog
Not as simple/beautiful & procedural
Needed for real Web applications!
Abiteboul – DBPL 2013
12
Webdamlog is not
datalog
Webdamlog: an extension of
datalog
Datalog program
fof(x,y) :- friend(x,y)
fof(x,y) :- friend(x,z), fof(z,y)
Extensional facts
(stored in the database)
friend(“peter”,”paul”) friend(“paul”, “mary”)
friend(“mary”,”sue”)
Intentional facts
(derived)
fof(“peter”,”paul”) fof(“peter”,”mary”) fof(“peter”, “sue”)
fof(“paul”, “mary”) fof(“paul”, “sue”)
fof(“mary”,”sue”)
Abiteboul – DBPL 2013
13
Webdamlog: an extension of
datalog
Extends datalog
• negation, updates, distribution, delegation, time
For a world that is
• distributed: autonomous and asynchronous peers
• dynamic: knowledge evolves; peers come and go
Influenced by
• Active XML (INRIA) - for distribution & intentional
data
• Dedalus (UC Berkeley) - for time & implementation
Abiteboul – DBPL 2013
14
Facts
Facts are of the form m@p(a1, ..., an), where
m is a relation name
& p is a peer name
a1, ..., an are data values (n is the arity of m@p)
the set of data values includes the relations and peer
names
Examples
friend@my-iphone(“peter”, “paul”)
fof@my-iphone(“adam”, “paul”)
Abiteboul – DBPL 2013
15
extensional
intentional
Examples of facts
data & metadata: pictures@alice-iphone(1771.jpg, “Paris”,
11/11/2011)
ontology: isA@yago.com("Elvis”, theKing)
annotations: tags@delicious.com(“wikipedia.org”, encyclopedia)
localization: where@alice(pictures, picasa/alice)
access rights: right@picasa(pictures, friends, read)
security: secret@picasa/alice; public@picasa/alice
Abiteboul – DBPL 2013
16
Rules
Rules are of the form
A term is a
variable or
a constant
$R@$P($U) :- (not) $R1@$P1($U1), ..., (not) $Rn@$Pn($Un)
where
$R, $Ri are relation terms
$P, $Pi are peer terms
$U, $Ui are tuples of terms
Safety condition
$R and $P must appear positively bound in the body
each variable in a negative literal must appear positively bound in the
body
Abiteboul – DBPL 2013
Examples coming up,
stay tuned
17
State transition
Choose some peer p randomly – asynchronously
Compute the transition of p
the database updates at p
the messages sent to other peers
the delegations of rules to other peers
Keep going forever
(I0, Γ0, ∅) (I1, Γ1, Γ1*) ... (In, Γn, Γn*) ...
Fair sequence: each peer is selected infinitely often
Abiteboul – DBPL 2013
18
The semantics of rules
Classification based on locality and nature of head
predicates (intentional or extensional)
•
Local rule at my-laptop: all predicates in the body
of the rules are from my-laptop
Local with local intentional head
classic datalog
Local with local extensional head
database update
Local with non-local extensional head messaging between peers
Local with non-local intentional head
view delegation
Non-local
general delegation
Abiteboul – DBPL 2013
19
Local rules with local intentional head
Example: Rule at peer my-laptop
friend is extensional, fof is intentional
fof@my-iphone($x, $y) :- friend@my-iphone($x,$y)
fof@my-iphone($x,$y) :- friend@my-iphone($x,$z), fof@myiphone($z,$y)
fof is the transitive closure of friend
Datalog = Webdamlog with only local rules and local intentional
head
Abiteboul – DBPL 2013
20
Local rules with local extensional
head
A new fact is inserted into the local database
believe@my-iphone(“Alice”, $loc) :tell@my-iphone($p,”Alice”, $loc),
friend@my-iphone($p)
Abiteboul – DBPL 2013
21
Local rules with non-local extensional
head
A new fact is sent to an external peer via a message
$message@$peer($name, “Happy birthday!”) :today@my-iphone($date),
birthday@my-iphone($name, $message, $peer, $date)
Extensional facts:
today@my-iphone(March 6)
birthday@my-iphone("Manon”, “sendmail”, “gmail.com”, March
sendmail@gmail.com("Manon”, “Happy birthday”)
Abiteboul – DBPL 2013
22
Local rules with non-local intentional
head
View delegation!
boyMeetsGirl@gossip-site($girl, $boy) :girls@my-iphone($girl, $loc),
boys@my-iphone($boy, $loc)
Semantics of boyMeetGirl@gossip-site is a join of relations girls
and boys from my-iphone
Formally, my-iphone delegates a rule boyMeetGirl@gossip-site(g,b)
for each g, b, l, girls@my-iphone(g,l), boys@my-iphone(b,l)
Abiteboul – DBPL 2013
23
Non-local rules: general delegation
(at my-iphone):
boyMeetsGirl@gossip-site($girl, $boy) :girls@my-iphone($girl, $loc),
boys@alice-iphone($boy, $loc)
Suppose that girls@my-iphone(“Alice”, “Julia's birthday”) holds.
Then my-iphone installs the following rule at alice-iphone
(at alice-iphone):
boyMeetsGirl@gossip-site(“Alice”, $boy) :boys@alice-iphone($boy, “Julia's
birthday”)
When girls@my-iphone(“Alice”, “Julia's birthday”) no longer holds,
my-iphone uninstalls the rule
Abiteboul – DBPL 2013
24
Complexity of delegation: illustration
fof(x,y) :- friend(x,y)
(at p) fof@p(x,y) :- peers@p($q), friend@$q(x,y)
If peers@p contains 100 000 tuples
peers@p(q1), ...., peers@p(q100 000)
This rule will install 100 000 rules!
for i=1 to 100 000 (at qi) fof@p(x,y) :friend@qi(x,y)
Data complexity transformed into program
complexity
Abiteboul – DBPL 2013
26
Summary of results [PODS 2011]
•
•
Formal definition of the semantics of Webdamlog
Results on expressivity
•
the model with delegation is more general, unless
all peers and programs are known in advance
Convergence is very hard to achieve
-
positive Webdamlog
strongly stratified programs with negation
Abiteboul – DBPL 2013
27
•
•
•
•
•
The Web as a distributed knowledge base
Webdamlog: a rule-based language for the
Web
Access control in Webdamlog
The Webdamlog system
Conclusion
Abiteboul – DBPL 2013
28
Requirements
Data access Users would like to control who can read
and modify their information
Data dissemination Users would like to control how their
data are transferred from one participant to another, and
how they are combined, with the owner of each piece of
data keeping some control over it
Application control Users would like to control which
applications can run on their behalf, and what
information these applications can access.
Abiteboul – DBPL 2013
29
The general picture
•
•
The privileges we consider: read, write, grant
For read:
•
•
Coarse grained access control: at the relation
level
Fine grain access control: at the tuple level
Abiteboul – DBPL 2013
30
Insertion in extentional relations
Definition of intensional relations
• Requires write privilege on the target relation
• [at Alice] alicePhotos@Bob($f) :person@Alice($p, “Friend”),
personInPhoto@Alice($pid, $p),
photo@Alice($pid,−, $f)
•
•
[at Alice] allPhotos@Alice($f) :
alicePhotos@Alice($f)
[at Bob] allPhotos@Alice($f) :- bobPhotos@Bob($f)
Abiteboul – DBPL 2013
31
Who can read a fact ? – default
•
•
Extensional relations: if you have read privilege to
the relation
Intensional relations:
if you have read privilege to the relation &
if you can read all the tuples that have been used to
create this fact – provenance of the fact
Abiteboul – DBPL 2013
32
Digression: provenance
•
Provenance of a tuple
•
•
How it was constructed: conjunction
Alternatives: disjunction
Abiteboul – DBPL 2013
33
Digression: provenance graph
(Also used for maintenance in case of update)
girls@p(Jane, Julia’s birthday) rule1 boys@p(John, Julia's birthday)
×
rule3
×
×
boyMeetsGirl@p(Jane, John)
Abiteboul – DBPL 2013
gossip@p(Jane, John)
34
Coarse grain access control
•
•
•
[at Alice] alicePhotos@Bob($f) :person@Alice($p, “Friend”),
personInPhoto@Alice($pid, $p),
photo@Alice($pid,−, $f)
alicePhotos@Bob is extensional
Whoever has read access to alicePhotos@Bob
sees all the relation
Abiteboul – DBPL 2013
35
Fine grain access control
•
•
•
•
•
[at Alice] allPhotos@Alice($f) :
alicePhotos@Alice($f)
[at Bob] allPhotos@Alice($f) :- bobPhotos@Bob($f)
allPhotos@Alice is intensional
Sue who has read privilege to allPhotos@Alice and
alicePhotos only, can see only the photos of Alice in
allPhotos
Lili who has read privilege to the three relations,
sees everything
Abiteboul – DBPL 2013
36
Overwriting the default for
intensional data
•
•
•
Let us change the rule to:
[at Alice] allPhotos@$x($f) :alicePhotos@Alice($f), friends@Alice($x)
Issue: you can read the photos only if you also have
read privilege to friends@Alice
Abiteboul – DBPL 2013
37
Overwriting the default for
intensional data
•
•
•
[at Alice] allPhotos@$x($f) :alicePhotos@Alice($f),
[hide friends@Alice($x)]
Hide: block the provenance from friends@Alice
Similar mechanism for extensional data – expose
Abiteboul – DBPL 2013
38
•
Issues with non local rules
[at Bob]
message@Sue(“I hate you”) :- date@Alice(d)
aliceSecret@Bob(x) :- date@Alice(d),
secret@Alice(x)
Ignoring access rights, by delegation, this results in
running
•
[at Alice]
message@Sue(“I hate you”) :- date@Alice(d)
aliceSecret@Bob(x) :- date@Alice(d),
secret@Alice(x)
Abiteboul – DBPL 2013
39
Default solution: sand box
We run the rule at Alice in a Sandbox
•
We use the access rights of Bob
So the second rule does not succeed in sending
secrets
•
The message specifies that this is done at Bob’s
request
So requires authentication/signatures
•
Alternative: delegation without sandbox. Possible if
the peer that asks for the delegation is given the
privilege to install rules at the other peer – Here if
Alice
gives
Bob
the
right
to
install
a
rule
in
her
Abiteboul – DBPL 40
2013
environment
Access control implementation
•
•
A program with access control is compiled locally in
a Webdamlog program without that is executed
Access control data is managed like any other data
Relation acl (defines relation access)
Relation kind
•
•
(ext or int)
Based on provenance implemented as a distributed
graph
On-going work on optimization
Abiteboul – DBPL 2013
41
•
•
•
•
The Web as a distributed knowledge base
Webdamlog: a rule-based language for the
Web
Access control in Webdamlog
The Webdamlog system
Abiteboul – DBPL 2013
42
The Webdamlog engine
Based on Bud
• developed at UC Berkeley
Manages knowledge
- Stores facts and rules
- exchanges knowledge with
other engines
-
performs reasoning
Abiteboul – DBPL 2013
43
The engine: beyond Bud
•
Compilation of
Webdamlog+AC
⇒
Webdamlog
⇒
Bloom
(Bud’s
language)
•
Main Webdamlog features not
supported by Bud
1. Variable relation and peer
names
2. Delegations with dynamic
changes of the program
Abiteboul – DBPL 2013
44
The Webdamlog peer
Support communication with other peers and with users
Support common security protocols
Support wrappers to external systems such as
Facebook
Provides Web interfaces
Abiteboul – DBPL 2013
45
Provenance graphs
•
•
•
•
•
Records the history of derivation
Provenance semiring semantics [Green et al. 07]
Used for performance optimization
Used for fine grain access control
Other possible uses such as explanation of results
Abiteboul – DBPL 2013
46
•
•
•
•
•
The Web as a distributed knowledge base
Webdamlog: a rule-based language for the
Web
Access control in Webdamlog
The Webdamlog system
Conclusion
Abiteboul – DBPL 2013
47
Thesis
Let us turn the Web into a distributed knowledge base
with billions of users
supported by billions of systems
analyzing information
extracting knowledge
exchanging knowledge
inferring knowledge
Abiteboul – DBPL 2013
48
Language
Webdamlog
• A language for distributed data management [PODS
2011]
• Datalog with distribution, updates, messaging
• Main novelty: delegation
Implementation
• WebdamExchange peer in Java [demo ICDE 2011]
• Webdamlog engine based on Bud [demo Sigmod
2013]
Access control:
Stoyanovich
on-going work with Miklau-
Probabilistic Webdamlog:
Abiteboul – DBPL 2013
on-going work with Deutch-
49
Grazie !
Cambridge University Press,
2012
http://webdam.inria.fr/Jorge
Download