CS 290C: Formal Models for Web Software Lecture 9: Analyzer and SMT-Solvers

advertisement
CS 290C: Formal Models for Web Software
Lecture 9: Analyzing Data Models Using Alloy
Analyzer and SMT-Solvers
Instructor: Tevfik Bultan
Three-Tier Architecture
Browser
Web Server
Backend
Database
Three-Tier Arch. + MVC Pattern
Browser
• MVC pattern has become the standard
way to structure web applications:
Controller
Views
Model
Web
Server
Backend
Database
•
•
•
•
•
•
Ruby on Rails
Zend for PHP
CakePHP
Struts for Java
Django for Python
…
Benefits of the MVC-Architecture
• Benefits of the MVC architecture:
• Separation of concerns
• Modularity
• Abstraction
• These are the basic principles of software design
• Can we exploit these principles for analysis?
A Data Model Verification Approach
MVC
• Ruby on
Rails
Application
MVC Design
Principles
Data
Model
• ActiveRecords
Formal
Model
Automatic
Extraction
Add data
model properties
• Alloy
Bounded • Alloy
Verification Analyzer
Rails Data Models
• Data model verification: Analyzing the
associations/relations between data objects
• Specified in Rails using association declarations inside the
ActiveRecord files
– The basic relation types
• One-to-one
• One-to-many
• Many-to-many
– Extensions to the basic relations using Options
• :through, :conditions, :polymorphic, :dependent
The Three Basic Relations in Rails
• One-to-One (One-to-ZeroOrOne)
class User < ActiveRecord::Base
has_one :account
end
.
class Account < ActiveRecord::Base
belongs_to :user
end
User
1
0..1
Account
.
• One-to-Many
class User < ActiveRecord::Base
has_many :projects
end
.
class Project < ActiveRecord::Base
belongs_to :user
end
User
1
*
Project
The Three Basic Relations in Rails
• Many-to-Many
class Author < ActiveRecord::Base
has_and_belongs_to_many :books
end
class Book < ActiveRecord::Base
has_and_belongs_to_many :authors
end
Author
*
*
Book
Options to Extend the Basic Relations
• :through Option
– To express transitive relations, or
– To express a many-to-many relation using a join model
as opposed to a join table
• :conditions Option
– To relate a subset of objects to another class
• :polymorphic Option
– To express polymorphic relations
• :dependent Option
– On delete, this option expresses whether to delete the
associated objects or not
The :through Option
class Book < ActiveRecord::Base
has_many :editions
belongs_to :author
end
class Author < ActiveRecord::Base
has_many :books
has_many :editions, :through => :books
end
class Edition < ActiveRecord::Base
belongs_to :book
end
Book
1
*
*
1
Author
1
*
Edition
The :conditions Option
class Account < ActiveRecord::Base
has_one :address,
:conditions => “address_type=‘Billing”
end
.
class Address < ActiveRecord::Base
belongs_to :account
end
Address
Account
address_type=
‘Billing’
The :polymorphic Option
class Address < ActiveRecord::Base
belongs_to :addressable, :polymorphic => true
end
class Account < ActiveRecord::Base
has_one :address, :as => :addressable
end
class Contact < ActiveRecord::Base
has_one :address, :as => :addressable
end
Account
Address
Contact
The :dependent Option
class User < ActiveRecord::Base
has_many :contacts, :dependent => :destroy
end
class Contact < ActiveRecord::Base
belongs_to :user
has_one :address, :dependent => :destroy
end
User
1
*
Contact
1
0..1
Address
• :delete directly deletes the associated objects without
looking at its dependencies
• :destroy first checks whether the associated objects
themselves have associations with the :dependent option
set
Formalizing Rails Semantics
Formal data model: M = <S, C, D>
• S: The sets and relations of the data model (data model
schema)
– e.g. {Account, Address, Project, User} and the relations
between them
• C: Constraints on the relations
– Cardinality constraints, transitive relations, conditional
relations, polymorphic relations
• D: Dependency constraints express conditions on two
consecutive instances of a relation such that deletion of an
object from the fist instance leads to the other instance
Formalizing Rails Semantics
• Data model instance: I = <O,R> where O = {o1, o2, . . . on} is
a set of object classes and R = {r1, r2, . . . rm} is a set of
object relations and for each ri ϵ R there exists oj, ok ϵ O
such that ri  oj × ok
• I = <O,R> is an instance of the data model M = <S,C,D>,
denoted by I |= M,
if and only if
1. the sets in O and the relations in R follow the schema
S, and
2. R |= C
Formalizing Rails Semantics
• Given a pair of data model instances I = <O,R> and I’ =
<O’,R’>, (I, I’) is a behavior of the data model M = <S,C,D>,
denoted by (I, I’) |= M,
if and only if
1. O and R and O’ and R’ follow the schema S
2. R |= C and R’ |= C, and
3. (R,R’) |= D
Data Model Properties
Given a data model M = <S,C,D>, we define four types of
properties:
1. state assertions (AS): properties that we expect to hold for
each instance of the data model
2. behavior assertions (AB): properties that we expect to hold
for each pair of instances that form a behavior of the data
model
3. state predicates (PS): predicates we expect to hold in some
instance of the data model
4. behavior predicates (PB): predicates we expect to hold in
some pair of instances that form a behavior of the data
model
Data Model Properties
Data Model Verification
• The data model verification problem: Given a data model
property, determine if the data model satisfies the property.
• An enumerative (i.e., explicit state) search technique not
likely to be efficient for bounded verification
• We can use SAT-based bounded verification!
– Main idea: translate the verification query to a Boolean
SAT instance and then use a SAT solver to search the
state space
Data Model Verification
• SAT-based bounded verification: This is exactly what the
Alloy Analyzer does!
• Alloy language allows specification of objects and relations,
and the specification of constraints on relations using firstorder logic
• In order to do bounded verification of Rails data models,
automatically translate the Active Record specifications to
Alloy specifications
Translation to Alloy
RAILS:
ALLOY:
class ObjectA
has_one :objectB
end
.sig
.
.
class ObjectA
has_many :objectBs
end
sig ObjectA {
objectBs: set ObjectB
}
.
.
class ObjectA
belongs_to :objectB
end
sig ObjectA {
objectB: one ObjectB
}
.
.
class ObjectA
has_and_belongs_to_many
:objectBs
end
sig ObjectA {
objectBs: set ObjectB
}
fact { ObjectA <: objectBs =
~(ObjectB <: objectA }
ObjectA {
objectB: lone ObjectB
}
Translating the :through Option
class Book <
ActiveRecord::Base
has_many :editions
belongs_to :author
end
sig Book {
editions: set Edition,
author: one Author
}
class Author <
ActiveRecord::Base
has_many :books
has_many :editions,
:through => :books
end
sig Author {
books: set Book,
editions: set Edition
} { editions = books.editions}
class Edition <
ActiveRecord::Base
belongs_to :book
end
sig Edition {
book: one Book
}
Book
*
1
1
Author
*
1
*
Edition
fact {
Book <: editions = ~(Edition <: book)
Book <: authors = ~(Author <: book)
}
Translating the :dependent Option
• The :dependent option specifies what behavior to take on
deletion of an object with regards to its associated objects
• To incorporate this dynamism, the model must allow analysis
of how sets of objects and their relations change from one
state to the next
class User <
ActiveRecord::Base
has_one :account
end
.
class Account <
ActiveRecord::Base
belongs_to :user,
:dependent => :destroy
end
sig User {}
sig Account {}
one sig PreState {
accounts: set Account,
users: set User,
relation1: Account lone -> one User
}
one sig PostState {
accounts’: set Account,
users’: set User,
relation1’: Account set -> set User
}
Translating the :dependent Option
pred deleteAccount [s: PreState, s’: PostState, x: Account] {
all x0: Account | x0 in s.accounts
all x1: User | x1 in s.users
s’.accounts’ = s.accounts - x
s’.users’ = s.users
s’.relation1’ = s’.relation1 – (x <: s.relation1)
}
– We also update relations of its associated object(s)
based on the use of the :dependent option
Translating the :dependent Option
pred deleteContext [s: PreState, s': PostState, x:Context] {
all x0: Context | x0 in s.contexts
all x1: Note | x1 in s.notes
all x2: Preference | x2 in s.preferences
all x3: Project | x3 in s.projects
all x4: RecurringTodo | x4 in s.recurringtodos
all x5: Tag | x5 in s.tags
all x7: Todo | x7 in s.todos
all x8: User | x8 in s.users
s'.contexts' = s.contexts - x
s'.notes' = s.notes
s'.preferences' = s.preferences
s'.projects' = s.projects
s'.recurringtodos' = s.recurringtodos
s'.tags' = s.tags
s'.todos' = s.todos - x.(s.context_todos)
s'.users' = s.users
s'.notes_user' = s.notes_user
s'.completed_todos_user' = s.completed_todos_user
s'.recurring_todos_user' = s.recurring_todos_user
s'.todos_user' = s.todos_user - (x.(s.context_todos) <: s.todos_user)
s'.active_contexts_user' = s.active_contexts_user
s'.active_projects_user' = s.active_projects_user
s'.projects_user' = s.projects_user
s'.contexts_user' = s.contexts_user - (x <: s.contexts_user)
s'.recurring_todo_todos' = s.recurring_todo_todos - (s.recurring_todo_todos :> x.(s.context_todos)) ...
Verification Overview
Active
Records
Counterexample
Data
Model
Instance
Alloy
Specification
Translator
Alloy
Analyzer
Verified
Data Model
Properties
Experiments
• We used two open-source Rails applications in our
experiments:
– TRACKS: An application to manage things-to-do lists
– Fat Free CRM: Customer Relations Management
software
TRACKS
LOC 6062 lines
Data model 13 classes
classes
Alloy spec LOC 301 lines
Fat Free CRM
12069 lines
20 classes
1082 lines
• We wrote 10 properties for TRACKS and 20 properties for
Fat Free CRM
Types of Properties Checked
• Relationship Cardinality
Note
– Is an Opportunity always
assigned to some Campaign?
• Transitive Relations
User
Project
– Is a Note’s User the same as the
Note’s Project’s User?
• Deletion Does Not Cause Dangling References
– Are there any dangling Todos after a User is deleted?
• Deletion Propagates to Associated Objects
– Does the User related to a Lead still exist
after the Lead has been deleted?
Experimental Results
• Of the 30 properties we checked 7 of them failed
• For example, in TRACKS Note’s User can be different than
Note’s Project’s User
– Currently being enforced by the controller
– Since this could have been enforced using the :through
option, we consider this a data-modeling error
• Another example from TRACKS: User deletion creates
dangling Todos
User
1
*
Context
1
*
Todo
:dependent => :delete
– User deletion does not get propagated into the relations
of the Context object, including the Todos
Performance
• To measure performance, we recorded
– the amount of time it took for Alloy to run and check the properties
– the number of variables generated in the boolean formula
generated for the SAT-solver
• The time and number of variables are averaged over the
properties for each application
• Taken over an increasing bound, from at most 10 objects for
each class to at most 35 objects for each class
Summary
• An approach to automatically discover data model errors in
Ruby on Rails web applications
• Automatically extract a formal data model, verify using the
Alloy Analyzer
• An automatic translator from Rails ActiveRecords to Alloy
– Handles three basic relationships and several options
(:through, :conditions, :polymorphic, :dependent)
• Found several data model errors on two open source
applications
• Bounded verification of data models is feasible!
What About Unbounded Verification?
• Bounded verification does not guarantee correctness for
arbitrarily large data model instances
• Is it possible to do unbounded verification of data models?
An Approach for Unbounded Verification
Web
Application
MVC Design
Pattern
• Ruby on
Rails
Data Model
• ActiveRecords
Formal
Model
Automatic
Extraction
Automatic
Translation
+
Automatic
Projection
+
Properties
• Sets and
Relations
Unbounded • SMT
Verification
Solver
Another Rails Data Model Example
Role
class User < ActiveRecord::Base
has_and_belongs_to_many :roles
has_one :profile, :dependent => :destroy
has_many :photos, :through => :profile
end
class Role < ActiveRecord::Base
has_and_belongs_to_many :users
end
class Profile < ActiveRecord::Base
belongs_to :user
has_many :photos, :dependent => :destroy
has_many :videos, :dependent => :destroy,
:conditions => "format='mp4'"
end
class Tag < ActiveRecord::Base
belongs_to :taggable, :polymorphic => true
end
class Video < ActiveRecord::Base
belongs_to :profile
has_many :tags, :as => :taggable
end
class Photo < ActiveRecord::Base
...
*
*
User
1
1
*
0..1
Photo
1
Taggable
*
Tag
*
1
Profile
1
format=.‘mp4’
*
1
Video
Translation to SMT-LIB
• Given a data model M = <S, C, D> we translate the
constraints C and D to formulas in the theory of
uninterpreted functions
• We use the SMT-LIB format
• We need quantification for some constraints
Translation to SMT-LIB
• One-to-Many Relation
RAILS:
SMT-LIB:
class Profile
has_many :videos
end
class Video
belongs_to :profile
end
(declare-sort Profile 0)
(declare-sort Video 0)
(declare-fun my_relation (Video) Profile).
Translation to SMT-LIB
• One-to-One Relation
RAILS:
SMT-LIB:
class User
has_one :profile
end
class Profile
belongs_to :user
end
(declare-sort User 0)
(declare-sort Profile 0)
(declare-fun my_relation (Profile) User).
(assert (forall ((x1 Profile)(x2 Profile))
(=> (not (= x1 x2)) (not (= (my_relation x1) (my_relation x2) ))
) ))
Translation to SMT-LIB
Many-to-Many Relation
RAILS:
SMT-LIB:
class User
has_and_belongs_to_many :roles
end
class Role
has_and_belongs_to_many :users
end
(declare-sort Role 0)
(declare-sort User 0)
(declare-fun my_relation (Role User) Bool)
Translating the :through Option
class Profile <
ActiveRecord::Base
belongs_to :user
has_many :photos
end
class Photo <
ActiveRecord::Base
belongs_to :profile
End
class User <
ActiveRecord::Base
has_one :profile
has_many :photos,
:through => :profile
end
Profile
0..1
1
1
User
*
1
*
Photo
(declare-sort Profile 0)
(declare-sort Photo 0)
(declare-sort User 0)
(declare-fun profile_photo (Photo)
Profile)
(declare-fun user_profile (Profile) User)
(declare-fun user_photo (Photo) User)
(assert (forall ((u User)(ph Photo))
(iff (= u (user_photo ph))
(exists ((p Profile))
(and (= u (user_profile p))
(= p (profile_photo ph)) ))
))
)
Translating the :dependent Option
• The :dependent option specifies what behavior to take on deletion of
an object with regards to its associated objects
• To incorporate this dynamism, the model must allow analysis of how
sets of objects and their relations change from one state to the next
class User <
ActiveRecord::Base
has_one :account,
:dependent => :destroy
end
(declare-sort Profile 0)
(declare-sort User 0)
(declare-fun Post_User (User) Bool)
(declare-fun Post_Profile (Profile) Bool)
.
class Profile <
ActiveRecord::Base
belongs_to :user
end
(declare-fun user_profile (Profile) User)
(declare-fun Post_user_profile
(Profile User) Bool)
Translating the :dependent Option
(assert (not (forall ((x User)) (=> (and
(forall ((a User)) (ite (= a x) (not (Post_User a)) (Post_User a)))
(forall ((b Profile)) (ite (= x (user_profile b))
(not (Post_Profile b)) (Post_Profile b) ))
(forall ((a Profile) (b User)) (ite (and (= b (user_profile a))
(Post_Profile a)) (Post_user_profile a b)
(not (Post_user_profile a b)) ))
) ;Remaining property-specific constraints go here
)))
– Update sets relations of its associated object(s) based on
the use of the :dependent option
Verification
• Once the data model is translated to SMT-LIB format we
can state properties about the data model again in SMT-LIB
and then use an SMT-Solver to check if the property holds
in the data model
• However, when we do that, for some large models, SMTSolver times out!
• Can we improve the efficiency of the verification process?
Property-Based Data Model Projection
• Basic idea: Given a property to verify, reduce the size of the
generated SMT-LIB specification by removing declarations
and constraints that do not depend on the property
• Formally, given a data model M = <S, C, D> and a property
p,
(M, p) = MP
where MP = ⟨S, CP, DP⟩ is the projected data model such
that CP ⊆ C and DP ⊆ D
Property-Based Data Model Projection
• Key Property: For any property p,
M |= p ⇔ (M, p) |= p
• Projection Input: Active Record files, property p
• Projection Output: The projected SMT-LIB specification
• Removes constraints on those classes and relations that
are not explicitly mentioned in the property nor related to
them based on transitive relations, dependency constraints
or polymorphic relations
Data Model Projection: Example
Role
Data
Model, M:
*
*
User
1
Property, p:
A User’s Photos are the
same as the User’s Profile’s
Photos.
1
*
Photo
*
1
*
Tag
(M, p) =
User
1
1
1
1
Taggabl
e
0..1
Profile
1
*
Video
*
Photo
*
1
0..1
Profile
Data Model
Properties
Verification Overview
Active
Records
Formal Data
Model
Projection
Translator
SMT-LIB
Specification
Counterexample
Data Model
Instance
SMT
Solver
(Z3)
Unknown
Verified
Experiments
• We used five open-source Rails apps in our experiments:
– LovdByLess: Social networking site
– Tracks: An application to manage things-to-do lists
– OpenSourceRails(OSR): Social project gallery
application
– Fat FreeCRM: Customer relations management software
– Substruct: An e-commerce application
LovdB
y
Less
LOC 3787
Data Model 13
Classes
Tracks
OSR
Fat
Free
CRM
Substru
ct
6062
4295
12069
15639
13
15
20
17
• We wrote 10 properties for each application
Types of Properties Checked
• Relationship Cardinality
Note
– Is an Opportunity always
assigned to some Campaign?
• Transitive Relations
– Is a Note’s User the same as the
Note’s Project’s User?
User
Project
• Deletion Does Not Cause Dangling References
– Are there any dangling Todos after a User is deleted?
• Deletion Propagates to Associated Objects
– Does the User related to a Lead still exist
after the Lead has been deleted?
Experimental Results
• 50 properties checked, 16 failed, 11 were data model errors
• For example in Tracks, a Note’s User can be different than
Note’s Project’s User
– Currently being enforced by the controller
– Since this could have been enforced using the :through
option, we consider this a data-modeling error
• From OpenSourceRails: User deletion fails to propagate to
associated Bookmarks
User
1
*
Bookmark
– Leaves orphaned bookmarks in database
– Could have been enforced in the data model by setting
the :dependent option on the relation between User and
Bookmark
Performance
• To measure performance, we recorded
– The amount of time it took for Z3 to run and check the
properties
– The number of variables produced in the SMT
specification
• The time and number of variables are averaged over the
properties for each application
Performance
• To compare with bounded verification, we repeated these
experiments using the tool from our previous work and
Alloy Analyzer
– The amount of time it took for Alloy to run
– The number of variables generated in the boolean
formula generated for the SAT solver
– Taken over an increasing bound, from at most 10
objects for each class to at most 35 objects for each
class
8
6
2.5
Tracks
2
4
2
0
8
6
25
OSR
20
1.5
15
1
10
0.5
5
0
0
10 15 20 25 30 35
Verification Time (s)
Verification Time (s)
Performance: Verification Time
Substruct
10 15 20 25 30 35
2.5
2
10 15 20 25 30 35
LovdByLess
Alloy
1.5
4
Z3
1
2
0.5
0
Z3+proj
0
10 15 20 25 30 35
FatFreeCRM
10 15 20 25 30 35
Scope
Performance: Formula Size (Variables)
Z3
Alloy
No. Variables (thousands)
No. Variables
200
150
100
50
0
800
600
400
200
0
10 15 20 25 30 35
Scope
non-proj
proj
LovdByLess
Tracks
OSR
Substruct
FatFreeCRM
Unbounded vs Bounded Performance
• Why does unbounded verification out-perform bounded so
drastically?
Possible reasons:
• SMT solvers operate at a higher level of abstraction than
SAT solvers
• Z3 uses many heuristics to eliminate quantifiers in formulas
• Implementation languages are different
– Z3 implemented in C++
– Alloy (as well as the SAT Solver it uses) is implemented
in Java
Summary
• Automatically extract a formal data model, translate it to the
theory of uninterpreted functions, and verify using an SMTsolver
– Use property-based data model projection for efficiency
• An automatic translator from Rails ActiveRecords to SMTLIB
– Handles three basic relationships and several options
(:through, :conditions, :polymorphic, :dependent)
• Found multiple data model errors on five open source
applications
– Unbounded verification of data models is feasible and
more efficient than bounded verification!
Possible Extensions
• Analyzing dynamic behavior
– Model object creation in addition object deletion
– Fuse the data model with the navigation model in order
to analyze dynamic data model behavior
– Check temporal properties
• Automatic Property Inference
– Manual property writing is error prone
– Use the inherent graph structure in the the data model to
automatically infer properties about the data model
• Automatic Repair
– When verifier concludes that a data model is violated,
automatically generate a repair that establishes the
violated property
Download