Data Model Property Inference and Repair Jaideep Nijjar and Tevfik Bultan {jaideepnijjar, bultan}@cs.ucsb.edu Verification Lab Department of Computer Science University of California, Santa Barbara ISSTA 2013 Motivation • Web applications influence every aspect of our lives • Our dependence on web applications keep increasing • It would be nice if we can improve the dependability of web applications Acknowledgement: NSF Support HTTP Status 500 type Exception report message description The server encountered an internal error () that prevented it from fulfilling this request. exception javax.servlet.ServletException: flp.fl_appl_stts not found. Specify owner.objectname or use sp_help to check whether the object e org.apache.jasper.runtime.PageContextImpl.doHandlePageException(PageContextImpl.java:830) org.apache.jasper.runtime.PageContextImpl.handlePageException(PageContextImpl.java:763) org.apache.jsp.fastlane_jsp._jspService(fastlane_jsp.java:242) org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:105) javax.servlet.http.HttpServlet.service(HttpServlet.java:860) org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:336) org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:302) org.apache.jasper.servlet.JspServlet.service(JspServlet.java:251) javax.servlet.http.HttpServlet.service(HttpServlet.java:860) org.apache.jasper.runtime.PageContextImpl.doForward(PageContextImpl.java:675) org.apache.jasper.runtime.PageContextImpl.forward(PageContextImpl.java:642) org.apache.jsp.index_jsp._jspService(index_jsp.java:44) org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:105) javax.servlet.http.HttpServlet.service(HttpServlet.java:860) org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:336) org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:302) org.apache.jasper.servlet.JspServlet.service(JspServlet.java:251) javax.servlet.http.HttpServlet.service(HttpServlet.java:860) root cause java.sql.SQLException: flp.fl_appl_stts not found. Specify owner.objectname or use sp_help to check whether the object exists (sp gov.nsf.fastlane.util.ApplicationStatus.<init>(ApplicationStatus.java:95) org.apache.jsp.fastlane_jsp._jspService(fastlane_jsp.java:79) org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:105) javax.servlet.http.HttpServlet.service(HttpServlet.java:860) org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:336) org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:302) Three-Tier Architecture Browser Web Server Backend Database Three-Tier Arch. + MVC Pattern Browser Web Server Controller Model Backend Database Views Model-View-Controller Pattern DB Model • MVC has become the standard way to structure web applications • • • • • • Ruby on Rails Zend for PHP CakePHP Struts for Java Django for Python … View • Benefits of the MVC pattern: Controller UserResponse Request User • Separation of concerns • Modularity • Abstraction Data Model • Data model is the heart of the web application • It specifies the set of objects and the associations (i.e., relations) between them • Using an Object-Relational-Mapping the data model is mapped to the back-end datastore • Any error in the data model can have serious consequences for the dependability of the application Our Data Model Analysis Approach Active Records Data Model (Schema + Constraints) Model Extraction Data Model + Inferred Properties Property Inference Data Model Schema Orphan Prevention Failing Properties Verification Inferred Properties Transitive Relations Delete Propagation Repair Generation Outline • Motivation • Overview of Our Approach • Rails Data Models • Basic Relations • Options to Extend Relations • Formalization of Semantics • • • • • Verification Techniques Property Inference Property Repair Experiments Conclusions and Future Work A Rails Data Model Example class User < ActiveRecord::Base has_and_belongs_to_many :roles has_one :profile, :dependent => :destroy has_many :photos, :through => :profile end class Role < ActiveRecord::Base has_and_belongs_to_many :users end class Profile < ActiveRecord::Base belongs_to :user has_many :photos, :dependent => :destroy has_many :videos, :dependent => :destroy, :conditions => "format='mp4'" end class Tag < ActiveRecord::Base belongs_to :taggable, :polymorphic => true end class Video < ActiveRecord::Base belongs_to :profile has_many :tags, :as => :taggable end class Photo < ActiveRecord::Base ... Role * * 1 User 1 * 0..1 Photo 1 Taggable * Tag * 1 Profile 1 format=.‘mp4’ 1 * Video Rails Data Models • Data model analysis verification: Analyzing the relations between the data objects • Specified in Rails using association declarations inside the ActiveRecord files • Three basic relations • One-to-one • One-to-many • Many-to-many • Extensions to the basic relationships using Options • :through, :conditions, :polymorphic, :dependent Three Basic Relations in Rails • One-to-One class User < ActiveRecord::Base has_one :profile end User 1 . class Profile < ActiveRecord::Base belongs_to :user end 0..1 Profile . • One-to-Many class Profile < ActiveRecord::Base has_many :videos end Profile 1 . class Video < ActiveRecord::Base belongs_to :profile end * Video Three Basic Relations in Rails • Many-to-Many class User < ActiveRecord::Base has_and_belongs_to_many :users end class Role < ActiveRecord::Base has_and_belongs_to_many :roles end User * * Role Extensions to the Basic Relations • :through Option • To express transitive relations • :conditions Option • To relate a subset of objects to another class • :polymorphic Option • To express polymorphic relationships • :dependent Option • On delete, this option expresses whether to delete the associated objects or not The :through Option class User < ActiveRecord::Base has_one :profile has_many :photos, :through => :profile end class Profile < ActiveRecord::Base belongs_to :user has_many :photos end class Photo < ActiveRecord::Base belongs_to :profile end Profile 0..1 1 * 1 User 1 * Photo The :dependent Option class User < ActiveRecord::Base has_one :profile, :dependent => :destroy end class Profile < ActiveRecord::Base belongs_to :user has_many :photos, :dependent => :destroy end User 1 0..1 Profile 1 * Photo • :delete directly delete the associated objects without looking at its dependencies • :destroy first checks whether the associated objects themselves have associations with the :dependent option set and recursively propagates the delete to the associated objects Formalizing Data Model Semantics Formal data model: M = <S, C, D> • S: Data model schema • The sets and relations of the data model, e.g., { Photo, Profile, Role, Tag, Video, User} and the relations between them • C: Constraints on the relations • Cardinality constraints, transitive relations, conditional relations, polymorphic relations • D: Dependency constraints about deletions • Express conditions on two consecutive instances of a relation such that deletion of an object from the first instance leads to the other instance Outline • • • • Motivation Overview of Our Approach Rails Data Models Verification Techniques • Bounded Verification • Unbounded Verification • • • • Property Inference Property Repair Experiments Conclusions and Future Work Verification Overview Active Records Model Extraction Bound bound Properties Data model + properties BOUNDED VERIFICATION formula UNBOUNDED VERIFICATION Alloy Translator SMT-LIB Translator formula Alloy Analyzer SMT Solver instance or unsat Results Results Interpreter Interpreter Property Verified Property Failed + Counterexample instance or unsat or unknown Unknown Sample Translation to Alloy class User < ActiveRecord::Base has_one :profile end class Profile < ActiveRecord::Base belongs_to :user end sig Profile {} sig User {} one sig State { profiles: set Profile, users: set User, relation: Profile lone -> one User } Sample Translation to SMT-LIB class User < ActiveRecord::Base has_one :profile end class Profile < ActiveRecord::Base belongs_to :user end (declare-sort User) (declare-sort Profile) (declare-fun relation (Profile) User) (assert (forall ((p1 Profile)(p2 Profile)) (=> (not (= p1 p2)) (not (= (relation p1) (relation p2) )) ) )) Property Inference: Motivation • Verification techniques require properties as input • Effectiveness depends on quality of properties written • Manually writing properties is time-consuming, error-prone, lacks thoroughness • Requires familiarity with the modeling language • We propose techniques for automatically inferring properties about the data model of a web application • Inference is based on the data model schema • A directed, annotated graph that represents the relations Data Model Schema Example Outline • • • • • Motivation Overview of Our Approach Rails Data Models Verification Techniques Property Inference • Orphan Prevention Pattern • Transitive Relation Pattern • Delete Propagation Pattern • Property Repair • Experiments • Conclusions and Future Work Property Inference: Overview • Identify patterns in the data model schema graph that indicates that certain property should hold in the data model • Extract the data model schema from the ActiveRecords declarations • Search for the identified patterns in the data model schema graph • If a match is found, report the corresponding property Orphan Prevention Pattern ... • For objects of a class that has only one relation: when the object it is related to is deleted but the object itself is not, such an object becomes orphaned • Orphan chains can also occur • Heuristic looks at all one-to-many or one-to-one relations to identify all potential orphans or orphan chains • Infers that deleting an object does not create orphans 0 1 ... n Transitive Relation Pattern • Looks at one-to-one or one-to-many relations in schema • Finds paths of relations that are of length > 1 • If there is a direct edge between the first and last node of the path, infer that this edge should be transitive, i.e. the composition of the others 1 2 ... n Delete Propagation Pattern • Look at schema with all relations removed that are many-tomany or transitive • Remove cycles in graph by collapsing strongly connected components to a single node • Assign levels to all nodes indicating its depth in the graph • Root node(s), those with no incoming edges, are at level 0 • Remaining nodes are at level 1 more than the maximum of their predecessors • Propagate deletes if levels between nodes is one 1 1 level=0 2 3 4 2 level=1 c level=2 Repair Generation • Check the inferred properties on the formal data model • If a property fails we can point out the option that needs to be set in the data model to make sure that the inferred property holds • For delete propagates and orphan prevention patterns: Set the dependency option accordingly to propagate the deletes • For transitive property: Set the through option accordingly to make a relation composition of two other relations Repair Examples 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 class User < ActiveRecord::Base has_one :preference, :conditions => "is_active=true”, :dependent => :destroy has_many :contexts, :dependent => :destroy has_many :todos, :through => :contexts end class Preference < ActiveRecord::Base belongs_to :user end class Context < ActiveRecord::Base belongs_to :user has_many :todos, :dependent => :delete end class Todo < ActiveRecord::Base belongs_to :context # belongs_to: user has_and_belongs_to_many :tags end class Tag < ActiveRecord::Base has_and_belongs_to_many :todos end Outline • • • • • • • • Motivation Overview of Our Approach Rails Data Models Verification Techniques Property Inference Property Repair Experiments Conclusions and Future Work Experiments on Five Applications Application LOC Classes Data Model Classes LovdByLess 3787 61 13 Substruct 15639 85 17 Tracks 6062 44 13 FatFreeCRM 12069 54 20 OSR 4295 41 15 A Social Networking Application • • • • LovdByLess: A social networking application Users can write blog entries Users can comment on a friend’s blog entry Friend deletes blog entry A Social Networking Application • A friend writes a blog entry • User comments on the friend’s blog entry • Friend deletes the blog entry A Failing Inferred Property • deletePropagates property inferred for LovdByLess Blog delete should propagate Comment A Todo List Application • • • • Tracks: A todo list application Todos can be organized by Contexts Users can also create Recurring Todos Delete the Context. Then edit the Recurring Todo. A Failing Inferred Property • Data Model and Application Error: deletePropagates property inferred for Tracks Context delete should propagate RecurringTodo False Positives • deletePropagates property inferred for FatFreeCRM Account delete should propagate Contact • But in FatFreeCRM it is valid to have a contact not associated with any account False Positives • transitive property inferred for LovdByLess ForumTopic User ForumPost • Just not a transitive relation due to semantics of the application Experiment Results Application LovdByLess Substruct Tracks FatFreeCRM OSR Property Type # Inferred # Timeout # Failed deletePropagates 13 0 10 noOrphans 0 0 0 transitive 1 0 1 deletePropagates 27 0 16 noOrphans 2 0 1 transitive 4 0 4 deletePropagates 15 0 6 noOrphans 1 0 1 transitive 12 0 12 deletePropagates 32 1 19 noOrphans 5 0 0 transitive 6 2 6 deletePropagates 19 0 12 noOrphans 1 0 1 transitive 7 0 7 # Data Model & Application Errors # Data Model Errors # Failures Due to Rails Limitations # False Positives deletePropagates 1 9 0 0 noOrphans 0 0 0 0 transitive 0 0 0 1 deletePropagates 1 3 5 7 noOrphans 0 1 0 0 transitive 0 1 0 3 deletePropagates 1 1 3 1 noOrphans 0 0 0 5 transitive 0 7 0 5 deletePropagates 0 18 1 0 noOrphans 0 0 0 0 transitive 0 0 0 6 deletePropagates 0 12 0 0 noOrphans 0 1 0 0 transitive 0 7 0 0 Property Type Conclusions and Future Work • It is possible to extract formal specifications from MVC-based data models and analyze them • We can automatically infer properties and find errors in realworld web applications • Most of the errors come from the fact that developers are not using the ActiveRecords extensions properly • This breaks the modularity, separation of concerns and abstraction principles provided by the MVC pattern • We are working on analyzing actions that update the data store • We are also investigating verifiable-model-driven development for data models Related Work • Automated Discovery of Likely Program Invariants • Daikon [Ernst et al, ICSE 1999] discovers likely invariants by observing the runtime behaviors of a program • [Guo et al, ISSTA 2006] extends this style of analysis and applies it to the inference of abstract data types • We analyze the static structure of an extracted data model to infer properties • Static Verification of Inferred Properties • [Nimmer and Ernst, 2001] integrate Daikon with ESC/Java, a static verification tool for Java programs • We focus on data model verification in web applications Related Work • Verification of Web Applications • [Krishnamurti et al, Springer 2006 ] focuses on correct handling of the control flow given the unique characteristics of web apps • Works such as [Hallé et al, ASE 2010] and [Han et al, MoDELS 2007] use state machines to formally model navigation behavior • In contrast to these works, we focus on analyzing the data model • Formal Modeling of Web Applications • WebAlloy [Chang, 2009]: user specifies the data model and access control policies; implementation automatically synthesized • WebML [Ceri et al, Computer Networks 2000]: a modeling language developed specifically for modeling web applications; no verification • In contrast, we perform model extraction (reverse engineering) Related Work • Verification of Ruby on Rails applications • Rubicon [Near et al, FSE 2012] verifies the Controller whereas we verify the Data Model • Requires manual specification of application behavior, whereas we verify manually written properties • Limited to bounded verification • Data Model Verification using Alloy • [Cunha and Pacheco, SEFM 2009] maps relational database schemas to Alloy; not automated • [Wang et al, ASWEC 2006] translates ORA-SS specifications to Alloy, and uses the Analyzer to produces instances of the data model to show consistency • [Borbar et al, Trends 2005] uses Alloy to discover bugs in browser and business logic interactions Related Work • Unbounded Verification of Alloy Specifications using SMT Solvers • [Ghazi et al, FM 2011], approach not implemented • More challenging domain since Alloy language contains constructs such as transitive closures • Specification and Analysis of Conceptual Data Models • [Smaragdakis et al ASE 2009, McGill et al ISSTA 2011] use Object Role Modeling to express data model and constraints • Focus is on checking consistency and producing test cases efficiently Questions?