Eliminating Navigation Errors in Web Applications via Model Checking and Runtime

advertisement
Eliminating Navigation Errors in Web
Applications via Model Checking and Runtime
Enforcement of Navigation State Machines
Sylvain Halle, Taylor Ettema, Chris Bunch, and
Tevfik Bultan
University of California, Santa Barbara
Web software
• Web software is becoming increasingly dominant
• Web applications are used extensively in many areas:
– Commerce: online banking, online shopping, …
– Entertainment: online music, videos, …
– Interaction: social networks
• We will rely on web applications more in the future:
– Health records (Microsoft HealthVault, Google Health)
– Controlling and monitoring of national infrastructures (Google
Powermeter)
•
Web software is also rapidly replacing desktop applications
– cloud computing + software-as-service
– Google Docs, Google …
One Major Road Block
• Web applications are not dependable!
• Web applications are error prone
– Many web applications have navigation errors: They mishandle
unexpected user requests
• Web applications are notorious for security vulnerabilities
– Their global accessibility makes them a target for many malicious
users
• As web applications are becoming increasingly dominant and as their
use in safety critical areas is increasing, their dependability is
becoming a critical issue
Web applications are error prone
• Most web applications have navigation errors where an unexpected
user request can cause a web application to
– display cryptic error messages
– display sensitive information that might be exploited by malicious
users
– execute an unintended action
A Web Application: Bamboo Invoice
A Web Application: Bamboo Invoice
• At the top of the Bamboo Invoice project homepage, it states:
“BambooInvoice is free Open Source invoicing software intended for
small businesses and independent contractors. Our number one
priorities are ease of use, user-interface, and beautiful code.”
Navigation errors: Bamboo Invoice
Another Web Application: Digitalus
Another Web Application: Digitalus
• At the top of the Digitalus project homepage, it states:
“Digitalus CMS is a new kind of CMS. The focus of this open source
project is usable software as opposed to endless lists of features.”
Navigation errors: Digitalus
Navigation errors: Digitalus
How Did We Generate These Screens?
• Not very difficult, just try to do something unexpected
• For example
• delete yourself
• try to access a page that you should not have access to
• See the step by step scenarios in the paper
• The point is:
• A normal user can accidentally do these operations
• A malicious user can intentionally do these operations
Why are web applications error prone?
• Script-oriented programming:
– A web application consists of a collections of scripts
– These scripts call each other indirectly through interaction by the
user and the browser
• The form that one script generates has the address of the next
script that will consume the user input
– There are no systematic checks that guarantee that the caller and
the callee agree on an interface
• For example in a procedure call, the caller and the callee must
agree on the number of arguments and their types
– There is no explicit control flow identifying the execution order
• The control flow is buried in the links of the generated html
pages
Why are web applications error prone?
• Extensive string manipulation:
– Web applications use extensive string manipulation
• To construct html pages, to construct database queries in SQL,
etc.
– The user input comes in string form and must be validated and
sanitized before it can be used
• This requires the use of complex string manipulation functions
such as string-replace
– String manipulation is error prone
Why are web applications error prone?
• Interactivity
– User interaction is not under the control of the developer
• The user can use the back button of the browser
• The user can open multiple windows
• The user can cut and paste the URL
– Imagine you develop a desktop application where all dialog boxes
and all menu items could be displayed by the user at any moment...
...regardless of whether this makes sense in the current state of
the application
Why are web applications error prone?
• Interactivity
– Stateful interaction over stateless protocols (HTTP)
– Interactions between different software components
• browser, server, back-end database
• the need to maintain session state across these components
– One web application can be composed of many applications
• Mash-ups, web services
Automated Verification to the Rescue
• What can automated verification do for you?
– Exhaustive state-space exploration
• Using state space reduction techniques to enable exhaustive
exploration of the state space of a program
– Symbolic analysis
• Using compact symbolic representations (such as BDDS) to
explore large sate spaces
– Runtime verification
• Check or enforce properties at runtime
– Combining static and dynamic checks
• Check as much as possible statically, for the rest use runtime
enforcement
•
What can you do for automated verification?
• Specify the intended behavior!
Request processing in a Web application
• Request processing in Web applications that use MVC (Model View
Controller) frameworks
Navigation modeling and analysis
• We developed a simple language to specify navigation state machines
– It is a state machine that shows the allowable sequences of
controller action executions in each session of a web application
• MVC frameworks typically use a hierarchical structure where actions
are combined in the controllers and controllers are grouped into
modules
– We exploit this hierarchy to specify the navigation state machines
as hierarchical state machines
Navigation state machines
• The states of a navigation state machine are defined by
– the values of the session variables,
– the last action executed by the application
– and the request parameters of the last action
• We assume that this information is enough to figure out what are the
next actions that can be executed by the application
• NSM specification and verification is session modular
What can we do with NSMs?
• If we can check that the web application conforms to the NSM,
– then we can verify navigation properties on the NSM and conclude
that the navigation properties hold for the application
– We can also use automated verification techniques to check
properties of NSMs
– This way we can eliminate the navigation errors
• Problem: How do we ensure that the application conforms to the NSM?
– Two approaches
• Automatically extract the NSM from the application
• Manually specify the NSM and use runtime enforcement to
make sure that the application follows the NSM
• Or use a combination of these two
Runtime Enforcement with NSMs
• Statically verifying that a web application conforms to a navigation
state machine is a very difficult problem (in general undecidable)
• So, instead, we use runtime enforcement
– We have a plugin that can be easily added to an MVC web
application that takes a NSM as input and makes sure that every
incoming request conforms to the NSM
– If the incoming request does not obey the NSM, then the plugin
either ignores the request and refreshes the previous page or
generates an appropriate error message
– This way non-compliant user requests can be handled uniformly
without generating strange error messages
Model Checking NSMs
• Runtime enforcement ensures that the violations of NSM behavior will
be handled uniformly at runtime
• However, we may also want to check properties of NSMs
• Is the logout page always followed by the homepage?
• Our approach:
• Ask developer to write properties of NSMs as temporal logic
formulas
• We translate NSMs to SMV specifications
• We check if the properties hold on the NSMS using the NuSMV
model checker
Overview of Our Approach
ACTL
Properties
NSM to
SMV
Translator
Navigation State
Machine (NSM)
Specification
SMV
Static Verification
Runtime Enforcement
NSM Plugin
(NSM
Interpreter)
Counter
Exampe
Verified
Some examples
• We studied three real-world, freely available web applications:
– BambooInvoice: invoice management application
159,000 lines of code, 60 actions
– Capstone: student project management system
41,000 lines of code, 33 actions
– Digitalus: content management system
401,000 lines of code, 26 actions
Extracting the NSM
• We extracted the NSMs from the applications by hand, by exploring
potential error sequences
– Amount of effort: Half a day per application (including taking
screenshots, drawing the graph, etc.)
Application
States
Transitions
Variables
Digitalus
32
48
7
BambooInvoice
63
80
8
Capstone
8
16
1
Extracting the NSM
• Most NSMs are a collection of groups of logically interrelated pages,
with few entry and exit points between groups
• Extraction by hand is easier than numbers show
• The NSM only needs to be a (reasonably) conservative
approximation of all the paths that the application tolerates
• Future work:
1) semi-automated extraction of NSM from source code
2) promote NSM as part of code documentation
Fragment of BambooInvoice's NSM
• Yellow transitions have guards which, if violated cause PHP warnings
• Target states for red transitions cause PHP error messages
• Black transitions either have guards that are handled gracefully by the
application and do not cause PHP messages or have no guard
Model Checking NSMs
• We used the NuSMV model checker to statically check navigation
properties of NSMs expressed in ACTL
• Some examples:
– Once you login, the only way to go back to the login page is by
traversing the logout page
– Each controller has the 'index' action as its only entry point from
other controllers
• The original BambooInvoice and Digitalus assume, but do not enforce
either of these properties
– This is the cause for many cryptic error messages we found
• We could statically verify with NuSMV that the NSM for both
applications did fulfill these properties
– 4.4 MB of memory
– 0.4 sec running time
Runtime enforcement of NSMs
• PHP plugin for enforcement of NSMs at runtime
– Intercepts page requests in an MVC application and validates them
against the NSM (supplied in an external XML file)
– One line of code to insert in MVC frameworks (Zend, CodeIgniter)
– Simple: 1,100 lines of PHP code (3% of the smallest application)
• Average processing time when an action conforms to the NSM:
Application
Time without
plugin (ms)
Time with plugin
(ms)
Digitalus
11
12
BambooInvoice
183
199
Capstone
90
122
Runtime enforcement of NSMs
• Processing time when an action provokes a PHP warning (which the
NSM-enabled application blocks):
Application
Action
Time without
plugin (ms)
Time with
plugin (ms)
Digitalus
Create folder
28
26
Digitalus
Upload media
18
32
BambooInvoice
New invoice
938
574
• ...and when an action provokes a PHP error:
Application
Action
Time without
plugin (ms)
Time with
plugin (ms)
Digitalus
Edit page
416
32
Digitalus
Delete folder
424
36
BambooInvoice
View invoice
564
594
Runtime enforcement of NSMs
• Take-home point: runtime enforcement pays
– Reasonable overhead when everything is OK
– Can actually save CPU time by sparing the application from
processing an error
• Other advantage: prevents an error from occurring, instead of
recovering from it after the fact
– E.g.: rolling back database operations
– Catching an exception at the earliest moment vs. propagating it
deeper in the stack trace
• Reminder: also guarantees that the results obtained by static
verification hold
How many errors do we prevent?
• We can provide an estimate based on the ''colored'' transitions we
found while extracting the NSM
• Count the number of valid traces of length k-1 (i.e., that follow the NSM
from its start state)
• Then count the ways these traces can be extended to an invalid trace
of length k by executing an unexpected action
• Calculate the percentage of these unexpeted traces that are not
caught by the application
• This ratio represents the proportion of traces of length k for which a
navigation constraint is assumed, but not checked, by the application.
• BambooInvoice: 64% of all unexpected navigation traces longer
than 4 can generate a cryptic error message
• Digitalus: 52% of all unexpected navigation traces longer than 4
can generate a cryptic error message
Related Work
• Navigation problems in Web applications have been identified a while
ago [Licata and Krishnamurthi, ASE 2004]
• There are programming language based solutions for this problem that
use continuations [Krishnamurthi et al. 2006]
• Modeling web applications as state machines has been proposed and
investigated before [Miao, Zeng ICECCS 2008], [Han, Hofmeister
MODELS 2007]
• Runtime enforcement of navigation state machines is related to earlier
work on runtime monitoring and verification [see Runtime Verification
Conference]
Conclusions
• Web applications suffer from weak enforcement mechanisms for valid
navigation sequences; this is the source of cryptic and confusing errors
• Navigation State Machines (NSM) are finite state machines that can
formally represent valid navigation paths, along with constraints on
request parameters
• By combining enforcement of NSMs at runtime with static verification of
NSMs, we can...
– Prevent navigation errors from occuring
– Verify navigation properties by model checking the NSMs rather
than the applications themselves
Download