CS 290C: Formal Models for Web Software Lecture 6: to Navigation Errors

advertisement
CS 290C: Formal Models for Web Software
Lecture 6: Language and Model-Based Solutions
to Navigation Errors
Instructor: Tevfik Bultan
Eliminating Navigation Errors
• There are several approaches that have been proposed for
eliminating navigation errors
– Model driven development approaches where the
application is specified or enhanced using a formal
model
• For example: statecharts for modeling navigation
– Reverse engineering approaches where a formal model
is extracted fro the application
• For example: Extracting a state machine model for
navigation by analyzing the links that are inserted in
web pages
Model Driven Development Approach
• Model driven development approach enables
– Specification of the behavior of the application at a high
level of abstraction, making it easier to develop
applications.
– The actual implementation can be automatically or semiautomatically generated from the high level models
– Separation of concerns can be achieved by specifying
different concerns about the application (such as the
data model or the navigation constraints) using different
specification mechanisms
• However, model driven development requires the
developers to learn and use the modeling languages
• There is a concern about the mapping between the actual
implementation and model (they have to maintained
together)
Reverse Engineering Approach
• Reverse engineering approaches does not require
developers to learn a new specification language
• Since reverse engineering approaches extract a model
directly from the code, there is no maintenance issues
(when the application changes, we can extract a new
model)
• However, reverse engineering is hard:
– Extracting sound models using static analysis can lead
to very approximate models that do not contain much
information or can be undecidable for more precise
models
– Extracting models by observing runtime behavior is not
sound and cannot be used to guarantee correctness
Language Based Approaches
• Both model driven development and reverse engineering
approaches can be considered software engineering
approaches
• Another approach would be to use a programming
language based approach
• Can we model the problems that appear in Web
applications in programming language terms and possibly
suggest solutions using programming language
mechanisms (such as type checking)?
Today I will discuss two approaches
• A language based approach for modeling and analyzing
navigation problems in Web applications where the
navigation problems are resolved using language-based
constructs (such as types):
“Modeling Web Interactions and Errors,” S. Krishnamurthi, R.
B. Findler, P. Graunke, and M. Felleisen.
• A model-driven approach where the navigation problems
are addressed using a formal model specifying the
navigation behavior and analyzing it:
"Eliminating Navigation Errors in Web Applications via Model
Checking and Runtime Enforcement of Navigation State
Machines.” Sylvain Halle, Taylor Ettema, Chris Bunch and
Tevfik Bultan.
Web Applications
• A Web program’s execution consists of a series of
interactions between a Web browser and a Web server
• When a browser submits a http request whose URL points
to a Web program, the server invokes the corresponding
program
• It then waits from the program to terminate and turns the
program’s output into a response that the browser can
display, i.e., it returns a Web page.
• Each such program a “script” that reads some inputs and
writes some output
Challenges of script oriented programming
• This simple request-response style programming using
scripts makes design of multi-stage Web interactions
difficult
• A multi-stage interactive Web program consists of many
scripts each handling one request
– These scripts communicate with each other via external
media since they must remember the earlier part of the
interaction
– Forcing scripts to communicate this way causes
problems since they lead to unstated and easily violated
invariants
Web Applications
• Use of the Web browser creates further complications
– A browser is designed to let a user navigate a web of
hyperlinked nodes
– When a user uses this power to navigate an interaction
with an application many unexpected scenarios can
happen
• User can backtrack to an earlier stage of the
interaction
• User can duplicate a page and generate parallel
interactions
A Language Based Approach
• We will first describe a formal model that captures the
essence of Web application behavior
• Then we will investigate the use of language based
techniques to address the navigation problems
A Formal Model
• A Web application (W) consists of
– a server (S) and
– a client (C)
• Server consists of
– a storage, and
– a dispatcher
• Dispatcher contains
– a table (P) of programs that associates URLs with
programs and
– an evaluator that applies programs from the table to the
submitted form
A Formal Model
• Every page is simply a form (F) that contains
– the URL to which the form is submitted, and
– a set of form fields
• A field name is a value that can be edited by the client
• The client stores the
– the current form and
– the sequence of all the forms that have been visited by
the client so far (cached pages)
Web Program Behavior
• The behavior of the Web program is described using three
types of actions:
– Fill-form: This corresponds to client editing values of
fields in the current form. The modified form becomes
the current form and is added to the cache
– Switch: Makes a form from the cache the current form
– Submit: dispatches on the current form’s URL to find a
program in the table P. This program accesses the
server state and the current form and updates the server
state and generates a new form which becomes the
current form
A Simple Web Programming Language
• A simple functional programming language can be specified
to characterize the basic operations that are required to
write a web application:
– Extract a field from a form
– Construct a new form
– Modify fields of a form
• To allow stateful programming we can introduce read and
write operations that allow read and write access to the
server storage
Navigation Problems
• Two navigation problems can be characterized formally in
this model:
– Script communication problem: Where a script
accepts a different type of form than what is delivered to
it. For example, the script tries to access a field that
does not exist in the form.
– HTTP observer problem: Since the http protocol does
not allow a proper implementation of the observer
pattern (which enables independent observers to be
notified of state changes) a page received by the client
can become outdated when the data model changes in
the server.
Script Communication Problem and Types
• The main issue in script communication problem is type
mismatch between the forms generated and consumed by
different scripts
• Since these scripts are loosely coupled programs, there is
no standard type checking mechanism that can be used to
make sure that these type mismatches do not happen
• Checking all scripts together is not feasible since they are
developed incrementally and may reside on different Web
servers and may be written using different programming
languages
An Incremental Type System for Web Applications
• The proposed solution is the following:
– When the Web server receives a request for a URL that
is not already in its table, it installs the relevant program
– Before installing the relevant program it checks that
there is no type mismatch with the input form and the
installed program (internal consistency check)
– Furthermore it generates type constraints that this new
installed program imposes on other programs in the
server that it interacts (these become external
consistency checks)
• If either the internal or external check fails the program is
rejected resulting in an error
A Simple Typed Web Programming Language
• The simple functional Web programming language can be
extended with types by requiring type declarations for
function arguments
• The type system for this language shows how external type
checking can be done
– While traversing the program, the type system generates
a set of type constraints on external programs
– Each constraints state a condition such as: a program
associated with a particular URL should consume Web
forms of a particular type
Solving Script Communication Problem with Type
Checking
• Using type checking with this incremental system it can be
guaranteed that
– scripts do not get stuck when they are processing
appropriately typed forms
– Server does not apply the scripts to forms with wrong
types
Solving the http observer problem with timestamps
• Server keeps track of the number of processed
submissions (this represent time)
• The external storage is changed so that it maps locations to
values + timestamp for the last write
• The server also maintains the set of all storage locations
read or written during the execution of a script (called a
carrier set CS)
– When sever sends a page to the consumer, it adds the
current time stamp and this set of locations as an extra
hidden field
Solving the http observer problem with timestamps
• A form with carrier set CS and time stamp T submitted to a
server is out of date if and only if any of the locations in CS
have a timestamp at the server that is greater than T
• A runtime error can be generated when out of date forms
are submitted preventing execution of scripts with out of
date data
– This approach solves the Orbitx problem of booking an
unintended flight
• However, this approach can also generate false positives
(for example a page counter value may make the form out
of date)
– So the programmers must specify which reads or writes
are relevant, and an error is generated only when a
relevant field is out of date
Modeling web application behavior with continuations
• Another language-based approach that has been
investigated in web application development is the use of
continuations for modeling web application behavior
• A “continuation” is an abstract representation of the control
state of a program
• In the continuation-passing-style of programming the
control is passed explicitly using continuations
– When invoking a function written using the continuationpassing-style, the caller function passes a continuation
that will be invoked with the return value of the callee
after the callee terminates
Modeling web application behavior with continuations
• Using continuations we do not have to think of a web
application as a collection of scripts
• Using continuations we can capture the behavior of a web
application as a single program that suspends its behavior
while interacting with the user
• When a web application is invoked by submitting a form,
after it performs its task, it outputs the result and a
continuation
– This continuation then is used to process the next form
submission
Modeling web application behavior with continuations
• In the continuation-based model, when a page is sent to the
user, the current “continuation” is captured and stored in a
table.
– The form sent to the user contains a URL that contains a
reference to that table entry
– When user submits the form, the server invokes the
corresponding continuation which then continues the
execution from the corresponding control location
• If you are interested in this topic here is a paper that
discusses this view:
“The Influence of Browsers on Evaluators or, Continuations to
Program Web Servers,” Christian Queinnec.
Modeling web application behavior with continuations
• Using this continuation-based approach, one can
investigate the effects of using the Back button, multiple
window creation, direct URL entry, etc. in a web application.
• If you are interested in this topic here is a paper that
discusses this view:
“The Influence of Browsers on Evaluators or, Continuations to
Program Web Servers,” Christian Queinnec.
A model-based approach to navigation problems
• We have discussed some language-based ideas for dealing
with navigation problems in web applications, now we will
discuss a model-based approach
• One successful approach to web application development
has been adoption of design patterns that bring some
structure to the scripts that implement the web application
• Web application development frameworks that adopt these
design patterns have become very successful
Model-View-Controller (MVC) Architecture
• MVC is a design structure for separating representation
from presentation using a subscribe/notify protocol
• The basic idea is to separate
– where and how data (or more generally some state) is
stored, i.e., the model
– from how it is presented, i.e., the views
• Follows basic software engineering principles:
– Separation of concerns
– Abstraction
Model-View-Controller (MVC) Architecture
• MVC consists of three kinds of objects
– Model is the application object
– View is its screen presentation
– Controller defines the way the user interface
reacts to user input
a=50%
b=30%
c=20%
model
views
Model-View-Controller (MVC) Architecture
• MVC decouples views and models by establishing a
subscribe/notify protocol between them
– whenever model changes it notifies the views that
depend on it
– in response each view gets an opportunity to update
itself
• This architecture allows you to attach multiple views to a
model
– it is possible to create new views for a model without
rewriting it
Model-View-Controller (MVC) Architecture
• Taken at face value this may be seen as an architecture for
user interface design
– It is actually addresses a more general problem:
• decoupling objects so that changes to one can affect
any number of others without requiring the changed
object to know the details of the others
– This is called Observer pattern in the design patterns
catalog
• Observer pattern is a design pattern that is used as part of
the Model-View-Controller (MVC) architecture to handle
notification of multiple views that depend on a single model
A Brief Overview of Design Patterns
• Think about the common data structures you learned
– Trees, Stacks, Queues, etc.
• These data structures provide a set of tools on how to
organize data
• Probably you implement them slightly differently in different
projects
A Brief Overview of Design Patterns
• Main concepts about these data structures, such as
– how to store them
– manipulation algorithms
are well understood
• You can easily communicate these data structures to
another software developer by just stating their name
• Knowing them helps you when you are dealing with data
organization in your software projects
– Better than re-inventing the wheel
A Brief Overview of Design Patterns
• This is the question:
– Are there common ideas in architectural design of
software that we can learn (and give a name to) so that
• We can communicate them to other software
developers
• We can use them in architectural design in a lot of
different contexts (rather than re-inventing the wheel)
• The answer is yes according to E. Gamma, R. Helm, R.
Johnson, J. Vlissides
– They developed a catalog of design patterns that are
common in object oriented software design
A Brief Overview of Design Patterns
• Design patterns provide a mechanism for expressing
common design structures
• Design patterns identify, name and abstract common
themes in software design
• Design patterns can be considered micro architectures that
contribute to overall system architecture
• Design patterns are helpful
– In developing a design
– In communicating the design
– In understanding a design
A Brief Overview of Design Patterns
• The origins of design patterns are in architecture (not in
software architecture)
• Christopher Alexander, a professor of architecture at UC
Berkeley, developed a pattern language for expressing
common architectural patterns
• Work of Christopher Alexander inspired the work of Gamma
et al.
• In explaining the patterns for architecture, Christopher
Alexander says:
“Each pattern describes a problem which occurs over
and over again in our environment, and then describes
the core of the solution to that problem, in such a way
that you can use this solution a million times over,
without ever doing it the same way twice”
• These comments also apply to software design patterns
Resources for Design Patterns
• Original paper:
– “Design Patterns: Abstraction and Reuse of ObjectOriented Design” by E. Gamma, R. Helm, R. Johnson, J.
Vlissides
• Later, same authors published a book which contains an
extensive catalog of design patterns:
– “Design Patterns: Elements of Reusable ObjectOriented Software”, by E. Gamma, R. Helm, R. Johnson,
J. Vlissides, Addison-Wesley, ISBN 0-201-63361-2
Cataloging Design Patterns
• Gamma et al. present:
– A way to describe design patterns
– A way to organize design patterns by giving a
classification system
• More importantly, in their book on design patterns, the
authors give a catalog of design patterns
– As a typical developer you can use patterns from this
catalog
– If you are a good developer you can contribute to the
catalog by discovering and reporting new patterns
• The template for describing design patterns used by
Gamma et al. is given in the next slide
Design Pattern Template
DESIGN PATTERN NAME
the name should convey pattern’s essence succinctly
Jurisdiction Characterization
used for categorization
Intent
What particular design issue or problem does the design pattern address?
Motivation
A scenario in which the pattern is applicable. This will make it easier to understand the
more abstract description that follows.
Applicability
What are the situations the design pattern can be applied?
Participants
Describe the classes and/or objects participating in the design pattern and their
responsibilities.
Collaborations
Describe how the participants collaborate to carry out their responsibilities.
Diagram
A class diagram representation of the pattern (extended with pseudo-code).
Consequences
What are the trade-offs and results of using the pattern?
Implementation
What pitfalls, hints, or techniques should one be aware of when implementing the pattern?
Examples
Examples of applications of the pattern in real systems.
See Also
What are the related patterns and what are their differences?
Observer Pattern
• Observer pattern is a design pattern based on Model-ViewController (MVC) architecture
• In the next slides, I will give the design pattern catalog entry
for the Observer pattern.
Observer
Behavioral
Intent
The Observer pattern defines an one-to-many dependency between a subject object and
any number of observer objects so that when the subject object changes state, all its
observer objects are notified and updated automatically.
Motivation
The Observer design pattern has two parts and they are subject and observer. The
relationship between subject and observer is one-to-many. In order to reuse subject and
observer independently, their relationship has to be decoupled. An example of using the
observer pattern is the graphical interface toolkit which separates the presentational
aspect with application data. The presentation aspect is the observer part and the
application data aspect is the subject part.
For example, in a spreadsheet program, the Observer pattern can be applied to separate
the spreadsheet data from its different views. In one view spreadsheet data can be
presented as a bar graph and in another view it can be represented as a pie chart.
The spread sheet data object notifies the observers whenever a there is a data change that
can make its state inconsistent with its observers.
Class Diagram for the Observer Pattern
Subject
observers
Attach(Observer)
Detach(Observer)
Notify()
Observer
Update()
for all o in observers
{ o->Update(); }
ConcreteObserver
ConcreteSubject
observerState
subjectState
Update()
GetState()
SetState()
return subjectState;
observerState =
subject->GetState();
:ConreteSubject
a:ConcreteObserver
b:ConcreteObserver
SetState()
Notify()
Update()
GetState()
Update()
GetState()
Applicability
Use the observer pattern in any of the following situations:
• When the abstraction has two aspects with one dependent on the other.
Encapsulating these aspects in separate objects will increase the chance to reuse
them independently.
• When the subject object doesn't know exactly how many observer objects it has.
• When the subject object should be able to notify it's observer objects without
knowing who these objects are.
Participants
• Subject
• Knows it observers
• Has any number of observer
• Provides an interface to attach and detaching observer object at run time
•ConcreteSubject
• Store subject state interested by observer
• Send notification to it's observer
•Observer
• Provides an update interface to receive signal from subject
• ConcreteObserver
• Maintain reference to a ConcreteSubject object
• Maintain observer state
• Implement update operation
Consequences
Further benefit and drawback of Observe pattern include:
•Abstract coupling between subject and observer, each can be extended and reused
individually.
• Dynamic relationship between subject and observer, such relationship can be
established at run time. This gives a lot more programming flexibility.
• Support for broadcast communication. The notification is broadcast automatically to
all interested objects that subscribed to it.
•Unexpected updates. Observes have no knowledge of each other and blind to the
cost of changing in subject. With the dynamic relationship between subject and
observers, the update dependency can be hard to track down.
Known Uses
• Smalltalk Model/View/Controller (MVC). User interface framework while Model is
subject and View is observer.
Back to MVC
How do the MVC architecture and the Observer pattern relate
to Web applications?
• The reason we are discussing the MVC architecture is that
many Web applications nowadays are built based on the
MVC architecture
• The reason we are discussing the Observer pattern is that
the MVC-based Web applications doe not properly use the
Observer pattern, causing problems
MVC Architecture in Web Applications
• Many web frameworks support web application
development based on the MVC architecture
– Ruby on Rails, Zend Framework for PHP, CakePHP,
Spring Framework for Java, Struts Framework for Java,
Django for Python, …
• MVC architecture has become the standard way to
structure web applications
MVC Framework for Web Applications
• Use of MVC architecture in Web applications
– Model: This is the data model which is an abstract
representation of the data stored in the backend
database. Typically uses an object-relational mapping to
map the class structure for the data model to the tables
in the back-send database
– Views: These are responsible for rendering of the web
pages, i.e., how is the data presented in user’s browser
– Controllers: Controllers are basically event handlers that
process incoming user requests. Based on a user
request, they can update the data model, and create a
new view to be presented to the user
MVC Framework for Web Applications
• Note that use of MVC in web applications does not fit the
Observer pattern
– typically it is not possible to refresh a browser window
directly when the data model changes (i.e., it is not
possible to actively notify the observers when the state
of the subject has changed)
• This can create navigation problems
– when there are multiple windows open, they may
represent stale views
– this was the problem in the orbitz example we discussed
earlier
Abstraction in MVC Frameworks
• MVC framework provides separation of concerns and
abstraction, which can be exploited for analysis
– For example, for analyzing properties of the data model
we can focus on the data model and ignore the views
– We can focus on the behaviors of the controllers to
eliminate navigation errors
Achieving Navigation Correctness in MVC
• I will discuss some work we have done recently on
analyzing navigation behavior in web applications
developed using MVC frameworks
• The idea is
– to exploit the abstraction provided by the MVC
architecture by enforcing navigation constraints at the
controller
– use model driven development to provide a navigation
model and analyze it statically
– enforce the navigation model synamically at runtime to
prevent navigation errors
Request processing in a Web application
A formal model
We can formally model an MVC application as
• M: is a set of data model states, where the data model can
include any stateful representation of application data, such
as a database,
• V: is a set of session variables, i.e., data stored on the
server on a per-client basis,
• I: is a set of sessions used by the server to associate clients
with session variables,
• A: is a set of controller actions, i.e., the program segments
that are invoked based on the HTTP requests sent by the
user,
• P: is a set of request parameters, i.e., input data from the
user received as part of the HTTP requests via GET or
POST.
A formal model
• A web application is a tuple A = (Q, B, T) where:
– Q is the set of states, which is the Cartesian product of
the model states and the domains of the session
variables for each session
– B is the set of initial states
– T is the transition relation mapping a state, an action, a
set of request parameters and a session to a next state
• The transition relation must guarantee that each session
can only modify its own session variables
• The model can change even when there is no action
executed (i.e., the backend database contents can change
without a request from a user)
Session traces
• Given the formal model, we can define global execution
traces of a web application
– Each trace starts from an initial state
– Each element in the trace is a tuple:
(state, action, request parameters, session index)
– Any two consecutive tuples in the trace must be
consistent with the transition relation of the application
• We can project each global trace to a session i (by deleting
the tuples which do not contain i) and obtain a session
trace
Navigation state machines (NSMs)
• A navigation state machine (NSM) is a state machine
– that specifies acceptable sequences of actions and
request parameters that can appear in a session trace
• Given a navigation state machine, and a session trace,
– the session trace conforms to the navigation state
machine
• if the sequence of actions and request parameters for
that session trace is accepted by the navigation state
machine
Navigation state machines
• The states of a navigation state machine is defined by
– the values of the session variables,
– the last action executed by the application
– And the request parameters that were sent with the last
action
• We assume that this information is enough to figure out
what are the next actions that can be executed by the
application
Navigation state machines
• We developed a simple language to specify navigation
state machines
• It is a state machine that shows the allowable sequences of
controller action executions in a web application
• MVC frameworks typically use a hierarchical structure
where actions are combined in the controllers and
controllers are grouped into modules
– We exploit this hierarch to specify the navigation state
machines as hierarchical state machines (statecharts)
Navigation state machines
• In addition to identifying which action can be executed after
which other action,
– navigation state machines also identify constraints
among the request parameters between two consecutive
requests
– This can be used to make sure that the values stored in
cookies are not changes for example
Navigation state machine example
• A portion of the navigation state machine for the Digitalus
system (an open source content management system)
What can we do with NSMs
• If we can check that the web application conforms to the
NSM,
– then we can verify navigation properties on the NSM and
conclude that the navigation properties hold for the
application
– We can use model checking to check properties of
NSMs
– This way we can eliminate the navigation errors
• Big problem: How do we ensure that the application
conforms to the NSM?
Runtime Enforcement
• Statically verifying that a web application conforms to a
navigation state machine is a very difficult problem (in
general undecidable)
• So, instead, we use runtime enforcement
– We have a plugin that can be easily added to an MVC
web application that takes a NSM as input and makes
sure that every incoming request conforms to the NSM
– If the incoming request does not obey the NSM, then the
plugin either ignores the request and refreshes the
previous page or generates an appropriate error
message
– This way non-compliant user requests can be handled
uniformly without generating strange error messages
Model checking NSMs
• We translate NSM models to SMV
• We write navigation constraints in temporal logic:
G (login => (!login U logout))
• We check the properties using a model checker
Model checking and Runtime Enforcement
• Our approach combines the following ideas:
– Using model driven development for specification of
navigation constraints
– Using model checking to verify properties of formal
navigation models
– Using runtime enforcement to ensure that the navigation
behavior at runtime obeys the navigation model
• We show that when all these are combined, navigation
errors can be eliminated
Download