Transcript slides ppt

Verifying Interactive
Web Programs
Daniel R. Licata
Shriram Krishnamurthi
Brown University
Popular Press
Quote:
But when I clicked on the National [car rental]
price […], the site responded with this message:
“You have back-buttoned too far”. This was my
first experience with “back-button” as a verb. […]
Since that was patently untrue, I decoded its true
meaning: “We ran out [of cars]”.
–M. Slatalla, New York Times, 2003-07-17
A Headache for Companies
●
●
Minor problem: Users might get booked into the
wrong hotels, onto the wrong flights, etc.
Major problem: People might embarass you in
newspapers and in public talks
The Orbitz Property
●
●
Orbitz Property: the user should receive a
reservation at the hotel that was displayed on the
page he submitted
In other words, the result does not depend on the
page on which you did not click “Reserve”
Should all sites have this property?
Question
What would Amazon want?
The Amazon Property
Amazon property: at the end, every book the user
added to his shopping cart is actually in his
shopping cart
These properties are
●
not fixed in number
●
temporal in nature
 model checking
Model Checking
1. From the source code of a program, generate a
model that captures the behaviors of interest
2. Consume properties written by the developer
3. Automatically check whether or not the model
satisfies the properties
Model Checking
1. From the source code of a program, generate a
model that captures the behaviors of interest
2. Consume properties written by the developer
3. Automatically check whether or not the model
satisfies the properties
Modelling Web Programs
Model = control-flow graph (CFG)
What would a model of Orbitz look like?
Modelling Orbitz
display
hotel list
set
chosen
use chosen
to compute
displayed
display
reservation
use chosen
to compute
reserved
display details
for displayed
User Operations add Control Flow
●
●
The browser's back-button introduced control
flow not present in the original CFG
Other browser operations do the same
How many operations do today's browsers provide?
One Browser
Alt+Tab
How can we model all of these operations?
User Operation Calculus
●
Express all browser operations in terms of
primitive user operations:
●
submit form to server
●
switch to previously-visited page
[Graunke et al., 2003]
●
Only need to account for these two operations'
control flow
Our Model: the WebCFG
●
submit corresponds to program's control flow
Already in the CFG
●
switch permits returning to any previouslyvisited Web-interaction point
Add edges from each Web-interaction node to the
successors of all the others (WebCFG)
The Orbitz CFG
display
hotel list
set
chosen
use chosen
to compute
displayed
display
reservation
use chosen
to compute
reserved
display details
for displayed
The Orbitz WebCFG
display
hotel list
set
chosen
use chosen
to compute
displayed
display
reservation
use chosen
to compute
reserved
display details
for displayed
Model Checking
1. From the source code of a program, generate a
model that captures the behaviors of interest
2. Consume properties written by the developer
3. Automatically check whether or not the model
satisfies the properties
Properties
We want to state properties about Web pages
Properties
Web pages are written as HTML source
<html>
<body bgcolor=yellow>
<table>
<td>
<p>
Residence Inn by
Marriot Charleston
Downtown
...
</html>
Properties
We want to reason about Web page texts
<html>
<body bgcolor=yellow>
<table>
<td>
<p>
Residence Inn by
Marriot Charleston
Downtown
...
</html>
Properties
How can we associate these texts with
the corresponding HTML source?
<html>
<body bgcolor=yellow>
<table>
<td>
<p>
Residence Inn by
Marriot Charleston
Downtown
...
</html>
Relating Web Page Content to Source
●
Parse the text?
Too hard
●
Static-distance coordinates?
Too brittle
What else can we do?
Relating Web Page Content to Source
Capitalize on Cascading Style Sheet (CSS) ID tags!
<html>
<body bgcolor=yellow>
<table>
<td>
<p id=”reserved”>
Residence Inn by
Marriot Charleston
Downtown
...
</html>
Relating Web Page Content to Source
●
●
If the tag is in the HTML, it must be present in
the source of the program that generates the page
This relates Web page text to the Web program
source expression that generates it
Annotating the WebCFG
Annotate each WebCFG state with the propositions
true in that state
<html>
<body bgcolor=yellow>
<table>
generate reservation page
<td>
<p id=”reserved”>
Residence Inn by
Marriot Charleston
Downtown
...
</html>
generate reservation text
tag=reserved
Defining our Property Language
●
●
●
The annotated WebCFG describes the set of
traces that potentially occur
The developer writes an automaton accepting the
set of traces that should occur
Verification is containment of the former
in the latter
[Vardi and Wolper, 1986]
Example Property
Password-Page Property: Before reaching an accesscontrolled page, the user must go through a
password page
tag=password-entry
2
1
tag=access-controlled
violation
Note: In properties,
tags label transitions
Expressing the Orbitz Property
Orbitz Property: the user should receive a
reservation at the hotel that was displayed on the
page he submitted
Divide and conquer!
Orbitz Subproperty 1
Property:
chosen does not
change between
the computation
of displayed
and the
computation of
reserved
display
hotel list
set
chosen
use chosen
to compute
displayed
display
use
chosen
display
details
reservation
to compute
for
displayed
reserved
We need additional propositions to express this property!
Orbitz Subproperty 1
Property:
display
hotel list
chosen does not
change between
the computation
of displayed
and the
computation of
reserved
set
chosen
use chosen
to compute
displayed
display
use
chosen
display
details
reservation
to compute
for
displayed
reserved
set and join enable reasoning about data
Orbitz Subproperty 1
Property:
chosen does not
change between
the computation
of displayed
and the
computation of
reserved
(set,chosen)
(join,chosen)
1
violation
2
tag=reserved
set and join enable reasoning about data
Orbitz Subproperty 2
display
hotel list
set
chosen
Property:
the value of
reserved comes
from the value of
displayed
use chosen
to compute
displayed
display
use
chosen
display
details
reservation
to compute
for
displayed
reserved
We need additional propositions to express this property!
Orbitz Subproperty 2
(tagged,displayed,X)
Property:
the value of
reserved comes
from the value of
displayed
1
2
(tagged,reserved,X)
violation
(tagged,reserved,X)
Augment CSS tagged propositions with additional
information for reasoning about value flow
Property Idioms
●
●
●
Writing these automata correctly is tricky
The two Orbitz subproperties and the Amazon
property occur repeatedly
We provide abstractions of these properties as
idioms in our property language
Model Checking
1. From the source code of a program, generate a
model that captures the behaviors of interest
2. Consume properties written by the developer
3. Automatically check whether or not the model
satisfies the properties
Verification Process
The model and properties we have described
are checkable by language containment
The Orbitz WebCFG
display
hotel list
set
chosen
use chosen
to compute
displayed
display
reservation
use chosen
to compute
reserved
display details
for displayed
The Orbitz WebCFG
display
hotel list
set
chosen
use chosen
to compute
displayed
display
reservation
use chosen
to compute
reserved
display details
for displayed
Verification Process
●
The model and properties we have described are
compatible with the FLAVERS algorithms
[Cobleigh, Naumovich, Clarke, and Osterweil, 2001-2002]
●
●
FLAVERS supports “constraint” automata
We can automatically generate constraints that
rule out all the infeasible forward paths
Status
We have begun to apply our model checker to
CONTINUE, a Web-based conference management
application
●
●
Written in Scheme; send/suspend primitive creates
Web-interaction points
MrFlow implements SBA
[Heintze, 1994; Flanagan and Felleisen, 1996; Meunier, 2001]
Minimization
●
●
●
Some WebCFG states are not labeled
We remove these from the model without
affecting results
CONTINUE: from ~17,000 to ~300 states
Future Work
●
Better data reasoning (verification conditions)
●
Concurrency
●
Case studies and more idioms
Perspective
●
Work encompasses traditional verification
●
Structure of Web source programs matters
●
Nature of environment models changes