webquality (3, 6 MB)
Download
Report
Transcript webquality (3, 6 MB)
Politecnico
di Milano
Web quality
Luciano Baresi
Politecnico di Milano – Dipartimento di Elettronica e Informazione
Piazza L. da Vinci, 32 – 20133, Milano (Italy)
[email protected]
Who am I?
Research positions
Associate professor at Politecnico di Milano (Italy)
Visiting positions at
University of Oregon (USA)
University of Paderborn (Germany)
Professional Activities
Co-chair of GT-VMT 01, ICECCS 02, UMICS 03
PC member of several conferences (ICWE 04)
Co-founder and senior partner of INQuaS
www.elet.polimi.it/~baresi
Web Quality - Buenos Aires (Argentina) July 21-25
2
Politecnico di Milano
The oldest technical university in Italy
One of the oldest in Europe
5 schools
1000 faculties
25,000 students
Web Quality - Buenos Aires (Argentina) July 21-25
3
Politecnico
di Milano
Quality
A definition
A methodological approach to the analysis and management of
all company processes
This activity aims at
Reducing wastes
Improving products and services
Improving processes
We all know what quality is
Web Quality - Buenos Aires (Argentina) July 21-25
5
Different perspectives
Quality is quality but different roles have different views. For
example
Stakeholders
They “see” the business
Users
They want to “use” the system
Developers
They must develop and maintain the system
Web Quality - Buenos Aires (Argentina) July 21-25
6
Process quality vs. product quality
Process quality
We aim at introducing specialpurpose characteristics to
control
Costs
Time-to-delivery
Quality
Competitiveness
Web Quality - Buenos Aires (Argentina) July 21-25
Product quality
We aim at products with
particular quality indicators no
matter of the process behind
them
They must be able to match
explicit and implicit requests
from customers
7
Quality & development process
RUP
Web Quality - Buenos Aires (Argentina) July 21-25
8
Quality dimensions
Usability
Dependability/Reliability
Correctness
Robustness
Scalability
Performability
Security
internal vs. external properties
Web Quality - Buenos Aires (Argentina) July 21-25
9
Politecnico
di Milano
Web applications
(at least their architectures)
Web applications vs. Web sites
Web Quality - Buenos Aires (Argentina) July 21-25
11
Why Web applications?
N-tier architecture
Client
Internet
F
i
r
e
w
a
l
l
internal users
Internal network
SMTP server Web server DB server
The Web is heterogeneous by definition
Too many technologies (Applet, Servlet, JSP, ASP, XML, …)
Much more importance to presentation and communication
Distribution
…
Web Quality - Buenos Aires (Argentina) July 21-25
12
Heterogeneity
Components
HTML
Scripting languages
Databases
Multimedia contents
Expertise
System designers
DB administrators
Designers
Programmers
Testers
Development and maintenance are complex (more complex?)
Web Quality - Buenos Aires (Argentina) July 21-25
13
Client/Server
Entities that are logically distinct, but linked by a network,
that work together to obtain a given task
Requests
Services
Web Quality - Buenos Aires (Argentina) July 21-25
14
Logical components
Presentation
User interactions
Data management
Access, persistency, queries
Application logic
Business oriented computations
Presentation
Application
logic
Data
management
In some cases some of these elements can be absent
Embedded systems have no presentation layers
Web Quality - Buenos Aires (Argentina) July 21-25
15
Client and server responsibilities
Server
Data
management
Logic
Data
management
Logic
Data
management
Data
management
Data
management
Logic
Presentation
Network
Data
management
Client
Presentation
A
Logic
Logic
Logic
Presentation
Presentation
Presentation
Presentation
B
C
D
E
Web Quality - Buenos Aires (Argentina) July 21-25
16
Client and server responsibilities
Distributed presentation
The server owns all the knowledge
The client only interacts with the server
Example: HTML form (no control on the quality of data)
Remote presentation
Presentation is fully in charge of the client
Distributed logic
The logic is partly on the server and partly on the client
Remote data access
Presentation and logic are on the client that interacts with a
server to access data (SQL interface)
Distributed database
The functionality to manage data are partially on the client and
partially on the server (e.g., Distributed Relational Database
Architecture di IBM)
Web Quality - Buenos Aires (Argentina) July 21-25
17
Fat client vs. Fat server
Clients are fat if the implement part of the application logic
(C, D, E)
Why fat clients
The system can handle user inputs more quickly
The system interacts with the user with a finer granularity
Different clients can implement interfaces that are
specific to different users
The system is more scalable
Web Quality - Buenos Aires (Argentina) July 21-25
18
Fat client vs. Fat server
Why not fat clients
Interactions with the server can become too frequent
during complex computations
Servers could better control data access if they control
the computation
We loose data encapsulation
The client must know in detail how data are organized
on the server
System management and maintenance become more
complex
Clients and server must be both updated
It would be easier if we could update only the server
Web Quality - Buenos Aires (Argentina) July 21-25
19
Three-tier architectures
Client
Intermediate
component
Server
Presentation
Application
logic
Data
management
In the ideal model the mid component contains all the application
logic
In many real cases the logic is also spread on client and server
This way we can distinguish between presentation and application
logic
Different distributed data sources
Different client types
Web Quality - Buenos Aires (Argentina) July 21-25
20
Another example
Client
Shared
Application
Services
Data
Sources
Order
Entry
Credit
Check
Customer
Service
Scheduling
Customer
Information
Distribution
Service
Inventory
Manager
Customers
Distribution
Inventory
Credit
Service
Web Quality - Buenos Aires (Argentina) July 21-25
21
From two-tier to n-tier
2-tier
Server
Client
Application
logic
Presentation
Client
Intermediate
component
Intermediate
component
Data
management
Intermediate
component
Server
n-tier
Web Quality - Buenos Aires (Argentina) July 21-25
22
Computational models
Computational model
Host/terminal
model
Models based on
file transfer
Models based on
Distributed computation
Client/server
Models based
on events
Web Quality - Buenos Aires (Argentina) July 21-25
Peer-to-Peer
ObjectOriented
23
Politecnico
di Milano
What can we do?
Basically, we can
Measure
We “understand” the quality of our applications by
measuring it
Several different measures
Analyze
We “understand” the quality by studying the artifacts we
have
Interesting, but complex
Test
We can “understand” the quality by trying with some
special executions
No magic solutions !!!
Web Quality - Buenos Aires (Argentina) July 21-25
25
Politecnico
di Milano
Design for testability
Testability
proc foo ( )
x: integer;
y: char;
begin
xxlskd ;
xxl;
Direct control over source code
(usually not feasible)
Definition of
analysis models
Interesting
properties
?P
Implication
Algorithm control of
property P’
Model
property
? P´
Web Quality - Buenos Aires (Argentina) July 21-25
27
Important factors
Systems can be tested more easily if they:
Work decently (operability)
Can be controlled (controllability)
Can be observed (observability)
Do not add useless complexity (simplicity)
Are documented consistently and completely
(understandability)
Are well-known (suitability)
Are stable (stability)
Web Quality - Buenos Aires (Argentina) July 21-25
28
Five dimensions
Requirements
We must consider the way we represent them
Design
We should anticipate as many constraints as we can
Implementation
Test oracles should be “added” during the implementation
Test
We should identify test cases as soon as we can
Documentation
Better testability if we have good documentation
Web Quality - Buenos Aires (Argentina) July 21-25
29
Some very preliminary comments
Design projects as simple as possible
No added (useless) complexity
Use/add contracts (assertions) to our components
Maximize the visibility of all products
Consider comments and documentation
Write “standard” code
Privilege formal (semi-formal) notations instead of informal
models
Is this true with the Web?
Web Quality - Buenos Aires (Argentina) July 21-25
30
noweb
Web Quality - Buenos Aires (Argentina) July 21-25
31
Politecnico
di Milano
Test and analysis
Why test and analysis
Software is never correct
No matter of the domain
No matter of the techniques we use
Any software must be verified/validated
Test and analysis are
Important to control and assess the quality of products
But impact on the process
Usually expensive
Difficult, but interesting
Good compromises
Web Quality - Buenos Aires (Argentina) July 21-25
33
IEEE terminology
Error
What causes the problem (deviation) between the product and
the ideal program
Errors and faults are not consistent
For example, typos, cut&paste, wrong requirements
Fault
Program elements that do not correspond to the expectations
Faults do not respect locality and are not consistent with failures
For example, the program has a multiply operator instead of a
sum operator
Failure
Behavior not consistent with system specification
For example, 4 + 3 = 12
Web Quality - Buenos Aires (Argentina) July 21-25
34
Properties
Process oriented (internal) properties
Reusability
Maintainability
Modularity
External properties that can be verified
Interoperability
Timeliness
External properties that cannot be verified
User-friendliness
Usability
Dependability properties
Correctness
Robustness
Safety
Reliability
Web Quality - Buenos Aires (Argentina) July 21-25
35
Dependability
Robust but not safe: we can have
catastrophic errors
Reliable
Correct
Robust
Safe, but not correct:
We can have “light”
failures
Safe
Reliable, but not correct:
We can seldom have failures
Correct, but not safe or robust: the specification is not enough
Web Quality - Buenos Aires (Argentina) July 21-25
36
Validation and verification
Formal description
Requirements
validation
system
verification
Include usability
Includes test, inspection,
test, user feedback static analysis
building the right system building the system right
Web Quality - Buenos Aires (Argentina) July 21-25
37
Validation vs. Verification
If we say that the page must display quickly, we cannot verify
the property, but we can validate it
If we say that the page must display in 30 seconds, we can
verify it.
Web Quality - Buenos Aires (Argentina) July 21-25
38
The real problem
property
program
Decision
procedure
yes/no
Correctness properties are undecidable
Web Quality - Buenos Aires (Argentina) July 21-25
39
What do they offer?
Sample the
input space
Optimistic approximation
(testing)
Pessimistic approximation
(analysis, proof)
Perfect verification
Simplified properties
We must settle for some kind of inaccuracy
to be able to deal with the problem
Web Quality - Buenos Aires (Argentina) July 21-25
40
Impact of the software type
The software type and its characteristics impact test and
analysis activities in different ways
Different emphasis on the same property
Timeliness
“Correctness”
Different properties
Usability
User-friendliness
New techniques
Presentation
Navigation
Web Quality - Buenos Aires (Argentina) July 21-25
41
Principles
The basic principles are
Sensitivity: it is better to fail every time than sometime
Redundancy: we should make our intentions explicit
Partitioning: divide et impera
Restriction: we should try to reduce the scope
Feedback: we should use the experience to improve the
process
Web Quality - Buenos Aires (Argentina) July 21-25
42
Sensitivity
Better to fail every time than sometimes
Consistency helps:
A test selection criterion is better if any selected test gives
the same results, that is, if the program fails with a given
test, it should fail with all tests selected with that
criterion
For example, deadlock analysis at run-time is better if it
does not depend on the machine, that is, if the program
fails on a machine within a given execution, then it should
fail on all machines within that execution
Web Quality - Buenos Aires (Argentina) July 21-25
43
Redundancy
Make decisions explicit
Redundant control can increase the capability of capturing
errors in advance or more efficiently
Static type check is redundant with respect to testing, but
it can solve many problems in advance
The validation of requirements is redundant with respect
to the validation of the final product, but it can solve
several problems in advance and more efficiently
Test and model checking are redundant, but they are used
together in some cases to increase the confidence on the
right behavior of the product
Web Quality - Buenos Aires (Argentina) July 21-25
44
Partition
Divide et impera
Difficult problems can be treated by partitioning the input
space
Criteria for both functional and structural test selection
identify meaningful partitions of the input space
Verification techniques partition the input space by
grouping data that are homogeneous with respect to the
properties that we want to prove
Web Quality - Buenos Aires (Argentina) July 21-25
45
Restriction
Simplify the problem
Clever restrictions can make problems that are difficult (and
undecidable) simple and tractable
In some cases a weaker property can be easier to verify
For example, we cannot demonstrate that pointers are
used in the right way, but if we use Java we can impose
it easily
In other cases, an heavier property can be easier to verify
For example, in general we cannot demonstrate that
we do not have type errors with languages that have a
dynamic type system, but we can demonstrate it if the
language is statically typed.
Web Quality - Buenos Aires (Argentina) July 21-25
46
Feedback
Fine tune the development process
Learn from the experience:
Checklists are built on errors discovered in the past
The way errors are classified can help define meaningful
criteria to select test cases
The mechanisms to revise the process are based on the
fact that we must improve the process to improve the
product
Web Quality - Buenos Aires (Argentina) July 21-25
47
Test and analysis in a development process
Activities related to quality control and types of development
processes
Degrees of freedom and compromises
How to balance budget, risks, and quality
Error analysis and feedback
Impact of the development process on test activities
Responsibilities of a test group
Web Quality - Buenos Aires (Argentina) July 21-25
48
Course outline
Usability
Accessibility
Functional correctness
Testing and analysis techniques
Robustness and scalability
Performance
Security
Test process
Tools
Conclusions
Web Quality - Buenos Aires (Argentina) July 21-25
49
Politecnico
di Milano
Usability
Usability
It is the measure of how a software system satisfies the needs
of its users
Ease of use
Efficacy and efficiency
Ease of storing produced artifacts
Low number of errors and ease to recover
Satisfaction while using the product
Jakob Nielsen
Web Quality - Buenos Aires (Argentina) July 21-25
51
Some design principles
Design of pages based on the device
Correct visualization with different browsers
Correct visualization no matter of the screen
characteristics
Light-to-load pages
Ease to access supplied contents
Structure of contents to facilitate their reading
Coherence in the page style
Visibility of links and their meaning
Consistency of the presentation style with respect to users
Ease of navigation
Web Quality - Buenos Aires (Argentina) July 21-25
52
Page design and device
Web Quality - Buenos Aires (Argentina) July 21-25
53
Correct visualizations and browsers
The semantics of tags can change in different browsers
The design can
Use the minimum common set, trying not to use the last version
of the language
Use the latest version
To stimulate users to update
Many users may not to be able to access some contents
We should test our applications with all well-known browsers
If we knew the users, we could better understand what we need
Specific classes of users
Skilled users
We can start developing the application with old versions of the
languages and
Add the extensions later
All main contents must be published using such a format that can
be accessed by all users
Web Quality - Buenos Aires (Argentina) July 21-25
54
Screen-independent design
800x600 is the standard resolution
Many displays are already set to this resolution
If we used higher resolutions, the visualization would use
horizontal sliding bars
Vertical bars are acceptable, but we should avoid
horizontal ones
We could use relative dimensions (percentages)
The visualization adapts to the current resolution and the
current screen size
But we must be sure that all visualizations are meaningful
Web Quality - Buenos Aires (Argentina) July 21-25
55
Design light pages
Some studies have revealed that
0.1 second is the limit to give the user the perception that
the system is reacting
We do not need any message, but we need to start
displaying the result
1 second is the limit within which any human thought is
not interrupted
The user perceives the delay, but it is still acceptable
10 seconds is the limit to keep the user attention on the
dialogue
If delays are higher, the user changes his focus
If we deliver pages in more than 10 seconds, this means
that we loose the client
Web Quality - Buenos Aires (Argentina) July 21-25
56
Possible problems and solutions
The speed can be influenced by the number and size of
images
Light and few images
Multimedia effects used only if they are really helpful
They improve the way information is perceived
No pages should be loaded in more than 20 seconds
We should check page loading in the different situations
Maybe even with different cache settings
First screen at a glance
We should reduce the time needed to load the first page
This page (the first part) should provide as much
information as possible (more text than images)
Web Quality - Buenos Aires (Argentina) July 21-25
57
Neat and clear information
We should use some 50% less text than “standard” newspapers
When we read on a screen, we are 25% slower
Users do not love to scroll windows
Contents should be well-structured: titles, paragraphs, and
itemized lists are useful to locate contents
Rule of the pyramid (taken from journalism)
Each page should start with a short conclusion (summary)
and then present all details
Web Quality - Buenos Aires (Argentina) July 21-25
58
Hypertextual structure
We should use hypertext to split long text in pages
The hypertext should not be used to break linear
information
Each information peace should focus on a specific and
well-defined argument
We should be able to identify the entities that are more
relevant and then identify their subcomponents
We should then identify the relationships that exist
between entities and between subcomponents within the
same entity
Web Quality - Buenos Aires (Argentina) July 21-25
59
Homogeneous visualization style
We should choose coherence and uniformity
We should plan the themes, information structures, and
navigation paths that contribute to the uniformity of the site
We should define a visual schema that should be adopted in
all pages
Same fonts, colors, rendering of contents to communicate
the adopted schema and facilitate the comprehension to
users
We should avoid to change the style
Linear transitions and uniformity give the user the idea of
being in the same site (known borders)
Better ease of use and learning since users can find wellknown models
Web Quality - Buenos Aires (Argentina) July 21-25
60
An example (I)
Web Quality - Buenos Aires (Argentina) July 21-25
61
An example (II)
Web Quality - Buenos Aires (Argentina) July 21-25
62
Linear transitions
Same background
Same fonts
Same grid to partition the page
Same positioning of the different page elements
Same use of empty spaces
Example: national gallery of Washington (www.nga.gov)
Web Quality - Buenos Aires (Argentina) July 21-25
63
Facilitate access to contents
Ease the interaction
Identification of links and their meaning
Link types
Structural links: They define the structure of the
information space and allow the user to move within it
Association links: Textual links that let the user deeper the
knowledge of the text used as anchor
Lists of alternative pointers: They can help users find what
they need
Starting rhetoric
The user must be able to identify the added value that he
will find in the target page
Ending rhetoric
The target page must clarify the ending context and give
value to the source page
Web Quality - Buenos Aires (Argentina) July 21-25
64
Descrivere i link testuali
We should not use long anchors
The must only be pointers
Too many words do not allow us to identify the meaning of
the link (max 4 words)
We should use simple words that clearly identify the target
Examples
– To know about my job click here
– Here is my job
We can add text to supply further information
It can be used to differentiate similar links
Web Quality - Buenos Aires (Argentina) July 21-25
65
Suitable presentation styles
The visual style must give the right impression that we want
to communicate
We should choose the right style
For example
http://www.nasa.gov
http://kids.msfc.nasa.gov
Web Quality - Buenos Aires (Argentina) July 21-25
66
Ease of navigation
Where am I? Where was I? Where can I go?
We should always define the page title
Breadcrumbs link identify the path in the site followed to
reach a given page
A link to the home page (or to the starting pages of the
different sections) from each page help the user to
understand where he is
We should be careful with the links defined so far
Web Quality - Buenos Aires (Argentina) July 21-25
67
Site organization
Each application must have a home page
It provides a summary and links to some important pages
HP
A Web site is the set of pages that are linked to the home
page through links
The should supply an homogeneous set of information
Web Quality - Buenos Aires (Argentina) July 21-25
68
Linear structure
It guides the user along a path
Useful when the presentation implies that the user follow a
predefined path
examples: lessons, book chapters, collections, etc.
HP
Argument
SubArgument1
Argument
…
Argument
SubArgument2
User navigate forward and backward by following a predefined
path
Each page has a link to the initial page
Web Quality - Buenos Aires (Argentina) July 21-25
69
Network structure
Typical in many simple
applications (personal site)
Links to and from any page in
the application
Home Page
HP
We need a navigation bar to
guide the user
Web Quality - Buenos Aires (Argentina) July 21-25
Link1
Link2
Link3
....
LinkN
70
Hierarchical structure
Home Page
Area page
Sub area
page
Contents
page
Area page
Sub area
page
Contents
page
Contents
page
Contents
page
Contents
page
Contents
page
…
Area page
Contents
page
Areas identify and organize contents
In each page we can have links to the home page and to the other pages of
the area
Web Quality - Buenos Aires (Argentina) July 21-25
71
Politecnico
di Milano
Usability test
A simple example
Information density is
inverse proportional to
the search time and
the capability of
finding what we were
searching for
A useful page should
not contain an
excessive number of
non homogenous data
We have many pages,
even professional ones,
that are not useful
according to this
criterion
Web Quality - Buenos Aires (Argentina) July 21-25
73
Usability test
Test with sample users to study their interaction with the
system
The human participation is basilar
Test of HTML code to verify they are compliant with the W3C
guidelines
It can be easily made in an automatic way
They are complementary techniques
Web Quality - Buenos Aires (Argentina) July 21-25
74
Usability test
The tests with sample users apply techniques to collect qualitative
and quantitative data while they interact with the product to stress
particular features
We have two main approaches with sample users
The use of formally defined tests to validate or invalidate
particular hypotheses (not so used)
The use of iterative tests to gradually expose usability problems
The goals are
Identification and correction of usability problems before
releasing the application
Facilitate the creation of products that
Are easy to learn and use
Satisfy user needs
Supply useful functionality to the set of target users
Create a repository of tests to be used for future releases
Web Quality - Buenos Aires (Argentina) July 21-25
75
Non technical goals
Minimize the costs to support product maintenance
Increase the number of sold products
Acquire a competitive advantage with respect to competitors
Usability is a key feature for many products
Minimize risks before the release
Web Quality - Buenos Aires (Argentina) July 21-25
76
Types of usability tests
Analysis of user
requirements
Requirements
specification
Preliminary
design
Comparison tests
Explorative
tests
Detailed design
Evaluation tests
Implementation
Validation tests
Web Quality - Buenos Aires (Argentina) July 21-25
77
Explorative tests (I)
When
At the beginning of the development process
When we know the user profile and how the system will be
used, but we are working on the functional specifications
Before working on the detailed project
Goals
Understand the validity of the preliminary design
Verify the conceptual idea that the user has of the product
(high abstraction level)
Verify the hypothesis on the user
Web Quality - Buenos Aires (Argentina) July 21-25
78
Explorative tests (II)
Methodology
We use product prototypes to make them be evaluated by
representative users
At the beginning, we can also use static interfaces, maybe
even on whiteboards (story boards)
We must highly interact with participants to
Verify the efficacy of the concepts on which the
preliminary project is based
Help fill the gap as to not-yet-implemented
functionality
Web Quality - Buenos Aires (Argentina) July 21-25
79
Evaluation tests (I)
It is the most widely used usability test
When
At the beginning or during the development cycle
After the definition of the preliminary project
Goals
We want to extend the results with explorative tests to
evaluate the usability of low level operations
While explorative tests work on the skeleton of my
product, these tests consider all characteristics
We do not want to evaluate how intuitive the product is,
but how a user can complete realistic tasks and maybe
identify possible problems
Web Quality - Buenos Aires (Argentina) July 21-25
80
Evaluation tests (II)
Methodology:
The user always works on tasks, instead of surfing around
and making comments
We not want to understand mental processes, but we
consider the actual behavior
We collect quantitative measures
Web Quality - Buenos Aires (Argentina) July 21-25
81
Validation tests (I)
Called also verification test
When
Late in the development cycle
It is used to certify the usability of the product
Before releasing the product
Goals
We want to compare the product against predefined
standards or standard used by competitors
We want our product to comply with standards before
releasing it
If it is not compliant with want to understand why
These standards are defined at the beginning of the
development cycle along with the usability goals
Web Quality - Buenos Aires (Argentina) July 21-25
82
Validation tests (II)
Usability goals
They are defined by thinking of
Usability tests on previous versions
Market analysis
Interviews with users
They are characterized by means of
Criteria to measure performance (speed, user accuracy
while working on a task)
Criteria for user preferences
Web Quality - Buenos Aires (Argentina) July 21-25
83
Validation test (III)
Methodology
Similar to evaluation test
Before testing, we need to identify the reference
standards
We must define also the tolerance with want to use to
accept results (e.g., % of failures)
We propose specific tasks to participants
They do not interact with the test monitor
We collect quantitative data
Web Quality - Buenos Aires (Argentina) July 21-25
84
Comparison tests (I)
When
It is not associated with any specific step in the
development process
In the first phases it can be used to compare different
alternatives through exploratory tests
It can be used to access the efficacy of a single component
It can be used to compare the product with what
developed by competitors
Goals
It can be associated with any of the other testing methods
It can be used to understand pros and cons of different
projects
It can be used to understand the effectiveness of different
designs
Web Quality - Buenos Aires (Argentina) July 21-25
85
Comparison tests (II)
Methodology
We propose the vis-a-vis comparison between two or more
alternatives
For each alternative, we collect information and
observations on performance and preferences
The format we use depends on what the want to get
In many cases we discover that some alternatives can be
the winning ones
The best results come when we compare projects that are
radically different (instead of just similar)
Web Quality - Buenos Aires (Argentina) July 21-25
86
The phases of usability test
Definition of test plan
Selection and recruiting of participants
Preparation of test material
Executing the test
Debriefing
Conclusions and recommendations
Web Quality - Buenos Aires (Argentina) July 21-25
87
Definition of test plan
Objectives: We want to describe the reasons why we want to
test the application
For example: We have user feedback on some particular
problems with the application
Test objectives: We must describe the test questions in a clear
and neat way
For example: is the on-line help easier through hot-keys or
the mouse?
Is the on-line help enough and self-contained?
Is the navigation good enough to allow the user to always
know where he is?
Is the product usable?
This is wrong because it is too vague
Web Quality - Buenos Aires (Argentina) July 21-25
88
Test plan: how to design tests
We must define the test procedure by identifying the steps
that testers should follow and the “tools” they should use
We must decide the distribution of participants with respect
to
Experience, background, age, sex, ..
Execution order of tasks
They need a reference guide for all participants (test monitor
and external observers)
Different test monitors can manage the same test in
equivalent ways
Web Quality - Buenos Aires (Argentina) July 21-25
89
Test plan: final report
It must identify all elements that are significant to evaluate
the application and that must be collected
For example
For each task of group of them
Assigned time frame
Percentage of participants that have completed their
task successfully
Percentage of participants that have done some
mistakes
Percentage of participants that have not been able to
complete their task
Web Quality - Buenos Aires (Argentina) July 21-25
90
Selection and recruiting of participants
Identification of classes of possible users
For example: professors, expert students, novice students
Definition of the optimum sample (10-12 participants)
Some participants from each class according to the
frequencies in the real world
For example
3 professors, 1 professor of computer science
5 students that are used to navigate in Internet
2 novice students
Definition of the minimal sample (4-5 participants)
Identify the minimum number for each class
1 professor, better if not expert
2 expert students
1 novice student
Web Quality - Buenos Aires (Argentina) July 21-25
91
Preparation of test material
Screening questionnaire to select participants
Questionnaire to study their background
Tools to collect data
What data
Performance: measures on the behavior of the
application (e.g., number of errors, mean time to
answer)
Preferences: opinions on the product (e.g., preferences
between two possible versions)
How to classify them
Declarations to comply with privacy rules
Pre-test
Post-test
Web Quality - Buenos Aires (Argentina) July 21-25
92
Test execution
Introduction
Presentation of the system and test
Filling of a questionnaire to know the participants
Identification of goals
Let the participants know what we want with the test
We want to evaluate the system not the participant
We should be careful with cameras and microphones
Execution
Each participant must have a list of tasks that must be
completed in a given time frame
The test monitor oversee the process, but does not
influence it
Web Quality - Buenos Aires (Argentina) July 21-25
93
Debriefing
We interview participants on what they did during the test
It is useful to understand the reasons of discovered problems
and how to solve them
We should not judge participants and try to score them
We must try to convince participants to tell their comments
Web Quality - Buenos Aires (Argentina) July 21-25
94
Conclusions and recommendations
We collect and summarize data (mean values and accuracy)
We analyze data to:
Identify those tests that did not match quality criteria
Identify the origin/nature of the problems
Discover new problems
Classify priorities: Criticality = Severity + Probability
Severity (to be defined with the development team):
–
–
–
–
Unusable
Hard
Mild
Not important
Web Quality - Buenos Aires (Argentina) July 21-25
95
Analysis of results
After executing tests, the test monitor analysis results based
on questioners and discussions with participants
He produces a report to identify problems and limits of the
system (as to usability)
He can define new requirements
Who is responsible for the system can decide that the system
must be reworked
The test monitor decides if and what tests must be reexecuted
Web Quality - Buenos Aires (Argentina) July 21-25
96
Politecnico
di Milano
Accessibility
W3C WAI
The WAI – Web Accessibility Initiative defines 14 steps that
must be followed to make a Web application accessible
Each step is associated with a priority level based on the
impact it has on the accessibility
According to the priority, we have 3 certification levels (A, AA,
AAA)
A : This means that the application complies with all MUST
steps
AA : This means that the application complies with all
MUST and SHOULD steps
AAA : This means that the application complies with all
MUST, SHOULD, LIKE TO steps
Web Quality - Buenos Aires (Argentina) July 21-25
98
Priorities
[Priority 1]
A Web content developer must satisfy this checkpoint.
Otherwise, one or more groups will find it impossible to access
information in the document. Satisfying this checkpoint is a basic
requirement for some groups to be able to use Web documents.
[Priority 2]
A Web content developer should satisfy this checkpoint.
Otherwise, one or more groups will find it difficult to access
information in the document. Satisfying this checkpoint will
remove significant barriers to accessing Web documents.
[Priority 3]
A Web content developer may address this checkpoint.
Otherwise, one or more groups will find it somewhat difficult to
access information in the document. Satisfying this checkpoint
will improve access to Web documents.
Web Quality - Buenos Aires (Argentina) July 21-25
99
Example
In General (Priority 1)
1.1 Provide a text equivalent for every non-text element (e.g., via "alt", "longdesc", or
in element content). This includes: images, graphical representations of text
(including symbols), image map regions, animations (e.g., animated GIFs), applets
and programmatic objects, ascii art, frames, scripts, images used as list bullets,
spacers, graphical buttons, sounds (played with or without user interaction), standalone audio files, audio tracks of video, and video.
2.1 Ensure that all information conveyed with color is also available without color, for
example from context or markup.
4.1 Clearly identify changes in the natural language of a document's text and any text
equivalents (e.g., captions).
6.1 Organize documents so they may be read without style sheets. For example, when
an HTML document is rendered without associated style sheets, it must still be
possible to read the document.
6.2 Ensure that equivalents for dynamic content are updated when the dynamic content
changes.
7.1 Until user agents allow users to control flickering, avoid causing the screen to
flicker.
14.1 Use the clearest and simplest language appropriate for a site's content.
And if you use images and image maps (Priority 1)
1.2 Provide redundant text links for each active region of a server-side image map.
9.1 Provide client-side image maps instead of server-side image maps except where the
regions cannot be defined with an available geometric shape.
Web Quality - Buenos Aires (Argentina) July 21-25
100
Politecnico
di Milano
Functional correctness
Traditional testing
Granularity levels
Acceptance testing: the software behavior is compared with
end user requirements
System testing: the software behavior is compared with the
requirements specifications
Integration testing: checking the behavior of module
cooperation.
Unit testing: checking the behavior of single modules
Regression testing: to check the behavior of new releases
Web Quality - Buenos Aires (Argentina) July 21-25
102
The test case generation problem
How to generate test data
Partition testing: divide program in (quasi-) equivalence
classes
random
functional (black box)
based on specifications
structural (white box)
based on code
fault based
based on classes of faults
Web Quality - Buenos Aires (Argentina) July 21-25
103
White vs black box
Black box
it depends on the
specification notation
it scales up
(different techniques at
different granularity levels)
it cannot reveal code bases
testing
(same specification
implemented with
different modules)
Web Quality - Buenos Aires (Argentina) July 21-25
White box
it is based on control or
data flow coverage
it does not scale up (mostly
applicable at unit and
integration testing level)
it cannot reveal missing
path errors
(part of the specification
that is not implemented)
104
Specification-based Testing
From formal specifications
can be automated
EXAMPLES: Test case generation from
Algebraic specifications
Finite state automata (UML class diagrams)
Grammars
From semi-formal specifications
partitions can be easily identified
can be partially automated
Web Quality - Buenos Aires (Argentina) July 21-25
105
Test-case Generation from Informal
Specifications (Natural Language)
cannot be automated
some structure (e.g., organization standards) can help
guidelines to increase confidence level and reduce
discretionality:
at least on test case for each:
subsets of “valid” homogeneous data
“non valid” (combination of) data
boundary data
specific data (treated independently, error prone,...)
Web Quality - Buenos Aires (Argentina) July 21-25
106
Fault-based Testing
Identify a set of program locations
(related to specific faults)
generate alternate programs by seeding faults in the original
program in the identified locations
generate test cases to estimate adequacy in detecting real
faults from adequacy in detecting seeded faults
Web Quality - Buenos Aires (Argentina) July 21-25
107
Partition Testing
Basic idea: Divide program input space into (quasi-)
equivalence classes
Underlying idea of specification-based, structural, and
fault-based testing
Web Quality - Buenos Aires (Argentina) July 21-25
108
The Category-Partition Method
STEP 1: Analyze the specification:
Identify individual functional units that can be tested
separately. For each unit identify:
parameters and characteristics
environment and characteristics
classify units into categories
STEP 2: Partition the categories into choices
STEP 3: Determine constraints among the choices
STEP 4: Write tests and documentation
Web Quality - Buenos Aires (Argentina) July 21-25
109
The Category-Partition Method:
an example
......... *
* From Ostrand, Balcer, The
Command:
Category-Partition Method for
find
Specifying and Generating
Syntax:
Functional Tests
find <pattern> <file>
Function:
The find command is used to locate one or more instances of a given pattern in a file. All
lines in the file that contain the pattern are written to standard output. A line containing
the pattern is written only once, regardless of the number of times the pattern occurs in
it.
The pattern is any sequence of characters whose length does not exceed the maximum
length of a line in the file. To include a blank in the pattern, the entire pattern must be
enclosed in quotes (“). To include a quotation mark in the pattern, two quotes in a row
(““) must be used.
...........
Web Quality - Buenos Aires (Argentina) July 21-25
110
Step A - analyze the specification:
identify categories
find is an individual function that can be tested separately
parameters: pattern, file
characteristics (pattern)
explicit (immediately derivable from specs):
pattern length
pattern enclosed in quotes
pattern contains blanks
pattern contains enclosed quotes
implicit (“hidden” in specs):
quoted patterns with/without blanks
several successive quotes included in the pattern
........
Web Quality - Buenos Aires (Argentina) July 21-25
111
Step B - partition categories
Parameters:
Pattern size:
empty
single character
many characters
longer than any line in the file
Quoting:
pattern is quoted
pattern is not quoted
pattern is improperly quoted
Embedded blanks:
none
one
several
Web Quality - Buenos Aires (Argentina) July 21-25
Parameters (cont.....)
Embedded quotes:
none
one
several
File name:
....
Environment:
Number of occurrences of pattern
in a file:
none
one
several
Pattern occurrences on target line:
....
112
Step C: Determine Constraints
Parameters:
Pattern size:
empty
single character
many characters
longer than any line in the file
Quoting:
pattern is quoted
pattern is not quoted
pattern is improperly quoted
........
[property Empty]
[property NonEmpty]
[property NonEmpty]
[single]
[property Quoted]
[if NonEmpty]
[error]
Environment:
Number of occurrence of pattern in a file:
none
one
[if NonEmpty] [single]
[if NonEmpty] [property Match]
.....
Web Quality - Buenos Aires (Argentina) July 21-25
113
Some Considerations on
the Category Partition Method
a practical implementation of general principles:
partition testing
boundary testing
erroneous conditions
other approaches with similar goals, but different procedures:
condition tables
cause effect graphs
equivalence partitioning
Web Quality - Buenos Aires (Argentina) July 21-25
114
Structural Coverage Testing
(In)adequacy criteria
If significant parts of program structure are not tested,
testing is surely inadequate
Control flow coverage criteria
Statement (node, basic block) coverage
Branch (edge) coverage
Condition coverage
Path coverage
Data flow (syntactic dependency) coverage
Attempted compromise between the impossible and the
inadequate
Web Quality - Buenos Aires (Argentina) July 21-25
115
Statement Coverage
i=0
int select(int A[], int N, int X)
{
int i=0;
while (i<N and A[i] <X)
i<N and A[i] <X
True
{
False
if (A[i]<0)
A[i]<0
True
A[i] = - A[i];
False
i++;
A[i] = - A[i];
}
return(1)
return(1);
i++
}
One test datum (N=1, A[0]=-7, X=9) is enough to guarantee
statement coverage of function select
Faults in handling positive values of A[i] would not be revealed
Web Quality - Buenos Aires (Argentina) July 21-25
116
Branch Coverage
i=0
int select(int A[], int N, int X)
{
int i=0;
while (i<N and A[i] <X)
i<N and A[i] <X
True
{
False
if (A[i]<0)
A[i]<0
True
A[i] = - A[i];
False
i++;
A[i] = - A[i];
}
return(1)
return(1);
i++
}
We must add a test datum (N=1, A[0]=7, X=9) to cover branch
False of the if statement. Faults in handling positive values of
A[i] would be revealed. Faults in exiting the loop with condition
A[i] <X would not be revealed
Web Quality - Buenos Aires (Argentina) July 21-25
117
Condition Coverage
i=0
int select(int A[], int N, int X)
{
int i=0;
while (i<N and A[i] <X)
i<N and A[i] <X
True
{
False
if (A[i]<0)
A[i]<0
True
A[i] = - A[i];
False
i++;
A[i] = - A[i];
}
return(1)
return(1);
i++
}
Both conditions (i<N), (A[i]<X) must be false and true for
different tests. In this case, we must add tests that cause the
while loop to exit for a value greater than X. Faults that arise
after several iterations of the loop would not be revealed.
Web Quality - Buenos Aires (Argentina) July 21-25
118
Path Coverage
i=0
int select(int A[], int N, int X)
{
int i=0;
while (i<N and A[i] <X)
i<N and A[i] <X
True
{
False
if (A[i]<0)
A[i]<0
True
A[i] = - A[i];
False
i++;
A[i] = - A[i];
}
return(1)
return(1);
}
i++;
The loop must be iterated given number of times.
PROBLEM: uncontrolled growth of test sets. We need to
select a significant subset of test cases.
Web Quality - Buenos Aires (Argentina) July 21-25
119
Data Flow Coverage
int select(int A[], int N, int X)
{
int i=0;
while (i<N and A[i] <X)
{
if (A[i]<0)
A[i] = - A[i];
i++;
}
return(1);
}
DEF={A,N,X}
DEF={i}
USE={i,N,A,X}
False
Exercise Def-Use paths: selects paths
based on effects on the variables, rather
than number of iteration of loops
Web Quality - Buenos Aires (Argentina) July 21-25
True
USE={A,i}
False
True
USE={A,i}
DEF{A}
USE={i}
DEF={i}
120
The Infeasibility Problem
Syntactically indicated behaviors (paths, data flows, etc.) are
often impossible
Infeasible control flow, data flow, and data states
Adequacy criteria are typically impossible to satisfy
Unsatisfactory approaches:
Manual justification for omitting each impossible test case
(esp. for more demanding criteria)
Adequacy “scores” based on coverage
example: 95% statement coverage, 80% def-use
coverage
Web Quality - Buenos Aires (Argentina) July 21-25
121
Regression Testing
Testing a new version (release): how can we minimize effort
using results of testing of previous versions?
On a previous release:
save scaffoldings (drivers,stubs,oracles)
record test cases (<inputs,outputs>)
On the new release:
keep track of changes
evaluate impact of changes
Web Quality - Buenos Aires (Argentina) July 21-25
122
Create Scaffolding
D
R
I
V
E
R
initialization of non-local variables
initialization of parameters
activation of the unit
PROGRAM UNIT
S
T
U
B
“templates” of modules used by the
unit (functions called by the unit)
“templates” of any other entity used
by the unit
Web Quality - Buenos Aires (Argentina) July 21-25
ORACLE
check the
correspondence
between the
produced and
the expected
result
123
Problems and Tradeoffs
effort in test execution and regression testing
poorly designed
drivers/stubs
low effort in
development
high effort in test
execution and
regression testing
high effort in
development
low effort in test
execution and
regression testing
well designed
drivers/stubs
effort in developing drivers/stubs
Web Quality - Buenos Aires (Argentina) July 21-25
124
What is an oracle?
An “Inspector” of executions:
do test executions produce
acceptable results?
13245
35968
....
An oracle can be:
human being
machine
a former version of the same program
another program
.....
Web Quality - Buenos Aires (Argentina) July 21-25
125
What is a good oracle?
Testing large, complex applications may
require millions of test runs
The size of the
outputs to be
inspected exceed
the capabilities of
human eyes
human eyes are slow and
unreliable examiners even of
small number of outputs
AUTOMATED ORACLES ARE ESSENTIAL!
Web Quality - Buenos Aires (Argentina) July 21-25
126
How can we build acceptable oracles?
There is NO universal recipe
• GUI
• protocols
• .....
Different solutions for different
application domains
development environments
development phases
• system testing
• regression testing
• ........
Web Quality - Buenos Aires (Argentina) July 21-25
•
•
•
•
no specifications
informal specifications
formal specifications
.......
127
Oracles from Design
Example: UML design notations
Message sequence charts
A UML message sequence chart indicates a test case and
expected outcome, which can be interpreted by a driver
and oracle
Typical of “scenario-based” oracles
scenarios combine test case with special oracle
StateChart (finite state acceptor)
A UML finite state machine describes all permissible
behaviors of a module
oracle can be used with large numbers of automatically
generated test cases
Web Quality - Buenos Aires (Argentina) July 21-25
128
Oracles from Code Documentation
Parnas’ tabular
annotations
precisely describe
the functional
behavior of the
unit. The table
can be evaluated
with respect to
the produced
outputs to check
for their
correctness .
DISPLAY 1
*
Display 1 Specification
Find(x,A,j,present)
R0(,) = ((1n) and [for all (1in) ‘A[i] ‘A[i+1]])
i[(1in) and (‘A[i]=‘x)] =
true
j’
|
present’=
false
‘A[j]=‘x
true
true
false
procedure find (...)
..........
end {find}
and NC(x,A)
Display 1 Program
Display 1 Specifications of the Invoked Programs
..........
* from: Parnas, Madey, Iglewski, Precise Documentation of WellStructured Programs, IEEE-TSE Vol.. 20 N. 12 Dec 1994
Web Quality - Buenos Aires (Argentina) July 21-25
129
Harness vs. Embedded Assertions
Driver
Driver
Oracle
Oracle
Unit or
Subsystem
Oracle
Unit or
Subsystem
Oracle
Stubs
Stubs
Embedded assertions act as oracles
within the unit under test
Web Quality - Buenos Aires (Argentina) July 21-25
130
Assertions as Oracles
/*
* Alphabetic sort of an array of strings
*/
void sort( char *words[ ], int nwords )
{
...
assert( is_sorted(words, nwords) );
return;
}
Web Quality - Buenos Aires (Argentina) July 21-25
131
Another example: HttpUnit
The main class is WebConversation
It is a browser that interacts with a server
With this class we can buid several interactions
WebConversation wc = new WebConversation();
WebResponse resp =
wc.getResponse("http://httpunit.sourceforge.net/doc/Coo
kbook.html");
WebLink link = resp.getLinkWith("response");
Web Quality - Buenos Aires (Argentina) July 21-25
132
An example
Check
Check
Web Quality - Buenos Aires (Argentina) July 21-25
133
An example
Check
! Check password
Web Quality - Buenos Aires (Argentina) July 21-25
134
Politecnico
di Milano
Analysis
Software Inspection:
Low tech but effective
Fagan Code Inspections
One of many “walk-through” and inspection techniques;
among the most successful
More formal and well-defined than “structured walkthroughs” etc.
Has been extended to designs, requirements, etc. with
similar organizing principles
A completely manual technique for finding and correcting
errors
Web Quality - Buenos Aires (Argentina) July 21-25
136
Software Inspection Roles
Moderator:
Typically borrowed from another project. Chairs meeting,
chooses participants, controls process
Readers, Testers:
Read code to group, look for flaws
Author:
Passive participant; answer questions when asked
Web Quality - Buenos Aires (Argentina) July 21-25
137
Software Inspection Process
Planning
Moderator checks entry criteria, choose participants,
schedule meeting
Overview
Provide background education, assign roles
Preparation
Inspection (see ahead)
Rework
Follow-up (& possible re-inspection)
Web Quality - Buenos Aires (Argentina) July 21-25
138
In the Meeting
Goal: Find as many faults as possible
max 2 x 2 hour sessions per day
approx. 150 source lines/hour
Approach: Line-by-line paraphrasing
Reconstruct intent of code from source
May also “hand test”
Find and log defects, but don’t fix them
Moderator responsible for staying on track
Web Quality - Buenos Aires (Argentina) July 21-25
139
Checklists — NASA example
About 2.5 pages for C code, 4 for FORTRAN
Divided into: Functionality, Data Usage, Control,
Linkage, Computation, Maintenance, Clarity
Examples:
Does each module have a single function?
Does the code match the Detailed Design?
Are all constant names upper case?
Are pointers not typecast (except assignment of NULL)?
Are nested “INCLUDE” files avoided?
Are non-standard usages isolated in subroutines and well
documented?
Are there sufficient comments to understand the code?
Web Quality - Buenos Aires (Argentina) July 21-25
140
Inspection Automation
Although a manual technique, many kinds of automated
support are possible:
Automate trivial checks (e.g., formatting)
Reference: Checklists, standards w/ examples
Focus (highlight, selection) on relevant parts
Annotation & Communication
Process guidance and (partial) enforcement
e.g., InspeQ will not allow check-off until all relevant
parts of a document have been observed
Web Quality - Buenos Aires (Argentina) July 21-25
141
Why does inspection work?
The evidence says it is cost-effective. Why?
Detailed, formal process, with record keeping
Check-lists; self-improving process
Social aspects of process, esp. for author
Consideration of whole input space
Applies to incomplete programs
Limitations
Scale: Inherently a unit-level technique
Non-incremental; what about evolution?
Web Quality - Buenos Aires (Argentina) July 21-25
142
Data flow analysis
function absdiff (a, b: integer)
return integer is
if (a < b)
tmp: integer;
begin
if (a < b) then
tmp := a;
a := b;
tmp := a;
a := b;
b := tmp;
b := tmp;
end if;
return ( a - b );
end absdiff;
Web Quality - Buenos Aires (Argentina) July 21-25
absdiff := a - b;
143
Classic data flow analyses to find
program errors
Uninitialized variable
“May” result from classic “avail” analysis
but conservative analysis can be annoying
“Must” version is also possible (how?)
Dead assignment (no possible use)
Classic “live variables” analysis
In FORTRAN, Awk, BASIC, PERL, etc., usually indicates a
misspelled variable
less useful in languages requiring declarations
Web Quality - Buenos Aires (Argentina) July 21-25
144
Precision & Safety
An analysis is conservative (safe) if it doesn’t miss errors
An analysis is precise to the extent that it doesn’t report
spurious errors
Static flow analysis considers all (syntactic) program paths; it
can be conservative or precise, but not both
An overly conservative, imprecise analysis may be useless.
A well-defined but overly strict property may be
preferable to spurious error reports
Web Quality - Buenos Aires (Argentina) July 21-25
145
Analysis of Models:
State-Space Exploration
Concurrency (multi-threading, distributed programming, ...)
makes testing harder
introduces non-determinism; time- and load-dependent
bugs escape extensive testing
Finite-state models can be exhaustively verified
-E
?E
!E
accept E do
...
...
E
Extract
Web Quality - Buenos Aires (Argentina) July 21-25
Combine
Check
146
Automated Finite-State Verification
G. Holzmann, “The model checker SPIN.”
IEEE TSE 23(5), May 1997
Example tool SPIN (one of many)
verifies simple program-like design model
high-level design of process interaction, ignoring other
aspects of computation (e.g., functional behavior)
used for protocols, OS scheduling, ...
useful despite limited capacity; best for verifying highlevel design before coding
Domain-specific analysis
limited “proof” of simple but critical properties in a limited
domain
Web Quality - Buenos Aires (Argentina) July 21-25
147
State explosion problem
Size of composite state graph is
product of individual state graphs.
OK for a simple two-party protocol, but impossibly
expensive for systems with many processes
Brute force state enumeration is limited
to a few processes
State explosion is one face of a (provably) hard problem.
The same fundamental limits appear in different form for
non-enumerative analyses
Web Quality - Buenos Aires (Argentina) July 21-25
148
What is static analysis good for?
Not a replacement for testing
focused, (mostly) automated analysis for limited classes of
faults
More thorough than testing (within scope)
conservative analyses are tantamount to formal
verification
Also augments testing, e.g., dependence analysis for data
flow testing
Web Quality - Buenos Aires (Argentina) July 21-25
149
Politecnico
di Milano
Load and performance test
Robustness and scalability
Robustness: It is the capability of behaving “decently” even in
cases not explicitly considered in the requirements definition
document
If we consider Web applications, the number of users is the
key factor
Scalability: It is the capability of serving a given load, along
with a predefined QoS, and being able to adapt to the
evolutions of the load
Web Quality - Buenos Aires (Argentina) July 21-25
151
An example load problem
Yahoo was attacked by sending thousands of emails and the
servers collapsed
Web Quality - Buenos Aires (Argentina) July 21-25
152
Approaches
We can analyze the architecture to discover problems in its
components or in the way they interact
We can verify their design and interactions
We can produce models
We can test the application (load test)
Web Quality - Buenos Aires (Argentina) July 21-25
153
Load test
We simulate a given load on the server by means of virtual
users
These users behave like real users and test the capability
of the system to support the load
We must have the application before being able to work on
load testing
With this test, we can
Study the HW/SW configuration needed to support a given
load
Discover the maximum load we want offer with the current
configuration
Web Quality - Buenos Aires (Argentina) July 21-25
154
How
Manual tests
We use employees during the week-ends
It is really expensive and we cannot simulate exceptional
situations
We need millions of users for Web applications
Automatic tests (Mercury, Rational, Empirix):
We do not need too many employees
We can perform what-if simulations
We can better collect statistics
Available tools record the activities done by real tools, create
the scripts that virtual users should execute, and measure the
mean time to answer
Scripts can be parametric and configurable and use a DB to
store and retrieve data and generate coherent use cases
Web Quality - Buenos Aires (Argentina) July 21-25
155
Astra load test (Mercury)
It can emulate thousands of users
and provides a graphical monitor to
identify and isolate problems
It allows users to record
browsing/interaction sessions
It allows users to fill forms in a
parametric way
San Francisco
Portland
Acapulco
Web Quality - Buenos Aires (Argentina) July 21-25
156
Astra load test
The test monitor allows us to
Define the number of virtual
users
Identify the tests that we
want to execute and are
already stored in the right
format
The machines on which we
want to execute the test
The monitor also displays the
results even during the
execution of the tests
Web Quality - Buenos Aires (Argentina) July 21-25
157
Politecnico
di Milano
Performance
Performance
We should always consider our clients before designing the
application
Its weight basically should depend on them
We should carefully plan performance testing at the beginning
of our project
Our requirements should clearly state what we want
We must clearly state pass and fail conditions
A transaction that takes too longs can be seen as an
error
What is the meaning of “too long”?
Web Quality - Buenos Aires (Argentina) July 21-25
159
What should we test? Where?
We must identify those cases are we think can be critical
Trivial problems become important on the server because
of the degree of parallelism
Different applications have different criticalities
In client-server like architectures we should always identify
the bottleneck
Is the server slow?
Is available bandwidth limited on the client side?
Are CPU and memory not sufficient?
Usually, we mainly consider problems on the server side
Before starting we must clearly know the HW/SW architecture
Web Quality - Buenos Aires (Argentina) July 21-25
160
Three levels
Ad hoc performance testing
Testers validate if the application answers with a
reasonable delay. They may record failures, but they do
not locate them
Observational testing
We use the first measures, but we use watches and similar
tools
Measured testing
We use objective measures and specific tools
Web Quality - Buenos Aires (Argentina) July 21-25
161
What do we measure?
Measures on the server
Megacycles (MCs)
Memory footprint
Measures that consider the network
Time to last byte (TTLB)
User-perceived response time
Other measures
Bytes over the wire (BoW)
Web Quality - Buenos Aires (Argentina) July 21-25
162
Megacycles (MCs)
CPU cost =
(CPU usage * number of CPUs * CPU speed in MHz) / requests
per second
This measure defines the computation cost on the CPU
If I know this cost and the number of users, I can foresee
the number of CPUs I need
When I add a new CPU, at most it gives an improvement of .8
because of the hardware it shares (bus, ram, ...)
Web Quality - Buenos Aires (Argentina) July 21-25
163
Example
I have two bi-processor Web servers that work at 400 MHz
Their usage is equal to the 60%
They must server 30 requests per second
The mean value of each measure is:
Available MCs =(1+0.8)+(1+0.8)*400= 1440
Used MCs =1440*0.60= 864
MCs used for each request =864/30= 28.8
Requests that we can serve per second =1440/28.8= 50
Web Quality - Buenos Aires (Argentina) July 21-25
164
Other measures
Memory footprint
It is used to study the maximum usage of memory
It is not interesting if we have only static pages and images
It is interesting when we have dynamic pages, accesses to
DBs, and code that uses memory (Java and C#)
Time to last byte (TTLB)
It measures the time from the instant at which the request
leaves the client till the server sends the last byte
It does not consider the time used by the client to execute
what receives (to execute a script, render a page)
If the “execution” is complex and time-consuming, the
user can have significant delays even if the TTLB is low
Web Quality - Buenos Aires (Argentina) July 21-25
165
Other measures
User-perceived response time
It is the time needed to fully load a page on the client and
have it ready to interact with the user
It is usually greater than TTLB since we must consider the
overhead necessary after the last byte
Bytes over the wire (BoW)
It counts the number of bytes moved between client and
server. It distinguished between:
First request: the cache is empty
Further request: the cache contains some data and we
must transmit only what changes (dynamic data)
Web Quality - Buenos Aires (Argentina) July 21-25
166
Model-based validation
Can provide quick feedback during the design
Allow for the tuning of resources
Internal Web
page
Req
Client
Internal LAN
Req
Req
Internet
Req
Internal LAN
page
new page
page
Web Quality - Buenos Aires (Argentina) July 21-25
167
Performance testing
We need testing tools
For example: E-Test (Empirix)
We create a script that corresponds to a
given browsing/interaction session
To measure the performance of the
server with respect to the number of
users
To verify the performance of each
single component and identify
bottlenecks
Kb/sec
Pages/sec
CPU usage by
the DB
Memory usage
by the DB
Web Quality - Buenos Aires (Argentina) July 21-25
Transactions/sec
168
Testing services through the Web
We can use some applications that allow us to test our
applications without installing anything on our machines
For example: www.netmechanic.com
They work on the load time for each page and on the
“weight” of each page or component
Load Time
38.71 seconds, height/width problems
Size
Object
URL
78698
HTML
http://www.tiscali.it/
6743
IMG
http://www.tiscali.it/magazine/Italy/spettacoli/mytv/foto/ gino_maggio_2.jpg
5591
IMG
5246
IMG
4179
SCRIPT
http://www.tiscali.it/magazine/Italy/eventi/eventi/Media/Foto/2002/
aprile/19/junior_spazio.jpg
http://www.tiscali.it/magazine/Italy/eventi/eventi/Media/Foto/2002/
maggio/2/foto_epoca.jpg
http://www.tiscali.it/inc/hp/cookie.js
Original: JPEG
Size: 6743 Bytes
H: 130
W: 150
DL Time (28.8): 0:02
Web Quality - Buenos Aires (Argentina) July 21-25
Type: JPEG
Quality: 30
Size: 4170 bytes
DL Time (28.8):
0:01
Savings: 38 %
Load Time by Modem Speed
Modem Speed
14.4k
28.8k
56k
ISDN (128k)
T1 (1.44 MB)
Download Time
75.42 seconds
38.71 seconds
20.68 seconds
10.26 seconds
2.73 seconds
•You may want to break this Web
page into several smaller pages.
•Click on an image's file size to
reduce it with GIFBot.
•Adding HEIGHT and WIDTH
attributes to your images will help
browsers display your page
sooner.
•Adding WIDTH attributes to your
TABLE tags will help browsers
display your page sooner.
•Your page's overall rating was
lowered one level because of
HTML problems.
•Correct the HTML problems to
get our highest rating.
169
Politecnico
di Milano
Security test
????
Politecnico
di Milano
Process
Software Qualities and Process
Qualities cannot be added after development
Quality results from a set of inter-dependent activities
Analysis and testing are crucial but far from sufficient.
Testing is not a phase, but a lifestyle
Testing and analysis activities occur from early in
requirements engineering through delivery and subsequent
evolution.
Quality depends on every part of the software process
An essential feature of software processes is that software
test and analysis is thoroughly integrated and not an
afterthought
Web Quality - Buenos Aires (Argentina) July 21-25
172
The Quality Process
Quality process: set of activities and responsibilities
focused primarily on ensuring adequate dependability
concerned with project schedule or with product usability
The quality process provides a framework for
selecting and arranging activities
considering interactions and trade-offs with other
important goals.
Web Quality - Buenos Aires (Argentina) July 21-25
173
Interactions and tradeoffs
Example: high dependability vs. time to market
Mass market products:
Better to achieve a reasonably high degree of
dependability on a tight schedule
Than to achieve ultra-high dependability on a much longer
schedule
Critical medical devices:
Better to achieve ultra-high dependability on a much
longer schedule
Than a reasonably high degree of dependability on a tight
schedule
Web Quality - Buenos Aires (Argentina) July 21-25
174
Properties of the Quality Process
Completeness: appropriate activities are planned to detect
each important class of faults.
Timeliness: faults are detected at a point of high leverage (as
early as possible)
Cost-effectiveness: activities are chosen depending on cost
and effectiveness
cost must be considered over the whole development cycle
and product life
the dominant factor is likely to be the cost of repeating an
activity through many change cycles.
Web Quality - Buenos Aires (Argentina) July 21-25
175
Planning and Monitoring
The quality process
balances several activities across the whole development
process
selects and arranges them to be as cost-effective as
possible
improves early visibility
Quality goals can be achieved only through careful planning
Planning is integral to the quality process
Web Quality - Buenos Aires (Argentina) July 21-25
176
Process Visibility
A process is visible to the extent that one can answer the
question
How does our progress compare to our plan?‘
Are we on schedule?
How far ahead or behind?
The quality process has not achieved adequate visibility if
one cannot gain strong confidence in the quality of the
software system before it reaches final testing
quality activities are usually placed as early as possible
design test cases at the earliest opportunity (not “just
in time'')
uses analysis techniques on software artifacts produced
before actual code
Web Quality - Buenos Aires (Argentina) July 21-25
177
A&T Strategy
Identifies company- or project-wide standards that must be
satisfied
procedures required for obtaining quality certificates
techniques and tools that must be used
documents that must be produced
Web Quality - Buenos Aires (Argentina) July 21-25
178
Quality plan
a comprehensive description of the quality process that includes:
objectives and scope of quality activities
documents and other items that must be available
items to be tested
features to be tested and not to be tested
analysis and test activities
staff involved in quality
constraints
pass and fail criteria
schedule
deliverables
hardware and software requirements
risks and contingencies
Web Quality - Buenos Aires (Argentina) July 21-25
179
Improving the Process
Long lasting errors are common
It is important to structure the process for
Identifying the most critical persistent faults
tracking them to frequent errors
adjusting the development and quality processes to
eliminate errors
Feedback mechanisms are the main ingredient of the quality
process for identifying and removing errors
Web Quality - Buenos Aires (Argentina) July 21-25
180
Organizational factors
Different teams for development and quality?
separate development and quality teams is common in
large organizations
indistinguishable roles is postulated by some
methodologies (extreme programming)
Different roles for development and quality?
test designer is a specific role in many organizations
mobility of people and roles by rotating engineers over
development and testing tasks among different projects is
a possible option
Web Quality - Buenos Aires (Argentina) July 21-25
181
Example of Allocation of Responsibilities
Allocating tasks and responsibilites is a complex job:
we can allocate
Unit testing
to the development team (requires detailed knowledge of the code)
but the quality team may control the results (structural coverage)
Integration, system and acceptance testing
to the quality team
but the development team may produce scaffolding and oracles
Inspection and walk-through
to mixed teams
Regression testing
to quality and maintenance teams
Process improvement related activities
to external specialists interacting with all teams
Web Quality - Buenos Aires (Argentina) July 21-25
182
Product vs. process improvement
Product improvement:
Fault is detected (by inspection, testing, user report, ...)
Fault is diagnosed and repaired
Process improvement
Faults are detected (and maybe repaired)
Fault record is analyzed to tune process
Web Quality - Buenos Aires (Argentina) July 21-25
183
Fault analysis
What are the faults?
Categorize by kind (Memory leak, interface error, misfeature, etc.)
And by severity
When did they occur? And when found?
Coding? Design? Requirements?
Why did they occur?
Look for “root causes”
How could they be prevented?
Web Quality - Buenos Aires (Argentina) July 21-25
184
Categorizing faults
There is no “right” categorization
May depend on design style, implementation language,
process and documents, ...
Should probably be revised occasionally
Goal is enough precision for “Pareto” analysis (80/20 rule)
considering severity and cost
Categorization needn’t be perfect or painful, but keeping
records is essential
Web Quality - Buenos Aires (Argentina) July 21-25
185
Fault severity
Typical breakdown (of failures):
Critical: Product is unusable
Severe: Product feature cannot be used; no workaround
Moderate: Product feature can be used only with
workaround (loss of efficiency, reliability, or significant
loss of convenience)
Cosmetic or minor inconvenience
Cost may be distinct from severity
Web Quality - Buenos Aires (Argentina) July 21-25
186
80/20 rule (a.k.a. Pareto analysis)
Identify one or two “dominant” fault categories
Considering severity, cost, and frequency
Further problem analysis is limited to these
Categories may “level” over time
A good time for rethinking the categories
Web Quality - Buenos Aires (Argentina) July 21-25
187
Test Documentation
Must be an organization standard
Depends on
organization (size,turnover,..),
type of software (criticality, average life, complexity,
number of versions,...)
It must include at least:
test suite documentation
test case documentation
Web Quality - Buenos Aires (Argentina) July 21-25
188
Test Documentation (cont....)
Documentation of test suites:
software tested
version
goal
overall results
author
Documentation of Test Cases:
goal
“environment” (driver,stub,oracle)
input
expected output
actual output
result
observations
Web Quality - Buenos Aires (Argentina) July 21-25
189
Politecnico
di Milano
Tools
Tools
Test process management
Validator/Req
Test Director
Rational Robot
Mercury WinRunner
Segue SilkTest
Compuware QA Run
Web Quality - Buenos Aires (Argentina) July 21-25
Usability
WebSAT
Bobby
Page Valet
W3C HTML Validator
Service
HTML Authoring Service
Xenu
Alert LinkRunner
HTML Link Validator
HTML Validator 5.0
Web Link Validator
191
Tools
Performance
Web Performance Trainer
Httpload
Webserver Stress Tool
Http/s Load OpenSTA
jMeter
Coverage
Logiscope TestCecker
Deep Cover
jCover
LDRA TestBed
Attol Coverage (Rational)
Scaffolding
ATTOL Unit Test (NOW
Rational)
Cantata++
jUnit
Cactus
HTTPunit
Memory leaks
Purify (Rational)
Sentinel
Web Quality - Buenos Aires (Argentina) July 21-25
Static analysis
Logiscope Audit
LDRA TestBed
192
Politecnico
di Milano
References
Web sites
www.w3c.org
www.useit.com
www.softwareqatest.com/qatweb1.html
www.junit.org/index.htm
www.swquality.com/users/pustaver/index.shtml
standards.ieee.org/catalog/olis/se.html
www.stickyminds.com
www.qaforums.com
www.qualitytree.com
www.cs.uoregon.edu/~michal/book/
Web Quality - Buenos Aires (Argentina) July 21-25
194
Books
Usability
Jakob Nielsen, “Designing Web Usability: The Practice of
Simplicity”, New Riders Publishing, Indianapolis, 2000
Software engineering
Ian Sommerville, "Software Engineering". Addison-Wesley,
2000
Roger S. Pressman, "Software Engineering: A Practitioner's
Approach". McGraw-Hill, 2000
Test in general
Cem Kaner, Hung Quoc Nguyen, Jack Falk, "Testing
Computer Software", John Wiley & Sons, 1999
Web Quality - Buenos Aires (Argentina) July 21-25
195
Books
Test for object-oriented software
Robert V. Binder, "Testing Object-Oriented Systems:
Models, Patterns, and Tools", Addison-Wesley, 1999
Software metrics
Norman E. Fenton, Shari Lawrence Pfleeger, "Software
Metrics: A Rigorous and Practical Approach, Revised",
Brooks/Cole, 1998
Stephen H. Kan, "Metrics and Models in Software Quality
Engineering", Addison-Wesley, 1995
Web Quality - Buenos Aires (Argentina) July 21-25
196
Politecnico
di Milano
Grazie !!!!
Luciano Baresi
DEI - Politecnico di Milano
Piazza L. da Vinci, 32 - 20133 Milano (Italia)
tel: 02 2399 3638
email: [email protected]
www.elet.polimi.it/~baresi
Homework
Study and analyze one of the arguments presented in these
days
Some particular methodologies
Some interesting examples
…
Imagine you are a test manager and try to identify the
activities that should be carried out to test a Web application
For example, www.amazon.com
Consider what, how, and when
Try to estimate costs in terms of person months
Constraints
10 days
5 pages (at most)
Web Quality - Buenos Aires (Argentina) July 21-25
198