Sonal Mahajan - USC - University of Southern California

Download Report

Transcript Sonal Mahajan - USC - University of Southern California

Detection and Localization of HTML
Presentation Failures Using Computer
Vision-Based Techniques
Sonal Mahajan and William G. J. Halfond
Department of Computer Science
University of Southern California
Presentation of a Website
• What do we mean by presentation?
– “Look and feel” of the website in a browser
• What is a presentation failure?
Web–page
renderingto≠move
expected
appearance
End–user
no penalty
to another
website
Business – loses out on valuable customers
• Why is it important?
– It takes users only 50 ms to form opinion about
your website (Google research – 2012)
– Affects impressions of trustworthiness, usability,
company branding, and perceived quality
2
Motivation
• Manual detection is difficult
– Complex interaction between HTML, CSS,
and Javascript
– Hundreds of HTML elements + CSS properties
– Labor intensive and error-prone
• Our approach – Automate debugging of
presentation failures
3
Two Key Insights
1. Detect presentation failures
Oracle image
Visual comparison
Test web page
Presentation
failures
Use computer vision techniques
4
Two Key Insights
2. Localize to faulty HTML elements
Test web page
Layout tree
Faulty HTML
elements
Use rendering maps
5
Limitations of Existing Techniques
• Regression Debugging
– Current version of the web app is modified
• Correct bug
• Refactor HTML (e.g., convert <table> layout to <div> layout)
– DOM comparison techniques (XBT) not useful, if DOM has
changed significantly
• Mockup-driven development
– Front-end developers convert high-fidelity mockups to HTML
pages
– DOM comparison techniques cannot be used, since there is no
existing DOM
– Invariants specification techniques (Selenium, Cucumber, Sikuli)
not practical, since all correctness properties need to be
specified
– Fighting layout bugs: app independent correctness checker
6
Running Example
Web page rendering
≠
Expected appearance (oracle)
7
Our Approach
Oracle image
Goal – Automatically detect and localize
presentation failures in web pages
Visual
differences
Report
Test web page
Pixel-HTML mapping
P1. Detection
P2. Localization
8
P1. Detection
• Find visual differences (presentation failures)
• Compare oracle image and test page
screenshot
• Simple approach: strict pixel-to-pixel
equivalence comparison
– Drawbacks
• Spurious differences due to difference in platform
• Small differences may be “OK”
9
Perceptual Image Differencing (PID)
• Uses models of the human visual system
– Spatial sensitivity
– Luminance sensitivity
– Color only
sensitivity
Shows
human perceptible differences
• Configurable parameters
– Δ : Threshold value for perceptible difference
– F : Field of view of the observer
– L : Brightness of the display
– C : Sensitivity to colors
10
P1. Detection – Example
A
B
C
Apply
clustering
(DBSCAN)
Oracleareas
Test web
Filter
page
differences
Visual
screenshot
comparison
belonging
using
to dynamic
PID
11
P2. Localization
• Identify the faulty HTML element
Use rendering maps to find faulty HTML
elements corresponding to visual differences
• Use R-tree to map pixel visual differences
to HTML elements
• “R”ectangle-tree: height-balanced tree,
popular to store multidimensional data
12
P2. Localization - Example
13
P2. Localization - Example
R1
R2
R3
R4
R5
Sub-tree of R-tree
14
P2. Localization - Example
Result Set:
(100, 400)
/html/body/…/tr[2] R
/html/body/…/tr[2]/td[1]1
/html/body/…/tr[2]/td[1]/table[1]
R
R
2
3
/html/body/…/tr[2]/td[1]/table[1]/tr[1]
/html/body/…/tr[2]/td[1]/table[1]/td[1]
tr[2
]
t
d
tabl
e
t
r
t
d
Map pixel visual differences to HTML elements
15
Special Regions Handling
• Special regions = Dynamic portions (actual content
not known)
1. Exclusion Region
2. Dynamic Text Region
16
1. Exclusion Regions
• Only apply size bounding property
Difference
pixels filtered
in detection
Advertisement
box <element>
reported as
faulty
Test web page
Oracle
17
2. Dynamic Text Regions
• Style properties of text known
News box
<element>
reported as
faulty
Text color: red
Font-size: 12px
Font-weight: bold
Test web page
Modified test web page
(Oracle)
Run P1, P2
18
P3. Result Set Processing
• Rank the HTML elements in the order of
likelihood of being faulty
Use heuristics based on element relationships
• Weighted prioritization score
• Lower the score, higher the likelihood of
being faulty
19
3.1 Contained Elements (C)
✖parent
parent
child1
child2
Expected appearance
✖child1
✖child2
Actual appearance
20
3.2 Overlapped Elements (O)
✖parent
parent
child1
child2
Expected appearance
✖child1
child2
Actual appearance
21
3.3 Cascading (D)
element
1
element
2
element
3
element
1
element
✖2
✖element
3
Expected appearance
Actual appearance
22
3.4 Pixels Ratio (P)
✖parent
Child pixels ratio = 100%
✖child
Parent pixels ratio = 20%
23
P3. Result Set Processing - Example
A
D
Report
Cluster A
1. /html/body/table/…/img
/html
/html/body
.
/html/body/table
.
..
.5. /html/body/table
.6. /html/body
/html/body/table/…/img
7. /html
B
C
Cluster B
Cluster C
Cluster D
E
Cluster E
24
Empirical Evaluation
• RQ1: What is the accuracy of our approach
for detecting and localizing presentation
failures?
• RQ2: What is the quality of the localization
results?
• RQ3: How long does it take to detect and
localize presentation failures with our
approach?
25
Experimental Protocol
• Approach implemented in “WebSee”
• Five real-world subject applications
• For each subject application
– Download page and take screenshot, use as
the oracle
– Seed a unique presentation failure to create a
variant
– Run WebSee on oracle and variant
26
Subject Applications
Size (Total HTML
Elements)
Generated # test
cases
72
52
322
59
1,100
53
Virgin America
998
39
Java Tutorial
159
50
Subject Application
Gmail
USC CS Research
Craigslist
27
RQ1: What is the accuracy?
• Detection accuracy: Sanity check for PID
• Localization accuracy: % of test cases in
which the expected faulty element was
reported in the result set
Java Tutorial
94%
Virgin America
Craigslist
97%
93%
90%
USC CS Research
92%
Gmail
92%
Localization accuracy
28
RQ2: What is the quality of localization?
Java Tutorial
Virgin America
Craigslist
USC CS…
Gmail
8 (5%)
49 (5%)
32 (3%)
23 (10%)
17 (5%)
12 (16%)
Result Set Size
faulty element not present
✖
✔
Distance = 6
1. <>…</>
2. <>…</>
.
.
.
.
.
.
23. <>…</>
Rank = 4.8 (2%)
29
RQ3: What is the running time?
7 sec
21%
87 sec
P2: Localization
54%
25%
3 min
P1: Detection
P3: Result Set
Processing
Sub-image search for cascading heuristic
30
Comparison with User Study
Accuracy
• Graduate-level students
• Manual detection and
localization using Firebug
Students
WebSee
100%
93%
76%
• Time
– Students: 7 min
– WebSee: 87 sec
36%
Detection
Localization
31
Case Study with Real Mockups
• Three subject applications
• 45% of the faulty elements reported in
top five
• 70% reported in top 10
• Analysis time similar
32
Summary
• Technique for automatically detecting and
localizing presentation failures
• Use computer vision techniques for detection
• Use rendering maps for localization
• Empirical evaluation shows positive results
33
Thank you
Detection and Localization of HTML
Presentation Failures Using Computer
Vision-Based Techniques
Sonal Mahajan and William G. J. Halfond
[email protected]
[email protected]
34
Normalization Process
• Pre-processing step before detection
1. Browser window size is adjusted based
on the oracle
2. Zoom level is adjusted
3. Scrolling is taken care of
35
Difference with XBT
• XBT use DOM comparison
– Find matched nodes, compare them
• Regression debugging
– Correct bug, refactor HTML (e.g. <table> to <div> layout)
– DOM significantly changed
• XBT cannot find matching DOM nodes, not accurate
comparison
• Mockup Driven Development
– No “golden” version of page (DOM) exists
– XBT techniques cannot be used
• Our approach
– Uses computer vision techniques for detection
– Applies to both scenarios
36
Pixel-to-pixel Comparison
Oracle
Test Webpage Screenshot
37
Pixel-to-pixel Comparison
98% of the entire image is shown in difference!
Difference pixel
Matched pixel
Difference image
38
P1. Detection
• Find visual differences (presentation failures)
• Simple approach: strict pixel-to-pixel
equivalence comparison
Analyze using computer vision
techniques
• Our approach: Perceptual image
differencing (PID)
39
Perceptual Image Differencing
Difference pixel
Matched pixel
Difference image
40
High fidelity mockups… reasonable?
41
High fidelity mockups… reasonable?
42
High fidelity mockups… reasonable?
43
High fidelity mockups… reasonable?
44
High fidelity mockups… reasonable?
45
High fidelity mockups… reasonable?
46
High fidelity mockups… reasonable?
47
High fidelity mockups… reasonable?
48
High fidelity mockups… reasonable?
49
High fidelity mockups… reasonable?
50
High fidelity mockups… reasonable?
51
High fidelity mockups… reasonable?
52