InterRing: An Interactive Tool for Visually Navigating and

Download Report

Transcript InterRing: An Interactive Tool for Visually Navigating and

Prefetching
for Visual Data Exploration
Punit R. Doshi, Elke A. Rundensteiner, Matthew O. Ward
Computer Science Department
Worcester Polytechnic Institute
Support: NSF grants IIS-9732897, EIA-9729878, and IIS-0119276.
Overview
• Why visually explore data?
– Fact: Increasing data set sizes
– Need: Efficient techniques for exploring the data
– Possible solution: Interactive Data Visualization -- humans
can detect certain patterns better and faster than data
mining tools
• Why cache and prefetch?
– Interactive data visualization tools do not scale well
– Interactive  real-time response needed
– Caching and prefetching improve response time.
• Goal: Propose and evaluate prefetching for visualization tools
2
Example Visual Exploration Tool: XmdvTool
Flat Display
Data Hierarchy
Hierarchical Display
3
Example Visual Exploration Tool: XmdvTool
Drill Down:
Structure-Based Brush1
Parallel Coordinates (Linked with Brush1)
Roll-Up:
Structure-Based Brush2
4
Parallel Coordinates (Linked with Brush2)
Characteristics of a Visualization Environment
Characteristics that
can be exploited
for caching and
prefetching:
• Locality of exploration
• Contiguity of user
movements
• Idle time due to user
viewing display
Move
up/down
Move left/right
5
Overview of Semantic Caching
• Purpose
• reduce response time and network traffic
• Issues
• visual query cannot directly translate into object IDs
 high-level cache specification to avoid complete scans
• Semantic Caching: queries are cached rather than objects
• minimize cost of cache lookup
• dynamically adapt cached queries to patterns of queries
GUI
cache
Client machine
DB
Server machine
6
In XmdvTool, caching reduced
response time by 85%
Response Time (seconds)
Effectiveness of Caching
200
160
120
80
40
0
Client OFF
Server OFF
Client OFF
Server ON
Client ON Server Client ON Server
OFF
ON
Caching
Prefetching can further improve response time.
7
Prefetching
• Locality of exploration
• Contiguity of user
movements
• Idle time due to user
viewing display
User’s next request can
be predicted with high
accuracy
Time to prefetch
Fetchin
g
Idle
time
New user
query
Cache
Prefetchin
g
DB
8
Prefetching Strategies
Direction
Strategy
Random Strategy
1/4
1/4
1/4
(m-1)
m
Mean Strategy
(m+1)
m(n-1)
1/4
m(n)
m(n+1)
m(n-2)
Localized Speculative Strategies
Exponential Weight
Average Strategy
Focus Strategy
m(n-1)
Current
Navigation
Window
Hot
Regions
Data Set Driven Strategy
m(n)
m(n+1)
m(n-2)
Vector Strategies
9
XmdvTool Implementation
OFF-LINE PROCESS
Used:
–
–
–
–
–
C/C++
TCL/TK
OpenGL
Oracle 8i
Pro*C
MinMax
Labeling
DB DB DB
Loader
Schema
Info
Translator
CACHE
Hierarchical
Data
User
Rewriter
Exploration Buffer
Variables Queries
GUI
Prefetcher
Library:
Buffer
ON-LINE PROCESS
Flat
Data
Estimator
Random
Direction
Focus
Mean
EWA
10
Evaluation of Prefetching Strategies
• Setup:
– Testbed: XmdvTool freeware system for ndimensional exploration
– User Traces:
• Synthetic user traces with varying # of hot regions,
% directionality, average delay between user requests
• Real user traces collected by a user study
• Study effect of different navigation patterns:
– # hot regions
– erratic vs. directional
– delay between user requests
11
Focus strategy best as # hot regions increases
1
Normalized Latency
Prefetching
improves
response time
No Prefetch
0.8
Random
0.6
Direction
0.4
Focus
0.2
Mean
EWA
0
1
2
3
4
5
Number of Hot Regions
12
Random Strategy – best for erratic traces.
Direction Strategy – best for directional traces.
Normalized Latency
1
No Prefetch
0.8
Random
0.6
Direction
Focus
0.4
Mean
0.2
EWA
0
0
20
40
60
80
100
'Keep Direction' factor
13
Prefetcher
performance
improved up
to 28%.
Recall:
Caching
improved
response time
by 85% over
no caching.
Percentage Improvement (%)
Prefetcher performance improves and plateaus
as delay between user operations increases.
30
25
20
15
10
5
0
0
1
2
3
4
5
6
7
8
Delay between User Operations (seconds)
14
What Can We Conclude?
• Focus: hot region calculation overhead
• Mean and EWA: offers more than needed
• Direction: simple, no prior knowledge required
NOTE:
• Our experiments on real user traces show that real
users are highly directional
If only one strategy can be chosen,
select Directional Prefetching.
15
Related Work
• Integrated visualization-database systems -Tioga, IDEA, DEVise
[have not used caching and prefetching]
• Prefetching research -- mostly on (1) web
prefetching, (2) prefetching for memory
caches by OS, (3) I/O prefetching.
[no prefetching research for visualization
apps]
16
Contributions
• Identified key characteristics of visualization tools
exploitable for optimizing data access performance
• Developed, implemented and tested prefetching
strategies in XmdvTool
• Shown that caching coupled with prefetching at
client-side improves data access performance
– Caching reduces response time by 85% over no-caching.
– Prefetching further improves response time by 28% over
no-prefetching.
17
Future Work
No single prefetcher works best for all types
of user navigation patterns
 Adaptive Prefetching
(preliminary results show that this further
improves response time and reduces prediction
errors, at a minimal overhead cost).
18
Thank You
XmdvTool Homepage:
http://davis.wpi.edu/~xmdv
[email protected]
Code is free for research and education.
Contact author: [email protected]
19