Your poster should be constructed with this page
Download
Report
Transcript Your poster should be constructed with this page
IBM Research
Caching Dynamic Web Content:
Designing and Analyzing an AspectOriented Solution
Sara Bouchenak – INRIA, France
Alan Cox – Rice University, Houston
Steven Dropsho – EPFL, Lausanne
Sumit Mittal – IBM Research, India
Willy Zwaenepoel – EPFL, Lausanne
© 2006 IBM Corporation
Dynamic Web Caching
Dynamic Web Content
Cache
HTTP request
SQL req.
Internet
SQL res.
HTTP response
Client
Web server
Application
server
Database
server
Web tier
Business tier
Database tier
Motivation for Caching
Represents large portion of web requests
Stock quotes, bidding-buying status on auction site, best-sellers on bookstore
Generation places huge burden on application servers
2
© 2006 IBM Corporation
Dynamic Web Caching
Caching Dynamic Web Content
Dynamic Content Not easy to Cache
– Ensure consistency, invalidate cached entries due to updates
• Write requests can modify entries used by read requests
– Caching logic inserted at different points in the application
• Entry and exit of requests, access to underlying database
• Correlation between requests and their database accesses
Most solutions rely on “manually” understanding complex
application logic
3
© 2006 IBM Corporation
Dynamic Web Caching
Our Contributions
Design a cache “AutoWebCache” that
• Ensures consistency of cached documents
• Insertion of caching logic transparent to application
– Make use of aspect-oriented programming
Analysis of the cache
• Transparency of injecting caching logic
• Improvement in response time for test-bed applications
4
© 2006 IBM Corporation
Dynamic Web Caching
Dynamic Web Caching – Solution Approach
HTTP request
SQL req.
Internet
SQL res.
HTTP response
Client
Web server
Consistency
Correlation between
read and write requests
Transparency
Cache Check
Web Page
Cache
Capture information flow
Application
server
Request
info
Database
server
Database
access
Caching
Cache inserts,
invalidations
Logic
AutoWebCache
5
© 2006 IBM Corporation
Dynamic Web Caching
Outline
Design of AutoWebCache
– Maintaining cache consistency
• Determine relationship between reads and updates
– Cache Structure
Aspectizing Web Caching
– Insertion of caching logic transparently
Evaluation
– Analysis of effectiveness, transparency
Conclusion
6
© 2006 IBM Corporation
Dynamic Web Caching
Maintaining Cache Consistency – Read Requests
Response to read-only requests cached
Read SQL queries recorded with cache entry
Index: URI
(readHandlerName
+ readHandlerArgs)
Cached web
page
URI1
WebPage1
{ Read Query 11, Read Query 12, ….}
URI2
WebPage2
{ Read Query 21, Read Query 22, ….}
…
…
Associated Read Queries
7
© 2006 IBM Corporation
Dynamic Web Caching
Maintaining Cache Consistency – Write Requests
Result not cached
Write SQL queries recorded
Intersect write SQL queries with read queries of
cached pages
Invalidate if non-zero intersection
WS
WS
RS
RS
No
Invalidation
Invalidation
8
© 2006 IBM Corporation
Dynamic Web Caching
Invalidating Cache Entries
Remove
Index: URI
(readHandlerName
+ readHandlerArgs)
Cached web
page
URI1
WebPage1
{ Read Query 11, Read Query 12, ….}
URI2
WebPage2
{ Read Query 21, Read Query 22, ….}
URI3
WebPage3
{ Read Query 31, Read Query 32, ….}
URIn
Associated Read Queries
Write Query
9
© 2006 IBM Corporation
Dynamic Web Caching
Query Analysis Engine
Determines intersection between SQL queries
Three levels of granularity for intersection
– Column based
– Value based
– Extra query based
Balance precision with complexity
10
© 2006 IBM Corporation
Dynamic Web Caching
Column Based Intersection
Invalidate if Column_Read = Column_Updated
a
b
c
5
8
7
1
10
9
SELECT T.a FROM T WHERE T.b = 8
UPDATE T SET T.c = 7 WHERE T.b = 10 Ok
UPDATE T SET T.a = 12 WHERE T.b = 10 Invalidate
11
© 2006 IBM Corporation
Dynamic Web Caching
Value Based Intersection
Invalidate if Rows_Read = Rows_Updated
a
b
c
5
8
7
1
10
9
Invalidate with
column-based
SELECT T.a FROM T WHERE T.b = 8
UPDATE T SET T.a = 7 WHERE T.b = 10 Ok
UPDATE T SET T.a = 12 WHERE T.b = 8 Invalidate
12
© 2006 IBM Corporation
Dynamic Web Caching
Extra Query Based Intersection
Generate extra query to find missing values
a
b
c
5
8
7
1
10
9
??
Invalidate with
value-based
SELECT T.a FROM T WHERE T.b = 8
UPDATE T SET T.a = 3 WHERE T.c = 9
SELECT T.b FROM T WHERE T.c = 9
Ok
13
© 2006 IBM Corporation
Dynamic Web Caching
Outline
Design of AutoWebCache
– Maintaining cache consistency
• Determine relationship between reads and updates
– Cache Structure
Aspectizing Web Caching
– Insertion of caching logic transparently
Evaluation
– Analysis of effectiveness, transparency
Conclusion
14
© 2006 IBM Corporation
Dynamic Web Caching
Dynamic Web Caching – Solution Approach
HTTP request
SQL req.
Internet
SQL res.
HTTP response
Client
Web server
Application
server
Database
server
Transparency
Capture information flow
Cache Check
Web Page
Cache
Request
info
Database
access
Caching
Cache inserts,
invalidations
Logic
AutoWebCache
15
© 2006 IBM Corporation
Dynamic Web Caching
Aspect-Oriented Programming (AOP)
Modularize cross-cutting concerns - Aspects
– Logging, billing, exception handling
Works on three principles
– Capture the execution points of interest – Pointcuts (1)
• Method calls, exception points, read/write accesses
– Determine what to do at these pointcuts – Advice (2)
• Encode cross-cutting logic (before/ after/ around)
– Bind Pointcuts and Advice together – Weaving (3)
• AspectJ compiler for Java
16
© 2006 IBM Corporation
Dynamic Web Caching
Insertion of Caching Logic
Original
web
application
Weaving
Rules
Caching
library
Aspect
Weaving
(Aspect J)
Cacheenabled web
application
version
17
© 2006 IBM Corporation
Dynamic Web Caching
Aspectizing Read Requests
Cache check
Original code of a readonly request handler
// Execute SQL queries
…
SQL query 1
SQL query 2
…
// Generate a web document
webDoc = …
Capturing
request entry
String cachedDoc = Cache.get (uri,
inputInfo);
if (cachedDoc != null)
return cachedDoc; // Cache hit
Capture
main
Collecting
dependency info
Capturing SQL
queries
Collect SQL query info
Cache insert
Capturing
request exit
Cache.add(webDoc, uri, inputInfo,
dependencyInfo); // Cache miss
// Return the web document
…
18
© 2006 IBM Corporation
Dynamic Web Caching
Aspectizing Write Requests
Original code of a write
request handler
// Execute SQL queries
…
SQL query 1
SQL query 2
…
…
Collecting
invalidation info
Capturing SQL
queries
Collect SQL query info
Capture
main
Cache
invalidation
Capturing
request exit
// Cache consistency
Cache.remove(invalidationInfo);
// Return
19
© 2006 IBM Corporation
Dynamic Web Caching
Capturing Servlet’s main Method
// Pointcut for Servlets’ main method
pointcut servletMainMethodExecution(...) :
execution(
void HttpServlet+.doGet(
HttpServletRequest, HttpServletResponse))
|| execution(
void HttpServlet+.doPost(
HttpServletRequest, HttpServletResponse));
Pointcut captures entry and exit points of web request handlers
Cache Checks and Inserts for Read Requests
Invalidations for Update Requests
20
© 2006 IBM Corporation
Dynamic Web Caching
Weaving Rules for Cache Checks and Inserts
// Advice for read-only requests
around(...) : servletMainMethodExecution (...) {
// Pre-processing: Cache check
String cachedDoc;
cachedDoc = ... call Cache.get of AutoWebCache
if (cachedDoc != null) {
... return cachedDoc
}
// Normal execution of the request
proceed(...);
// Post-processing: Cache insert
... call Cache.add of AutoWebCache
}
21
© 2006 IBM Corporation
Dynamic Web Caching
Weaving Rules for Cache Invalidations
// Advice for write requests
after(...) : servletMainMethodExecution (...) {
// Cache invalidation
... call Cache.remove of AutoWebCache
}
22
© 2006 IBM Corporation
Dynamic Web Caching
Weaving Rules for Collecting Consistency
Information
// Pointcut for SQL query calls
pointcut sqlQueryCall( ) :
call(ResultSet PreparedStatement.executeQuery())
|| call(int PreparedStatement.executeUpdate());
// Advice for SQL query calls
after( ) : sqlQueryCall ( ) { ... collect consistency info ...}
After each SQL query, note
Query template
Query instance values
23
© 2006 IBM Corporation
Dynamic Web Caching
Transparency of AutoWebCache
Ability to Capture Information Flow
– Entry and exit points of request handlers
• e.g. doGet(), doPost() APIs for Java Servlets
– Modification to underlying data sets
• e.g. JDBC calls for SQL requests
– Multiple sources of dynamic behavior
• Currently handle dynamic behavior from SQL queries
• Need standard interfaces for all sources
24
© 2006 IBM Corporation
Dynamic Web Caching
Hidden State Problem
…
Number number = getRandom ( );
Image img = getImage (number);
request
execution
displayImage (img);
…
Request does not contain all information for response creation
Occurs when random nos., timers etc. used by application
Subsequent requests result in different responses
Duty of developer to declare such requests non-cacheable
25
© 2006 IBM Corporation
Dynamic Web Caching
Use of Application Semantics
Aspect-orientedness relies on code syntax
– Cannot capture semantic concepts
In TPC-W application
– Best Seller requests allows dirty reads for 30 sec
– Conforms to specification clauses 3.1.4.1 and 6.3.3.1
Application semantics can be used to improve
performance
– Best seller cache entry time-out set for 30 sec
26
© 2006 IBM Corporation
Dynamic Web Caching
Outline
Design of AutoWebCache
– Maintaining cache consistency
• Determine relationship between reads and updates
– Cache Structure
Aspectizing Web Caching
– Insertion of caching logic transparently
Evaluation
– Analysis of effectiveness
Conclusion
27
© 2006 IBM Corporation
Dynamic Web Caching
Evaluation Environment
RUBiS
– Auction site based on eBay
– Browsing items, bidding, leaving comments etc.
– Large number of requests that can be satisfied quickly
TPC-W
– Models an on-line bookstore
– Listing new products, best-sellers, shopping cart etc.
– Small number of requests that are database intensive
Client Emulator
– Client browser emulator generates requests
– Average think time, session time conform to TPCW v1.8 specification
– Cache warmed for 15 min, statistics gathered over 30 min
28
© 2006 IBM Corporation
Dynamic Web Caching
Response Time for RUBiS – Bidding Mix
140
Response Time (ms)
120
100
No cache
80
AutoWebCache
60
40
20
0
0
200
400
600
800
1000
Number of Clients
29
© 2006 IBM Corporation
Dynamic Web Caching
Relative Benefits for different Requests in RUBiS
Percent of Requests
25
20
15
10
5
0
Request Type
Hits
Misses
30
© 2006 IBM Corporation
Dynamic Web Caching
Response Time for TPC-W – Shopping Mix
Response Time (ms)
10000
1000
100
10
1
50
100
150
200
250
300
350
400
Number of Clients
No cache
AutoWebCache
Optimization for Semantics
31
© 2006 IBM Corporation
Dynamic Web Caching
Relative Benefits for different Requests in TPC-W
Percent of Requests
25
20
15
10
5
0
Request Type
Hits
Hits based on
app. semantics
Misses
32
© 2006 IBM Corporation
Dynamic Web Caching
Implementation of AutoWebCache
Web application
Application
# Java
classes
Java code
size
TPC-W
46
12K lines
RUBiS
25
Caching library
AOP-based caching
# Java
classes
Java code
size
# AspectJ files
(weaving rules)
Size of
AspectJ code
13
4.6K lines
1
150 lines
5.8K lines
33
© 2006 IBM Corporation
Dynamic Web Caching
Conclusion
AutoWebCache - a cache that
• Ensures consistency of cached documents
– Query Analysis
• Insertion of caching logic transparent to application
– Make use of aspect-oriented programming
Transparency of AutoWebCache
• Well-defined, standard interfaces for information flow
• Presence of hidden states
• Use of application semantics
34
© 2006 IBM Corporation
IBM Research
Questions / Comments / Suggestions !
© 2006 IBM Corporation
IBM Research
Thank You!!
© 2006 IBM Corporation
Dynamic Web Caching
SQL Query Structure
SELECT T.a FROM T WHERE T.b=10
Column(s) Selected
Column(s) Updated
Table Concerned
Predicate
Condition
UPDATE T SET T.c WHERE 20 < T.d < 35
37
© 2006 IBM Corporation
Dynamic Web Caching
Response Time for RUBiS – Bidding Mix
Response time (ms)
140
120
100
80
60
40
20
0
0
200
400
600
800
1000
Number of Clients
No cache
AC extra query
AC column based
Hand-coded
AC value based
38
© 2006 IBM Corporation
Dynamic Web Caching
Response time (ms)
Response Time for TPCW – Shopping Mix
10000
1000
100
10
1
0
50
100
150
200
250
300
350
400
450
Number of Clients
No cache
AC extra query
AC column based
Hand-coded
AC value based
39
© 2006 IBM Corporation
Dynamic Web Caching
Cache Structure in AutoWebCache
Index: URI
(readHandlerName
+ readHandlerArgs)
Cached web
page
URI1
WebPage1
Index: SQL String
ReadQueryTemplate1
URI2
WebPage2
…
…
<value vector, URI> pair
<instance values1a, URI1>
<instance values1b, URI41>
<instance values1c, URI57>
ReadQueryTemplate2
<instance values2a, URI7>
ReadQueryTemplate3
<instance values3a, URI12>
…
…
Remove
If a Write Query invalidates ReadQueryTemplate1 with instances values1a
40
© 2006 IBM Corporation
Dynamic Web Caching
Evaluation
Analysis of AutoWebCache
– Effect on performance of applications
– Relation of application semantics to cache efficiency
– Relative benefit of caching on different read-only
requests
– Usefulness of AOP techniques in implementing the
caching system
41
© 2006 IBM Corporation
Dynamic Web Caching
Response Time (ms)
Breakdown of Response Times for Requests in RUBiS
350
300
250
200
150
100
50
0
Request Type
Extra time for a Miss (on
top of overall response time)
Overall avg. response time
42
© 2006 IBM Corporation
Dynamic Web Caching
Breakdown of Response Times for Requests in TPC-W
Response Time (ms)
350
300
250
200
150
100
50
0
Request Type
Extra time for a Miss (on
top of overall response time)
Overall avg. response time
43
© 2006 IBM Corporation
Dynamic Web Caching
Key Aspect-Oriented Programming Concepts
“Join points” identify executable points in system
– Method calls, read and write accesses, invocations
“Pointcuts” allow capturing of various join points
“Advice” specifies actions to be performed at
pointcuts
– Before or after the execution of a pointcut
– Encode the cross-cutting logic
44
© 2006 IBM Corporation
Dynamic Web Caching
Conclusion
Dynamic Content Not easy to Cache
– Ensure consistency, invalidate cached entries as a result of updates
AutoWebCache – Query Analysis
– Caching logic inserted at different points in the application
• Entry and exit of requests, access to underlying database
– Most solutions rely on understanding complex application logic
AutoWebCache – Transparent insertion of caching logic using AOP
Transparency affected by
• Well-defined, standard interfaces for information flow
• Presence of hidden states
• Use of application semantics
45
© 2006 IBM Corporation
Dynamic Web Caching
Web Caching versus Query Caching
The two are complimentary
Web caching useful when app server is bottleneck
Documents can be cached nearer to the client,
distributed
Can make use of application semantics with web
page caching (best seller for TPC-W)
46
© 2006 IBM Corporation