Your poster should be constructed with this page

Download Report

Transcript Your poster should be constructed with this page

IBM Research
Caching Dynamic Web Content:
Designing and Analyzing an AspectOriented Solution
Sara Bouchenak – INRIA, France
Alan Cox – Rice University, Houston
Steven Dropsho – EPFL, Lausanne
Sumit Mittal – IBM Research, India
Willy Zwaenepoel – EPFL, Lausanne
© 2006 IBM Corporation
Dynamic Web Caching
Dynamic Web Content
Cache
HTTP request
SQL req.
Internet
SQL res.
HTTP response
Client
Web server
Application
server
Database
server
Web tier
Business tier
Database tier
Motivation for Caching
 Represents large portion of web requests
 Stock quotes, bidding-buying status on auction site, best-sellers on bookstore
 Generation places huge burden on application servers
2
© 2006 IBM Corporation
Dynamic Web Caching
Caching Dynamic Web Content
 Dynamic Content Not easy to Cache
– Ensure consistency, invalidate cached entries due to updates
• Write requests can modify entries used by read requests
– Caching logic inserted at different points in the application
• Entry and exit of requests, access to underlying database
• Correlation between requests and their database accesses
Most solutions rely on “manually” understanding complex
application logic
3
© 2006 IBM Corporation
Dynamic Web Caching
Our Contributions
 Design a cache “AutoWebCache” that
• Ensures consistency of cached documents
• Insertion of caching logic transparent to application
– Make use of aspect-oriented programming
 Analysis of the cache
• Transparency of injecting caching logic
• Improvement in response time for test-bed applications
4
© 2006 IBM Corporation
Dynamic Web Caching
Dynamic Web Caching – Solution Approach
HTTP request
SQL req.
Internet
SQL res.
HTTP response
Client
Web server
Consistency
 Correlation between
read and write requests
Transparency
Cache Check
Web Page
Cache
Capture information flow
Application
server
Request
info
Database
server
Database
access
Caching
Cache inserts,
invalidations
Logic
AutoWebCache
5
© 2006 IBM Corporation
Dynamic Web Caching
Outline
 Design of AutoWebCache
– Maintaining cache consistency
• Determine relationship between reads and updates
– Cache Structure
 Aspectizing Web Caching
– Insertion of caching logic transparently
 Evaluation
– Analysis of effectiveness, transparency
 Conclusion
6
© 2006 IBM Corporation
Dynamic Web Caching
Maintaining Cache Consistency – Read Requests
 Response to read-only requests cached
 Read SQL queries recorded with cache entry
Index: URI
(readHandlerName
+ readHandlerArgs)
Cached web
page
URI1
WebPage1
{ Read Query 11, Read Query 12, ….}
URI2
WebPage2
{ Read Query 21, Read Query 22, ….}
…
…
Associated Read Queries
7
© 2006 IBM Corporation
Dynamic Web Caching
Maintaining Cache Consistency – Write Requests
 Result not cached
 Write SQL queries recorded
 Intersect write SQL queries with read queries of
cached pages
 Invalidate if non-zero intersection
WS
WS
RS
RS
No
Invalidation
Invalidation
8
© 2006 IBM Corporation
Dynamic Web Caching
Invalidating Cache Entries
Remove
Index: URI
(readHandlerName
+ readHandlerArgs)
Cached web
page
URI1
WebPage1
{ Read Query 11, Read Query 12, ….}
URI2
WebPage2
{ Read Query 21, Read Query 22, ….}
URI3
WebPage3
{ Read Query 31, Read Query 32, ….}
URIn
Associated Read Queries
Write Query
9
© 2006 IBM Corporation
Dynamic Web Caching
Query Analysis Engine
 Determines intersection between SQL queries
 Three levels of granularity for intersection
– Column based
– Value based
– Extra query based
 Balance precision with complexity
10
© 2006 IBM Corporation
Dynamic Web Caching
Column Based Intersection
Invalidate if Column_Read = Column_Updated
a
b
c
5
8
7
1
10
9
SELECT T.a FROM T WHERE T.b = 8
UPDATE T SET T.c = 7 WHERE T.b = 10 Ok
UPDATE T SET T.a = 12 WHERE T.b = 10 Invalidate
11
© 2006 IBM Corporation
Dynamic Web Caching
Value Based Intersection
Invalidate if Rows_Read = Rows_Updated
a
b
c
5
8
7
1
10
9
Invalidate with
column-based
SELECT T.a FROM T WHERE T.b = 8
UPDATE T SET T.a = 7 WHERE T.b = 10 Ok
UPDATE T SET T.a = 12 WHERE T.b = 8 Invalidate
12
© 2006 IBM Corporation
Dynamic Web Caching
Extra Query Based Intersection
Generate extra query to find missing values
a
b
c
5
8
7
1
10
9
??
Invalidate with
value-based
SELECT T.a FROM T WHERE T.b = 8
UPDATE T SET T.a = 3 WHERE T.c = 9
SELECT T.b FROM T WHERE T.c = 9
Ok
13
© 2006 IBM Corporation
Dynamic Web Caching
Outline
 Design of AutoWebCache
– Maintaining cache consistency
• Determine relationship between reads and updates
– Cache Structure
 Aspectizing Web Caching
– Insertion of caching logic transparently
 Evaluation
– Analysis of effectiveness, transparency
 Conclusion
14
© 2006 IBM Corporation
Dynamic Web Caching
Dynamic Web Caching – Solution Approach
HTTP request
SQL req.
Internet
SQL res.
HTTP response
Client
Web server
Application
server
Database
server
Transparency
Capture information flow
Cache Check
Web Page
Cache
Request
info
Database
access
Caching
Cache inserts,
invalidations
Logic
AutoWebCache
15
© 2006 IBM Corporation
Dynamic Web Caching
Aspect-Oriented Programming (AOP)
 Modularize cross-cutting concerns - Aspects
– Logging, billing, exception handling
 Works on three principles
– Capture the execution points of interest – Pointcuts (1)
• Method calls, exception points, read/write accesses
– Determine what to do at these pointcuts – Advice (2)
• Encode cross-cutting logic (before/ after/ around)
– Bind Pointcuts and Advice together – Weaving (3)
• AspectJ compiler for Java
16
© 2006 IBM Corporation
Dynamic Web Caching
Insertion of Caching Logic
Original
web
application
Weaving
Rules
Caching
library
Aspect
Weaving
(Aspect J)
Cacheenabled web
application
version
17
© 2006 IBM Corporation
Dynamic Web Caching
Aspectizing Read Requests
Cache check
Original code of a readonly request handler
// Execute SQL queries
…
SQL query 1
SQL query 2
…
// Generate a web document
webDoc = …
Capturing
request entry
String cachedDoc = Cache.get (uri,
inputInfo);
if (cachedDoc != null)
return cachedDoc; // Cache hit
Capture
main
Collecting
dependency info
Capturing SQL
queries
Collect SQL query info
Cache insert
Capturing
request exit
Cache.add(webDoc, uri, inputInfo,
dependencyInfo); // Cache miss
// Return the web document
…
18
© 2006 IBM Corporation
Dynamic Web Caching
Aspectizing Write Requests
Original code of a write
request handler
// Execute SQL queries
…
SQL query 1
SQL query 2
…
…
Collecting
invalidation info
Capturing SQL
queries
Collect SQL query info
Capture
main
Cache
invalidation
Capturing
request exit
// Cache consistency
Cache.remove(invalidationInfo);
// Return
19
© 2006 IBM Corporation
Dynamic Web Caching
Capturing Servlet’s main Method
// Pointcut for Servlets’ main method
pointcut servletMainMethodExecution(...) :
execution(
void HttpServlet+.doGet(
HttpServletRequest, HttpServletResponse))
|| execution(
void HttpServlet+.doPost(
HttpServletRequest, HttpServletResponse));
Pointcut captures entry and exit points of web request handlers
Cache Checks and Inserts for Read Requests
Invalidations for Update Requests
20
© 2006 IBM Corporation
Dynamic Web Caching
Weaving Rules for Cache Checks and Inserts
// Advice for read-only requests
around(...) : servletMainMethodExecution (...) {
// Pre-processing: Cache check
String cachedDoc;
cachedDoc = ... call Cache.get of AutoWebCache
if (cachedDoc != null) {
... return cachedDoc
}
// Normal execution of the request
proceed(...);
// Post-processing: Cache insert
... call Cache.add of AutoWebCache
}
21
© 2006 IBM Corporation
Dynamic Web Caching
Weaving Rules for Cache Invalidations
// Advice for write requests
after(...) : servletMainMethodExecution (...) {
// Cache invalidation
... call Cache.remove of AutoWebCache
}
22
© 2006 IBM Corporation
Dynamic Web Caching
Weaving Rules for Collecting Consistency
Information
// Pointcut for SQL query calls
pointcut sqlQueryCall( ) :
call(ResultSet PreparedStatement.executeQuery())
|| call(int PreparedStatement.executeUpdate());
// Advice for SQL query calls
after( ) : sqlQueryCall ( ) { ... collect consistency info ...}
After each SQL query, note
Query template
Query instance values
23
© 2006 IBM Corporation
Dynamic Web Caching
Transparency of AutoWebCache
 Ability to Capture Information Flow
– Entry and exit points of request handlers
• e.g. doGet(), doPost() APIs for Java Servlets
– Modification to underlying data sets
• e.g. JDBC calls for SQL requests
– Multiple sources of dynamic behavior
• Currently handle dynamic behavior from SQL queries
• Need standard interfaces for all sources
24
© 2006 IBM Corporation
Dynamic Web Caching
Hidden State Problem
…
Number number = getRandom ( );
Image img = getImage (number);
request
execution
displayImage (img);
…
 Request does not contain all information for response creation
 Occurs when random nos., timers etc. used by application
 Subsequent requests result in different responses
 Duty of developer to declare such requests non-cacheable
25
© 2006 IBM Corporation
Dynamic Web Caching
Use of Application Semantics
 Aspect-orientedness relies on code syntax
– Cannot capture semantic concepts
 In TPC-W application
– Best Seller requests allows dirty reads for 30 sec
– Conforms to specification clauses 3.1.4.1 and 6.3.3.1
 Application semantics can be used to improve
performance
– Best seller cache entry time-out set for 30 sec
26
© 2006 IBM Corporation
Dynamic Web Caching
Outline
 Design of AutoWebCache
– Maintaining cache consistency
• Determine relationship between reads and updates
– Cache Structure
 Aspectizing Web Caching
– Insertion of caching logic transparently
 Evaluation
– Analysis of effectiveness
 Conclusion
27
© 2006 IBM Corporation
Dynamic Web Caching
Evaluation Environment
 RUBiS
– Auction site based on eBay
– Browsing items, bidding, leaving comments etc.
– Large number of requests that can be satisfied quickly
 TPC-W
– Models an on-line bookstore
– Listing new products, best-sellers, shopping cart etc.
– Small number of requests that are database intensive
 Client Emulator
– Client browser emulator generates requests
– Average think time, session time conform to TPCW v1.8 specification
– Cache warmed for 15 min, statistics gathered over 30 min
28
© 2006 IBM Corporation
Dynamic Web Caching
Response Time for RUBiS – Bidding Mix
140
Response Time (ms)
120
100
No cache
80
AutoWebCache
60
40
20
0
0
200
400
600
800
1000
Number of Clients
29
© 2006 IBM Corporation
Dynamic Web Caching
Relative Benefits for different Requests in RUBiS
Percent of Requests
25
20
15
10
5
0
Request Type
Hits
Misses
30
© 2006 IBM Corporation
Dynamic Web Caching
Response Time for TPC-W – Shopping Mix
Response Time (ms)
10000
1000
100
10
1
50
100
150
200
250
300
350
400
Number of Clients
No cache
AutoWebCache
Optimization for Semantics
31
© 2006 IBM Corporation
Dynamic Web Caching
Relative Benefits for different Requests in TPC-W
Percent of Requests
25
20
15
10
5
0
Request Type
Hits
Hits based on
app. semantics
Misses
32
© 2006 IBM Corporation
Dynamic Web Caching
Implementation of AutoWebCache
Web application
Application
# Java
classes
Java code
size
TPC-W
46
12K lines
RUBiS
25
Caching library
AOP-based caching
# Java
classes
Java code
size
# AspectJ files
(weaving rules)
Size of
AspectJ code
13
4.6K lines
1
150 lines
5.8K lines
33
© 2006 IBM Corporation
Dynamic Web Caching
Conclusion
 AutoWebCache - a cache that
• Ensures consistency of cached documents
– Query Analysis
• Insertion of caching logic transparent to application
– Make use of aspect-oriented programming
 Transparency of AutoWebCache
• Well-defined, standard interfaces for information flow
• Presence of hidden states
• Use of application semantics
34
© 2006 IBM Corporation
IBM Research
Questions / Comments / Suggestions !
© 2006 IBM Corporation
IBM Research
Thank You!!
© 2006 IBM Corporation
Dynamic Web Caching
SQL Query Structure
SELECT T.a FROM T WHERE T.b=10
Column(s) Selected
Column(s) Updated
Table Concerned
Predicate
Condition
UPDATE T SET T.c WHERE 20 < T.d < 35
37
© 2006 IBM Corporation
Dynamic Web Caching
Response Time for RUBiS – Bidding Mix
Response time (ms)
140
120
100
80
60
40
20
0
0
200
400
600
800
1000
Number of Clients
No cache
AC extra query
AC column based
Hand-coded
AC value based
38
© 2006 IBM Corporation
Dynamic Web Caching
Response time (ms)
Response Time for TPCW – Shopping Mix
10000
1000
100
10
1
0
50
100
150
200
250
300
350
400
450
Number of Clients
No cache
AC extra query
AC column based
Hand-coded
AC value based
39
© 2006 IBM Corporation
Dynamic Web Caching
Cache Structure in AutoWebCache
Index: URI
(readHandlerName
+ readHandlerArgs)
Cached web
page
URI1
WebPage1
Index: SQL String
ReadQueryTemplate1
URI2
WebPage2
…
…
<value vector, URI> pair
<instance values1a, URI1>
<instance values1b, URI41>
<instance values1c, URI57>
ReadQueryTemplate2
<instance values2a, URI7>
ReadQueryTemplate3
<instance values3a, URI12>
…
…
Remove
If a Write Query invalidates ReadQueryTemplate1 with instances values1a
40
© 2006 IBM Corporation
Dynamic Web Caching
Evaluation
 Analysis of AutoWebCache
– Effect on performance of applications
– Relation of application semantics to cache efficiency
– Relative benefit of caching on different read-only
requests
– Usefulness of AOP techniques in implementing the
caching system
41
© 2006 IBM Corporation
Dynamic Web Caching
Response Time (ms)
Breakdown of Response Times for Requests in RUBiS
350
300
250
200
150
100
50
0
Request Type
Extra time for a Miss (on
top of overall response time)
Overall avg. response time
42
© 2006 IBM Corporation
Dynamic Web Caching
Breakdown of Response Times for Requests in TPC-W
Response Time (ms)
350
300
250
200
150
100
50
0
Request Type
Extra time for a Miss (on
top of overall response time)
Overall avg. response time
43
© 2006 IBM Corporation
Dynamic Web Caching
Key Aspect-Oriented Programming Concepts
 “Join points” identify executable points in system
– Method calls, read and write accesses, invocations
 “Pointcuts” allow capturing of various join points
 “Advice” specifies actions to be performed at
pointcuts
– Before or after the execution of a pointcut
– Encode the cross-cutting logic
44
© 2006 IBM Corporation
Dynamic Web Caching
Conclusion
 Dynamic Content Not easy to Cache
– Ensure consistency, invalidate cached entries as a result of updates
 AutoWebCache – Query Analysis
– Caching logic inserted at different points in the application
• Entry and exit of requests, access to underlying database
– Most solutions rely on understanding complex application logic
 AutoWebCache – Transparent insertion of caching logic using AOP
 Transparency affected by
• Well-defined, standard interfaces for information flow
• Presence of hidden states
• Use of application semantics
45
© 2006 IBM Corporation
Dynamic Web Caching
Web Caching versus Query Caching
 The two are complimentary
 Web caching useful when app server is bottleneck
 Documents can be cached nearer to the client,
distributed
 Can make use of application semantics with web
page caching (best seller for TPC-W)
46
© 2006 IBM Corporation