PPT - CSE, IIT Bombay

Download Report

Transcript PPT - CSE, IIT Bombay

Web Caching
By
Neeraj Agrawal
Caching

Caching is widely used for improving
performance in many context( e.g
processor caches in hardware, buffer pool
in DBMS etc).



Where and what to cache in web context?
Many points and many kind of objects!
Focus is on Transactional/Database apps.
What Where When and How
Considerations for caching in web context
are:



What, when and where to cache
Granularity of caching: web pages, fragments of
pages, servlet execution result, SQL query results etc
Location of cache: client, proxy, edge of the net,
internet service provider (ISP), edge of enterprise,
application server (EJBs etc), web server, database
server.
What Where When and How

Caching and Invalidating policies: Pull vs
Push, freshness maintenance, triggers.

Related DB Technologies:




Replication
Materialized views
Mediator systems
Buffer management.
Common Points of Caching






Browser
Proxy (Forward proxy cache)
Enterprise/ISP Edge Servers
Web Server
Application Server
Databases
Cache Models

Front End Cache




Caches data + markup
Cache can be app independent
Static pages easily cached
Data Cache


Cache Data
Effectiveness depends on app
design
Cache Models

Distributed Applications


Multiple copies
distributed around net
Turns caching into
content management
problem.
HTTP Caching Today


Multiple cache between browser and server
HTTP header control



Whether a page can be cached
Cache expiration time.
Full pages and images can be cached.

Unable to cache html fragments
Dynamically Generated Pages

Increased due to




Database centric e-commerce application
Frequently update contents
Personalization
Proxy caching is ineffective for such pages
Caching Data-intensive Web Sites





Relies on the declarative spec of web sites
Data content is extracted from DBMS
Website structure and content is separate
from graphical representation XSLT.
The mapping between raw data and logical
model of web is described by declarative
language (WeaveL)
HTML page is generated from XML and XSLT
Materialization Strategies





What kind of data must be materialized?
When must materialization must be performed?
Where should the materialized intermediate
result must be placed for effective performance
improvements?
How are updates are propagated to the
materialized data?
Which particular data must be materialized and
which must be computed upon request?
Approaches

Materialization (HTML)





Good response time
High space overhead
Propagating update to materialization difficult
Not always possible (forms)
Materialization granularity is not always
appropriate
Approaches

Cache DBMS query results




Significant performance improvement
Simple update propagation
Granularity
Cache intermediate XML



Complex update propagation
Granularity
Eliminate cost of database connection
WeaveL
XML Fragment
Weave Web Site Management
System
Runtime Policies
Conclusion

Results

Better performance with mixed caching