Dynamic-Content Web Caching using Cooperative Proxy Scheme

Download Report

Transcript Dynamic-Content Web Caching using Cooperative Proxy Scheme

Dynamic-Content Web Caching
with
Cooperative Proxy Scheme
Βελισκάκης Μανώλης
Εθνικό Μετσόβιο Πολυτεχνείο
Dept. of Electrical & Computer Engineering
Knowledge and Database Systems Laboratory
Συνάντηση DBLAB
Τρίτη, 20 Ιανουαρίου 2004
Outline
Problem Definition
 Dynamic-Data Web Caching vs Cooperative
Schemes
 Proposed Web Caching Algorithm
 Current and Future Work
 Discussion

Problem Definition – What?
Query Results
Dynamic Data for personalization purposes
Problem Definition – Where?
Client
 Proxy
 Edge-of-net
 Internet Service Provider
 Edge-of-Enterprise
 Application Server
 Web Server
 DBMS

Problem Definition – How?
Nowadays Approaches



Exact matching query
Materialized Views
DB Characteristics to Proxies
Problem Definition – Topology
Scheme
Broadcast queries
 Hierarchical Caching
 URL Hashing
 Directory based Cooperation

Problem Definition - Issues
Replacement Policy
 Cache Consistency
 Proxy Communication
 Web objects placement

Dynamic-Data Web Caching
vs Cooperative Schemes




Exact matching
query
Materialized Views
DB Characteristics
to Proxies




Replacement Policy

Proxy Communication

Web objects placement
Broadcast queries
Hierarchical
Caching
URL Hashing
Directory based
Cooperation
Dynamic-Data Web Caching
vs Cooperative Schemes
Conclusions (?)


Exact Matching Query
– Common Web Caching Issues
– Not interesting
DB Characteristics to Proxies
– Common DB Replication Issues
– Interesting Issue: Create Cache Tables knowing that there is a
cooperative proxy Scheme
Dynamic-Data Web Caching
vs Cooperative Schemes
Conclusions (?)

Materialized Views
– Many interesting issues
 Query rewriting
 Replacement Algorithm
 Appropriate Cooperative Scheme
 Web Objects exchange between Proxies
 Consideration of DBMS structure
 Dynamic or a priori definition of Materialized Views
 Giving DB capabilities to Proxies (queries on Materialized
Views)
 Communication between Proxies
Proposed Web Caching Algorithm –
Hybrid Topology (Hierarchical-Directory Based)
C
L
I
E
N
T
S
PROXY 1a
C
LI
E
N
T
S
Q.M
CACHE
DIRECTORY
PROXY 2a
Q.M
CACHE
DIRECTORY
PROXY 1b
PROXY 1c
PROXY 2b
PROXY 2c
Q.M
Q.M
Q.M
Q.M
CACHE
CACHE
CACHE
CACHE
DIRECTORY
DIRECTORY
DIRECTORY
DIRECTORY
WEB SERVER
WEB SERVER
DATABASE
SERVER
DATABASE
SERVER
Proposed Web Caching Algorithm –
Web Objects description
 There
are 3 different ways to refer to a
Web Object
– URL
– QTag
– QTag+Query Result (Whole Web Object)
Proposed Web Caching Algorithm –
Web Objects description
QTAG
<QTag
ID:Number,
Query:String,
/>
//Unique identifier for every QTag
//Contains the query that has been asked to the Back-End
Database
LocationOfWebServer:URL,
//Contains the URL Location of the Web Server that stands
in front of the Database
DatabaseID:Number,
//Contains the ID of the Database where the query was
asked
TimeToLive:Number (sec),
//Determines the period in which the query is valid and can
satisfy Requests
Weight:Number,
Relationships:List of QTag.ID
//Determines the significance (Weight) of the query.
//Determines a list of Web Objects that are
frequently used with the current Web Object in
order to satisfy query requests
Proposed Web Caching Algorithm –
Web Objects description
QTAG + Query Results
<QTag
ID=1029384,
Query=”Select name, surname, age from Customers where
Name=’John’”,
LocationOfWebServer =”http://www.dblab.ece.ntua.gr/siteNo1”,
DatabaseID =1,
TimeToLive=1000,
Weight=0.65
Relationships=”15433456, 15433766, 15682456, 15432456
/>
John, Manolopoulos, 28
John, Nikolaidis,35
.
.
.
John,Fissas,40
Query Result
Proposed Web Caching Algorithm –
Proxy Structure
PROXY STRUCTURE
REST OF COOPERATIVE
SCHEME
URL/QTag
TRANSFORMER
COOPERATIVESCHEME
DIRECTORY
QUERY REWRITER
CACHE DIRECTORY
MAIN CACHE
WEIGHT
CALCULATOR
Proposed Web Caching Algorithm –
Proxy Structure –
URL/QTag Transformer



Proxies manipulates Web-Objects (Query Results) through
their <QTags>
Extract from a Web Object’s URL the
– Query (Knowing the CGI that produces the Query
Result)
– LocationOfWebServer
– DatabaseID
1-1 correspondence between URLs and QTags
Proposed Web Caching Algorithm –
Proxy Structure –
Query Rewriter

Rewriting the requested Web Objects (Queries) in case there is not
an exact match of the requested query cached but it can be
satisfied from other already cached web objects (queries).

Query rewriter will follow standard query-rewriting methods and
techniques that are already used to database system and
environments
Proposed Web Caching Algorithm –
Proxy Structure –
Weight Calculator
Every web object will be characterized from a Weight W which will be determined from the following
factors:
S
Πs
CS
Πcs
Ρ
Πp
R
Πr
(Determined from the web-object’s size)
(Determined from the influence percentage of Factor
S
to the Weight)
(Determined from the web-object’s cost-retrieval)
(Determined from the influence percentage of factor
CS to the Weight)
(Determined from the web-object’s popularity)
(Determined from the influence percentage of Factor
Ρ to the Weight)
(Determined from the web-object’s significance as far as its relationships concerns)
(Determined from the influence percentage of Factor
R to the Weight)
Proposed Web Caching Algorithm
The Request
arrives to a Proxy
The URL/QTag Transformer Finds the
QTag that best describes the incoming
URL
The QTag is sent to
Query Rewriter
Some of the
Sub-QTags
are cached
The Query Rewriter asks
the Cache Directory if any
of these Sub-QTags is
already cached in the Main
Cache
The Query Rewriter asks
the Cooperative-scheme
Directory if the rest SubQTags cached in other
Proxies
ALL of the rest
of the Sub-Qtags
are cached in
other Proxies
Query Rewriter
retrieves the locally
cached Web Objects
Proxy retrieves the Cached Web
Objects from the other Proxies
and sends them to Query
Rewriter
Query Rewriter Rewrites the
Query and produces
Sub-QTags
None of the
Sub-QTags
are cached
Send request to Web
Server and Caches
the response
All the SubQTags are
cached
Not all of the rest
of the Sub-Qtags
are cached in
other Proxies
Query Rewriter
combines the Sub
QTags and the proxy
sends the response
The Proxy Caches
locally the retrieved
Web Objects
Weight Calculator Refreshes
Weight Value and Parameters of
the Sub-Tags
Current and Future Work
Study and Testing the proposed new
approaches
 Definition of Workload
 Better Definition and Testing of the
proposed Algorithm

Discussion
Efficiency of Testing Tools (Simulator)
 Ideas for efficient Web Caching for
Dynamic-Data
 Comments

Thank You