presentation
Download
Report
Transcript presentation
Web Caching
By
Amisha Thakkar
Alpa Shah
Web Caching
1
Overview
•
•
•
•
•
•
What is a Web Cache ?
Caching Terminology
Why use a cache?
Disadvantages of Web Cache
Other Features
Caching Rules
Web Caching
2
Overview
•
•
•
•
•
Caching Architectures
Comparison of Architectures
Cache Deployment Scheme
Client Side Cache Cooperation
Active Caching
Web Caching
3
What is a Web Cache ?
• Cache is a place where temporary copies of
objects are stored
• Cached information is generally closer to
the requester than the permanent
information is
• Objects -HTML pages, images, files
Web Caching
4
What is a Web Cache?
Web Caching
5
Caching Terminology
• Client - An application program that
establishes connections for sending requests
• Server- An application program that accepts
connection to service requests by sending
back responses
• Origin Server-The server on which the
given resource resides or is to be created
Web Caching
6
Caching Terminology
• Proxy- An intermediary program which acts
both as a server and a client which requests
on behalf of the other clients
• Proxy is not necessarily a cache
* Proxy does not always cache the replies
passing through it
* It may be used on a firewall to monitor
accesses
Web Caching
7
Why use a cache ?
•
•
•
•
To reduce latency
To reduce network traffic
Load on origin servers will be reduced
Can isolate end users from network failures
Web Caching
8
Disadvantages of Web cache
• With cached data there is always a chance
of receiving stale information
• Content providers lose access counts when
cache hits are served
• Manual configuration is often required
• Operation of cache requires additional
resources
• In some situations the cache can be a single
9
point of failure Web Caching
Other Features
• Depending on the perspective the following
may be good or bad
* Cache requests on behalf of clients ; the
servers never see the clients IP addresses
* Cache provides an easy opportunity to
monitor and analyze browsing activities
* Cache can be used to block certain
requests
Web Caching
10
Types of Web Caches
• Proxy caches
* Serve a large number of users
* Large corporations and ISP’s often set
them up on the firewalls
* They are type of shared caches
• Browser caches
* Use a section of the computer’s hard disk
to store objects that you have seen
Web Caching
11
Caching Rules
• Rules on which caches work * Some of them set in protocols
* Some are set by cache administrator
• Most common rules :
* If the object is authenticated or secure it
won’t be cached
* Object’s headers indicate whether the
object is cacheable or not
Web Caching
12
Caching Rules
* Object is considered fresh when It has an expiry time or other age
controlling directive set & is still
within the fresh period
If the browser cache has already seen
the object & has been set to check
once a session
Web Caching
13
Caching Rules
If a proxy cache has seen the object
recently & it was modified relatively
long ago
Fresh documents are served directly from the
cache without checking with the origin server
Web Caching
14
Caching Rules
* For a stale object , the origin server will
be asked to validate the object , or tell the
cache whether the copy is still good
* The most common validator is the time
that the object was last changed
Web Caching
15
Caching Architectures
Hierarchical /Simple Cache
• Browser-cache interaction is same as
browser -host interaction, i.e. a TCP
connection is made & item requested
• If not found send request to parent cache
• Hierarchy built up - each level serving
indirectly a wider community of users
Web Caching
16
Caching Architectures
Hierarchical /Simple Cache
National Network
National Network
Regional Network
Regional Network
Institutional Network
Institutional Network
Institutional Network
Web Caching
Institutional Network
17
Caching Architectures
Distributed /Co-operating Cache
• Decentralized(Cache Mesh)
• Multiple servers cooperate in such a way
that they share their individual caches to
create a large distributed one
• Simply put caching proxies communicating
with each other to serve different users
• On a cache miss, it checks with other proxy
caches before contacting the origin server
Web Caching
18
Caching Architectures
Distributed /Co-operating Cache
• Caches communicate amongst themselves
using a protocol like ICP (Internet Cache
Protocol)
• Caches can be selected on the basis of
* Distances from the end user
* Specialize in particular URLs(location
hint).
Web Caching
19
Caching Architectures
Distributed /Co-operating Cache
• Why Distributed - limitations of hierarchy
* Width of cache in hierarchy: caches at
same level are inaccessible to each other
* LRU policy implies sufficient disk space
* Cost in replication of disk storage
* Amount of disk space reqd. depends on
number of users served & breadth of
reading
Web Caching
20
Caching Architectures
Distributed /Co-operating Cache
More the users more disk space higher
in the hierarchy
* Exponential growth of number of
documents on WWW
Web Caching
21
Caching Architectures
Distributed /Co-operating Cache
• Caching close to user - more effective,
higher the level lower the efficiency
• Can be created for load balancing
• Most effective when serving a community
of interests
Web Caching
22
Caching Architectures
Distributed /Co-operating Cache
• First an UDP packet sent for cache inquiry.
• Cache selection decision is determined by
RTT
• Potential problem -network congestion
because of UDP
• In favor* UDP exchange :2 IP packets, TCP :at least
8 packets
Web Caching
23
Caching Architectures
Distributed /Co-operating Cache
* UDP reply from cache can indicate
a. Presence
b. Speed
c. Availability of requested documents
Web Caching
24
Caching Architectures
Hybrid Cache
Note: ICP
Web Caching
25
Comparison of Architectures
• Hierarchical : caches placed at multiple
levels
• Distributed :caches only at bottom level; no
intermediate caches
Web Caching
26
Comparison of Architectures
• Performance parameters.
Connection time (Tc)is defined as the
time since the document is requested & first
data byte is received
Transmission time (Tt)is defined as the
time taken to transmit the document
Total latency = Tc +Tt .
Bandwidth usage
Web Caching
27
Comparison of Architectures
• Fig 3 -Connection time for different
document’s popularity
Web Caching
28
Comparison of Architectures
• For unpopular documents high connection
time
• No of requests increases avg..
connection time decreases
• For extremely popular documents
distributed has smaller connection times
Web Caching
29
Comparison of Architectures
• Fig 4 Network traffic generated
Web Caching
30
Comparison of Architectures
• On lower levels, distributed caching
practically double the network bandwidth
usage
• Around the root node in national network,
the network traffic is reduced to half
• Distributed caching uses all possible
network shortcuts between institutional
caches, generating more traffic in the less
congested low network levels
Web Caching
31
Comparison of Architectures
• Fig 5 a, Not congested national network
Web Caching
32
Comparison of Architectures
• The only bottleneck on the path from the
client to the origin server is the international
path. Hence transmission times are similar
for both
Web Caching
33
Comparison of Architectures
• Fig 5 b Congested National Networks
Web Caching
34
Comparison of Architectures
• Both have higher transmission times
compared to the previous case
• Distributed caching gives shorter
transmission times than hierarchical
because many requests travel through
lower network levels
Web Caching
35
Comparison of Architectures
• Fig 6 Average total latency
Web Caching
36
Comparison of Architectures
• For large documents transmission time is
more relevant than connection times
• Hierarchical caching gives lower latencies
for documents smaller 200 KB due to lower
connection times
• Distributed caching gives lower latencies
for larger documents due to lower
transmission times
Web Caching
37
Comparison of Architectures
• The size- threshold depends on the degree
of congestion in national network
• Higher the congestion, lower is the sizethreshold
• Distributed caching has lower latencies
than hierarchical
Web Caching
38
Comparison of Architectures
With Hybrid Scheme
• Fig 7 connection time
Web Caching
39
Comparison of Architectures
With Hybrid Scheme
• Fig 8.
Web Caching
40
Comparison of Architectures
With Hybrid Scheme
• In the hybrid scheme if the number of
cooperating caches (kc) is very small , the
connection time is high
• When number of cooperating caches
increases, the connection times decreases up
to a minimum
• If the number increases over the threshold ,
the connection time increases very fast
Web Caching
41
Comparison of Architectures
With Hybrid Scheme
• Fig 9 Transmission time
Web Caching
42
Comparison of Architectures
With Hybrid Scheme
• For un-congested n/w the no.of coop caches
(kt) at every level hardly influences Tt
• If no. of coop caches is very small , high Tt &
vice -versa
• If the no increases above the threshold the Tt
increases
• Optimum no. of caches depends on the no of
caches reachable avoiding congested links
Web Caching
43
Comparison of Architectures
With Hybrid Scheme
• Fig 10
Web Caching
44
Comparison of Architectures
With Hybrid Scheme
• Fig 11 total latency
Web Caching
45
Comparison of Architectures
With Hybrid Scheme
• The no. of coop caches(kopt) at every level
depend on the document size to minimize the
total latency
• For small documents the optimum no. is
closer to kc
• For large documents the the optimum no. is
closer to kt
Web Caching
46
Comparison of Architectures
With Hybrid Scheme
• Fig 12
Web Caching
47
Comparison of Architectures
With Hybrid Scheme
• For any document the optimum kopt that
minimizes the total latency is such that
kc koptkt
Web Caching
48
Cache Deployment Schemes
• Proxy caching
Web Caching
49
Cache Deployment Schemes
• Advantages
Clients point all web requests directly to
cache : no effect on non web traffic
Cost of upgrading h/w & s/w is limited
Administration on caches limited to
basic configuration
Web Caching
50
Cache Deployment Schemes
• Disadvantages
Every browser must be configured to
point to the cache
Each client can hit only one cache
Single point of failure
Unnecessary duplication of data
Bottleneck in cases where content is
otherwise available
in LAN
Web Caching
51
Cache Deployment Schemes
• Transparent Proxy caching
Web Caching
52
Cache Deployment Schemes
• Advantages
No browser configuration
Cost of upgrading h/w & s/w is limited
No administration of intermediate
systems required
Web Caching
53
Cache Deployment Schemes
• Disadvantages
Each client can hit only one cache
If cache goes down internet as well as
intranet access lost
Negative impact on non web traffic
Cache has to route non web traffic
Routing ,packet examination & n/w addr.
translation steal CPU cycles from the main
cache serving function
Web Caching
54
Cache Deployment Schemes
• Transparent proxy caching with web cache
redirection.
Web Caching
55
Cache Deployment Schemes
• Advantages
Switch/ router examines the packets
Minimal impact on non-web traffic
Frees up CPU cycles for the web cache
Allows client load to be dynamically
spread over multiple caches
Eliminates single point of failure
especially if redundant redirectors are used
Web Caching
56
Cache Deployment Schemes
• Disadvantages
Additional intermediate systems must be
deployed
Increases expense
Web Caching
57
Client Side Cache Cooperation.
Web Caching
58
Active Caching
• Current problem unable to cache dynamic
documents
• Caching Dynamic contents on the web
using active web
• Cache applet is server supplied code that is
attached with an URL , or collection of
URLs
• Applet is written in platform independent
language
Web Caching
59
Active Caching
• On a user request the applet is invoked by
the cache
• The applet decides what is to be sent to the
user
• Other functions of the applet* Logging user accesses
* Checking access permissions
* Rotating advertising banners
Web Caching
60
Active Caching
• The proxy has the freedom to not invoke the
applet but send the request to the server
• Proxy promises to not send back a cached
copy without invoking the applet
• If applet too huge ,send request to server
• Proxy not obligated to cache any applet , in
that case agrees to not service the request
for that document
Web Caching
61
Active Caching
• Proxy can devote resources to the applets
associated with the hottest URLs to its user
• Proxy that receives the request is typically
the proxy closest to the user , the scheme
automatically migrates the server
processing to the nodes that are close to
users
• Thus increasing the scalability of web based
services
Web Caching
62