Transcript part I
Improving Web Servers performance
Objectives:
Scalable Web server System
Locally distributed architectures
Cluster-based Web systems
Distributed Web systems
Cluster-based solutions
Distributed Web-based solutions
Dispatching algorithms for cluster-based
Web systems
1
Reference
“The State of the Art in Locally
Distributed Web-server Systems”
Valeria Cardellini, Emiliano Casalicchio, Michele Colajanni
and Philip S. Yu
2
Concepts
Web server System is a system that
Provides web services
The trend is
Increasing number of clients
Growing complexity of web applications
Scalable Web server systems
The ability to support large numbers of
accesses and resources while still providing
adequate performance
3
Architecture solutions for scalable
Web-server systems
4
Model architecture for a locally
distributed Web system
5
Locally Distributed Web
System
Cluster Based Web System
The server nodes mask their IP addresses to clients, using a
Virtual IP address corresponding to one device (web switch)
in front of the set of the servers
Web switch receives all packets and then sends them to
server nodes
Distributed Web System
The IP addresses of the web server nodes are visible to
clients
No web switch, just a layer 3 router may be employed to
route the requests
6
Cluster based Architecture
7
Distributed Architecture
8
Request routing mechanisms
After classifying the two Web systems
Cluster Based Web System
Distributed Web System
The question now becomes “how are packets
routed to each of the web servers?
9
Request routing mechanisms for
cluster-based Web systems
layer-4 switch
Content-blind routing
layer-7 switch
Content-aware switches
Also called Layer 5 switches in TCP/IP protocol
What are the trade-offs between layer-4 and layer7 switches?
10
Two Approaches
11
Taxonomy of cluster-based
architecture
12
Layer-4 two-way architecture
13
Layer-4 one-way architecture
14
Layer-4 one-way mechanisms
Packet single-rewriting
Same as two-way architecture. The only difference is in the
modification of the source address of outbound packets
Packet tunneling
This is also known as IP encapsulation
IP datagrams with IP datagrams
Requires that all servers support IP tunneling
Packet frowarding
Assumes that the Web switch and the server nodes are on
the same LAN
All nodes share the VIP address
Server nodes need to disable ARP
Web switch forwards the inbound packet to the target
server without modifying the TCP/IP header
15
LAN Addresses
Each adapter on LAN has unique LAN address
16
LAN Address (more)
MAC address allocation administered by IEEE
manufacturer buys portion of MAC address space
(to assure uniqueness)
Analogy:
MAC address: like Social Security Number
IP address: like postal address
MAC flat address => portability
IP hierarchical address NOT portable
17
Routing discussion
Starting at A, given IP
datagram addressed to B:
A
223.1.1.1
223.1.2.1
look up net. address of B, find B
on same net. as A
link layer send datagram to B
inside link-layer frame
frame source,
dest address
B’s MAC A’s MAC
addr
addr
223.1.1.2
223.1.1.4 223.1.2.9
B
223.1.1.3
datagram source,
dest address
A’s IP
addr
B’s IP
addr
223.1.3.27
223.1.3.1
223.1.2.2
E
223.1.3.2
IP payload
datagram
frame
18
ARP: Address Resolution Protocol
Question: how to determine
MAC address of B
knowing B’s IP address?
Each IP node (Host or
Router) on LAN has
ARP table
ARP Table: IP/MAC
address mappings for
some LAN nodes
< IP address; MAC address; TTL>
TTL (Time To Live): time
after which address
mapping will be forgotten
(typically 20 min)
19
ARP protocol
A wants to send datagram
to B, and A knows B’s IP
address.
Suppose B’s MAC address
is not in A’s ARP table.
A broadcasts ARP query
packet, containing B's IP
address
all machines on LAN
receive ARP query
B receives ARP packet,
replies to A with its (B's)
MAC address
frame sent to A’s MAC
address (unicast)
A caches (saves) IP-to-
MAC address pair in its
ARP table until information
becomes old (times out)
soft state: information
that times out (goes
away) unless refreshed
ARP is “plug-and-play”:
nodes create their ARP
tables without
intervention from net
administrator
20
Layer-7 two-way architecture
21
Layer-7 two-way mechanisms
TCP gateway
An application level proxy running on the web switch
mediates the communication between the client and the
server
Makes separate TCP connections to client and server
TCP splicing
reduce the overhead in TCP gateway. For outbound packets,
packet forwarding occurs at network level by rewriting the
client IP address
22
Layer-7 two-way Mechanisms
TCP gateway
An application level proxy running
on the web switch mediates the
communication between the client
and the server
user
kernel
TCP splicing
reduce the overhead in TCP
gateway. Packet forwarding occurs
at network level between the
network interface driver and the
TCP/IP stack, is carried out
directly by OS
user
kernel
23
Content-aware Switch
www.yahoo.com
Internet
Image Server
IP
TCP
APP. DATA
Application Server
GET /cgi-bin/form HTTP/1.1
Host: www.yahoo.com…
Switch
HTML Server
• Front-end of a web servers
• Route packets based on layer 5/7 (content)
information
24
Why use Context-aware Switching
Servers can be specialized for certain types of
request
Content segregation
Exploit locality
Affinity-based routing
Increase the performance because of the improved hit
rate
Partial replication of server file set
Partition the server’s file set over different nodes
25
URL Parsing is expensive!!
Performing content-aware routing implies that
some kind of string searching and matching
algorithm is required
Such a time-consuming function is expensive in a
heavy traffic web site
Experience showed that the system
performance would be severely degraded if we
implement some URL parsing functions in the
distributor
26
TCP splicing
Once the two TCP connections are established,
they are spliced
IP packets are forwarded at the network layer
TCP splicing requires
Connection binding
Packet analyzer to rewrite packets
• Appropriate address translation
• Sequence number modifications to be performed on the
packets
Basically, we are deploying connection re-use
27
Operation of Content-aware
Distributor
Client
connection
setup
(2)
Layer-7 Switch
SYN(CIS
N
)
pre-fork
connection
(1)
)
SYN(DISN
+1)
ACK(CISN
SYN(PISN
)
)
SYN(SISN
+1)
ACK(PISN
HTTP Kee
pAlive(PIS
N+1)
ACK(SISN
+1)
ACK(DIS
N+1)
Client sends
HTTP request
(3)
Server
HTTP requ
est(CISN+
1)
ACK(DIS
N+1)
Data(SSN')
)
ACK(PSN
(4)
ACK(SSN
=SSN'+x+
1)
Connection
reuse
Connection
Binding
+1)
Data(DISN
+len+1)
rewrite
packet
HTTP requ
est(PSN)
ACK(SSN
),Option(bi
nd)
Data(SSN)
+len+1)
ACK(PSN
ACK(CISN
ACK(DIS
N+len+1)
rewrite
packet
ACK(SSN
+len+1)
ta
End of da
ta, FIN
End of da
ACK
ACK
connection
Reuse
28
Layer-7 one-way architecture
29
Layer-7 one-way mechanisms
TCP handoff
The switch hands off the TCP connection endpoint to the
server
Needs changes to the OS on both components
TCP connection hop
Software-based proprietary solution
encapsulating the IP packet and sending it to
the server
30
Layer-7 one-way mechanisms
Migrate the created TCP connection from the
switch to the back-end sever
Create a TCP connection at the back-end without going
through the TCP three-way handshake
Retrieve the state of an established connection and
destroy the connection without going through the normal
message handshake required to close a TCP connection
Once the connection is handed off to the back-
end server, the switch must forward packets from
the client to the appropriate back-end server
31
Summary
So far, we have discussed:
32