Performance Issues of Web Services
Download
Report
Transcript Performance Issues of Web Services
Performance Issues of
Web Services
CSCI 8710
November 29-30, 2006
Kraemer
Web Services
Services available via the Internet that
complete tasks or conduct transactions.
Self-contained, modular applications that
can be described, published, and invoked
over the Internet.
Can be automatically invoked by
application programs.
Web Services
May be invoked at one site or may combine
results of several services executed at
different sites.
Performance concerns
differ from stanard C/S
May involve both web service processing
and network delays
May be accessed by wide variety of devices
-- desktop computers, PDAs, mobile
phones, other servers
Access via wireless communication
networks: dynamic connectivity, low
bandwidth, high latency
Performance concerns
differ from standard C/S
Undpredictable nature of requests
Highly bursty
Varies with geographical location of clients, day
of week, time of day
Highly variable size of requested objects
“Robot” access
Autonomous software agents that can consume
significant amounts of system resources
Types of servers
providing Web Services
Web servers
Transaction servers
Proxy servers
Cache servers
Wireless gateway servers
Mirror servers
Common problems
Insufficient bandwidth at peak times
Overloaded servers
Uneven server loads
Delivery of dynamic content
Shortage of connections between
application servers and database servers
Failure of third-party servers
Delivery of multi-media content
Example:
Bill Paying Service
Portal offers bill paying service
Customers can pay variety of bills through
the service
Uses services provided by others:
Debit authorization (100 tps capability)
Electronic funds transfer
Customer authentication
Example:
Bill Paying Service
Example:
Bill Paying Service
Portal B is bill paying service
Treat overall web service as ‘system’
Treat component services as ‘devices’
What is the capacity of B, given that the debit
authorization service can support 100 tps and that
each payment transaction requires 2 visits to the
Xi = Vi * X0
100 = 2 * X0
X0 = 50 tps
Web server elements
HTML and XML
Most documents on the Web written using HTML
“markup language”
Most consist of text and inline images
Can also include other multimedia objects
Generates multiple requests: for document and for
each inline image -- single click by user may
generate series of requests
XML uses tags and attributes to define/delimit
data
Application must interpret meaning of the tags
Hardware and Operating
System
Hardware view: performance a function of:
Number and speed of processors
Amount of main memory
Bandwidth and storage capacity of disk
subsystem
Bandwidth of the NIC
OS considerations:
Performance, scalability, reliability, robustness
Content
Performance affected by:
Content size
Content structure
Hyperlinks
Popularity of content
Perception of
Performance
User view:
Fast response time; no connections refused
Management view:
High throughput; high availability
Need to have quantitative measurements
that describe behavior of Web service
Metrics
Two most important;
Response time -- seconds
Throughput -- http_ops/sec, also bits/sec
Other metrics
Hit
any connection to a web site, including in-line requests
and errors
difficult to compare across sites
Visit
Series of page requests by a user at a single site
Inter-request times < timeout_value
Session
Series of consecutive and related requests made during a
single visit
Inter-request times < timeout_value
Other metrics
User-perceived response time
Set of geographically distributed agents poll the WS
Error rate
Increase indicates degrading performance
Examples:
Overflow of pending connection queue
For streaming services:
Jitter
Startup latency
Most common
measurements of Web
service performance
End-to-end response time
Site response time
Throughput (req/sec)
Throughput (Mbps)
Errors/sec
Visitors/day
Unique visitors/day
Example - Travel Agency
Monitor for 30 minutes:
9000 HTTP requests
Three types of objects delivered:
Html pages (30%, avg. size 11,200 bytes)
Images (65%, avg. size 17,200 bytes)
Video clips (5%, avg. size 439,000 bytes)
What is the throughput:
9000 requests/1800 sec = 5 req/sec
What is the throughput in Kbps?
Throughput in Kbps?
Xr = (total_req * class% * avg. size)/time
Xhtml = (9000 * 0.30 * 11,200*8)/1800 = 131.25
Ximage = (9000 * 0.65 * 17,200*8)/1800 = 436.72
Xvideo = (9000 * 0.05 * 439,000*8)/1800 = 857.42
X0 = 131.25 + 436.72 + 857.42
X0 = 1425.39 Kbps
To support the Web traffic, the network connection
should be at least a T1 line (1.544 Mbit/s ).
QoS indicators for
Web Services
Response time
Availability
Percentage of time a service is ‘live’ (serving customer
requests)
Reliability
Probability that WS will perform in satisfactory manner
for a given period of time under specified operating and
load conditions
Predictability
Cost
Input data needed to
monitor QoS
Traffic
Performance
Usage patterns
Knowledge of average and peak load
Where are the delays?
Where are the delays?
Four categories:
DNS lookup phase
TCP connection set-up phase
Server execution time
Network time
DNS lookup phase
Browser converts server name in URL into an IP
address to establish the TCP connection
If server name can’t be resolved by local cache,
send query to higher-level DNS server
For leading e-commerce sites, avg. lookup times
are 0.01 and 0.11 sec. Fastest sites achieve 0.001
sec.
Anatomy of a Web
Transaction
Anatomy of a Web
transaction
Browser
Network
Server
Anatomy of a Web
Transaction:
the Browser
User clicks on hyperlink; requests document
Client (browser) checks local cache for document;
in case of hit:
returns document; user response time R’Browser,hit*
In case of miss
Browser asks DNS to map server hostname to IP address
Cloent opens a TCP connectionto the server defined by the
URL of the link
Client sends an HTTP request to the server
Browser formats and displays document and renders images
Returned document is stored in browser cache
User response time: R’Browser,miss*
Anatomy of a Web
Transaction:
the Network
Imposes delays in delivering info from
client to server (R’N1) and from server to
client (R’N2).
Delays a function of components on path
between them:
Modems, routers, comm links, bridges, relays
R’Network
= total time HTTP request spends in the netork
= R’N1 + R’N2
Anatomy of a Web
transaction:
the Server
request arrives from client
server parses the request according to the http
server executes requested method (GET, HEAD, etc.)
if GET
server looks up file in its document tree by using the file system;
file may be in cache or on disk
server read contents of file from disk or cache and writes it
to network port
when file send complete, close the connection (if nonpersistent HTTP)
R’server = time spent in execution of HTTP request
includes service time and waiting time at the server
Anatomy of a Web
transaction
If document not found in client’s cache:
response time is sum of residence time at all resources
Rmiss = R’Browser, miss + R’Network + R’Server
If a hit
Rhit = R’Browser, hit
Typically:
Rhit << Rmiss
Average response time, R, over NT requests:
R = pC * Rhit + (1-pc) * Rmiss
Example
User wants to analyze impact of local
cache size of browser on Web response
time perceived by user
20% of requests serviced by local cache with
R=400 msec
R for remotely serviced requests = 3 sec
Previous expts. indicate that 3x cache size
results in hit rate of 45%
R_orig=0.20 * 0.4 + 0.80 * 3.0 = 2.48 sec
R_new = 0.45 * 0.4 + 0.55 * 3.0 = 1.83 sec
Bottlenecks
bottleneck = the component that limits
system performance
Need to identify the bottleneck to improve
performance
Example
home user
takes too long to download medium-size page
(avg. size 20KB)
considering upgrading to processor w/2X faster
CPU
How will this affect response time?
Example, continued
Assume:
R’network = 7.5 sec
R’server = 3.6 sec
R’Browser, miss = 0.3 sec
R = R’network + R’server +R’Browser, miss
R = 7.5 sec + 3.6 sec + 0.3 sec = 11.4 sec
Rnew = 7.5 + 3.6 + 0.15 = 11.25 sec
not much difference … CPU not the bottleneck
Example
Pharma co. plans intranet for training and
display of images of molecules
training sessions have 100 people
assume 80% active at any one time
Each user performs avg. of 100 ops/hour
Each op requests avg. of 5 images
Avg. size of requested image is 25600 bytes
What is minimum bandwidth of network
connection to image server?
Example, continued
100 * 0.80 * 100 ops/hour * 5 images/op * 25600
bytes/image * 8 bits /byte * 1 hr/3600 sec
(100 * 0.80 * 5 *25600 * 8 )/3600 = 2.28 Mbps
Web Infrastructure
Web infrastructure
Three major delay sources:
“last mile”
Link between end user and phone company switch, or
DSL or cable connection to service provider
ISPs
Recently, more bandwidth added
Improvements via caching, load balancing, more
servers
‘backbone’ of network
Collection of interconnected network providers
Connect to each other to exchange traffic (peering)
Public peering: at major interconnection points (NAPs,
network access points)(MAEs, Metropolitan Access
Points)
Delays may occur at peering points
Basic Components
Servers
Browsers
Firewalls
protect data, programs, and computers on private
network from the uncontrolled activities of untrusted
users and software on other computers
Screens network traffic going through it, using
Software, network hardware, computers
Potential performance bottleneck
Proxy, Cache, Mirror
Techniques for improving web performance
and security
Try to reduce
access time to web documents
Network bandwidth required for doc xfers
Demand on servers w/ very popular docs
Proxy server
Special type of web server that acts as an agent:
server to the client, client to the server
Accepts requests from clients, forwards them to
web servers
Receives responses from remote servers, forwards
them back to the client
Originally designed to provide web access for users
on private networks who had to go through a
firewall
Proxy server
Can be configured to cache relayed responses
Benefits:
Improves access speed by bringing data closer to
consumer
Cuts down on network traffic
Reduces server load
Increases availability in the web
Problems:
Ensuring that cached docs are up-to-date
What’s worth caching? For how long?
Proxy server
Caching
Used in the Web:
Client-side, at the browser
In the network, a caching proxy
Evaluating caching effectiveness:
Hit ratio = requests_satisfied/total_requests
Byte hit ratio = hit ratio weighted by doc size
Data transferred = bytes xferred/time
Example
Manager wants to install caching proxy server on
corporate intranet w/ > 2000 users
Use for 6 months -> then evaluate
Consider two cases:
Cache holds small documents, avg. size 4800 bytes, hit
ratio 60%
Cache holds medium documents, avg. size 32500 bytes,
hit ratio 20%
Monitor for one hour, observe 28800 requests
Cache efficiency
Saved_BW =
(num_req * hit_ratio * avg_size)/time
Saved_BW_small =
(28800 * 0.60 * 4800 * 8)/3600 sec = 184Kbps
Saved_BW_med =
(28800 * 0.20 * 32500*8)/3600 = 416 Kbps
Holding larger documents can save more
BW
Mirroring
Replicating site content at other servers
Requires:
Regular updates
DNS to direct browsers to secondary sites when
primary is busy
Goals:
Increase availability
Balance server load
Thus increasing quality of service
Example
Manufacturing co., employee portal, too
slow for European users
Idea: install mirror site in Paris
What are the bandwidth savings ?
Example: Mirror site in
Paris
Current avg. BW is 35 Mbps
40% of load from Europe
42% of traffic could be served from caching
Cacheable amount: 35 * 0.42 = 14.7Mbps
Estimate cache hit ratio at 38%
Saved_BW = 14.7 Mbps * 0.38 = 5.6 Mbps
40% of traffic from Europe, so:
5.6 * 0.40 = 2.24 Mbps could be served from cache in Paris
6.4% savings on current BW usage at server
improvement in perceived response time for European users
Content Delivery
Networks(CDN)
cache or replicate content as needed to
meet demands from clients over the Web
coordinated caching systems implemented
through proprietary networks and data
centers
employ a DNS-redirecting mechanism
tries to assign best location from which to
serve the requested content
Content Delivery
Networks(CDN
DNS-redirecting mechanism:
client requests URL; browser generates a DNS request for
the IP address corresponding to the domain name in the
URL
CDN controls the DNS service for this domain name
CDN modifies DNS requests with the IP addess of a
selected server rather than IP address of original server
uses a routing function to select “best” server:
client location, id of requested content, load of CDN
network and servers, proximity of CDN servers to client are
all considered
CDN should provide:
scalability, high availability, manageability, performance
The WAP Infrastructure
WAP = Wireless Application Protocol
architecture + set of protocols for wireless
devices to access Web services at regular Web
sites
wireless device communicates with WAP
gateway, over wireless nework
WAP gateway communicates with servers
The WAP Infrastructure
The WAP Infrastructure
Docs for wireless devices written in form of
XML known as WML (wireless markup
language)
can also use WMLscript
WML docs
structured as set of “cards”, units of user
interaction
deck = set of cards
users navigate between cards
The WAP Infrastructure
WML decks + WMLScripts
stored in regular web servers on internet
retrieved by WAP gateway via HTTP
Web server response is binary encoded by WAP
gateway and sent to wireless device via
lightweight protocols
designed to minimize BW requirements
WAP protocol stack
Server Architectures
Web Server
Application Server
Transaction and Database Server
Streaming Server
Multi-tier Architecture
Web Server
listens for HTTP requests
establishes requested connection
sends requested file
returns to listening mode
can handle more than one request at a time
fork a copy of the HTTP process for each request
multi-threaded HTTP program
pool of running processes
Dynamic content
can use client-side or server-side programs
can improve performance by pushing to
client-side
Application Server
software that handles all application
operations between broswer-based
customers and back-end databases
receive client request
execute business logic, interacting with
transaction and/or DB servers
can be implemented in many ways:
CGI scripts, FastCGIs, server-applications,
server-side scripts
Transaction and
Database Server
Tranasction Processing (TP) monitor
provides:
an application programming interface
a set of program development tools
a system to monitor and control execution of
transaction programs
DB server:
executes and monitor transaction processing
applications
Streaming Server
Initially, audio and video were “download
and play” technologies
Streaming media begins to play “almost”
immediately
client request arrives
server retrieves video and audio data and
begins to deliver them over the network
video and audio are compressed (MPEG, MP3)
typically have control part and data part
Example
Company plans to offer MM online training
Employee retrieves lecture of video, audio,
slides; 30 minute duration
What is the number of streaming servers
needed to serve the lecture presentation
during busiest period of the day: 4-5 pm
Example
400 employees at peak
One MM server can stream presentations to 150
viewers simultaneously
What is the average number of simultaneous
viewers during peak period?
Use Little’s Law: N=R
= Req/time = 400 viewers/60 min
R = 30 min
N = 30 * 400/60 = 200
Need two MM servers
Multi-tier Architecture
web-based apps usually in 3-tier
architecture:
presentation layer
user interface (browser & HTML, XML, etc.)
application layer
business logic
collection of rules to implement application logic
may also contain Java applets, ActiveX controls, etc.
data service layer
persistent data
Multi-tier Architecture
Example
application layer designed to support 400
simultaneous processes
app process:
receives client request
executes app logic, interacting with DB server
Monitoring shows:
app process executes for 150 msec between DB requests
DB server handles 440 req/sec
400 app processes running during peak period
What if??
the application servers are replaced by new
servers with 2X speed
Each application server characterized by Z,
“think time” – time between receiving a
reply from the DB server and submitting a
new DB request
DB layer, characterized by throughput, X,
in req/sec
R = N/X - Z
What if ...?
DB response time:
R = 400/550 – 0.15 = 577 msec = 0.577 sec
after cpu upgrade, app processing time
should be 75 msec
DB response time now:
Rnew = 400/550 – 0.075 = 652 msec = 0.652 sec
Improvement in app layer may not lead to
improvement overall
Dynamic Load Balancing
heavy traffic load adversely impacting
performance
add more servers
buy bigger (faster) servers
need to do cost-performance analysis
Dynamic Load Balancing
web cluster:
multiple web servers
single location addressed by one URL and a
single virtual IP address
incoming requests routed amount servers in
user-transparent way
switch acts as dispatcher, mapping virtual IP
address to actual address
Web cluster
Networks
Bandwidth
measures the rate at which data can be sent
through the network
usually expressed in bps
Latency
time needed for a bit (or small packet) to
travel across the network
Bandwidth for different
types of networks
Planning
Streaming service offers training videos
training session -> 15 min video at 300 Kbps
What impact if videos go to 25 min?
Service supports 35 simultaneous sessions
Average BW needed (now)
35 * 300 Kbps = 10.5 Mbps
Average number simult. sessions (now)
N = 35
N=*R
35 = * 15
= 35/15 = 35/15 .. assume this remains the same
Nnew = * 25 = 35/15 * 25 = 58.33
Average BW needed (new)
58.33 * 300 Kbps = 17.5 Mbps
Example
training videos, avg. size 950 MB
100 students, 80% active at one time
Each user requests 2 clips/hour
BW needed to support:
( 0.80 * 100) * 2 * (8 * 950)/3600 sec
337.7 Mbps
Need a 622 ATM network to support