Performance Issues of Web Services

Download Report

Transcript Performance Issues of Web Services

Performance Issues of
Web Services
CSCI 8710
November 29-30, 2006
Kraemer
Web Services
Services available via the Internet that
complete tasks or conduct transactions.
Self-contained, modular applications that
can be described, published, and invoked
over the Internet.
Can be automatically invoked by
application programs.
Web Services
May be invoked at one site or may combine
results of several services executed at
different sites.
Performance concerns
differ from stanard C/S
May involve both web service processing
and network delays
May be accessed by wide variety of devices
-- desktop computers, PDAs, mobile
phones, other servers
Access via wireless communication
networks: dynamic connectivity, low
bandwidth, high latency
Performance concerns
differ from standard C/S
Undpredictable nature of requests
Highly bursty
Varies with geographical location of clients, day
of week, time of day
Highly variable size of requested objects
“Robot” access
Autonomous software agents that can consume
significant amounts of system resources
Types of servers
providing Web Services
Web servers
Transaction servers
Proxy servers
Cache servers
Wireless gateway servers
Mirror servers
Common problems
Insufficient bandwidth at peak times
Overloaded servers
Uneven server loads
Delivery of dynamic content
Shortage of connections between
application servers and database servers
Failure of third-party servers
Delivery of multi-media content
Example:
Bill Paying Service
Portal offers bill paying service
Customers can pay variety of bills through
the service
Uses services provided by others:
Debit authorization (100 tps capability)
Electronic funds transfer
Customer authentication
Example:
Bill Paying Service
Example:
Bill Paying Service
 Portal B is bill paying service
 Treat overall web service as ‘system’
 Treat component services as ‘devices’
 What is the capacity of B, given that the debit
authorization service can support 100 tps and that
each payment transaction requires 2 visits to the
 Xi = Vi * X0
 100 = 2 * X0
 X0 = 50 tps
Web server elements
HTML and XML
 Most documents on the Web written using HTML
“markup language”
 Most consist of text and inline images
 Can also include other multimedia objects
 Generates multiple requests: for document and for
each inline image -- single click by user may
generate series of requests
 XML uses tags and attributes to define/delimit
data
 Application must interpret meaning of the tags
Hardware and Operating
System
Hardware view: performance a function of:
Number and speed of processors
Amount of main memory
Bandwidth and storage capacity of disk
subsystem
Bandwidth of the NIC
OS considerations:
Performance, scalability, reliability, robustness
Content
Performance affected by:
Content size
Content structure
Hyperlinks
Popularity of content
Perception of
Performance
User view:
Fast response time; no connections refused
Management view:
High throughput; high availability
Need to have quantitative measurements
that describe behavior of Web service
Metrics
Two most important;
Response time -- seconds
Throughput -- http_ops/sec, also bits/sec
Other metrics
 Hit
 any connection to a web site, including in-line requests
and errors
 difficult to compare across sites
 Visit
 Series of page requests by a user at a single site
 Inter-request times < timeout_value
 Session
 Series of consecutive and related requests made during a
single visit
 Inter-request times < timeout_value
Other metrics
 User-perceived response time
 Set of geographically distributed agents poll the WS
 Error rate
 Increase indicates degrading performance
 Examples:
 Overflow of pending connection queue
 For streaming services:
 Jitter
 Startup latency
Most common
measurements of Web
service performance
End-to-end response time
Site response time
Throughput (req/sec)
Throughput (Mbps)
Errors/sec
Visitors/day
Unique visitors/day
Example - Travel Agency
Monitor for 30 minutes:
9000 HTTP requests
Three types of objects delivered:
Html pages (30%, avg. size 11,200 bytes)
Images (65%, avg. size 17,200 bytes)
Video clips (5%, avg. size 439,000 bytes)
What is the throughput:
9000 requests/1800 sec = 5 req/sec
What is the throughput in Kbps?
Throughput in Kbps?
 Xr = (total_req * class% * avg. size)/time
 Xhtml = (9000 * 0.30 * 11,200*8)/1800 = 131.25
 Ximage = (9000 * 0.65 * 17,200*8)/1800 = 436.72
 Xvideo = (9000 * 0.05 * 439,000*8)/1800 = 857.42
 X0 = 131.25 + 436.72 + 857.42
 X0 = 1425.39 Kbps
To support the Web traffic, the network connection
should be at least a T1 line (1.544 Mbit/s ).
QoS indicators for
Web Services
 Response time
 Availability
 Percentage of time a service is ‘live’ (serving customer
requests)
 Reliability
 Probability that WS will perform in satisfactory manner
for a given period of time under specified operating and
load conditions
 Predictability
 Cost
Input data needed to
monitor QoS
Traffic
Performance
Usage patterns
Knowledge of average and peak load
Where are the delays?
Where are the delays?
Four categories:
DNS lookup phase
TCP connection set-up phase
Server execution time
Network time
DNS lookup phase
 Browser converts server name in URL into an IP
address to establish the TCP connection
 If server name can’t be resolved by local cache,
send query to higher-level DNS server
 For leading e-commerce sites, avg. lookup times
are 0.01 and 0.11 sec. Fastest sites achieve 0.001
sec.
Anatomy of a Web
Transaction
Anatomy of a Web
transaction
Browser
Network
Server
Anatomy of a Web
Transaction:
the Browser
 User clicks on hyperlink; requests document
 Client (browser) checks local cache for document;
 in case of hit:
 returns document; user response time R’Browser,hit*
 In case of miss
 Browser asks DNS to map server hostname to IP address
 Cloent opens a TCP connectionto the server defined by the
URL of the link
 Client sends an HTTP request to the server
 Browser formats and displays document and renders images
 Returned document is stored in browser cache
 User response time: R’Browser,miss*
Anatomy of a Web
Transaction:
the Network
Imposes delays in delivering info from
client to server (R’N1) and from server to
client (R’N2).
Delays a function of components on path
between them:
Modems, routers, comm links, bridges, relays
R’Network
= total time HTTP request spends in the netork
= R’N1 + R’N2
Anatomy of a Web
transaction:
the Server
 request arrives from client
 server parses the request according to the http
 server executes requested method (GET, HEAD, etc.)
 if GET
 server looks up file in its document tree by using the file system;
file may be in cache or on disk
 server read contents of file from disk or cache and writes it
to network port
 when file send complete, close the connection (if nonpersistent HTTP)
 R’server = time spent in execution of HTTP request
 includes service time and waiting time at the server
Anatomy of a Web
transaction
 If document not found in client’s cache:
 response time is sum of residence time at all resources
 Rmiss = R’Browser, miss + R’Network + R’Server
 If a hit
 Rhit = R’Browser, hit
 Typically:
 Rhit << Rmiss
 Average response time, R, over NT requests:
 R = pC * Rhit + (1-pc) * Rmiss
Example
User wants to analyze impact of local
cache size of browser on Web response
time perceived by user
20% of requests serviced by local cache with
R=400 msec
R for remotely serviced requests = 3 sec
Previous expts. indicate that 3x cache size
results in hit rate of 45%
R_orig=0.20 * 0.4 + 0.80 * 3.0 = 2.48 sec
R_new = 0.45 * 0.4 + 0.55 * 3.0 = 1.83 sec
Bottlenecks
bottleneck = the component that limits
system performance
Need to identify the bottleneck to improve
performance
Example
home user
takes too long to download medium-size page
(avg. size 20KB)
considering upgrading to processor w/2X faster
CPU
How will this affect response time?
Example, continued
Assume:
R’network = 7.5 sec
R’server = 3.6 sec
R’Browser, miss = 0.3 sec
R = R’network + R’server +R’Browser, miss
R = 7.5 sec + 3.6 sec + 0.3 sec = 11.4 sec
Rnew = 7.5 + 3.6 + 0.15 = 11.25 sec
not much difference … CPU not the bottleneck
Example
Pharma co. plans intranet for training and
display of images of molecules
training sessions have 100 people
assume 80% active at any one time
Each user performs avg. of 100 ops/hour
Each op requests avg. of 5 images
Avg. size of requested image is 25600 bytes
What is minimum bandwidth of network
connection to image server?
Example, continued
100 * 0.80 * 100 ops/hour * 5 images/op * 25600
bytes/image * 8 bits /byte * 1 hr/3600 sec
(100 * 0.80 * 5 *25600 * 8 )/3600 = 2.28 Mbps
Web Infrastructure
Web infrastructure
Three major delay sources:
“last mile”
Link between end user and phone company switch, or
DSL or cable connection to service provider
ISPs
Recently, more bandwidth added
Improvements via caching, load balancing, more
servers
‘backbone’ of network
Collection of interconnected network providers
 Connect to each other to exchange traffic (peering)
 Public peering: at major interconnection points (NAPs,
network access points)(MAEs, Metropolitan Access
Points)
 Delays may occur at peering points
Basic Components
 Servers
 Browsers
 Firewalls
 protect data, programs, and computers on private
network from the uncontrolled activities of untrusted
users and software on other computers
 Screens network traffic going through it, using
 Software, network hardware, computers
 Potential performance bottleneck
Proxy, Cache, Mirror
Techniques for improving web performance
and security
Try to reduce
access time to web documents
Network bandwidth required for doc xfers
Demand on servers w/ very popular docs
Proxy server
 Special type of web server that acts as an agent:
server to the client, client to the server
 Accepts requests from clients, forwards them to
web servers
 Receives responses from remote servers, forwards
them back to the client
 Originally designed to provide web access for users
on private networks who had to go through a
firewall
Proxy server
 Can be configured to cache relayed responses
 Benefits:
 Improves access speed by bringing data closer to
consumer
 Cuts down on network traffic
 Reduces server load
 Increases availability in the web
 Problems:
 Ensuring that cached docs are up-to-date
 What’s worth caching? For how long?
Proxy server
Caching
Used in the Web:
Client-side, at the browser
In the network, a caching proxy
Evaluating caching effectiveness:
Hit ratio = requests_satisfied/total_requests
Byte hit ratio = hit ratio weighted by doc size
Data transferred = bytes xferred/time
Example
 Manager wants to install caching proxy server on
corporate intranet w/ > 2000 users
 Use for 6 months -> then evaluate
 Consider two cases:
 Cache holds small documents, avg. size 4800 bytes, hit
ratio 60%
 Cache holds medium documents, avg. size 32500 bytes,
hit ratio 20%
 Monitor for one hour, observe 28800 requests
Cache efficiency
Saved_BW =
(num_req * hit_ratio * avg_size)/time
Saved_BW_small =
(28800 * 0.60 * 4800 * 8)/3600 sec = 184Kbps
Saved_BW_med =
(28800 * 0.20 * 32500*8)/3600 = 416 Kbps
Holding larger documents can save more
BW
Mirroring
Replicating site content at other servers
Requires:
Regular updates
DNS to direct browsers to secondary sites when
primary is busy
Goals:
Increase availability
Balance server load
Thus increasing quality of service
Example
Manufacturing co., employee portal, too
slow for European users
Idea: install mirror site in Paris
What are the bandwidth savings ?
Example: Mirror site in
Paris
 Current avg. BW is 35 Mbps
 40% of load from Europe
 42% of traffic could be served from caching
 Cacheable amount: 35 * 0.42 = 14.7Mbps
 Estimate cache hit ratio at 38%
 Saved_BW = 14.7 Mbps * 0.38 = 5.6 Mbps
 40% of traffic from Europe, so:
 5.6 * 0.40 = 2.24 Mbps could be served from cache in Paris
 6.4% savings on current BW usage at server
 improvement in perceived response time for European users
Content Delivery
Networks(CDN)
cache or replicate content as needed to
meet demands from clients over the Web
coordinated caching systems implemented
through proprietary networks and data
centers
employ a DNS-redirecting mechanism
tries to assign best location from which to
serve the requested content
Content Delivery
Networks(CDN
 DNS-redirecting mechanism:
 client requests URL; browser generates a DNS request for
the IP address corresponding to the domain name in the
URL
 CDN controls the DNS service for this domain name
 CDN modifies DNS requests with the IP addess of a
selected server rather than IP address of original server
 uses a routing function to select “best” server:
 client location, id of requested content, load of CDN
network and servers, proximity of CDN servers to client are
all considered
 CDN should provide:
 scalability, high availability, manageability, performance
The WAP Infrastructure
WAP = Wireless Application Protocol
architecture + set of protocols for wireless
devices to access Web services at regular Web
sites
wireless device communicates with WAP
gateway, over wireless nework
WAP gateway communicates with servers
The WAP Infrastructure
The WAP Infrastructure
Docs for wireless devices written in form of
XML known as WML (wireless markup
language)
can also use WMLscript
WML docs
structured as set of “cards”, units of user
interaction
deck = set of cards
users navigate between cards
The WAP Infrastructure
WML decks + WMLScripts
stored in regular web servers on internet
retrieved by WAP gateway via HTTP
Web server response is binary encoded by WAP
gateway and sent to wireless device via
lightweight protocols
designed to minimize BW requirements
WAP protocol stack
Server Architectures
Web Server
Application Server
Transaction and Database Server
Streaming Server
Multi-tier Architecture
Web Server
 listens for HTTP requests
 establishes requested connection
 sends requested file
 returns to listening mode
 can handle more than one request at a time
 fork a copy of the HTTP process for each request
 multi-threaded HTTP program
 pool of running processes
Dynamic content
can use client-side or server-side programs
can improve performance by pushing to
client-side
Application Server
software that handles all application
operations between broswer-based
customers and back-end databases
receive client request
execute business logic, interacting with
transaction and/or DB servers
can be implemented in many ways:
CGI scripts, FastCGIs, server-applications,
server-side scripts
Transaction and
Database Server
Tranasction Processing (TP) monitor
provides:
an application programming interface
a set of program development tools
a system to monitor and control execution of
transaction programs
DB server:
executes and monitor transaction processing
applications
Streaming Server
Initially, audio and video were “download
and play” technologies
Streaming media begins to play “almost”
immediately
client request arrives
server retrieves video and audio data and
begins to deliver them over the network
video and audio are compressed (MPEG, MP3)
typically have control part and data part
Example
Company plans to offer MM online training
Employee retrieves lecture of video, audio,
slides; 30 minute duration
What is the number of streaming servers
needed to serve the lecture presentation
during busiest period of the day: 4-5 pm
Example
 400 employees at peak
 One MM server can stream presentations to 150
viewers simultaneously
 What is the average number of simultaneous
viewers during peak period?
 Use Little’s Law: N=R
  = Req/time = 400 viewers/60 min
 R = 30 min
 N = 30 * 400/60 = 200
 Need two MM servers
Multi-tier Architecture
web-based apps usually in 3-tier
architecture:
presentation layer
user interface (browser & HTML, XML, etc.)
application layer
business logic
 collection of rules to implement application logic
 may also contain Java applets, ActiveX controls, etc.
data service layer
persistent data
Multi-tier Architecture
Example
 application layer designed to support 400
simultaneous processes
 app process:
 receives client request
 executes app logic, interacting with DB server
 Monitoring shows:
 app process executes for 150 msec between DB requests
 DB server handles 440 req/sec
 400 app processes running during peak period
What if??
the application servers are replaced by new
servers with 2X speed
Each application server characterized by Z,
“think time” – time between receiving a
reply from the DB server and submitting a
new DB request
DB layer, characterized by throughput, X,
in req/sec
R = N/X - Z
What if ...?
DB response time:
R = 400/550 – 0.15 = 577 msec = 0.577 sec
after cpu upgrade, app processing time
should be 75 msec
DB response time now:
Rnew = 400/550 – 0.075 = 652 msec = 0.652 sec
Improvement in app layer may not lead to
improvement overall
Dynamic Load Balancing
heavy traffic load adversely impacting
performance
add more servers
buy bigger (faster) servers
need to do cost-performance analysis
Dynamic Load Balancing
web cluster:
multiple web servers
single location addressed by one URL and a
single virtual IP address
incoming requests routed amount servers in
user-transparent way
switch acts as dispatcher, mapping virtual IP
address to actual address
Web cluster
Networks
Bandwidth
measures the rate at which data can be sent
through the network
usually expressed in bps
Latency
time needed for a bit (or small packet) to
travel across the network
Bandwidth for different
types of networks
Planning





Streaming service offers training videos
training session -> 15 min video at 300 Kbps
What impact if videos go to 25 min?
Service supports 35 simultaneous sessions
Average BW needed (now)
 35 * 300 Kbps = 10.5 Mbps
 Average number simult. sessions (now)




N = 35
N=*R
35 =  * 15
 = 35/15 = 35/15 .. assume this remains the same
 Nnew =  * 25 = 35/15 * 25 = 58.33
 Average BW needed (new)
 58.33 * 300 Kbps = 17.5 Mbps
Example
training videos, avg. size 950 MB
100 students, 80% active at one time
Each user requests 2 clips/hour
BW needed to support:
( 0.80 * 100) * 2 * (8 * 950)/3600 sec
337.7 Mbps
Need a 622 ATM network to support