Protocols and Interaction Models for Web Services
Download
Report
Transcript Protocols and Interaction Models for Web Services
Protocols and
Interaction Models for
Web Services
CSCI 8710
Fall 2006
Outline
Networks
Client/Server
Peer-to-Peer
Web Service Protocols
Networks
Originated with ARPANET
Packet-switched
Started in late 1960’s
ARPA - Advanced Research Projects Agency (now
known as DARPA)
Then: about 10 nodes (~100 million today)
Goal: resource sharing
Result: reached goal + demonstrated
importance of networks as tool for
communication and interaction via email
Networks
Kleinrock at UCLA
Did much of the early work in queueing models
of networks, measurement and management of
networks and network protocols
Networks
1983: ARPANET split into two networks:
MILNET (military purposes)
ARPANET (reduced version)
Term “Internet” first introduced
-- came into being in 1983
Today’s Internet
Worldwide collection of interconnected
WANs
Characterized by:
IP - networking protocol
TCP - process-to-process protocol
Types of Networks
WANs
LANs
LAN-to-WAN connection
Home-to-WAN connection
WANs
Wide-area network
City-wide, muti-city, country-wide, continent …
Uses packet switching:
Messages transmitted between hosts are broken down
into chunks called packets with some max size
Packet header has routing and sequencing info
Consists of:
Packet switches or routers
High speed links connecting the routers
Routers
Communication computers that:
Store incoming packets
Examine headers
Look up routing tables
Decide next router to send packet to
Place packet on output queue for selected link
Known as “store and forward”
Technologies used to
build WANs
X25 - a standard of the ITU (International
Telecommunications Union)
ISDN - Integrated Services Digital Network
Service offered by telephone companies;
integrates voice and data over ordinary
telephone lines
Frame Relay
A high-speed WAN service offered by longdistance carriers
Technologies used to
build WANs (cont’d)
SMDS - Switched Multi-megabit Data
Service
Another high-speed WAN service offered by
long-distance carriers
ATM - Asynchronous Transfer Mode
Packet-switched technology that uses small
fixed-size packets (53 bytes), called cells, to
provide fast switching to voice, video, and data
over WANs
LANs
Typically confined to a building or set of
closely located buildings
Both wired and wireless
Most popular:
10-Mbps Ethernet
100-Mbps Ethernet
4- or 16-Mbps Token Ring
100-Mbps FDDI
Ethernet
Invented by Metcalfe in the early 1970s
Have bus topology:
Shared bus
computer
Network interface card (NIC)
Ethernet
Computers connect to a shared coaxial
cable through NICs
Packets transmitted by one NIC can be
received by all others: broadcast
communication
But, since packets have destination addresses,
only destination NIC will (typically) copy the
packet to the computer’s main memory
Ethernet
All NICs can try to talk at once
which reminds EK of a faculty meeting ;^)
No central coordinator
CSMA/CD is used:
Carrier Sense with Multiple Access / Collision Detection
NIC that wants to transmit “listens” to see if a
transmission is in progress (carrier sense)
If so, it waits
But two may start “talking” at about the same time
(collision)
Collisions are detected and NICs stop “talking”, wait for
a randomly selected time period(so they don’t just
collide again), and then try to retransmit
Ethernet
What happens as number of nodes and/or
traffic increases?
probability of collision increases
Network throughput decreases
Because more bandwidth spent on collisions and
retransmissions
Token Ring
Invented at IBM Research Labs
Based on a ring topology:
= computer
= Network interface card (NIC)
ring
Token Ring
Access to ring is controlled by a token
(special bit pattern that circulates in the
ring)
NIC with token can transmit
If NIC has nothing to transmit, just passes
the token
Token Ring
Sender (NIC with token) inserts the bits
representing its packet into the ring
Packet goes around the ring, is copied by
the NIC of the destination address
Packet flows back around to sender; sender
removes packet and performs error
checking (compares received packet with
sent packet)
Token Ring
As more stations are added to the token
ring …
Delay in obtaining token increases
Token must circulate through more NICs
Probability that token is used by other NICs increases
FDDI
Uses optical fibers and token-passing
Differs in that it uses 2 rings
= computer
= Network interface card (NIC)
ring
FDDI
Benefits of 2nd ring?
Data flows in opposite directions in the 2 rings
If a station fails, the hardware can reconfigure
the ring and turn it into a single functioning
ring by bypassing the malfunctioning station
LANs: limits on size
Two types of limits:
Physical limits
Performance limits
Examples:
Ethernet physical limits:
Cable no more than 500 m in length
Mininum separation of 3 m between stations
Managing performance
limits on LANs
Number of stations may be limited because
of adverse impact of additional stations on
overall performance
Approach:
Can divide larger LAN into LAN segments with
fewer stations each
Segments joined by connecting devices such as
routers and bridges
Stations that communicate frequently should be
in same segment
Wireless LANs
Stations communicate via RF (radio
frequency)
Modulation of transmitted wave is interpreted
as sequence of 0s and 1s
IEEE standard for LANs is the IEEE 802.11
protocol
Can transmit data at 1 or 2 Mbps depending on
underlying modulation technology
IEEE 802.11 protocol
Provides a carrier sense signal that
indicates if a transmission is in progress
Data sent by one station can be received by
all stations in the coverage area
Subject to the “hidden terminal” problem
Happens when walls or other structures
obstruct the RF signals
Station C may hear A and B, but they may not
be able to communicate with one another
IEEE 802.11 protocol
within their cells or
basic service set (BSS),
stations ( ) can
communicate with one
another and with an
access point (AP).
Through the AP
stations may
communicate with
stations in another BSS
Stations may also form
an ad-hoc network
(without an AP)
BSS
AP
BSS
AP
IEEE 802.11 protocol:
handling interference
Uses CSMA/CA
Carrier Sense Multiple Access/Collision Avoidance
If channel is sensed idle for time equal to DIFS
(distributed inter frame space), station may transmit
Receiver of a correct frame then sends ack frame to
sender after short time (SIFS=short interframe spacing)
If channel is busy:
sender defers access, listens again
If quiet for DIFS, xmits after random backoff time expires
Use of random prevents all waiting from sending at same
time
Doesn’t detect collisions, tries to avoid
Transmitted frame contains transmission duration; others
know how long to wait
IEEE 802.11:
handling the hidden
station problem
Two stations (A and B) hidden from one
another may transmit to same station (C)
Can use RTS (request to send) and CTS (clear to
send) exchange of frames before actual
transmission
B will “hear” C send the CTS to A and will not
send.
LAN-to-WAN connection
LANS usually connect to WANS through
dedicated leased lines at T1 (1.544 Mbps)
or T3 (45 Mbps) speeds
LANS may be of any type
Example:
3 LANS: 1 FDDI, 1 ethernet, 1 Token ring
Each connects to Frame Relay WAN through
router and T1 line
Home to WAN
connection
Many alternatives:
Dialup modem, 14.4 - 15.6 Kbps
simple, cheap
ISDN Basic Rate Interface (BRI)
Dialup digital modem
Speed up to 128 Kbps
ISDN Primary Rate Interface (PRI)
1.544 Mbps
Leased T1 line
1.544 Mbps
Home-WAN connection,
continued
High Bit Rate Digital Subscriber Line (HDSL)
1.544 Mbps
Asymmetric Digital Subscribe Line (ADSL)
640 Kbps outbound
6 Mbps inbound
Good for web access (http requests are small, returned
images, videos, etc. may be large)
Cable modems
Cable is shared; actual bandwidth seen by customer
depends on load on network
Most cable modems are asymmetric
Typical speeds 1- 10 Mbps downstream, 128 Kbps
upstream
Protocols
Purpose
IP
TCP
Protocols
Protocol:(in this context)
A set of rules governing communication between two
computers or two processes over a computer network
Consists of functions/rules for:
Addressing
Routing
Together ensure that message from A to B arrives at B
Error detection
Error recovery
Sequence control
Together handle situation in which messages from A are lost
or corrupted due to noise or network failures
Flow control
To handle situation in which A sends at faster rate than B can
consume
Protocols
Connectionless
Messages from A to B are independent of one
another; may arrive at destination in order
different from transmission order
Think of mailing off a batch of postcards that
together contain the content of a novel
Good when data to be exchanged fit into maximum
data unit (all fits on one postcard)
Protocols
Connection-oriented
Used when messages that are much larger than
the maximum data unit are transmitted
Sequencing and data recovery important
Think of making a phone call: a connection is
set up, and the channel remains open for
transmission until you disconnect
Protocol specification
Syntax
Specifies the types of messages that can be sent, the
format of those messages, and the meaning of each field
in the message
Semantics
Specifies the actions taken by each entity when specific
events occur
Example: when a message arrives, when a message times
out, etc.
Protocol specification
ISO (International Standards Organization)
defined a seven-layer model, the Reference
Model for Open Systems Interconnection
1.
2.
3.
4.
5.
6.
7.
Physical
Data link
Network
Transport
Session
Presentation
Application
(ISO) OSI
Each entity at layer n communicates only with
remote nth-layer entities
Layer n uses local services provided by layer n-1
N-th
layer
N-th layer protocol
(N-1)th
layer
N-th
layer
(N-1)th
layer
network
Protocol Layers
Data exchanged between nth-layer entities
have to be:
physically processed by layers n to 1 at the
sending computer
Transported through the network
Moved from layer 1 to n at the receiving end
Protocol Layers
Each entity at layer n exchanges a Protocol
Data Unit (PDU) with a remote layer n
entity
PDU has:
Layer n data
Layer n header
The layer n PDU becomes layer (n-1) data:
Layer (n-1)
header
Layer n
header
Layer n data
<---------- layer (n-1) data ---------------------->
TCP/IP
IP = Internet Protocol
a network layer protocol
TCP = Transmission Control Protocol
A transport layer protocol
Connection-oriented
UDP = User Datagram Protocol
A transport layer protocol
A connectionless protocol
Together: TCP/IP protocol suite; forms the core of
the internet
On top of TCP:
HTTP- hypertext transfer protocol (web)
FTP - file transfer protocol
SMTP - simple mail transfer protocol
Telnet - an interactive login protocol
On top of UDP:
RPC - remote procedure call
NFS - network file sytem - runs on top of RPC
DNS - Domain Name Server
SNMP - Simple Network Management
Protocol
Internet Protocol (IP)
Specifies:
the formats of packets sent across the
Internet,
the mechanisms used to forward these packets
through a collection of networks
Routers from source to destination
Internet Protocol (IP)
Every host connected to the internet has a
unique address: an IP address
A 32-bit number
Represented by a dotted notation: 129.192.4.5,
for example
Each of the four numbers represents the value
of 8 bits in the address
Divided into prefix and suffix
Prefix: indicates the network
Suffix: host within the network
Internet Protocol (IP)
Number of bits allocated to prefix
determines number of unique network
numbers
Number of bits allocated to suffix
determines number of hosts per network
Currently, IPv4 uses 32-bit address field
But … may be approaching limits of number
of servers to be on …
IP v 6 uses 128 bits
Internet Protocol (IP)
IP datagram: the data unit transported by
IP
IP is connectionless and
can “lose” datagrams
can deliver datagrams out-of-order (may travel
to destination by different routes)
is known as “best effort” service
Internet Protocol (IP)
Header is 20 bytes long
4 bytes for IP address of source
4 bytes for IP address of destination
Performs routing of datagrams from source to
destination
IP implementation at router maintains an inmemory routing table; used to search for next
router or host to which to forward the datagram.
Tuesday’s class
Kelly,
Stop here …
On Thursday, pick up here with TCP…
Thanks!
Ek
Transmission Control
Protocol (TCP)
Provides a:
Connection-oriented
Reliable
Flow-controlled
End-to-end
communication service between processes
residing at hosts connected through a
network
Transmission Control
Protocol (TCP)
Guarantees that data is delivered and is inorder
Provides full-duplex communication
Both ends of the connection can communicate
simultaneously
Provides a stream interface
Accepts a continuous stream of bytes from the
application to be sent through the connection
Transmission Control
Protocol (TCP)
PDU exchanged at the TCP level is called a
segment
Header is 20 bytes long
Segments reside within IP datagrams
Connection must be established before
data can be exchanged:
Three-way handshake is mechanism for
establishing connections
TCP: three-way
handshake
host A establishes connection with host B
A sends SYN(synchronization) segment to B (#1)
B replies with a SYN segment(#2); places A in queue of
incomplete connections
A sends ACK to B (#3 of 3-way handshake)
Connection complete when B receives ACK
Data may now be exchanged in both directions
Note: Denial of Service attack:
fills queue of incomplete connections(A never sends the
ACK) -- host can’t accept new connections
TCP - closing the
connection
TCP connection closing is half-close
Host closing connection indicates that host won’t send
more data but is still willing to receive
Host A to host B
Host A sends FIN to host B
B replies with ACK
B can still send to A
To completely close the connection
Host B sends FIN to host A
A replies with ACK
So, total of 4 segments to close the connection
TCP
Error control:
Handled via ACKs, timeouts, and retransmission
Flow control:
Implemented via a sliding window mechanism
Window = max num bytes sent before an ACK is
received
Window size limited by
Buffer size at receiver
Network congestion perceived by sender
TCP flow control
Connection first established
Receiver advertises Wm - its maximum window
size
Sender’s window size can’t exceed this or
receiver will have buffer overflow and packets
will be lost
Network congestion causes some packets to be
dropped at some router or to not be ack’d
before a timeout occurs
Sender then reduces window size to Wc , which
reduces its transmission rate and attempts to mitigate
network congestion
TCP flow control
Two phases:
Slow start
Wc is initialized to 1, increased by 1 for each ACK
received (note ACK count is cumulative)
Wc = 1, 1 packet sent, ACK’d in 1 RTT -> Wc now 2
Wc = 2, 2 packets sent, ACK’d in 1 RTT ->Wc now 4
Wc = 4, 4 packets sent, ACK’s in 1 RTT -> Wc now 8
… doubles every RTT up to max of Wm
Congestion avoidance
TCP control flow
Network congestion detected by
Receipt of a duplicate ACK (receiver received
an out-of-sequence segment)
Timeout at the sender
Response:
Save current value of window size to Wssthr (slow
start threshhold window size)
Reduce Wc
TCP flow control
TCP actually has different versions.
TCP Reno (a common version) reduction of
window size upon congest works like this:
If duplicate ACK received: Wc divided by two,
enter congestion avoidance phase
If a timeout occurs, Wc set to 1; go back to slow
start
When Wc reaches Wssthr during slow start, switch
to congestion avoidance
TCP flow control
Congestion avoidance phase:
Wc increased by 1/Wc for every ACK received
XTCP, throughput of a TCP connection,
measured in segments per second:
Decreases with RTT
Decreases with probability packets are dropped
Increases with Wm, measured in segments
Decreases with T0, TCP timeout value
TCP performance and
limitations …
… some derivation of performance
formulas, to be performed when EK returns
…
Client/Server
Client/Server paradigm
Server types
HTTP
Client/Server Paradigm
Client process:
Runs on desktop or user workstation and provids
GUI code for data capture and display
Makes requests for specific services to be
performed by one or more server processes,
usually located at remote machines
Executes a portion of the application code
Client/Server Paradigm
Server process
Executes a set of functionally-related services
that usually require a specialized
hardware/software component
Never initiates a message exchange with any
client; a passive entity that listens to client
requests, executes them, and replies to clients
Usually runs on a machine that is faster and has
more main memory and disk space than a client
machine
Client-Server
interaction protocol
Request-reply protocol
Clients send requests
Servers reply to client requests
May run on TCP, UDP, or other
Client-Server
Server may receive many requests
Forms a queue of requests at server
Serving only one request at a time may
Under-utilize machine
Limit server throughput
Increase response time to clients
Most servers create multiple processes or threads to
handle the queue of incoming requests
As number of threads increases, response time will initally
decrease (clients don’t need to wait as long) and then go up
or flatten out as the multiple threads contend for resources
Client-Server
Two-tier architecture
Client runs: GUI + application logic
Server runs: SQL server
Three-tier architecture
Client runs: GUI
Application server runs: application logic (and
acts as client to…)
SQL server
Fat clients v. thin
clients
“Fat” clients
incorporate more of the transaction logic than
thin clients
tend to require fewer interactions with server
cost: higher computing requirements for client
Client-Server: caching
Cache = copy of data stored “closer” to
client
client may keep cached copy of file
server may keep main memory cache of
recently/frequently accessed files
web server may cache popular docs
use of caches can improve performance
and increase system scalability
Client-server: caching
cache hit – data of interest found at the
cache
cache miss – not found
Benefits: better performance
Costs: need to maintain consistency of
cache, extra processing time for cache
misses
Caching - example
File server receives requests to read 8-KB
file blocks at a rate of 900 req/sec
What is the effect at the server if 30% of
requests generate a cache hit at the client?
(1-0.30) * 900 req/sec = 630 req/sec at server
What if 25% of the requests that reach the
server can be satisfied from server’s main
memory cache?
(1-0.25) * 630 req/sec = 472.5 req/sec
Server types
file servers
database servers
application servers
groupware servers
object servers
Web servers
software servers
File servers
provide networked computers with access
to a shared files system
Example: NFS (networked file system)
can use UDP (usually) or TCP/IP
Client requests:
look up directory
retrieve file attributes
read and write blocks from files
Database servers
provide access to one or more shared
databases
Client requests:
SQL statements
Server response:
list of records
Application servers
provide access to remote procedures
invoked through the Remote Procedure Call
(RPC) mechanism
typically implement business logic & make SQL
calls to backend DB engine
Transaction Processing
Monitors (TPMs)
perform load balancing amoung several
servers that implement the same service
Groupware Servers
provide access to unstructured and semistructured info such as text, images, mail,
BBoards, workflows
Object servers
support remote invocation of methods in
support of distributed object-oriented
application development
ORB = Object Request Broker – the “glue”
between clients and remote objects
Software servers
used to provide executables to Network
Computers (NCs), which do not have hard
drives
Web servers
provide access to documents, images,
sound, executables, and downloadable
applications through HTTP
HTML/HTTP
HTML = markup language for documents:
formatting, links to other documents, links
to inline images, video, software, etc.
inline images impact performance:
single click -> multiple requests
HTTP
application-level protocol, runs on top of TCP
simple request-response interaction: “Web
transaction”
map server name to IP address
establish TCP connection with server
transmit request
receive response
close TCP/IP connection
in HTTP 1.1 the connection remains open for embedded
images
HTTP request
request includes:
action (GET, HEAD, PUT, POST)
URL that identifies info requested
other: type of doc client will accept,
authentication, payment authorization, etc.
HTTP response
status line (success or failure)
meta-info about object returned and info
requested
file or output generated by server-side app
(CGI script, for example)
HTTP
“stateless”
single request/response, no continuing
conversation
servers don’t have to keep track of clients and
their histories
adverse effect on performance
new connection established for every request
in HTTP 1.0 – new connection for document + 1 for
every image on the page
HTTP 1.1
persistent connection – leaves the TCP
connection open between consecutive
operations
avoids many RTT delays
supports “pipeline of requests”
can send multiple requests without waiting for
a response
Peer-to-Peer Model
used in sharing files, disk space, even
computing cycles
two styles
meta-data server
more messages
purely distributed
more scalable
Web Service Protocols
Intro
SOAP
WSDL
UDDI
Web Service
business functionality exposed by a
company, for the purpose of allowing
another company or software program to
use the service
Components:
service provider
service registry
service requester
WSDL
Web services description language
XML format
describes services as endpoints operating on
messages containing either document-oriented
or procedure-oriented information
describes what service does, where it is, how
to invoke it
UDDI
Universal Description, Discovery,and
Integration
for finding services that meet requirements