Protocols and Interaction Models for Web Services

Download Report

Transcript Protocols and Interaction Models for Web Services

Protocols and
Interaction Models for
Web Services
CSCI 8710
Fall 2006
Outline
Networks
Client/Server
Peer-to-Peer
Web Service Protocols
Networks
Originated with ARPANET
Packet-switched
Started in late 1960’s
ARPA - Advanced Research Projects Agency (now
known as DARPA)
Then: about 10 nodes (~100 million today)
Goal: resource sharing
Result: reached goal + demonstrated
importance of networks as tool for
communication and interaction via email
Networks
Kleinrock at UCLA
Did much of the early work in queueing models
of networks, measurement and management of
networks and network protocols
Networks
1983: ARPANET split into two networks:
MILNET (military purposes)
ARPANET (reduced version)
Term “Internet” first introduced
 -- came into being in 1983
Today’s Internet
Worldwide collection of interconnected
WANs
Characterized by:
IP - networking protocol
TCP - process-to-process protocol
Types of Networks
WANs
LANs
LAN-to-WAN connection
Home-to-WAN connection
WANs
 Wide-area network
 City-wide, muti-city, country-wide, continent …
 Uses packet switching:
 Messages transmitted between hosts are broken down
into chunks called packets with some max size
 Packet header has routing and sequencing info
 Consists of:
 Packet switches or routers
 High speed links connecting the routers
Routers
Communication computers that:
Store incoming packets
Examine headers
Look up routing tables
Decide next router to send packet to
Place packet on output queue for selected link
Known as “store and forward”
Technologies used to
build WANs
X25 - a standard of the ITU (International
Telecommunications Union)
ISDN - Integrated Services Digital Network
Service offered by telephone companies;
integrates voice and data over ordinary
telephone lines
Frame Relay
A high-speed WAN service offered by longdistance carriers
Technologies used to
build WANs (cont’d)
SMDS - Switched Multi-megabit Data
Service
Another high-speed WAN service offered by
long-distance carriers
ATM - Asynchronous Transfer Mode
Packet-switched technology that uses small
fixed-size packets (53 bytes), called cells, to
provide fast switching to voice, video, and data
over WANs
LANs
Typically confined to a building or set of
closely located buildings
Both wired and wireless
Most popular:
10-Mbps Ethernet
100-Mbps Ethernet
4- or 16-Mbps Token Ring
100-Mbps FDDI
Ethernet
Invented by Metcalfe in the early 1970s
Have bus topology:
Shared bus
computer
Network interface card (NIC)
Ethernet
Computers connect to a shared coaxial
cable through NICs
Packets transmitted by one NIC can be
received by all others: broadcast
communication
But, since packets have destination addresses,
only destination NIC will (typically) copy the
packet to the computer’s main memory
Ethernet
 All NICs can try to talk at once
 which reminds EK of a faculty meeting ;^)
 No central coordinator
 CSMA/CD is used:
 Carrier Sense with Multiple Access / Collision Detection
 NIC that wants to transmit “listens” to see if a
transmission is in progress (carrier sense)
 If so, it waits
 But two may start “talking” at about the same time
(collision)
 Collisions are detected and NICs stop “talking”, wait for
a randomly selected time period(so they don’t just
collide again), and then try to retransmit
Ethernet
What happens as number of nodes and/or
traffic increases?
 probability of collision increases
Network throughput decreases
Because more bandwidth spent on collisions and
retransmissions
Token Ring
Invented at IBM Research Labs
Based on a ring topology:
= computer
= Network interface card (NIC)
ring
Token Ring
Access to ring is controlled by a token
(special bit pattern that circulates in the
ring)
NIC with token can transmit
If NIC has nothing to transmit, just passes
the token
Token Ring
Sender (NIC with token) inserts the bits
representing its packet into the ring
Packet goes around the ring, is copied by
the NIC of the destination address
Packet flows back around to sender; sender
removes packet and performs error
checking (compares received packet with
sent packet)
Token Ring
As more stations are added to the token
ring …
Delay in obtaining token increases
Token must circulate through more NICs
Probability that token is used by other NICs increases
FDDI
Uses optical fibers and token-passing
Differs in that it uses 2 rings
= computer
= Network interface card (NIC)
ring
FDDI
Benefits of 2nd ring?
Data flows in opposite directions in the 2 rings
If a station fails, the hardware can reconfigure
the ring and turn it into a single functioning
ring by bypassing the malfunctioning station
LANs: limits on size
Two types of limits:
Physical limits
Performance limits
Examples:
Ethernet physical limits:
Cable no more than 500 m in length
Mininum separation of 3 m between stations
Managing performance
limits on LANs
Number of stations may be limited because
of adverse impact of additional stations on
overall performance
Approach:
Can divide larger LAN into LAN segments with
fewer stations each
Segments joined by connecting devices such as
routers and bridges
Stations that communicate frequently should be
in same segment
Wireless LANs
Stations communicate via RF (radio
frequency)
Modulation of transmitted wave is interpreted
as sequence of 0s and 1s
IEEE standard for LANs is the IEEE 802.11
protocol
Can transmit data at 1 or 2 Mbps depending on
underlying modulation technology
IEEE 802.11 protocol
Provides a carrier sense signal that
indicates if a transmission is in progress
Data sent by one station can be received by
all stations in the coverage area
Subject to the “hidden terminal” problem
Happens when walls or other structures
obstruct the RF signals
Station C may hear A and B, but they may not
be able to communicate with one another
IEEE 802.11 protocol
 within their cells or
basic service set (BSS),
stations ( ) can
communicate with one
another and with an
access point (AP).
 Through the AP
stations may
communicate with
stations in another BSS
 Stations may also form
an ad-hoc network
(without an AP)
BSS
AP
BSS
AP
IEEE 802.11 protocol:
handling interference
 Uses CSMA/CA
 Carrier Sense Multiple Access/Collision Avoidance
 If channel is sensed idle for time equal to DIFS
(distributed inter frame space), station may transmit
 Receiver of a correct frame then sends ack frame to
sender after short time (SIFS=short interframe spacing)
 If channel is busy:
 sender defers access, listens again
 If quiet for DIFS, xmits after random backoff time expires
 Use of random prevents all waiting from sending at same
time
 Doesn’t detect collisions, tries to avoid
 Transmitted frame contains transmission duration; others
know how long to wait
IEEE 802.11:
handling the hidden
station problem
Two stations (A and B) hidden from one
another may transmit to same station (C)
Can use RTS (request to send) and CTS (clear to
send) exchange of frames before actual
transmission
B will “hear” C send the CTS to A and will not
send.
LAN-to-WAN connection
LANS usually connect to WANS through
dedicated leased lines at T1 (1.544 Mbps)
or T3 (45 Mbps) speeds
LANS may be of any type
Example:
3 LANS: 1 FDDI, 1 ethernet, 1 Token ring
Each connects to Frame Relay WAN through
router and T1 line
Home to WAN
connection
Many alternatives:
Dialup modem, 14.4 - 15.6 Kbps
 simple, cheap
ISDN Basic Rate Interface (BRI)
Dialup digital modem
Speed up to 128 Kbps
ISDN Primary Rate Interface (PRI)
1.544 Mbps
Leased T1 line
1.544 Mbps
Home-WAN connection,
continued
 High Bit Rate Digital Subscriber Line (HDSL)
 1.544 Mbps
 Asymmetric Digital Subscribe Line (ADSL)
 640 Kbps outbound
 6 Mbps inbound
 Good for web access (http requests are small, returned
images, videos, etc. may be large)
 Cable modems
 Cable is shared; actual bandwidth seen by customer
depends on load on network
 Most cable modems are asymmetric
 Typical speeds 1- 10 Mbps downstream, 128 Kbps
upstream
Protocols
Purpose
IP
TCP
Protocols
 Protocol:(in this context)
 A set of rules governing communication between two
computers or two processes over a computer network
 Consists of functions/rules for:
 Addressing
 Routing
 Together ensure that message from A to B arrives at B
 Error detection
 Error recovery
 Sequence control
 Together handle situation in which messages from A are lost
or corrupted due to noise or network failures
 Flow control
 To handle situation in which A sends at faster rate than B can
consume
Protocols
Connectionless
Messages from A to B are independent of one
another; may arrive at destination in order
different from transmission order
Think of mailing off a batch of postcards that
together contain the content of a novel
Good when data to be exchanged fit into maximum
data unit (all fits on one postcard)
Protocols
Connection-oriented
Used when messages that are much larger than
the maximum data unit are transmitted
Sequencing and data recovery important
Think of making a phone call: a connection is
set up, and the channel remains open for
transmission until you disconnect
Protocol specification
 Syntax
 Specifies the types of messages that can be sent, the
format of those messages, and the meaning of each field
in the message
 Semantics
 Specifies the actions taken by each entity when specific
events occur
 Example: when a message arrives, when a message times
out, etc.
Protocol specification
 ISO (International Standards Organization)
defined a seven-layer model, the Reference
Model for Open Systems Interconnection
1.
2.
3.
4.
5.
6.
7.
Physical
Data link
Network
Transport
Session
Presentation
Application
(ISO) OSI
 Each entity at layer n communicates only with
remote nth-layer entities
 Layer n uses local services provided by layer n-1
N-th
layer
N-th layer protocol
(N-1)th
layer
N-th
layer
(N-1)th
layer
network
Protocol Layers
Data exchanged between nth-layer entities
have to be:
 physically processed by layers n to 1 at the
sending computer
Transported through the network
Moved from layer 1 to n at the receiving end
Protocol Layers
Each entity at layer n exchanges a Protocol
Data Unit (PDU) with a remote layer n
entity
PDU has:
Layer n data
Layer n header
The layer n PDU becomes layer (n-1) data:
Layer (n-1)
header
Layer n
header
Layer n data
<---------- layer (n-1) data ---------------------->
TCP/IP
 IP = Internet Protocol
 a network layer protocol
 TCP = Transmission Control Protocol
 A transport layer protocol
 Connection-oriented
 UDP = User Datagram Protocol
 A transport layer protocol
 A connectionless protocol
 Together: TCP/IP protocol suite; forms the core of
the internet
On top of TCP:
HTTP- hypertext transfer protocol (web)
FTP - file transfer protocol
SMTP - simple mail transfer protocol
Telnet - an interactive login protocol
On top of UDP:
RPC - remote procedure call
NFS - network file sytem - runs on top of RPC
DNS - Domain Name Server
SNMP - Simple Network Management
Protocol
Internet Protocol (IP)
Specifies:
 the formats of packets sent across the
Internet,
 the mechanisms used to forward these packets
through a collection of networks
Routers from source to destination
Internet Protocol (IP)
Every host connected to the internet has a
unique address: an IP address
A 32-bit number
Represented by a dotted notation: 129.192.4.5,
for example
Each of the four numbers represents the value
of 8 bits in the address
Divided into prefix and suffix
Prefix: indicates the network
Suffix: host within the network
Internet Protocol (IP)
Number of bits allocated to prefix
determines number of unique network
numbers
Number of bits allocated to suffix
determines number of hosts per network
Currently, IPv4 uses 32-bit address field
But … may be approaching limits of number
of servers to be on …
IP v 6 uses 128 bits
Internet Protocol (IP)
IP datagram: the data unit transported by
IP
 IP is connectionless and
 can “lose” datagrams
 can deliver datagrams out-of-order (may travel
to destination by different routes)
 is known as “best effort” service
Internet Protocol (IP)
 Header is 20 bytes long
 4 bytes for IP address of source
 4 bytes for IP address of destination
 Performs routing of datagrams from source to
destination
 IP implementation at router maintains an inmemory routing table; used to search for next
router or host to which to forward the datagram.
Tuesday’s class
Kelly,
Stop here …
On Thursday, pick up here with TCP…
Thanks!
Ek
Transmission Control
Protocol (TCP)
Provides a:
Connection-oriented
Reliable
Flow-controlled
End-to-end
communication service between processes
residing at hosts connected through a
network
Transmission Control
Protocol (TCP)
Guarantees that data is delivered and is inorder
Provides full-duplex communication
Both ends of the connection can communicate
simultaneously
Provides a stream interface
Accepts a continuous stream of bytes from the
application to be sent through the connection
Transmission Control
Protocol (TCP)
PDU exchanged at the TCP level is called a
segment
Header is 20 bytes long
Segments reside within IP datagrams
Connection must be established before
data can be exchanged:
Three-way handshake is mechanism for
establishing connections
TCP: three-way
handshake
 host A establishes connection with host B
 A sends SYN(synchronization) segment to B (#1)
 B replies with a SYN segment(#2); places A in queue of
incomplete connections
 A sends ACK to B (#3 of 3-way handshake)
 Connection complete when B receives ACK
 Data may now be exchanged in both directions
 Note: Denial of Service attack:
 fills queue of incomplete connections(A never sends the
ACK) -- host can’t accept new connections
TCP - closing the
connection
 TCP connection closing is half-close
 Host closing connection indicates that host won’t send
more data but is still willing to receive
 Host A to host B
 Host A sends FIN to host B
 B replies with ACK
 B can still send to A
 To completely close the connection
 Host B sends FIN to host A
 A replies with ACK
 So, total of 4 segments to close the connection
TCP
Error control:
Handled via ACKs, timeouts, and retransmission
Flow control:
Implemented via a sliding window mechanism
Window = max num bytes sent before an ACK is
received
Window size limited by
 Buffer size at receiver
 Network congestion perceived by sender
TCP flow control
Connection first established
Receiver advertises Wm - its maximum window
size
Sender’s window size can’t exceed this or
receiver will have buffer overflow and packets
will be lost
Network congestion causes some packets to be
dropped at some router or to not be ack’d
before a timeout occurs
Sender then reduces window size to Wc , which
reduces its transmission rate and attempts to mitigate
network congestion
TCP flow control
Two phases:
Slow start
Wc is initialized to 1, increased by 1 for each ACK
received (note ACK count is cumulative)
Wc = 1, 1 packet sent, ACK’d in 1 RTT -> Wc now 2
Wc = 2, 2 packets sent, ACK’d in 1 RTT ->Wc now 4
Wc = 4, 4 packets sent, ACK’s in 1 RTT -> Wc now 8
… doubles every RTT up to max of Wm
Congestion avoidance
TCP control flow
Network congestion detected by
Receipt of a duplicate ACK (receiver received
an out-of-sequence segment)
Timeout at the sender
Response:
Save current value of window size to Wssthr (slow
start threshhold window size)
Reduce Wc
TCP flow control
TCP actually has different versions.
TCP Reno (a common version) reduction of
window size upon congest works like this:
If duplicate ACK received: Wc divided by two,
enter congestion avoidance phase
If a timeout occurs, Wc set to 1; go back to slow
start
When Wc reaches Wssthr during slow start, switch
to congestion avoidance
TCP flow control
Congestion avoidance phase:
Wc increased by 1/Wc for every ACK received
XTCP, throughput of a TCP connection,
measured in segments per second:
Decreases with RTT
Decreases with probability packets are dropped
Increases with Wm, measured in segments
Decreases with T0, TCP timeout value
TCP performance and
limitations …
… some derivation of performance
formulas, to be performed when EK returns
…
Client/Server
Client/Server paradigm
Server types
HTTP
Client/Server Paradigm
Client process:
Runs on desktop or user workstation and provids
GUI code for data capture and display
Makes requests for specific services to be
performed by one or more server processes,
usually located at remote machines
Executes a portion of the application code
Client/Server Paradigm
Server process
Executes a set of functionally-related services
that usually require a specialized
hardware/software component
Never initiates a message exchange with any
client; a passive entity that listens to client
requests, executes them, and replies to clients
Usually runs on a machine that is faster and has
more main memory and disk space than a client
machine
Client-Server
interaction protocol
Request-reply protocol
Clients send requests
Servers reply to client requests
May run on TCP, UDP, or other
Client-Server
 Server may receive many requests
 Forms a queue of requests at server
 Serving only one request at a time may
 Under-utilize machine
 Limit server throughput
 Increase response time to clients
 Most servers create multiple processes or threads to
handle the queue of incoming requests
 As number of threads increases, response time will initally
decrease (clients don’t need to wait as long) and then go up
or flatten out as the multiple threads contend for resources
Client-Server
Two-tier architecture
Client runs: GUI + application logic
Server runs: SQL server
Three-tier architecture
Client runs: GUI
Application server runs: application logic (and
acts as client to…)
SQL server
Fat clients v. thin
clients
“Fat” clients
incorporate more of the transaction logic than
thin clients
tend to require fewer interactions with server
cost: higher computing requirements for client
Client-Server: caching
Cache = copy of data stored “closer” to
client
client may keep cached copy of file
server may keep main memory cache of
recently/frequently accessed files
web server may cache popular docs
use of caches can improve performance
and increase system scalability
Client-server: caching
cache hit – data of interest found at the
cache
cache miss – not found
Benefits: better performance
Costs: need to maintain consistency of
cache, extra processing time for cache
misses
Caching - example
File server receives requests to read 8-KB
file blocks at a rate of 900 req/sec
What is the effect at the server if 30% of
requests generate a cache hit at the client?
(1-0.30) * 900 req/sec = 630 req/sec at server
What if 25% of the requests that reach the
server can be satisfied from server’s main
memory cache?
(1-0.25) * 630 req/sec = 472.5 req/sec
Server types
file servers
database servers
application servers
groupware servers
object servers
Web servers
software servers
File servers
provide networked computers with access
to a shared files system
Example: NFS (networked file system)
can use UDP (usually) or TCP/IP
Client requests:
look up directory
retrieve file attributes
read and write blocks from files
Database servers
provide access to one or more shared
databases
Client requests:
SQL statements
Server response:
list of records
Application servers
provide access to remote procedures
invoked through the Remote Procedure Call
(RPC) mechanism
typically implement business logic & make SQL
calls to backend DB engine
Transaction Processing
Monitors (TPMs)
perform load balancing amoung several
servers that implement the same service
Groupware Servers
provide access to unstructured and semistructured info such as text, images, mail,
BBoards, workflows
Object servers
support remote invocation of methods in
support of distributed object-oriented
application development
ORB = Object Request Broker – the “glue”
between clients and remote objects
Software servers
used to provide executables to Network
Computers (NCs), which do not have hard
drives
Web servers
provide access to documents, images,
sound, executables, and downloadable
applications through HTTP
HTML/HTTP
HTML = markup language for documents:
formatting, links to other documents, links
to inline images, video, software, etc.
inline images impact performance:
single click -> multiple requests
HTTP
 application-level protocol, runs on top of TCP
 simple request-response interaction: “Web
transaction”
 map server name to IP address
 establish TCP connection with server
 transmit request
 receive response
 close TCP/IP connection
 in HTTP 1.1 the connection remains open for embedded
images
HTTP request
request includes:
action (GET, HEAD, PUT, POST)
URL that identifies info requested
other: type of doc client will accept,
authentication, payment authorization, etc.
HTTP response
status line (success or failure)
meta-info about object returned and info
requested
file or output generated by server-side app
(CGI script, for example)
HTTP
“stateless”
single request/response, no continuing
conversation
servers don’t have to keep track of clients and
their histories
adverse effect on performance
new connection established for every request
in HTTP 1.0 – new connection for document + 1 for
every image on the page
HTTP 1.1
persistent connection – leaves the TCP
connection open between consecutive
operations
avoids many RTT delays
supports “pipeline of requests”
can send multiple requests without waiting for
a response
Peer-to-Peer Model
used in sharing files, disk space, even
computing cycles
two styles
meta-data server
more messages
purely distributed
more scalable
Web Service Protocols
Intro
SOAP
WSDL
UDDI
Web Service
business functionality exposed by a
company, for the purpose of allowing
another company or software program to
use the service
Components:
service provider
service registry
service requester
WSDL
Web services description language
XML format
describes services as endpoints operating on
messages containing either document-oriented
or procedure-oriented information
describes what service does, where it is, how
to invoke it
UDDI
Universal Description, Discovery,and
Integration
for finding services that meet requirements