Transcript cap1-2003

Content
•
•
•
•
•
•
•
Introduction
TCP Clients
Iterative TCP Servers
Concurent TCP Servers
UDP
Multicasting
http://www.dcc.uchile.cl/~nbaloian/tampere
Evaluation
• 4 Homeworks
• Final Exam
Introduction
October 2003
Why distributed systems
- Share resources
- Communicate people
- Performance, scalability
- Fault tolerant systems
We know already how
computers communicate but...
... how do programs
communicate?
PROG1
PROG2
They need to establish a protocol !
- Who send the data first
- What kind of data
- How to react to the data
Every layer has the illusion of talking to
the same one located at the other host
A CLIENT
The UDP: UserRead
Defined
write
Package:
sequence
like writing a letter
A SERVER
4444
UDP or TCP communication
A CLIENT
Internet frames and addresses
A CLIENT
electric
pulses
Decisions when Developing a
Distributed System
• Which service from the transport layer are we
going to use (TCP, UDP, or a middleware)
• Software architecture: replicated, centralized
• Communications architecture: centralized,
networked
• Server design: concurrent, iterative, stateless, with
state
• Etc…
Internet : two different ways to deliver a
message to another application
Applications’ programmers decide on this according to their needs
The UDP: User Defined Package: like writing a letter
TCP or UDP
Nowadays there is a lot of middleware
which make distributed programming
much easier
Libraries for distributed
programming (middleware)
RPC, CORBA, RMI
Why Client/Server ?
It is a communication protocol model (listener/caller)
• TCP/IP does not provide any mechanism which would start running
a program in a computer when a message arrives. A program must
be executing BEFORE the message arrives in order to establish a
communication (daemons).
•
Is there really no other mean to communicate ?
– Multicasting (but the sender does not know who is receiving and
in this case there is no dialogue)
• What are the protocol ports of a server machine ?
– It is a virtual address inside the machine at a server listening to
client requirements asking for a certain service. In most Unix
machines there are “well known ports” which are associated to a
server program providing a service trough a protocol. Port number
and protocol should be well known.
The client-server paradigm
(do you remember the WEB ?)
The web
server program
Web
resources
answer
request
THE INTERNET
answer
The web
client program
request
1- The server opens a channel and
starts listening to requests.
A SERVER
1 ?
THE INTERNET
Web
resources
A CLIENT
2- A client who knows it, sends a
request and waits for the answer
A SERVER
2
THE INTERNET
Web
resources
2
A CLIENT
3- The server, analyses the request and
answers properly according to the
protocol
A SERVER
Web
resources
3
THE INTERNET
3
This may involve the
reading of a file
A CLIENT
The Client-Server Model
invocación
Servidor2
Cliente
resultado
Servidor1
Cliente
Servidor3
Services Provided by Multiple
Servers
Server 1
Client
Server 2
Client
Server 3
Proxy servers & caches
Server 1
Client
Proxy/cache
Client
Server 2
Peer-top-peer Applications
Application
+
Coordination
Application
+
Coordination
Application
+
Coordination
Communication Architectures for
Distributed Applications
• Servers as Clients
– Programms do not behave as pure servers or as pure clients. For
example, a file server can ask another compter for a timestamt to
register the last change of a file.
– When all application must behave at the same time as client and
server we can organize the communication in two basic ways:
• Every application can open a communication channel with each other
application (network configuration): P2P applications
• There is a commincation server and all applications open one
communication channel with it (star configuration): multiple chat
servers.
Network communication
architecture
• Every application opens an exclusive channel qith each other
application present in the session
• There may be up to n*(n-1)/2 channels open for n applications
• Advantages:
– It avoids bottlencks in the communications
• Drawbacks:
– All applications must be aware of all other taking part in the session
– The dynamic is more complicated when managing consistency when
applications enter and quit the session
Star communication architecture
• The applications open a channel with the server and send their
communication requests to the server. This Las aplicaciones
envían sus requerimientos de comunicación a un servidor y éste se
encarga de mandarlas a su punto de destino final.
• There are up to n channels open for n applications
• Advantages:
– The managing og the communication parameters is more easy to manage
– The problem of incomming and outgoing of applications is more easy to
tackle
• Drawbacks:
– The server can get oveloaded
– The channels may get overloaded.
Replicated Architecutres
• Every application has a copy of the application and the
data
• The modifications (data) are distributed to all participants
in some way
• Synchronization is normally achieved by distributing the
events, not the state of the data
• Problems with latecommers
• Communication architecture may be that of a star or
network type
Replicated Architecture
Data
Data
Data
view
Data
Appl
Semi-replicated Architectures
• Data are kept centralized by a single application
• Every client mantains its own actualized view of the data
• There is a single data model, while the views and
controllers are replicated
• Permits the use of different interfaces (browser)
• Synchronisation by events or by state
• Communication architecture normaly centralized (the data
are located at the server)
Semi-replicated Architecture
Data
Data
Data
Centralized Architecture
•
•
•
•
•
Data and view are mantained centralized
Every client has a graphic server for displaying the view
Synchonization by state (the view)
Communications architecture centralized
It provoques a big traffic of data over the network (the
whole view is transmitted)
• Are frecuently of general use (like netmeeting)
Full centralized Architecture
view / commands
view / commands
Implementation of Communications
in a TCP/IP Network
• At a low level (¿future “assembler of the communications”?)
- Based on the “sockets” & “ports” abstractions
- Originally developed for BSD UNIX but now present in almost all
systems (UNIX, LINUX, Macintosh OS, Windows)
-The destination of a message is determined by the computer’s IP
number and the port number
- Every machine has 2**16 ports
- The origin of the message is also a socket but most of the times the port
number is not important
- Ports are asociated to services (programms)
The 3 basic communication forms
• UDP communication reflects almost what really happens
over the internet. An application sends a packet trough a
socket addressed to a certain IP number and port. There
should be another application on that host listening to
packets comming to that port (which is agreed beforehand)
• TCP simulates a data flow. A client must establish a
communication with the server before starting
sending/receiving data. The server must be waiting for sucha
request.
• Multicast fits well for group communication when the
group is not well defined beforehand (sponaneous
networking). It is also based in the sending of UDP packages
but all “interested” applications may receive it. It does not
require a central server
Protocolos for communication
• Every service is normally identifyed by a port
–
–
–
–
Web: HTTP (port 80)
Mail: SMTP
File transfer protocol: FTP (21)
telnet: 22/23
• Servers with/without Connection
– connectionless style: UDP
– connection-oriented style TCP
The channel which server and client
use to communicate (either int TCP or
UDP) is called SOCKET
When a server wants to start listening it must create a socket
bound to a port. The port is specified with a number.
www.thisserver.jp
A SERVER 1
4444
A SERVER 2
3333
A SERVER 3
5555
If a client wants to communicate with server 1 should try to
communicate with computer www.thisserver.jp through port 4444
UDP: communication with datagrams
DATAGRAM: an independent, self-contained message sent over
the internet whose arrival, arrival time and content are not
guaranteed (like regular mail in some countries....)
Once a server is listening, the client should create a datagram
with the server’s address, port number and, the message
www.waseda1.jp
A SERVER
www.waseda2.jp
A CLIENT
?
4444
www.waseda1.jp
4444
message
Sending datagrams with UDP protocol
Then it should open a socket and send the datagram
to the internet. The “routing algorithm” will find the
way to the target computer
www.waseda1.jp
A SERVER
www.waseda2.jp
A CLIENT
?
4444
3333
Sending datagrams with UDP protocol
Before the datagram leaves the client, it receives the
address of the originating computer and the socket
number
www.waseda1.jp
A SERVER
www.waseda2.jp
A CLIENT
!
4444
3333
Sending datagrams with UDP protocol
After the datagram is sent, the client computer may
start hearing at the port created for sending the
datagram if an answer from the server is expected
www.waseda1.jp
www.waseda2.jp
?
A SERVER
4444
3333
A CLIENT
Sending datagrams with UDP protocol
The server can extract the client’s address and port
number to create another datagram with the answer
www.waseda1.jp
www.waseda2.jp
?
A SERVER
4444
answer
3333
A CLIENT
Sending datagrams with UDP protocol
Finally is sends the datagram with the answer to the “client”.
When a datagram is sent there is no guarantee that it will arrive
to the destination. If you want reliable communication you
should provide a checking mechanism, or use ...
www.waseda1.jp
www.waseda2.jp
?
A SERVER
4444
3333
A CLIENT
TCP: communication with data flow
With TCP a communication channel between both
computers is built and a reliable communication is
established between both computers. This allows to
send a data flow rather tan datagrams.
www.waseda1.jp
A SERVER
www.waseda2.jp
A CLIENT
?
4444
3333
TCP: communication with data flow
After the client contacts the server, a reliable channel is
established. After this, client and server may begin
sending data through this channel. The other should be
reading this data: They need a protocol !!!!
www.waseda2.jp
www.waseda1.jp
bla
A SERVER
4444
bla
bla bla
A CLIENT
3333
TCP: How is reliability achieved ?
The internet itself works only with the datagram paradigm. Internet
frames are may “get lost” (destroyed): For every frame delivered
carrying a part of the data flow there is a confirmation!
Sending
bla bla bla
Sending 1st bla
Ack 1st bla
Sending 2nd bla
Ack 2nd bla
Sending 3rd bla
Ack 3rd bla
What if a message get lost ?
The server waits a certain amount of time. If it does not receive any
confirmation it sends the message again.
Sending
bla bla bla
Sending 1st bla
Ack 1st bla
Sending 2nd bla
LOST !!!
No confirmation !!!
Sending 2nd bla again
Ack 2nd bla
The Window for improving efficiency
The transmitter will handle a set of not acknowledged packets
Sending 1st bla
Sending 2nd bla
Sending 3rd bla
Ack 1st bla
Ack 2nd bla
Ack 3rd bla
TCP or UDP Protocol:
decision at the transport level
• What does it means for the programmer/designer:
– By choosing one or the other protocol for establishing a connection
between machines the programmer/designer decides about the
reliability and speed of the communication.
• TCP provides high reliability: data are only sent if the communication
was established. An underlying protocol is responsible for retranslating,
ordering, eliminating duplicate packages
• UDP reflects just what the internet does with the packages: best effort
delivery, no checking.
– Also the programming style is quite different :
• With TCP the data is sent a flow (of bytes, in principle) which can be
written, read as if they were stored in a file.
• With UDP the programmer must assemble the package and send it to the
internet without knowing if it will arrive its pretended destination
When to use one or another
• Considerations
– TCP imposes a much higher load to the network than UDP (almost 6
times)
– We can expect high package loss when the information travels trough
many routers.
– Inside a LAN UDP communications may be reliable is there is not much
traffic. Although with some congestion we can expect some packages to
be lost inside the LAN
• In general, it is recommended especially for beginners (but also to
skilled programmers) to use only TCP to develop distributed
applications. Not only it is more reliable but the programming style is
also simpler. UDP is normally used if the application needs to
implement hardware supported broadcasting or multicasting, or if the
application cannot tolerate the overload of TCP
When do programmers should use UDP
or TCP ?
- TCP generates 6 times more traffic than UDP
- It is also slower to send and receive the messages
UDP
- not complete info
- fast
- valid in a very short
period of time
- history not important
TCP
- Reliable
- Complete
- Valid in a certain
period of time
- No need of speed
Mark with a + the applications to use
TCP and with a = those to use UDP
E-Mail
Video conference
Web server and client
Stock values every 5 seconds
Temperature every second