Transcript cap1-2006

Advanced Java Programming:
programming of distributed
application using TCP/IP
Tokyo, Jan. Feb. 2006
Nelson Baloian, Roberto Konow
Content
0. Introduction (concepts of distributed systems)
1. TCP/IP client & server programming
– Client programming:
– a simple client (date, echo)
– a pop3 client
– A STMP client
2. Server programming (and their clients)
–
–
–
–
–
–
–
–
–
simple client-server example with serializing example
File Servers: simple whole file iterative server (not secure)
simple whole file robust server
simple whole file concurrent server
stateless random access file server
TCP/IP Chat with awareness
a simple extensible web server
Parallel downloading techniques
awareness in a TC/IP peer to peer environment and the latecomers
problem
3. UDP programming
-
Content
simple UDP client-server example
a "ping" program –
multicasting
multicasting chat
awareness in a multicasting environment
broadcasting vs. multicasting
4. RMI Client-server programming
- a simple example will be used to show: remiregistry, concurrency
automatic stub distribution
- a sequential file server with state
- Automatic teller machine example
- RMI-based chat with awareness
5- introduction to servlets
- principles - parameters (from request and parameter file)
- using forms
- implementing state with cookies/sessions
Evaluation
• 3 Homeworks
• Final Exammination
Why distributed systems
- Share resources (25 years ago)
- Communicate people (now)
- Performance, scalability (always)
- Fault tolerant systems (always)
Which distributed programs
do I use daily ?
1- ICQ
2- email
3- p2p file sharing
4- web browser-server
5- database software
6- file server
Can we deduce how were they
developed ?
1- Programming language and resources used
2- Connection style
3- Communications architecture
4- Software architecture
5- Server design (if any)
What is the INTERNET ?
Internet : two different ways to deliver a
message to another application
Applications’ programmers decide on this according to their needs
The UDP: User Defined Package: like writing a letter
TCP or UDP
Every layer has the illusion of talking to
the same one located at the other host
A CLIENT
The UDP: UserRead
Defined
write
Package:
sequence
like writing a letter
A SERVER
4444
UDP or TCP communication
A CLIENT
Internet frames and addresses
A CLIENT
electric
pulses
Implementation of Communications
in a TCP/IP Network
• At a low level (¿future “assembler of the communications”?)
•
•
•
•
•
•
Based on the “sockets” & “ports” abstractions
Originally developed for BSD UNIX but now present in almost all
systems (UNIX, LINUX, Macintosh OS, Windows)
The destination of a message is determined by the computer’s IP
number and the port number
Every machine has 2**16 ports
The origin of the message is also a socket but most of the times the
port number is not important
Ports are associated to services (programs)
The 3 basic communication forms
• UDP communication reflects almost what really happens
over the internet. An application sends a packet trough a
socket addressed to a certain IP number and port. There
should be another application on that host listening to
packets coming to that port (which is agreed beforehand)
• TCP simulates a data flow. A client must establish a
communication with the server before starting
sending/receiving data. The server must be waiting for sucha
request.
• Multicast fits well for group communication when the
group is not well defined beforehand (spontaneous
networking). It is also based in the sending of UDP packages
but all “interested” applications may receive it. It does not
require a central server
Protocolos for communication
• Every service is normally identifyed by a port
–
–
–
–
Web: HTTP (port 80)
Mail: SMTP
File transfer protocol: FTP (21)
telnet: 22/23
• Servers with/without Connection
– connectionless style: UDP
– connection-oriented style TCP
The channel which server and client
use to communicate (either int TCP or
UDP) is called SOCKET
When a server wants to start listening it must create a socket
bound to a port. The port is specified with a number.
www.informatik.de
A SERVER 1
4444
A SERVER 2
3333
A SERVER 3
5555
If a client wants to communicate with server 1 should try to
communicate with computer www.informatik.de through port 4444
UDP: communication with datagrams
DATAGRAM: an independent, self-contained message sent over
the internet whose arrival, arrival time and content are not
guaranteed (like regular mail in some countries....)
Once a server is listening, the client should create a datagram
with the server’s address, port number and, the message
www.informatik.de
A SERVER
www.waseda2.jp
A CLIENT
?
4444
www.waseda1.jp
4444
message
Sending datagrams with UDP protocol
Then it should open a socket and send the datagram
to the internet. The “routing algorithm” will find the
way to the target computer
www.informatik.de
A SERVER
www.waseda2.jp
A CLIENT
?
4444
3333
Sending datagrams with UDP protocol
Before the datagram leaves the client, it receives the
address of the originating computer and the socket
number
www.informatik.de
A SERVER
www.waseda2.jp
A CLIENT
!
4444
3333
Sending datagrams with UDP protocol
After the datagram is sent, the client computer may
start hearing at the port created for sending the
datagram if an answer from the server is expected
www.informatik.de
www.waseda2.jp
?
A SERVER
4444
3333
A CLIENT
Sending datagrams with UDP protocol
The server can extract the client’s address and port
number to create another datagram with the answer
www.informatik.de
www.waseda2.jp
?
A SERVER
4444
answer
3333
A CLIENT
Sending datagrams with UDP protocol
Finally is sends the datagram with the answer to the “client”.
When a datagram is sent there is no guarantee that it will arrive
to the destination. If you want reliable communication you
should provide a checking mechanism, or use ...
www.informatik.de
www.waseda2.jp
?
A SERVER
4444
3333
A CLIENT
TCP: communication with data flow
With TCP a communication channel between both
computers is built and a reliable communication is
established between both computers. This allows to
send a data flow rather tan datagrams.
www.informatik.de
A SERVER
www.waseda2.jp
A CLIENT
?
4444
3333
TCP: communication with data flow
After the client contacts the server, a reliable channel is
established. After this, client and server may begin
sending data through this channel. The other should be
reading this data: They need a protocol !!!!
www.waseda2.jp
www.informatik.de
bla
A SERVER
4444
bla
bla bla
A CLIENT
3333
TCP: How is reliability achieved ?
The internet itself works only with the datagram paradigm. Internet
frames are may “get lost” (destroyed): For every frame delivered
carrying a part of the data flow there is a confirmation!
Sending
bla bla bla
Sending 1st bla
Ack 1st bla
Sending 2nd bla
Ack 2nd bla
Sending 3rd bla
Ack 3rd bla
What if a message get lost ?
The server waits a certain amount of time. If it does not receive any
confirmation it sends the message again.
Sending
bla bla bla
Sending 1st bla
Ack 1st bla
Sending 2nd bla
LOST !!!
No confirmation !!!
Sending 2nd bla again
Ack 2nd bla
The Window for improving efficiency
The transmitter will handle a set of not acknowledged packets
Sending 1st bla
Sending 2nd bla
Sending 3rd bla
Ack 1st bla
Ack 2nd bla
Ack 3rd bla
TCP or UDP Protocol:
decision at the transport level
• What does it means for the programmer/designer:
– By choosing one or the other protocol for establishing a connection
between machines the programmer/designer decides about the
reliability and speed of the communication.
• TCP provides high reliability: data are only sent if the communication
was established. An underlying protocol is responsible for retranslating,
ordering, eliminating duplicate packages
• UDP reflects just what the internet does with the packages: best effort
delivery, no checking.
– Also the programming style is quite different :
• With TCP the data is sent a flow (of bytes, in principle) which can be
written, read as if they were stored in a file.
• With UDP the programmer must assemble the package and send it to the
internet without knowing if it will arrive its pretended destination
When to use one or another
• Considerations
– TCP imposes a much higher load to the network than UDP (almost 6
times)
– We can expect high package loss when the information travels trough
many routers.
– Inside a LAN UDP communications may be reliable is there is not much
traffic. Although with some congestion we can expect some packages to
be lost inside the LAN
• In general, it is recommended especially for beginners (but also to
skilled programmers) to use only TCP to develop distributed
applications. Not only it is more reliable but the programming style is
also simpler. UDP is normally used if the application needs to
implement hardware supported broadcasting or multicasting, or if the
application cannot tolerate the overload of TCP
When do programmers should use UDP
or TCP ?
- TCP generates 6 times more traffic than UDP
- It is also slower to send and receive the messages
UDP
- not complete info
- fast
- valid in a very short
period of time
- history not important
TCP
- Reliable
- Complete
- Valid in a certain
period of time
- No need of speed
Mark with a + the applications to use
TCP and with a = those to use UDP
E-Mail
Video conference
Web server and client
Stock values every 5 seconds
Temperature every second
When to use one or another
• Considerations
– TCP imposes a much higher load to the network than UDP (almost 6
times)
– We can expect high package loss when the information travels trough
many routers.
– Inside a LAN UDP communications may be reliable is there is not much
traffic. Although with some congestion we can expect some packages to
be lost inside the LAN
• In general, it is recommended especially for beginners (but also to
skilled programmers) to use only TCP to develop distributed
applications. Not only it is more reliable but the programming style is
also simpler. UDP is normally used if the application needs to
implement hardware supported broadcasting or multicasting, or if the
application cannot tolerate the overload of TCP
Nowadays there is a lot of middleware
which make distributed programming
much easier
Libraries for distributed
programming (middleware)
RPC, CORBA, RMI
Goals of the Middleware
• Provide a framework for making
development of distributed system easier
• Hide (encapsulate) communications details
• Make distributing programming similar to
local programming
• Standardization of communication protocols
and data format
• This help comes not for free !!!
The client-server paradigm
(do you remember the WEB ?)
The web
server program
Web
resources
answer
request
THE INTERNET
answer
The web
client program
request
1- The server opens a channel and
starts listening to requests.
A SERVER
1 ?
THE INTERNET
Web
resources
A CLIENT
2- A client who knows it, sends a
request and waits for the answer
A SERVER
2
THE INTERNET
Web
resources
2
A CLIENT
3- The server, analyses the request and
answers properly according to the
protocol
A SERVER
Web
resources
3
THE INTERNET
3
This may involve the
reading of a file
A CLIENT
Why Client/Server ?
It is a communication protocol model (listener/caller)
• TCP/IP does not provide any mechanism which would start running
a program in a computer when a message arrives. A program must
be executing BEFORE the message arrives in order to establish a
communication (daemons).
•
Is there really no other mean to communicate ?
– Multicasting (but the sender does not know who is receiving and
in this case there is no dialogue)
• Most programs do not act as pure servers or client
– It very frequent to have a server of o a certain program act as a
client of another
– Sometimes a group of programs are client and servers from each
other at the same time!
The Client-Server Model
invocación
Servidor2
Cliente
resultado
Servidor1
Cliente
Servidor3
Services Provided by Multiple
Servers
Server 1
Client
Server 2
Client
Server 3
Proxy servers & caches
Server 1
Client
Proxy/cache
Client
Server 2
Peer-top-peer Applications
Application
+
Coordination
Application
+
Coordination
Application
+
Coordination
Communication Architectures for
Distributed Applications
• Servers as Clients
– Programms do not behave as pure servers or as pure clients. For
example, a file server can ask another compter for a timestamt to
register the last change of a file.
– When all application must behave at the same time as client and
server we can organize the communication in two basic ways:
• Every application can open a communication channel with each other
application (network configuration): P2P applications
• There is a commincation server and all applications open one
communication channel with it (star configuration): multiple chat
servers.
Network communication
architecture
• Every application opens an exclusive channel qith each other
application present in the session
• There may be up to n*(n-1)/2 channels open for n applications
• Advantages:
– It avoids bottlencks in the communications
• Drawbacks:
– All applications must be aware of all other taking part in the session
– The dynamic is more complicated when managing consistency when
applications enter and quit the session
Star communication architecture
• The applications open a channel with the server and send their
communication requests to the server. This server takes the
message and forwards it to its final destination
• There are up to n channels open for n applications
• Advantages:
– The managing og the communication parameters is more easy to manage
– The problem of incomming and outgoing of applications is more easy to
tackle
• Drawbacks:
– The server can get oveloaded
– The channels may get overloaded.
Replicated Architecutres
• Every application has a copy of the application and the
data
• The modifications (data) are distributed to all participants
in some way
• Synchronization is normally achieved by distributing the
events, not the state of the data
• Problems with latecommers
• Communication architecture may be that of a star or
network type
Replicated Architecture
Data
Data
Data
view
Data
Appl
Semi-replicated Architectures
• Data are kept centralized by a single application
• Every client mantains its own actualized view of the data
• There is a single data model, while the views and
controllers are replicated
• Permits the use of different interfaces (browser)
• Synchronisation by events or by state
• Communication architecture normaly centralized (the data
are located at the server)
Semi-replicated Architecture
Data
Data
Data
Centralized Architecture
•
•
•
•
•
Data and view are mantained centralized
Every client has a graphic server for displaying the view
Synchonization by state (the view)
Communications architecture centralized
It provoques a big traffic of data over the network (the
whole view is transmitted)
• Are frecuently of general use (like netmeeting)
Full centralized Architecture
view / commands
view / commands
Web-based Systems
Web service
Sevlet
Web
Ser
ver
dbm
THE INTERNET
Ejb
1- What is the WEB ?
3- What are Web-based systems
2- Why Web-based systems ?
3- Which are the most used java-based resources ?
Development and execution
of stand-alone programs
1. Write source code
2. Compile it with javac
3. Run it with the JVM (java)
MyProg.java
Java
source
code
Compilador java
(javac)
MyProg.class
Java
class
file
Java VM
(java)
Applets
MyApplet.java
MyApplet.class
Java
class
file
Java
source
code
Pagina.html
Applet
tag
Pagina.html
<applet code=MyApplet.class >
<parameters>
</applet>
MyApplet.class
GET Pagina.html
MyApplet.class
Pagina.htm
l
Servlets
MyServlet.java
MyServlet.class
Java
class
file
Java
source
code
HTML
GET MyServlet
MyServlet
.class
Java Script
The code is written inside the html page
Html & Script
<script language = “JavaScript”>
the code
</script>
Java program
running on the client
J2EE
Application server
DBMS
J2EE Bean
Servlet
JSP
response
contacts a Servlet or JSP
Web Server