Transcript Document
Welkom
bij de eerste bijeenkomst van het vak
Integratie van Sofware Systemen
voor Informatiekundestudenten.
Vereiste voorkennis:
Gedistribueerde Software Systemen.
Communicatie via:
Blackboard, email,
http://www.niii.ru.nl/R.deVries/iss.html
July 20, 2015
Integratie van Software Systemen, college 1
1
Integratie van Software Systemen
Informatiekunde
Jan Tretmans, René de Vries
Middleware, Gedistribueerd Programmeren, Java
Boeken: Distributed Computing, Liu
Web Services, Alonso etal.
Verder aanbevolen: IT Architectures and Middleware,
Strategies for Building Large, Integrated Systems, Peter
Bye, Chris Britton en papers
Wekelijks kleine zelfstudieopgaven
Tentamen + 2 grotere practicumopgaven en hoorcollege
in groepjes van 2 (voorwaarde vooraf)
July 20, 2015
Integratie van Software Systemen, college 1
2
Onderwerpen
Synchronization, Paradigmas for Distributed
Computing, Sockets
Client-Server, Multicasting, Distributed
Objects
RMI, XML, CGI, HTTP, CORBA, .NET,
SOAP, Webservices, Applets, Servlets, Mobile
Agents, …
Architectuur van geïntegreerde software
systemen
July 20, 2015
Integratie van Software Systemen, college 1
3
Distributed computing
Mei-Ling Liu
Professor
Department of Computer Science
California Polytechnic University
United States of America
July 20, 2015
Integratie van Software Systemen,
college 1
4
Systems en Distribution
A system is a black box that is capable of
providing a service to its user by means of
interaction.
A distributed system is a regularly interacting
or independent group of items forming a
unified system.
July 20, 2015
Integratie van Software Systemen, college 1
5
Distributed system, distributed computing
Early computing was performed on a single
processor. Uni-processor computing can be
called centralized computing.
A distributed system is a collection of
independent computers or processes,
interconnected via a network, capable of
collaborating on a task.
Distributed computing is computing
performed in a distributed system.
July 20, 2015
Integratie van Software Systemen, college 1
6
Distributed Systems
work
stations
a local network
The Internet
a network host
July 20, 2015
Integratie van Software Systemen, college 1
7
Examples of Distributed systems
Network of workstations (NOW): a group of
networked personal workstations connected to
one or more server machines.
The Internet
An intranet: a network of computers and
workstations within an organization,
segregated from the Internet via a protective
device (a firewall).
July 20, 2015
Integratie van Software Systemen, college 1
8
Example of a large-scale distributed
system – eBay (Source: Los Angeles Times.)
July 20, 2015
Integratie van Software Systemen, college 1
9
Computers in a Distributed System
Workstations: computers used by end-users to
perform computing
Server machines: computers which provide
resources and services
Personal Assistance Devices: handheld
computers connected to the system via a
wireless communication link.
July 20, 2015
Integratie van Software Systemen, college 1
10
Centralized vs. Distributed Computing
terminal
mainframe computer
workstation
network link
network host
centralized computing
distributed computing
July 20, 2015
Integratie van Software Systemen, college 1
11
Monolithic mainframe applications vs. distributed applications
The monolithic mainframe application architecture:
Separate, single-function applications, such as order-entry or billing
Applications cannot share data or other resources
Developers must create multiple instances of the same functionality
(service).
Proprietary (user) interfaces
The distributed application architecture:
Integrated applications
Applications can share resources
A single instance of functionality (service) can be reused.
Common user interfaces
July 20, 2015
Integratie van Software Systemen, college 1
12
Why distributed computing?
Economics: distributed systems allow the
pooling of resources, including CPU cycles,
data storage, input/output devices, and
services.
Reliability: distributed systems allow
replication of resources and/or services, thus
reducing service outage due to failures.
The Internet has become a universal platform
for distributed computing.
July 20, 2015
Integratie van Software Systemen, college 1
13
Distributed Computing
Strengths
The affordability of computers and availability of network
access
Resource sharing
Scalability
Fault Tolerance
Weaknesses
Multiple Points of Failure: the failure of one or more
participating computers, or one or more network links, can
spell trouble.
Security Concerns: In a distributed system, there are more
opportunities for unauthorized attack.
July 20, 2015
Integratie van Software Systemen, college 1
14
Introductory Basics: three areas
Some of the notations and concepts from
these areas will be employed from time to
time in the presentations for this course:
Software engineering
Operating systems
Networks.
July 20, 2015
Integratie van Software Systemen, college 1
15
Software Engineering Basics
July 20, 2015
Integratie van Software Systemen,
college 1
16
Procedural versus Object-oriented Programming
In building network applications, there are two
main classes of programming languages:
procedural languages and object-oriented
languages.
Procedural languages, with the C language being the primary example,
use procedures (functions) to break down the complexity of the tasks
that an application entails.
Object-oriented languages, exemplified by Java, use objects to
encapsulate the details. Each object simulates an object in real life,
carrying state data as well as behaviors.
• State data are represented as instance data.
• Behaviors are represented as methods.
July 20, 2015
Integratie van Software Systemen, college 1
17
UML Class Diagram Notations
Basic UML Class Diagram Notations
A class/interface is represented as follows:
interface/class
name
attributes
(name: type)
operations
(method names)
attributes are static/ instance variables/constants
operations are static or instance methods.
NOTE: The shape, the style of the line (dashed or
solid), the direction of the arrow, and the shape of the
arrowheads (pointed, hollow, or solid) are significant.
July 20, 2015
Integratie van Software Systemen, college 1
18
The Architecture of Distributed Applications
Presentation
Application (Business) logic
Services
July 20, 2015
Integratie van Software Systemen, college 1
19
Operating Systems Basics
July 20, 2015
Integratie van Software Systemen,
college 1
20
Operating systems basics
A process consists of an executing program,
its current values, state information, and the
resources used by the operating system to
manage its execution.
A program is an artifact constructed by a
software developer; a process is a dynamic
entity which exists only when a program is
run.
July 20, 2015
Integratie van Software Systemen, college 1
21
Process State Transition Diagram
terminated
start
queued
exit
dispatch
ready
event completion
running
waiting
for event
blocked
Simplifed finite state diagram for a process's lifetime
July 20, 2015
Integratie van Software Systemen, college 1
22
Java processes
There are three types of Java program: applications,
applets, and servlets, all are written as a class.
A Java application program has a main method, and is
run as an independent(standalone) process.
An applet does not have a main method, and is run
using a browser or the appletviewer.
A servlet does not have a main method, and is run in the
context of a web server.
A Java program is compiled into bytecode, a
universal object code. When run, the bytecode is
interpreted by the Java Virtual Machine (JVM).
July 20, 2015
Integratie van Software Systemen, college 1
23
Concurrent Processing
On modern day operating systems, multiple processes
appear to be executing concurrently on a machine by
timesharing resources.
Processes
P1
P2
P3
P4
time
Timesharing of a resource
July 20, 2015
Integratie van Software Systemen, college 1
24
Concurrent processing within a process
It is often useful for a process to have parallel threads of
execution, each of which timeshare the system resources in
much the same way as concurrent processes.
A parent process may spawn child processes.
A process may spawn child threads
a process
parent process
main thread
child thread 1
child thread 2
child processes
Concurrent processing within a process
July 20, 2015
Integratie van Software Systemen, college 1
25
Java threads
The Java Virtual Machine allows an application to have multiple
threads of execution running concurrently.
Java provides a Thread class:
public class Thread
extends Object
implements Runnable
When a Java Virtual Machine starts up, there is usually a single thread
(which typically calls the method named main of some designated
class). The Java Virtual Machine continues to execute threads until
either of the following occurs:
The exit method of class Runtime has been called and the security
manager has permitted the exit operation to take place.
All threads have terminated, either by returning from the call to the run
method or by throwing an exception that propagates beyond the run
method.
July 20, 2015
Integratie van Software Systemen, college 1
26
Two ways to create a new thread of execution
Using a subclass of the Thread class
Using a class that implements the Runnable
interface
July 20, 2015
Integratie van Software Systemen, college 1
27
Thread-safe Programming
When two threads independently access and update
the same data object, such as a counter, as part of
their code, the updating needs to be synchronized.
(See next slide.)
Because the threads are executed concurrently, it is
possible for one of the updates to be overwritten by
the other due to the sequencing of the two sets of
machine instructions executed in behalf of the two
threads.
To protect against the possibility, a synchronized
method can be used to provide mutual exclusion.
July 20, 2015
Integratie van Software Systemen, college 1
28
Race Condition
time
fetch value in counter and load into a register
fetch value in counter and load into a register
increment value in register
fetch value in counter and load into a register
store value in register to counter
increment value in register
fetch value in counter and load into a register
increment value in register
increment value in register
store value in register to counter
store value in register to counter
store value in register to counter
This execution results in the
value 2 in the counter
This execution results in the
value 1 in the counter
instruction executed in concurrent process or thread 1
instruction executed in concurrent process or thread 2
July 20, 2015
Integratie van Software Systemen, college 1
29
Synchronized method in a thread
class SomeThread3 implements Runnable {
static int count=0;
SomeThread3() {
super();
}
public void run() {
update( );
}
static public synchronized void update( ){
int myCount = count;
myCount++;
count = myCount;
System.out.println("count="+count+
"; thread count=" + Thread.activeCount( ));
}
}
July 20, 2015
Integratie van Software Systemen, college 1
30
Network Basics
July 20, 2015
Integratie van Software Systemen, college 1
31
Network standards and protocols
On public networks such as the Internet, it is
necessary for a common set of rules to be
specified for the exchange of data.
Such rules, called protocols, specify such
matters as the formatting and semantics of
data, flow control, error correction.
Software can share data over the network
using network software which supports a
common set of protocols.
July 20, 2015
Integratie van Software Systemen, college 1
32
The network architecture
Network hardware transfers electronic signals,which
represent a bit stream, between two devices.
Modern day network applications require an application
programming interface (API) which masks the underlying
complexities of data transmission.
A layered network architecture allows the functionalities
needed to mask the complexities to be provided
incrementally, layer by layer.
Actual implementation of the functionalities may not be
clearly divided by layer.
July 20, 2015
Integratie van Software Systemen, college 1
33
The OSI seven-layer network architecture
application layer
application layer
presentation layer
presentation layer
session layer
session layer
transport layer
transport layer
network layer
network layer
data link layer
data link layer
physical layer
physical layer
July 20, 2015
Integratie van Software Systemen, college 1
34
Network Architecture
The division of the layers is conceptual: the
implementation of the functionalities need not
be clearly divided as such in the hardware and
software that implements the architecture.
The conceptual division serves at least two
useful purposes :
Systematic specification of protocols
it allows protocols to be specified systematically
2. Conceptual Data Flow: it allows programs to be
written in terms of logical data flow.
1.
July 20, 2015
Integratie van Software Systemen, college 1
35
The TCP/IP Protocol Suite
The Transmission Control Protocol/Internet Protocol suite is a set of
network protocols which supports a four-layer network architecture.
It is currently the protocol suite employed on the Internet.
Application layer
Application layer
Transport layer
Transport layer
Internet layer
Internet layer
Physical layer
Physical layer
The Internet network architecture
July 20, 2015
Integratie van Software Systemen, college 1
36
The TCP/IP Protocol Suite -2
The Internet layer implements the Internet
Protocol, which provides the functionalities
for allowing data to be transmitted between
any two hosts on the Internet.
The Transport layer delivers the transmitted
data to a specific process running on an
Internet host.
The Application layer supports the
programming interface used for building a
program.
July 20, 2015
Integratie van Software Systemen, college 1
37
Network Resources
Network resources are resources available to the
participants of a distributed computing community.
Network resources include hardware such as
computers and equipment, and software such as
processes, email mailboxes, files, web documents.
An important class of network resources is
network services such as the World Wide Web
and file transfer (FTP), which are provided by
specific processes running on computers.
July 20, 2015
Integratie van Software Systemen, college 1
38
Identification of Network Resources
One of the key challenges in distributed
computing is the unique identification of
resources available on the network, such as email mailboxes, and web documents.
Addressing an Internet Host
Addressing a process running on a host
Email Addresses
Addressing web contents: URL
July 20, 2015
Integratie van Software Systemen, college 1
39
Addressing an Internet Host
July 20, 2015
Integratie van Software Systemen, college 1
40
The Internet Topology
an Internet host
subnets
The Internet backbone
The Internet Topology Model
July 20, 2015
Integratie van Software Systemen, college 1
41
The Internet Topology
The internet consists of an hierarchy of
networks, interconnected via a network
backbone.
Each network has a unique network address.
Computers, or hosts, are connected to a
network. Each host has a unique ID within
its network.
Each process running on a host is associated
with zero or more ports. A port is a logical
entity for data transmission.
July 20, 2015
Integratie van Software Systemen, college 1
42
The Internet addressing scheme
In IP version 4, each address is 32 bit long.
The address space accommodates 232 (4.3 billion) addresses in total.
Addresses are divided into 5 classes (A through E)
byte 0
byte 1
byte 2
byte 3
class A address 0
class B address 1 0
network address
class C address 1 1 0
multcast address 1 1 1 0
reserved address 1 1 1 1 0
July 20, 2015
multicast group
host portion
reserved
reserved
Integratie van Software Systemen, college 1
43
The Internet Address Scheme - 3
For human readability, Internet addresses are
written in a dotted decimal notation:
nnn.nnn.nnn.nnn, where each nnn group is a decimal value
in the range of 0 through 255
# Internet host table (found in /etc/hosts file)
127.0.0.1
localhost
129.65.242.5 falcon.csc.calpoly.edu falcon loghost
129.65.241.9 falcon-srv.csc.calpoly.edu falcon-srv
129.65.242.4 hornet.csc.calpoly.edu hornet
129.65.241.8 hornet-srv.csc.calpoly.edu hornet-srv
129.65.54.9 onion.csc.calpoly.edu onion
129.65.241.3 hercules.csc.calpoly.edu
hercules
July 20, 2015
Integratie van Software Systemen, college 1
44
IP version 6 Addressing Scheme
Each address is 128-bit long.
There are three types of addresses:
Unicast: An identifier for a single interface.
Anycast: An identifier for a set of interfaces
(typically belonging to different nodes).
Multicast: An identifier for a set of
interfaces (typically belonging to different
nodes). A packet sent to a multicast
address is delivered to all interfaces
identified by that address.
July 20, 2015
Integratie van Software Systemen, college 1
45
The Domain Name System (DNS)
For user friendliness, each Internet address is mapped
to a symbolic name, using the DNS, in the format of:
<computer-name>.<subdomain hierarchy>.<organization>.<sector name>{.<country code>}
e.g., www.csc.calpoly.edu.us
root
top-level domain
com
edu
gov
net
in the U.S.
org
mil
country code
Top-level domain name has to be applied for.
Subdomain hierachy and names are assigned
by the organization.
organization
...
...
subdomain
host name
July 20, 2015
Integratie van Software Systemen, college 1
46
The Domain Name System
For network applications, a domain name must be
mapped to its corresponding Internet address.
Processes known as domain name system servers
provide the mapping service, based on a
distributed database of the mapping scheme.
The mapping service is offered by thousands of
DNS servers on the Internet, each responsible for a
portion of the name space, called a zone. The
servers that have access to the DNS information
(zone file) for a zone is said to have authority for
that zone.
July 20, 2015
Integratie van Software Systemen, college 1
47
Top-level Domain Names
.com: For commercial entities, which anyone, anywhere in the
world, can register.
.net : Originally designated for organizations directly involved
in Internet operations. It is increasingly being used by
businesses when the desired name under "com" is already
registered by another organization. Today anyone can
register a name in the Net domain.
.org: For miscellaneous organizations, including non-profits.
.edu: For four-year accredited institutions of higher learning.
.gov: For US Federal Government entities
.mil: For US military
Country Codes: For individual countries based on the
International Standards Organization. For example, nl for The
Netherlands, ca for Canada, and jp for Japan.
July 20, 2015
Integratie van Software Systemen, college 1
48
Domain Name Hierarchy
. (root domain)
.au ... .ca ... .us ... .zw
.com
.gov
.edu
.mil
.net
.org
country code
ucsb.edu ...
cs ...
July 20, 2015
ece ...
calpoly.edu
csc ...
...
ee english ... wireless
Integratie van Software Systemen, college 1
49
Name lookup and resolution
If a domain name is used to address a host, its
corresponding IP address must be obtained for the
lower-layer network software.
The mapping, or name resolution, must be
maintained in some registry.
For runtime name resolution, a network service is
needed; a protocol must be defined for the naming
scheme and for the service.
Examples:
The DNS service supports the DNS; the Java RMI
registry supports RMI object lookup; JNDI is a
network service lookup protocol.
July 20, 2015
Integratie van Software Systemen, college 1
50
Addressing a process running on a host
July 20, 2015
Integratie van Software Systemen,
college 1
51
Logical Ports
host A
...
host B
...
process
port
Each host has 65536 ports.
The Internet
July 20, 2015
Integratie van Software Systemen, college 1
52
Well Known Ports
Each Internet host has 216 (65,535) logical
ports. Each port is identified by a number
between 1 and 65535, and can be allocated to
a particular process.
Port numbers beween 1 and 1023 are reserved
for processes which provide well-known
services such as finger, FTP, HTTP, and
email.
July 20, 2015
Integratie van Software Systemen, college 1
53
Well-known ports
Assignment of some well-known ports
Protocol
Port
Service
echo
7
IPC testing
daytime
13
provides the current date and time
ftp
21
file transfer protocol
telnet
23
remote, command-line terminal session
smtp
25
simple mail transfer protocol
time
37
provides a standard time
finger
79
provides information about a user
http
80
web server
RMI Registry
1099
registry for Remote Method Invocation
special web server
8080
web server which supports
servlets, JSP, or ASP
July 20, 2015
Integratie van Software Systemen, college 1
54
Choosing a port to run your program
For a programming exercise: when a port is
needed, choose a random number above the
well known ports: 1,024- 65,535.
If you are providing a network service for the
community, then arrange to have a port
assigned to and reserved for your service.
July 20, 2015
Integratie van Software Systemen, college 1
55
Addressing a Web Document
July 20, 2015
Integratie van Software Systemen, college 1
56
The Uniform Resource Identifier (URI)
Resources to be shared on a network need to
be uniquely identifiable.
On the Internet, a URI is a character string
which allows a resource to be located.
There are two types of URIs:
URL (Uniform Resource Locator) points to a
specific resource at a specific location
URN (Uniform Resource Name) points to a
specific resource at a nonspecific location.
July 20, 2015
Integratie van Software Systemen, college 1
57
URL
A URL has the format of:
protocol://host address[:port]/directory path/file name#section
A sample URL:
http://www.csc.calpoly.edu:8080/~mliu/CSC369/hw.html # hw1
section name
file name
host name
protocol of server
directory path
port number of server process
Other protocols that can appear in a URL are:
file
ftp
gopher
news
telnet
WAIS
July 20, 2015
Integratie van Software Systemen, college 1
58
More on URL
The path in a URL is relative to the document
root of the server. Often, a user’s document
root is ~/www.
A URL may appear in a document in a
relative form:
< a href=“another.html”>
and the actual URL referred to will be
another.html preceded by the protocol,
hostname, directory path of the document .
July 20, 2015
Integratie van Software Systemen, college 1
59
Break
July 20, 2015
Integratie van Software Systemen, college 1
60
Interprocess Communications
A. Process Communication Methods
B. Marshalling
C. Protocols
July 20, 2015
Integratie van Software Systemen,
college 1
61
A. Process Communication Methods
-
-
-
Operating systems provide facilities for interprocess
communications (IPC), such as message queues,
semaphores, and shared memory.
Distributed computing systems make use of these
facilities to provide application programming
interface which allows IPC to be programmed at a
higher level of abstraction.
Distributed computing requires information to be
exchanged among independent processes.
July 20, 2015
Integratie van Software Systemen, college 1
62
IPC – unicast and multicast
In distributed computing, two or more processes
engage in IPC in a protocol agreed upon by the
processes. A process may be a sender at some points
during a protocol, a receiver at other points.
When communication is from one process to a single
other process, the IPC is said to be a unicast. When
communication is from one process to a group of
processes, the IPC is said to be a multicast, a topic
that we will explore in a later chapter.
July 20, 2015
Integratie van Software Systemen, college 1
63
Unicast vs. Multicast
P2
P2
m
P1
unicast
July 20, 2015
...
P3
m
m
P4
m
P1
multicast
Integratie van Software Systemen, college 1
64
Interprocess Communications in Distributed Computing
Process 1
Process 2
data
sender
July 20, 2015
receiver
Integratie van Software Systemen, college 1
65
Operations provided in an
archetypal Interprocess Communications API
•
•
•
•
Receive ( [sender], message storage object)
Connect (sender address, receiver address), for
connection-oriented communication.
Send ( [receiver], message)
Disconnect (connection identifier), for
connection-oriented communication.
July 20, 2015
Integratie van Software Systemen, college 1
66
Interprocess Communication in basic HTTP
Web server
S2
S1
S3
HTTP
request
a process
an operation
data flow
S4
HTTP
response
C1
C2
C3
C4
operations:
S1: accept connection
S2: receive (request)
S3: send (response)
S3: disconnect
C1: make connection
C2: send (request)
C3: receive (response)
C4: disconnect
Web browser
July 20, 2015
Integratie van Software Systemen, college 1
67
Event Synchronization
Interprocess communication requires that the
two processes synchronize their operations:
one side sends, then the other receives until all
data has been sent and received.
Ideally, the send operation starts before the
receive operation commences.
In practice, the synchronization requires
system support.
July 20, 2015
Integratie van Software Systemen, college 1
68
Synchronous vs asynchronous communication
The IPC operations may provide synchronous
communication using blocking.
A blocking operation issued by a process will block
further processing of the process until the operation
is fulfilled.
Alternatively, IPC operations may be asynchronous
or nonblocking.
An asynchronous operation issued by a process will
not block further processing of the process.
Instead, the process is free to proceed with its
processing, and may optionally be notified by the
system when the operation is fulfilled.
July 20, 2015
Integratie van Software Systemen, college 1
69
Synchronous send and receive
process 1
running on host 1
process 2
running on host 2
blocking receive starts
blocking send starts
an operation
execution flow
blocking send returns
acknowledgement of data received
provided by the IPC facility
blocking receive ends
suspended period
Synchronous Send and Receive
Ok, if you want to be sure the message is delivered
but you may have to wait....
July 20, 2015
Integratie van Software Systemen, college 1
70
Asynchronous send and synchronous
receive
Process 2
Process 1
blocking receive starts
nonblocking send
operation
execution flow
suspended period
blocking receive returns
Asynchronous Send and
Synchronous Receive
The message is received only if the receiver is not too
late unless the
system is buffering
Integratie van Software Systemen, college 1
July 20, 2015
71
Synchronous send and Async. Receive – 1
Process 2
Process 1
blocking send issued
transparent acknowledgement
provided by the IPC facility
nonblocking receive issued
execution flow
suspended period
Synchronous Send and
Asynchronous Receive
Scenario A
The message is received only if the receiver is not
too early unless
process
is polling....
Integratie the
van Software
Systemen,
college 1
July 20, 2015
72
Synchronous send and Async. Receive – 2
no show unless you are polling all the time
Process 2
Process 1
nonblocking receive issued
and returned immediately
blocking send issued
indefinite
blocking
execution flow
suspended period
Process 2
Process 1
Synchronous Send and
Asynchronous Receive
Scenario B
July 20, 2015
Integratie van Software Systemen, college 1
73
Synchronous send and Async. Receive - 3
Process 2
Process 1
nonblocking receive issued
and returned immediately
blocking send issued
transparent acknowledgement
provided by the IPC facility
process is notified
of the arrival of
data
execution flow
suspended period
Synchronous Send and
Asynchronous Receive
Scenario C
...July
or20,somehow
theIntegratie
process
is notified when the message is there
van Software Systemen, college 1
2015
74
Asynchronous send and Asynchronous
receive
Process 2
Process 1
nonblocking receive issued
and returned immediately
blocking send issued
non
process is notified
of the arrival of
data
execution flow
suspended period
Asynchronous Send and
Asynchronous Receive
Scenario C
Again polling
or notification is required
Integratie van Software Systemen, college 1
July 20, 2015
75
The perils of blocking
Blocking operations issued in the wrong sequence can
cause deadlocks.
Deadlocks should be avoided. Alternatively, timeout can
be used to detect deadlocks.
July 20, 2015
Integratie van Software Systemen, college 1
76
Indefinite blocking due to a deadlock
Process 1
Process 2
"receive from process 2" issued;
process 1 blocked pending data
from process 2.
an operation
"receive from process 1" issued;
process 2 blocked pending data
from process 1.
process
executing
process
blocked
July 20, 2015
Integratie van Software Systemen, college 1
77
B. Marshalling
Data transmitted on the network is a binary stream.
An interprocess communication system may provide the
capability to allow data representation to be imposed on
the raw data.
Because different computers may have different internal
storage format for the same data type, an external
representation of data may be necessary.
Data marshalling is the process of
(i) flattening a data structure, and
(ii) converting the data to an external representation.
Some well known external data representation schemes
are: Sun XDR
ASN.1 (Abstract Syntax Notation)
XML (Extensible Markup Language)
July 20, 2015
Integratie van Software Systemen, college 1
78
Data Encoding
decreasing
level of abstraction
data encoding schemes
application specific data encoding language
general data encoding language
network data encoding standard
July 20, 2015
Sample Standards
XML:(Extensible Markup Language)
ASN.1(Abstract Syntax Notation)
Sun XDR(External Data Representation)
Integratie van Software Systemen, college 1
79
XML
XML is a text-based markup language that is the
standard for data interchange on the Web.
XML has syntax analogus to HTML.
Unlike HTML, XML tags tell you what the data
means, rather than how to display it.
Example:
<message>
<to>[email protected]</to>
<from>[email protected]</from>
<subject>XML Is Really Cool</subject>
<text> How many ways is XML cool? Let me count the ways...
</text>
</message>
July 20, 2015
Integratie van Software Systemen, college 1
80
An example of Data Marshalling
"This is a test."
1.2
7.3
-1.5
marshalling
host A
1. flattening of structured data items
2. converting data to external (network)
representation
110011 ... 10000100 ...
unmarshalling
"This is a test."
-1.5
7.3
1.2
1. convert data to internal representation
2. rebuild data structures.
External to internal representation and vice versa
is not required
- if the two sides are of the same host type;
- if the two sides negotiates at connection.
host B
July 20, 2015
Integratie van Software Systemen, college 1
81
C. Protocols
Data marshalling is at its simplest when the data
exchanged is a stream of characters, or text.
Exchanging data in text has the additional
advantage that the data can be easily parsed in a
program and displayed for human perusal. Hence
it is a popular practice for protocols to exchange
requests and responses in the form of characterstrings. Such protocols are said to be text-based.
Many popular network protocols, including FTP
(File Transfer Protocol), HTTP, and SMTP (Simple
Mail Transfer Protocol), are text-based.
July 20, 2015
Integratie van Software Systemen, college 1
82
Event Diagram
Process 2
Process 1
time
request 1
response 1
request 2
interprocess communication
execution flow
process blocked
response2
Event diagram for a protocol
July 20, 2015
Integratie van Software Systemen, college 1
83
Sequence Diagram
Process A
Process B
request 1
response 1
request 2
interprocess communication
response 2
July 20, 2015
Integratie van Software Systemen, college 1
84
Protocol
In a distributed application, two processes
perform interprocess communication in a
mutually agreed upon protocol.
The specification of a protocol should
include
i.
the sequence of data exchange, which can be
described using an event diagram.
ii. the specification of the format of the data exchanged
at each step.
July 20, 2015
Integratie van Software Systemen, college 1
85
HTTP: A sample protocol
The HyperText Transfer Protocol is a
protocol for a process (the browser) to obtain
a document from a web server process.
It is a request/response protocol: a browser
sends a request to a web server process,
which replies with a response.
July 20, 2015
Integratie van Software Systemen, college 1
86
The Basic HTTP protocol
web server
web browser
request
response
request is a message in 3 parts:
- <command> <document adddress> <HTTP version>
- an optional header
- optional data for CGI data using post method
response is a message consisting of 3 parts:
- a status line of the format <protocol><status code><description>
- header information, which may span several lines;
- the document itself.
We will explore HTTP in details later this quarter.
July 20, 2015
Integratie van Software Systemen, college 1
87
A sample HTTP session
Script started on Tue Oct 10 21:49:28 2000
9:49pm telnet www.csc.calpoly.edu 80
Trying 129.65.241.20...
Connected to tiedye2-srv.csc.calpoly.edu.
Escape character is '^]'.
GET /~mliu/ HTTP/1.0
HTTP/1.1 200 OK
Date: Wed, 11 Oct 2000 04:51:18 GMT
Server: Apache/1.3.9 (Unix) ApacheJServ/1.0
Last-Modified: Tue, 10 Oct 2000 16:51:54 GMT
ETag: "1dd1e-e27-39e3492a"
Accept-Ranges: bytes
Content-Length: 3623
Connection: close
Content-Type: text/html
<HTML>
<HEAD>
<TITLE> Mei-Ling L. Liu's Home Page
</TITLE>
</HEAD>
<BODY bgcolor=#ffffff>
…
July 20, 2015
HTTP Request
HTTP response status line
HTTP response header
document content
Integratie van Software Systemen, college 1
88
IPC paradigms and implementations
Paradigms of IPC of different levels of abstraction have
evolved, with corresponding implementations.
decreasing
level of
abstraction
IPC paradigms
remote procedure/method
socket API
data transmission
July 20, 2015
Example IPC Implementations
Remote Procedure Call (RPC), Java RMI
Unix socket API, Winsock
serial/parallel communication
Integratie van Software Systemen, college 1
89
Evolution of IPC paradigms
Client-server: Socket API, remote method invocation
Distributed objects
Object broker: CORBA
Network service: Jini
Object space: JavaSpaces
Mobile agents
Message oriented middleware (MOM): Java Message Service
Collaborative applications
July 20, 2015
Integratie van Software Systemen, college 1
90
Summary
Liu, Chapter 1:
Distributed computing
Basics:
operating system (concurrency)
network (OSI 7-layer architecture, connection-oriented, naming)
software engineering (uml, three layered architecture)
Liu, Chapter 2:
Process Communication Methods
Marshalling
Protocols
Hoofdstukken 1 en 2 bestuderen en
opgaven: H1:1a,1b,3,4; H2: 1,2,3,5,7 en 8 maken.
July 20, 2015
Integratie van Software Systemen, college 1
91