here - Parent Directory

Download Report

Transcript here - Parent Directory

Languages and Compilers
(SProg og Oversættere)
Concurrency and distribution
Bent Thomsen
Department of Computer Science
Aalborg University
With acknowledgement to John Mitchell whose slides this lecture is based on.
1
Concurrency, distributed computing, the Internet
•
•
•
•
•
•
Traditional view:
Let the OS deal with this
=> It is not a programming language issue!
End of Lecture
Wait-a-minute …
Maybe “the traditional view” is getting out of date?
2
Languages with concurrency constructs
•
•
•
•
•
•
•
•
•
•
•
•
•
Maybe the “traditional view” was always out of date?
Simula
Modula3
Occam
Concurrent Pascal
ADA
Linda
CML
Facile
Jo-Caml
Java
C#
…
3
Categories of Concurrency:
1.
Physical concurrency - Multiple independent processors (
multiple threads of control)
•
•
•
2.
Uni-processor with I/O channels
(multi-programming)
Multiple CPU
(parallel programming)
Network of uni- or multi- CPU machines
(distributed programming)
Logical concurrency - The appearance of physical concurrency
is presented by time-sharing one processor (software can be
designed as if there were multiple threads of control)
•
Concurrency as a programming abstraction
Def: A thread of control in a program is the sequence of program
points reached as control flows through the program
4
Introduction
•
Reasons to Study Concurrency
1. It involves a different way of designing software that can be
very useful—many real-world situations involve concurrency
– Control programs
– Simulations
– Client/Servers
– Mobile computing
2. Computers capable of physical concurrency are now widely
used
– High-end servers
– Game consoles
– Grid computing
5
The promise of concurrency
• Speed
– If a task takes time t on one processor, shouldn’t it take time
t/n on n processors?
• Availability
– If one process is busy, another may be ready to help
• Distribution
– Processors in different locations can collaborate to solve a
problem or work together
• Humans do it so why can’t computers?
– Vision, cognition appear to be highly parallel activities
6
Challenges
• Concurrent programs are harder to get right
– Folklore: Need an order of magnitude speedup (or more) to
be worth the effort
• Some problems are inherently sequential
– Theory – circuit evaluation is P-complete
– Practice – many problems need coordination and
communication among sub-problems
• Specific issues
– Communication – send or receive information
– Synchronization – wait for another process to act
– Atomicity – do not stop in the middle and leave a mess
7
Why is concurrent programming hard?
• Nondeterminism
– Deterministic: two executions on the same input it always
produce the same output
– Nondeterministic: two executions on the same input may
produce different output
• Why does this cause difficulty?
– May be many possible executions of one system
– Hard to think of all the possibilities
– Hard to test program since some may occur infrequently
8
Traditional C Library for concurrency
System Calls
- fork( )
- wait( )
- pipe( )
- write( )
- read( )
Examples
9
Process Creation
Fork( )
NAME
fork() – create a new process
SYNOPSIS
# include <sys/types.h>
# include <unistd.h>
pid_t fork(void)
RETURN VALUE
success
parent- child pid
child- 0
failure
-1
10
Fork()- program structure
#include <sys/types.h>
#include <unistd.h>
#include <stdio.h>
Main()
{
pid_t pid;
if((pid = fork())>0){
/* parent */
}
else if ((pid==0){
/*child*/
}
else {
/* cannot fork*
}
exit(0);
}
11
Wait() system call
Wait()- wait for the process whose pid reference is passed to finish
executing
SYNOPSIS
#include<sys/types.h>
#include<sys/wait.h>
pid_t wait(int *stat)loc)
The unsigned decimal integer process ID for which to wait
RETURN VALUE
success- child pid
failure- -1 and errno is set
12
Wait()- program structure
#include <sys/types.h>
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
Main(int argc, char* argv[])
{
pid_t childPID;
if((childPID = fork())==0){
/*child*/
}
else {
/* parent*
wait(0);
}
exit(0);
}
13
Pipe() system call
Pipe()- to create a read-write pipe that may later be used to
communicate with a process we’ll fork off.
SYNOPSIS
Int pipe(pfd)
int pfd[2];
PARAMETER
Pfd is an array of 2 integers, which that will be used to
save the two file descriptors used to access the pipe
RETURN VALUE:
0 – success;
-1 – error.
14
Pipe() - structure
/* first, define an array to store the two file descriptors*/
Int pipe[2];
/* now, create the pipe*/
int rc = pipe (pipes);
if(rc = = -1) {
/* pipe() failed*/
Perror(“pipe”);
exit(1);
}
If the call to pipe() succeeded, a pipe will be created, pipes[0] will
contain the number of its read file descriptor, and pipes[1] will
contain the number of its write file descriptor.
15
Write() system call
Write() – used to write data to a file or other object identified
by a file descriptor.
SYNOPSIS
#include <sys/types.h>
Size_t write(int fildes, const void * buf, size_t nbyte);
PARAMETER
fildes is the file descriptor,
buf is the base address of area of memory that data is
copied from,
nbyte is the amount of data to copy
RETURN VALUE
The return value is the actual amount of data written, if this
differs from nbyte then something has gone wrong
16
Read() system call
Read() – read data from a file or other object identified by a file
descriptor
SYNOPSIS
#include <sys/types.h>
Size_t read(int fildes, void *buf, size_t nbyte);
ARGUMENT
fildes is the file descriptor,
buf is the base address of the memory area into which the
data is read,
nbyte is the maximum amount of data to read.
RETURN VALUE
The actual amount of data read from the file. The pointer is
incremented by the amount of data read.
17
Solaris 2 Synchronization
• Implements a variety of locks to support multitasking,
multithreading (including real-time threads), and multiprocessing.
• Uses adaptive mutexes for efficiency when protecting data from
short code segments.
• Uses condition variables and readers-writers locks when longer
sections of code need access to data.
• Uses turnstiles to order the list of threads waiting to acquire either
an adaptive mutex or reader-writer lock.
18
Windows 2000 Synchronization
• Uses interrupt masks to protect access to global
resources on uniprocessor systems.
• Uses spinlocks on multiprocessor systems.
• Also provides dispatcher objects which may act as
wither mutexes and semaphores.
• Dispatcher objects may also provide events. An event
acts much like a condition variable.
19
Basic question
• Maybe the library approach is not such a good idea?
• How can programming languages make concurrent and
distributed programming easier?
20
Language support for concurrency
• Help promote good software engineering
• Allowing the programmer to express solutions more
closely to the problem domain
• No need to juggle several programming models
(Hardware, OS, library, …)
• Make invariants and intentions more apparent (part of
the interface and/or type system)
• Allows the compiler much more freedom to choose
different implementations
• Base the programming language constructs on a wellunderstood formal model => formal reasoning may be
less hard and the use tools may be possible
21
What could languages provide?
• Abstract model of system
– abstract machine => abstract system
• Example high-level constructs
– Communication abstractions
• Synchronous communication
• Buffered asynchronous channels that preserve msg order
– Mutual exclusion, atomicity primitives
• Most concurrent languages provide some form of locking
• Atomicity is more complicated, less commonly provided
– Process as the value of an expression
• Pass processes to functions
• Create processes at the result of function call
22
Basic issue: conflict between processes
• Critical section
– Two processes may access shared resource
– Inconsistent behavior if two actions are interleaved
– Allow only one process in critical section
• Deadlock
– Process may hold some locks while awaiting others
– Deadlock occurs when no process can proceed
23
Concurrency
• Def: A task is disjoint if it does not communicate with
or affect the execution of any other task in the
program in any way
• Task communication is necessary for synchronization
– Task communication can be through:
1. Shared nonlocal variables
2. Parameters
3. Message passing
24
Synchronization
• Kinds of synchronization:
1. Cooperation
– Task A must wait for task B to complete some specific
activity before task A can continue its execution e.g., the
producer-consumer problem
2. Competition
– When two or more tasks must use some resource that cannot
be simultaneously used e.g., a shared counter
– Competition is usually provided by mutually exclusive
access (approaches are discussed later)
25
Design Issues for Concurrency:
1.
2.
3.
4.
5.
6.
How is cooperation synchronization provided?
How is competition synchronization provided?
How and when do tasks begin and end execution?
Are tasks statically or dynamically created?
Are there any syntactic constructs in the language?
Are concurrency construct reflected in the type system?
26
Concurrent Pascal: cobegin/coend
• Limited concurrency primitive
• Example
x := 0;
cobegin
begin x := 1; x := x+1 end;
begin x := 2; x := x+1 end;
coend;
print(x);
x := 1
execute sequential
blocks in parallel
x := x+1
x := 0
print(x)
x := 2
x := x+1
Atomicity at level of assignment statement
27
Mutual exclusion
• Sample action
procedure sign_up(person)
begin
number := number + 1;
list[number] := person;
end;
• Problem with parallel execution
cobegin
sign_up(fred);
sign_up(bill);
end;
bob
bill
fred
28
Locks and Waiting
<initialze concurrency control>
cobegin
begin
<wait>
sign_up(fred); // critical section
<signal>
end;
begin
<wait>
sign_up(bill); // critical section
<signal>
end;
Need atomic operations to implement wait
end;
29
Mutual exclusion primitives
• Atomic test-and-set
– Instruction atomically reads and writes some location
– Common hardware instruction
– Combine with busy-waiting loop to implement mutex
• Semaphore
–
–
–
–
Avoid busy-waiting loop
Keep queue of waiting processes
Scheduler has access to semaphore; process sleeps
Disable interrupts during semaphore operations
• OK since operations are short
30
Monitor
Brinch-Hansen, Dahl, Dijkstra, Hoare
• Synchronized access to private data. Combines:
– private data
– set of procedures (methods)
– synchronization policy
• At most one process may execute a monitor procedure at a
time; this process is said to be in the monitor.
• If one process is in the monitor, any other process that calls
a monitor procedure will be delayed.
• Modern terminology: synchronized object
31
OCCAM
•
•
•
•
•
Program consists of processes and channels
Process is code containing channel operations
Channel is a data object
All synchronization is via channels
Formal foundation based on CSP
32
Channel Operations in OCCAM
• Read data item D from channel C
– D?C
• Write data item Q to channel C
– Q!C
• If reader accesses channel first, wait for writer,
and then both proceed after transfer.
• If writer accesses channel first, wait for reader,
and both proceed after transfer.
33
Tasking in Ada
• Declare a task type
• The specification gives the entries
– task type T is
entry Put (data : in Integer);
entry Get (result : out Integer);
end T;
• The entries are used to access the task
34
Declaring Task Body
• Task body gives actual code of task
• task body T is
x : integer; -- local per thread declaration
begin
…
accept Put (M : Integer) do
…
end Put;
…
end T;
35
Creating an Instance of a Task
• Declare a single task
• X : T;
• or an array of tasks
• P : array (1 .. 50) of T;
• or a dynamically allocated task
• type AT is access T;
• P : AT;
…
P := new T;
36
Task execution
• Each task executes independently, until
– an accept call
• wait for someone to call entry, then proceed with
rendezvous code, then both tasks go on their way
– an entry call
• wait for addressed task to reach corresponding accept
statement, then proceed with rendezvous, then both tasks
go on their way.
37
More on the Rendezvous
• During the Rendezvous, only the called task executes,
and data can be safely exchanged via the entry
parameters
• If accept does a simple assignment, we have the
equivalent of a simple CSP channel operation, but there
is no restriction on what can be done within a
rendezvous
38
Termination of Tasks
• A task terminates when it reaches the end of the beginend code of its body.
• Tasks may either be very static (create at start of
execution and never terminate)
• Or very dynamic, e.g. create a new task for each new
radar trace in a radar system.
39
The Delay Statement
• Delay statements temporarily pause a task
– Delay xyz
• where xyz is an expression of type duration causes
execution of the thread to be delayed for (at least) the given
amount of time
– Delay until tim
• where tim is an expression of type time, causes execution
of the thread to be delayed until (at the earliest) the given
tim
40
Selective Accept
• Select statement allows a choice of actions
• select
entry1 (…) do .. end;
or
when bla entry2 (…);
or
delay ...ddd...;
end select;
– Take whichever open entry arrives first, or if none arrives by
end of delay, do …ddd…stmts.
41
Timed Entry Call
• Timed Entry call allows timeout to be set
• select
entry-call-statement
or
delay xxx;
…
end select
– We try to do the entry call, but if the task won’t accept in xxx
time, then do the delay stmts.
42
Java Concurrency
• Threads
– Create process by creating thread object
• Communication
– shared variables
– method calls
• Mutual exclusion and synchronization
– Every object has a lock (inherited from class Object)
• synchronized methods and blocks
– Synchronization operations (inherited from class Object)
• wait : pause current thread until another thread calls notify
• notify : wake up waiting threads
43
Java Threads
• Thread
– Set of instructions to be executed one at a time, in a specified
order
• Java thread objects
– Object of class Thread
– Methods inherited from Thread:
• start : method called to spawn a new thread of control;
causes VM to call run method
• suspend : freeze execution
• interrupt : freeze execution and throw exception to thread
• stop : forcibly cause thread to halt
44
Example subclass of Thread
class PrintMany extends Thread {
private String msg;
public PrintMany (String m) {msg = m;}
public void run() {
try { for (;;){ System.out.print(msg + “ “);
sleep(10);
}
} catch (InterruptedException e) {
return;
}
}
(inherits start from Thread)
45
Interaction between threads
• Shared variables
– Two threads may assign/read the same variable
– Programmer responsibility
• Avoid race conditions by explicit synchronization!!
• Method calls
– Two threads may call methods on the same object
• Synchronization primitives
– Each object has internal lock, inherited from Object
– Synchronization primitives based on object locking
46
Synchronization example
• Objects may have synchronized methods
• Can be used for mutual exclusion
– Two threads may share an object.
– If one calls a synchronized method, this locks object.
– If the other calls a synchronized method on same object, this
thread blocks until object is unlocked.
47
Synchronized methods
• Marked by keyword
public synchronized void commitTransaction(…) {…}
• Provides mutual exclusion
– At most one synchronized method can be active
– Unsynchronized methods can still be called
• Programmer must be careful
• Not part of method signature
– sync method equivalent to unsync method with body
consisting of a synchronized block
– subclass may replace a synchronized method with
unsynchronized method
48
Join, another form of synchronization
• Wait for thread to terminate
class Future extends Thread {
private int result;
public void run() { result = f(…); }
public int getResult() { return result;}
}
…
Future t = new future;
t.start()
// start new thread
…
t.join(); x = t.getResult(); // wait and get result
49
Aspects of Java Threads
• Portable since part of language
– Easier to use in basic libraries than C system calls
– Example: garbage collector is separate thread
• General difficulty combining serial/concur code
– Serial to concurrent
• Code for serial execution may not work in concurrent sys
– Concurrent to serial
• Code with synchronization may be inefficient in serial
programs (10-20% unnecessary overhead)
• Abstract memory model
– Shared variables can be problematic on some implementations
50
C# Threads
• Basic thread operations
– Any method can run in its own thread
– A thread is created by creating a Thread object
– Creating a thread does not start its concurrent execution; it
must be requested through the Start method
– A thread can be made to wait for another thread to finish with
Join
– A thread can be suspended with Sleep
– A thread can be terminated with Abort
51
C# Threads
• Synchronizing threads
– The Interlock class
– The lock statement
– The Monitor class
• Evaluation
– An advance over Java threads, e.g., any method can run its
own thread
– Thread termination cleaner than in Java
– Synchronization is more sophisticated
52
Polyphonic C#
• An extension of the C# language with new concurrency
constructs
• Based on the join calculus
– A foundational process calculus like the p-calculus but better suited to
asynchronous, distributed systems
• A single model which works both for
– local concurrency (multiple threads on a single machine)
– distributed concurrency (asynchronous messaging over LAN or WAN)
• It is different
• But it’s also simple – if Mort can do any kind of concurrency, he
can do this
53
In one slide:
• Objects have both synchronous and asynchronous methods.
• Values are passed by ordinary method calls:
– If the method is synchronous, the caller blocks until the method returns some result
(as usual).
– If the method is async, the call completes at once and returns void.
• A class defines a collection of chords (synchronization patterns), which define
what happens once a particular set of methods have been invoked. One method
may appear in several chords.
–
–
–
–
When pending method calls match a pattern, its body runs.
If there is no match, the invocations are queued up.
If there are several matches, an unspecified pattern is selected.
If a pattern containing only async methods fires, the body runs in a new thread.
54
Extending C# with chords
•
Classes can declare methods using generalized
chord-declarations instead of method-declarations.
chord-declaration ::= method-header [ & method-header ]* body
method-header ::= attributes modifiers [return-type | async] name (parms)
•
Interesting well-formedness conditions:
1.
2.
3.
At most one header can have a return type (i.e. be synchronous).
Inheritance restriction.
“ref” and “out” parameters cannot appear in async headers.
55
Concurrent ML
• Threads
– New type of entity
• Communication
– Synchronous channels
• Synchronization
– Channels
– Events
• Atomicity
– No specific language support
56
Threads
• Thread creation
– spawn : (unit  unit)  thread_id
• Example code
CIO.print "begin parent\n";
spawn (fn () => (CIO.print "child 1\n";));
spawn (fn () => (CIO.print "child 2\n";));
CIO.print "end parent\n“
• Result
child 1
begin parent
child 2
end parent
57
Channels
• Channel creation
– channel : unit  ‘a chan
• Communication
– recv : ‘a chan  ‘a
– send : ( ‘a chan * ‘a )  unit
• Example
ch = channel();
spawn (fn()=> … <A> … send(ch,0); … <B> …);
spawn (fn()=> … <C> … recv ch; … <D> …);
• Result
<A>
<C>
send/recv
<B>
<D>
58
CML programming
• Functions
– Can write functions : channels  threads
– Build concurrent system by declaring channels and “wiring
together” sets of threads
• Events
– Delayed action that can be used for synchronization
– Powerful concept for concurrent programming
• Sample Application
– eXene – concurrent uniprocessor window system
59
A CML implementation (simplified)
• Use queues with side-effecting functions
datatype 'a queue = Q of {front: 'a list ref, rear: 'a list ref}
fun queueIns (Q(…)) = (* insert into queue *)
fun queueRem (Q(…)) = (* remove from queue *)
• And continuations
val enqueue = queueIns rdyQ
fun dispatch () = throw (queueRem rdyQ) ()
fun spawn f = callcc (fn parent_k =>
( enqueue parent_k; f (); dispatch()))
Source: Appel, Reppy
60
Language issues in client/server programming
• Communication mechanisms
– RPC, Remote Objects, SOAP
• Data representation languages
– XDR, ASN.1, XML
• Parsing and deparsing between internal and external
representation
• Stub generation
61
Client/server example
•
A major task of most clients is to interact with a human user and a remote server.
• The basic organization of the X Window System
62
Client-Side Software for Distribution Transparency
• A possible approach to transparent replication of a remote object
using a client-side solution.
63
The Stub Generation Process
Compiler / Linker
Server
Program
Interface
Specification
Stub
Generator
Server
Stub
Common
Header
Client
Stub
Server
Source
RPC
RPC
LIBRARY
LIBRARY
Client
Source
Client
Program
Compiler / Linker
64
RPC and the OSI Reference Model
Application Layer
Presentation Layer (XDR)
Session Layer (RPC)
Transport Layer (UDP)
65
Representation
• Data must be represented in a meaningful format.
• Methods:
– Sender or Receiver makes right (NDR).
• Network Data Representation (NDR).
• Transmit architecture tag with data.
– Represent data in a canonical (or standard) form
• XDR
• ASN.1
• Note – these are languages, but traditional DS
programmers don’t like programming languages, except C
66
XDR - eXternal Data Representation
• XDR is a universally used standard from Sun Microsystems
used to represent data in a network canonical (standard) form.
• A set of conversion functions are used to encode and decode
data; for example, xdr_int( ) is used to encode and decode
integers.
• Conversion functions exist for all standard data types
– Integers, chars, arrays, …
• For complex structures, RPCGEN can be used to generate
conversion routines.
67
RPC Example
gcc
client.c
client
date_clnt.c
date_xdr.c
date.x
RPCGEN
date.h
RPC
library
-lnsl
date_svc.c
date_proc.c
gcc
date_svc
68
XDR Example
#include <rpc/xdr.h>
..
XDR sptr; // XDR stream pointer
xdrs
XDR *xdrs; // Pointer to XDR stream pointer
char buf[BUFSIZE]; // Buffer to hold XDR data
xdrs = (&sptr);
xdrmem_create(xdrs, buf, BUFSIZE, XDR_ENCODE);
..
int i = 256;
xdr_int(xdrs, &i);
printf(“position = %d. \n”, xdr_getpos(xdrs));
sptr
buf
69
Abstract Syntax Notation 1 (ASN.1)
•
•
•
ASN.1 is a formal language that has two features:
– a notation used in documents that humans read
– a compact encoded representation of the same information used in communication
protocols.
ASN.1 uses a tagged message format:
– < tag (data type), data length, data value >
Simple Network Management Protocol (SNMP) messages are encoded using ASN.1.
70
Distributed Objects
• CORBA
• Java RMI
• SOAP and XML
71
Distributed Objects
Proxy and Skeleton in Remote Method
Invocation
server
client
object A proxy for B
Request
skeleton
& dispatcher
for B’s class
remote
object B
Reply
Communication
Remote
reference module
module
Communication Remote reference
module
module
72
CORBA
• Common Object Request Broker Architecture
• An industry standard developed by OMG to help in distributed
programming
• A specification for creating and using distributed objects
• A tool for enabling multi-language, multi-platform communication
• A CORBA based-system is a collection of objects that isolates the
requestors of services (clients) from the providers of services
(servers) by an encapsulating interface
73
CORBA objects
They are different from typical programming objects in
three ways:
• CORBA objects can run on any platform
• CORBA objects can be located anywhere on the
network
• CORBA objects can be written in any language that has
IDL mapping.
74
Client
Object Implementation
IDL
IDL
Client
Object Implementation
IDL
IDL
ORB
ORB
NETWORK
A request from a client to an Object implementation within a network
75
IDL (Interface Definition Language)
• CORBA objects have to be specified with interfaces (as
with RMI) defined in a special definition language IDL.
• The IDL defines the types of objects by defining their
interfaces and describes interfaces only, not
implementations.
• From IDL definitions an object implementation tells its
clients what operations are available and how they should
be invoked.
• Some programming languages have IDL mapping (C, C++,
SmallTalk, Java,Lisp)
76
IDL File
IDL Compiler
Client
Implementation
Client Stub
File
Server
Skeleton File
Object
Implementation
ORB
77
The IDL compiler
• It will accept as input an IDL file written using any text editor
(fileName.idl)
• It generates the stub and the skeleton code in the target
programming language (ex: Java stub and C++ skeleton)
• The stub is given to the client as a tool to describe the server
functionality, the skeleton file is implemented at the server.
78
IDL Example
module katytrail {
module weather {
struct WeatherData {
float temp;
string wind_direction_and_speed;
float rain_expected;
float humidity;
};
typedef sequence<WeatherData> WeatherDataSeq
interface WeatherInfo {
WeatherData get_weather(
in string site
);
WeatherDataSeq find_by_temp(
in float temperature
);
};
79
IDL Example Cont.
interface WeatherCenter {
register_weather_for_site (
in string site,
in WeatherData site_data
);
};
};
};
Both interfaces will have Object Implementations.
A different type of Client will talk to each of the
interfaces.
The Object Implementations can be done in one
of two ways. Through Inheritance or through
a Tie.
80
Stubs and Skeletons
• In terms of CORBA development, the stubs and skeleton files are
standard in terms of their target language.
• Each file exposes the same operations specified in the IDL file.
• Invoking an operation on the stub file will cause the method to be
executed in the skeleton file
• The stub file allows the client to manipulate the remote object
with the same ease with each a local file is manipulated
81
Java RMI
• Overview
– Supports remote invocation of Java objects
– Key: Java Object Serialization
Stream objects over the wire
– Language specific
• History
–
–
–
–
Goal: RPC for Java
First release in JDK 1.0.2, used in Netscape 3.01
Full support in JDK 1.1, intended for applets
JDK 1.2 added persistent reference, custom protocols, more support for
user control.
82
Java RMI
• Advantages
–
–
–
–
True object-orientation: Objects as arguments and values
Mobile behavior: Returned objects can execute on caller
Integrated security
Built-in concurrency (through Java threads)
• Disadvantages
– Java only
• Advertises support for non-Java
• But this is external to RMI – requires Java on both sides
83
Java RMI Components
• Base RMI classes
– Extend these to get RMI functionality
• Java compiler – javac
– Recognizes RMI as integral part of language
• Interface compiler – rmic
– Generates stubs from class files
• RMI Registry – rmiregistry
– Directory service
• RMI Run-time activation system – rmid
– Supports activatable objects that run only on demand
84
RMI Implementation
Client Host
Server Host
Java Virtual Machine
Java Virtual Machine
Client
Object
Remote
Object
Stub
Skeleton
85
Java RMI Object Serialization
• Java can send object to be invoked at remote site
– Allows objects as arguments/results
• Mechanism: Object Serialization
– Object passed must inherit from serializable
– Provides methods to translate object to/from byte stream
• Security issues:
– Ensure object not tampered with during transmission
– Solution: Class-specific serialization
Throw it on the programmer
86
Building a Java RMI Application
• Define remote interface
– Extend java.rmi.Remote
• Create server code
– Implements interface
– Creates security manager, registers with registry
• Create client code
– Define object as instance of interface
– Lookup object in registry
– Call object
• Compile and run
– Run rmic on compiled classes to create stubs
– Start registry
– Run server then client
87
Parameter Passing
• Primitive types
– call-by-value
• Remote objects
– call-by-reference
• Non-remote objects
– call-by-value
– use Java Object Serialization
88
Java Serialization
•
•
•
•
•
Writes object as a sequence of bytes
Writes it to a Stream
Recreates it on the other end
Creates a brand new object with the old data
Objects can be transmitted using any byte stream
(including sockets and TCP).
89
Codebase Property
• Stub classpaths can be confusing
– 3 VMs, each with its own classpath
– Server vs. Registry vs. Client
• The RMI class loader always loads stubs from the
CLASSPATH first
• Next, it tries downloading classes from a web server
– (but only if a security manager is in force)
• java.rmi.server.codebase specifies which web server
90
CORBA vs. RMI
• CORBA was designed for language independence whereas
RMI was designed for a single language where objects run
in a homogeneous environment
• CORBA interfaces are defined in IDL, while RMI
interfaces are defined in Java
• CORBA objects are not garbage collected because they are
language independent and they have to be consistent with
languages that do not support garbage collection, on the
other hand RMI objects are garbage collected
automatically
91
SOAP Introduction
• SOAP is simple, light weight and text based protocol
• SOAP is XML based protocol (XML encoding)
• SOAP is remote procedure call protocol, not object oriented
completely
• SOAP can be wired with any protocol
SOAP is a simple lightweight protocol with minimum set of rules for
invoking remote services using XML data representation and HTTP
wire.
• Main goal of SOAP protocol – Interoperability
MainFrame
Windows
SOAP
Unix
ECommerce
• SOAP does not specify any advanced distributed services.
92
Why SOAP – What’s wrong with existing distributed technologies
• Platform and vendor dependent solutions
(DCOM – Windows) (CORBA – ORB vendors) (RMI – Java)
• Different data representation schemes
(CDR – NDR)
• Complex client side deployment
• Difficulties with firewall
Firewalls allows only specific ports ( port 80 ), but DCOM and
CORBA assigns port numbers dynamically.
• In short, these distributed technologies do not communicate easily
with each other because of lack of standards between them.
93
Base Technologies – HTTP and XML
• SOAP uses the existing technologies, invents no new
technology.
• XML and HTTP are accepted and deployed in all
platforms.
• Hypertext Transfer Protocol (HTTP)
– HTTP is very simple and text-based protocol.
– HTTP layers request/response communication over TCP/IP.
HTTP supports fixed set of methods like GET, POST.
– Client / Server interaction
•
•
•
•
•
•
Client requests to open connection to server on default port number
Server accepts connection
Client sends a request message to the Server
Server process the request
Server sends a reply message to the client
Connection is closed
– HTTP servers are scalable, reliable and easy to administer.
• SOAP can be bind any protocol – HTTP , SMTP, FTP
94
Extensible Markup Language (XML)
• XML is platform neutral data representation protocol.
• HTML combines data and representation, but XML contains just
structured data.
• XML contains no fixed set of tags and users can build their own
customized tags.
<student>
<full_name>Bhavin Parikh</full_name>
<email>[email protected]</email>
</student>
• XML is platform and language independent.
• XML is text-based and easy to handle and it can be easily
extended.
95
Architecture diagram
1. Client call remote service
using SOAP
2. Client can use proxy object to
hide all SOAP details
Client Application
(COM client or CORBA client or
Java RMI client)
Proxy Object
Call
direct
XML Parser
Call through
proxy
Web Services
Description
Language
SOAP Library
SOAP Request
SOAP Response
SOAP = HTTP +XML + RPC
OR
SOAP = HTTPS +XML + RPC
SOAP Listener
SOAP Library
HTTP Server
Mapping
Tool
3. SOAP Listener can be implemented
as ASP, JSP, CGI or SERVLET
XML Parser
Server Application
(COM object or CORBA object or
RMI Object)
4. Mapping tool maps SOAP request to
remote serice
96
Parsing XML Documents
• Remember: XML is just text
• Simple API for XML (SAX) Parsing
– SAX is typically most efficient
– No Memory Implementation!
• Left to the Developer
• Document Object Model (DOM) Parsing
– “Parsing” is not fundamental emphasis.
– A “DOM Object” is a representation of the XML document in
a binary tree format.
97
Parsing: Examples
• SaxParseExample
– “Callback” functions to process Nodes
• DomParseExample
– Use of JAXP (Java API for XML Parsing)
• Implementations can be ‘swapped’, such as replacing
Apache Xerces with Sun Crimson.
– JAXP does not include some ‘advanced’ features that may be
useful.
– SAX used behind the scenes to create object model
98
Languages for distributed computing
• Motivation
– Why all the fuss about language and platform independence?
• It is extremely inefficient to parse/deparse to/from
external/internal representation
• 95% of all computers run Windows anyway
• There is a JVM for almost any processor you can think of
• Few programmers master more than one programming
language anyway
– Develop a coherent programming models for all aspects of an
application
99
Facile Programming Language
• Integration of Multiple Paradigms
–
–
–
–
–
Functions
Types/complex data types
Concurrency
Distribution/soft real-time
Dynamic connectivity
• Implemented as extension to SML
• Syntax for concurrency similar to CML
100
101
Facile implementation
• Pre-emptive scheduler implemented at the lowest level
– Exploiting CPS translation => state characterised by the set of
registers
• Garbage collector used for linearizing data structures
• Lambda level code used as intermediate language when
shipping data (including code) in heterogeneous
networks
• Native representation is shipped when possible
– i.e. same architecture and within same trust domain
• Possibility to mix between interpretation or JIT
depending on usage
102
Conclusion
• Concurrency may be an order of magnitude more
difficult to handle
• Programming language support for concurrency may
help make the task easier
• Which concurrency constructs to add to the language is
still a very active research area
• If you add concurrency construct, be sure you base them
on a formal model!
103