Distributed Systems Major Design Issues
Download
Report
Transcript Distributed Systems Major Design Issues
DM Rasanjalee Himali
CSc8320 – Advanced Operating Systems (SECTION 2.6)
FALL 2009
The Basics
A distributed system consist of concurrent
processes accessing distributed resources
Resources are shared through message
passing in a network environment that may be
unreliable and contain untrusted components.
1.
2.
3.
4.
5.
Object Models and Naming Schemes
Distributed Coordination
Interprocess Communication
Distributed Resources
Fault Tolerance and Security
Objects in Computer System:
◦ Ex:
Processes, data files, memory, devices, processors, networks
◦ Are represented by set of allowable operations of the object
◦ Physical details of the object are transparent to other objects
Object Servers:
◦
◦
◦
◦
Is the process that manages the object
Objects are encapsulated in servers
Only visible entities in the system are servers
Ex:
process servers, file servers, memory servers etc.
◦ A client is a null server that accesses the object server
Identifying Server:
◦
To contact a server, server must be
identifiable.
◦
Three identification methods:
1.
2.
3.
Identification by name
Identification by physical or logical address
Identification by service that servers provide
1.
◦
◦
◦
2.
◦
◦
◦
◦
3.
◦
◦
◦
◦
◦
◦
◦
Identification by Name:
Names are generally assumed to be unique
But multiple addresses for same server may exist , and needs to change if server
moves
Names are more intuitive than addresses
Identification by physical or logical address
Name to logical address mapping is done by name server in OS.
logical address to physical address mapping is a network service
The PORT used by many systems is a logical address.
Associating more than one port to server provide multiple entry points to server
Identification by service that servers provide
Multiple servers can share the same port
This can be used for service identification in distributed system.
Client is only interested in requested service
Who provide the service is irrelevant
Multiple servers can provide the same service
This approach is critical to implement an autonomous system.
A resolution protocol is needed to translate service to server
Object models and naming :
◦ Must be addressed early in the system design as
many things depend on the naming scheme:
◦ Ex:
Structure of the system
Management of the namespace
Name resolution
Access methods
Interacting concurrent processes require coordination
to achieve synchronization.
Types of Synchronization Requirements:
◦
In general there are three types of synchronization
requirements:
1. Barrier Synchronization
◦
A set of processes or events must reach a common synchronization
point before they can continue
2. Condition coordination
◦
A process or event must wait for a condition that will be set
asynchronously by other interacting processes to maintain some
ordering of execution
3. Mutual Exclusion
◦
Concurrent processes must have mutual exclusion when accessing a
critical shared resource
Synchronization Implies the need for the knowledge
of state information about other processes
Problems with Synchronization:
1. Complete State of information is difficult to obtain
Ex:
◦
◦
◦
◦
no shared memory environment
Solution:
Use message passing to convey state information
2. Inaccurate or Incomplete information
Ex:
◦
◦
message transfer delays
Solution:
◦
◦
Use centralized coordinator that move from one process to another
(no single point of failure)
3. Deadlock of Processes
Interacting processes can lead to deadlock
Deadlock :Circular waiting of processors
Problem:
Sometimes it is not practical to implement deadlock prevention or avoidance
strategies in a distributed system
Solution:
Detect and recover from deadlocks
Problem:
Detection of deadlocks in a distributed system is non-trivial (b’s global state
of the system is not available)
Who should initiate the detection algorithm?
How the algorithm be implemented in distributed fashion by message
passing?
Who should be the victim in order to abort and resolve the deadlock?
How the victim can be recovered?
Efficiency of the of deadlock resolution and recovery seems more than that of
detection
Distributed solutions to synchronization and
deadlock problems:
◦ Use partial global state for decision making
Many applications do not need absolute global
knowledge of the system
◦ Exchange of local knowledge among cooperating
sites
Communication:
◦ Most important issue in any distributed system
◦ In OSs, interaction between processes and information flow between
objects depend on communication
◦ Message passing is the only means of communication in distributed
system
◦ Goal:
Have transparency in communication by providing higher level
communication methods that hide the physical details of the message
passing
◦ Two concepts are used to achieve this goal:
Client/Server model
Remote Procedure Calls (RPC)
Client/Server model:
◦ Programming paradigm for structuring processing in distributed
systems
◦ All system interactions are viewed as a pair of message
exchanges
Client process send request to server
Server responds with a reply message
Remote Procedure Calls:
◦ Client/Server request/reply message exchange is represented as
a procedure call in programming languages
◦ RPC: Procedure call to a remote server
Multicast and Broadcast:
◦ Client/Server, RPC : Unicast (point-to-point)
◦ Notion of “groups” is inherent to distributed systems
◦ Processes cooperate in group activities
◦ Group communication in distributed systems is logical multicast
(perhaps without broadcasting hardware)
◦ Communication needs to go through several layers of protocols and be
propagated to a no. of physically distributed nodes.
◦ Thus it is more susceptible to failures in the system
◦ Reliable and atomic group broadcast remains an open issue in
distributed systems
Only resource needed for computation are data and
processing
Data:
may reside physically in distributed memory or secondary
storage
Processing Capacity:
Aggregate processing power of all processors
Goal:
Achieve transparency in allocating processing capacity
processes (distributing processes/load to the processors )
Static Load Distribution:
Also called multiprocessor scheduling
Goal:
minimize completion time of a set of related processes
Issue:
Communication overhead on design of scheduling strategies
Dynamic Load Distribution:
Also called load sharing
Goal:
Maximize utilization of set of processes
Issue:
Process migration
Distributed Shared Memory:
Transparent memory system
Assume data resides in distributed memory modules
Present single shared memory view of physically distributed
memories
Goal:
Maximize transparency
Other issues (for distributed file systems & distributed
shared memory):
Sharing & replication of data
Need protocols to maintain consistency & coherency of data
Existence of replicas should be transparent to the user
Distributed systems are vulnerable to failures and
security threats
Failures:
Faults due to unintentional intrusion
Security Violations:
Faults due to intentional intrusion
Dependable Distributed System:
Fault tolerant system
System faults are transparent to the user
Solution for Failures:
Redundancy in the system:
Is an inherent property of distributed systems as data and resources can be
replicated
Rollback:
Recovery from failures requires rolling back the execution of failed process
and other affected processes
The execution state must be kept for rollback recovery (difficult task in
distributed systems)
Solution for Security:
Issues :
Trustworthiness of the communicating processes
Confidentiality and integrity of messages & data
Authentication & Authorization
Solutions:
Authentication : Clients , servers & messages must be authenticated
Authorization : access control across physical network with heterogeneous
components under different administrative units, using different security
models
Related Work
Peer-to-Peer Networks:
distributed network architecture
composed of participants that make
a portion of their resources
available directly to their peers
without intermediary network hosts
or servers.
Peers are both suppliers and
consumers of resources
◦ Research:
Security and privacy in P2P systems
Resource discovery/management in
P2P systems
Peer-to-Peer Search
BFS – Breadth First Search
(-) sacrifices performance and network utilization for simplicity
(+) guarantees high hit rates at the expense of a large no. of
messages
Random BFS
(-) RBFS algorithm is probabilistic and the query might not
reach some large network segments
(+) does not require global knowledge
Future Work
Develop a model for P2P Search
Bayesian Inferencing
Value of Information
Extend P2P search for P2P Web Search
Most centralized Web search engines currently find it harder to
catch up with the growth in information needs
Local & decentralized global directory
Semantic P2P Overlay Networks
Node connections be influenced by content / existence of
multiple overlay networks based on content
Dynamic restructuring of overlay
Randy Chow, Theodore Johnson, “Distributed Operating
Systems & Algorithms”, Addison Wesley, 1997
Semantic Overlay Networks for P2P Systems, Arturo
Crespo and Hector Garcia-Molina, 2002
Random walks in peer-to-peer networks: algorithms
and evaluation , Christos Gkantsidis, Milena Mihail,
Amin Saberi , 2006
www.en.wikipedia.com