topic 1x - Lightweight OCW University of Palestine

Download Report

Transcript topic 1x - Lightweight OCW University of Palestine

Topic 1: Characterization Of
Distributed Systems
Dr. Ayman Srour
Faculty of Applied Engineering and Urban Planning
University of Palestine
Outline
• 1.1: What is a Distributed System
• 1.2: Examples of Distributed Systems
• 1.3: Challenges
• 1.4:Basic Design Issues
• 1.5:Design Requirements
• 1.6: Summary
1.1 What is distributed systems?
• Definition: A distributed system is one in which
components located at networked computers
communicate and coordinate their actions only
by passing messages.
• This definition leads to the following
characteristics of distributed systems:
•
•
•
concurrency of components
lack of a global clock
independent failures of components
Centralized System Characteristics
• One component with non-autonomous parts
• Component shared by users all the time
• All resources accessible
• Software runs in a single process
• Single point of control
• Single point of failure
Distributed System Characteristics
• Multiple autonomous components
• Components are not shared by all users
• Resources may not be accessible
• Software runs in concurrent processes on different
processors
• Multiple points of control
• Multiple points of failure
Selected application domains and associated networked
applications
Finance and commerce
eCommerce e.g. Amazon and eBay,
PayPal, online banking and trading
The information society
Web information and search engines, ebooks,
Wikipedia; social networking: Facebook and
MySpace.
Creative industries and
entertainment
online gaming, music and film in the
home, user-generated content, e.g.
YouTube, Flickr
Healthcare
health informatics, on online patient
records, monitoring patients
Education
e-learning, virtual learning environments;
distance learning
Transport and logistics
GPS in route finding systems, map
services: Google Maps, Google Earth
Science
The Grid as an enabling technology for
collaboration between scientists
Environmental management
sensor technology to monitor
1.2 Examples of Distributed
Systems
• Local Area Network and Intranet
• Database Management System
• Automatic Teller Machine Network
• Web search
• Portable and handheld devices
• Internet/World-Wide Web
1.2.1 Local Area Network
1.2.1 Local Area Network
• A local area network consists of a number of different
computers. Workstations and Personal computers
provide the front-end for network users. Different
servers provide shared services
• One or several network file servers provide data storage
services. Any workstation and PC may henceforth store
files on disks maintained by these file servers.
• A local name server maps machine names to IP
addresses, user names to user ids and group names to
group ids. Any machine can request a service to resolve
a certain name.
• Local area networks can be connected together to form
Intranets
1.2.1 Local Area Network
• One or several print servers control the access to shared
printers. Workstations and PCs have the server printing
jobs for them.
• Another component provides a gateway to the wide
area network
• As a user you need not be aware which machine
provides which service.
• Name servers provide the information regarding
machine (by IP address),user names (by user ids), and
group name (by group ids).
• Local area networks can be connected together to form
Intranets
1.2.2 Database Management
System
1.2.2 Database Management
System
• Different client applications want to access and update
shared data in a database.
• Client applications might be banking systems, realestate agencies, airline ticket reservation systems
accessing data like balances of bank accounts, details of
property that are for sale, or airfares and aircraft
reservation data.
• The database is physically distributed over several
processors to take advantage of local data accesses for
increased performance of client applications.
• Data may be replicated to reduce the impact of failures
of a processor and/or the network.
1.2.2 Database Management
System
• Each processor runs a database monitor that
implements the mapping between the database
seen by clients and the physical database stored on
the different processors.
• Database monitors have to cooperate with each
other to implement client accesses to remote data,
updates of replicated data and concurrency control.
• The physical distribution of data is therefore
transparent to clients.
1.2.3 Automatic Teller Machine
Network
1.2.3 Automatic Teller Machine
Network
• An automatic teller machine network enables bank
customers to withdraw cash from their bank account.
• Banks and building societies maintain large networks of
teller machines.
• Customer have high security, privacy and reliability
requirements
• Customers may want to withdraw cash from their
account through a ´foreign´ teller machine.
• copy of the account database and can replace the main
computer within seconds.
1.2.3 Automatic Teller Machine
Network
• front-end computer controls one or several tellers. It
o transfers withdrawal requests to the computer of the
account holder´s bank,
o awaits the bank granting the request, and
o therefore has to be interoperable with heterogeneous
computer systems (Hang Seng Bank may have
different account management systems than
HongKong Bank and Bank of China).
• Each bank has fault-tolerant systems to quickly recover
from failures of their account holding computers. An
example is the ´Hot standby´ computer which maintains
a copy of the account database and can replace the
main computer within seconds.
1.2.4 Web Search
• The global number of searches has risen to over 10
billion per calendar month.
• Web search engine should index the entire
contents of the WWW. i.e., web pages, multimedia
sources, books and so on….
• Its very complex task need sophisticated processing
on huge DB.
• Google presents the largest and more complex DS
infrastructure to support web search and other
application and services (i.e. google earth…)
1.2.5 Portable and handheld
devices
1.2.6 Internet/World-Wide Web
• The modern Internet is a vast interconnected collection of computer networks
of many different types (WiFi, WiMAX, 3.0 and 4.0 G,…)
• Internet is the largest distributed system in the world. It enables users, wherever
they are, to make use of services such as the World Wide Web, email and file
transfer.
• Internet Service
• Providers (ISPs) are companies that provide broadband links and other types of
connection to individual users and small organizations and typically protected by
firewalls.
• The role of a firewall is to protect an intranet by preventing unauthorized
messages from leaving or entering.
• The intranets are linked together by backbones. A backbone is a network link
with a high transmission capacity, employing satellite connections, fibre optic
cables and other high-bandwidth circuits.
• World Wide Web is the largest software application running on Internet. It
becomes the most popular distributed software application ever created.
Internet
intranet
ISP
backbone
satellite link
desktop computer:
server:
network link:
1.3 Challenges
• What are we trying to achieve when we construct a
distributed system?
• Certain common characteristics can be used to assess
distributed systems:
o
o
o
o
o
o
o
Heterogeneity.
Openness.
Security.
Scalability.
Failure Handling
Concurrency
Transparency
1.3.1 Heterogeneity
• Variety and differences in
– Networks
– Computer hardware
– Operating systems
– Programming languages
– Implementations by different developers
• Middleware as software layers to provide a programming abstraction as
well as masking the heterogeneity of the underlying networks,
hardware, OS, and programming languages (e.g., CORBA).
• CORBA provides remote object invocation, which allows an object in a
program running on one computer to invoke a method of an object in a
program running on another computer.
• Mobile Code to refer to code that can be sent from one computer to
another and run at the destination (e.g., Java applets and Java virtual
machine).
1.3.2 Openness
• Openness is concerned with extensions and
improvements of distributed systems.
• Detailed interfaces of components need to be
published.
• New components have to be integrated with
existing components.
• Differences in data representation of interface
types on different processors (of different vendors)
have to be resolved.
1.3.3 Security
• In a distributed system, clients send requests to access
data managed by servers, resources in the networks
o
o
Doctors requesting records from hospitals
Users purchase products through electronic commerce
• Many needs for security exist for distributed systems
for data secrecy and personal privacy.
• Security is required for
o
o
Concealing the contents of messages: security and privacy
Identifying a remote user or other agent correctly: authentication
• New challenges:
o
o
Denial of service attack
Security of mobile code
1.3.4 Scalability
• Centralized systems often create bottlenecks as soon as
a certain number of users are reached.
• Distributed systems can be built in a way that these
bottlenecks are avoided.
• Adaptation of distributed systems to
o accommodate more users
o respond faster (this is the hard one)
• Usually done by adding more and/or faster processors
to accommodate new users.
• Components should not need to be changed when
scale of a system increases.
• Design components to be scalable
Growth of the Internet
(computers and web servers)
1.3.5 Failure Handling (Fault
Tolerance)
• Hardware, software and networks fail!
• They fail either because of software errors, failures in the
supporting infrastructure (power supplyor air-conditioning),
mis-use of their users or just because of aging hardware.
• The average life time of hard disks are between two and five
years, much less than the average life-time of a distributed
system.
• Distributed systems must maintain availability even at low
levels of hardware/software/network reliability.
• Fault tolerance is achieved by
o recovery
o redundancy (hardware, software and data)- to decreases the time that
isneeded after a failure to bring a system up again.
1.3.6 Concurrency
• Components in distributed systems are executed in
concurrent processes. There may be many different people
at different teller machines. Likewise, there are many
different users working in a local area network.
• Components access and update shared resources (e.g.
variables, databases, device drivers). There may be many
different people at different teller machines. Likewise, there
are many different users working in a local area network.
• Integrity of the system may be violated if concurrent
updates are not coordinated.
o Lost updates
o Inconsistent analysis
o I.e. debit balance/ credit balance
1.3.6 Concurrency
• As an example for a lost update, consider that you withdraw 50 dollars. This requires the bank´s
account database to compute
• If a clerk in the bank credits a check of 100 dollars the following computation has to be done:
• If these two modifications to your account are done concurrently the integrity of the account
data may be violated in two ways:
1. your debit may not be recorded (bad luck for the bank) if the
schedule is (Op1, Op3, Op2, Op4).
2. the credit of your check may not be recorded (bad luck for you) if
the schedule is (Op3, Op1, Op4, Op2).
• These situations have by all means to be avoided.
• Concurrency control facilities (such as locking) are needed in almost any concurrent system.
1.3.7 Transparency
• Transparency is defined as the concealment from the user and the
application programmer of the separation of components in a
distributed system.
• Distributed systems should be perceived by users and application
programmers as a whole rather than as a collection of cooperating
components.
• For the user point of view: the complexity of distributed systems should
be hidden from their users. They should not have to be aware whether
the system they are using is centralized or distributed. Thus, it is
transparent for the user that s/he is using a distributed system.
• To make life easier for an application programmer, s/he should also not
have to be aware that s/he is using distributed components.
• For the administrator point of view: this is not true. For them, it may
well be important (e.g. during load balancing) to know which
component resides on which machine.
1.3.7 Transparency
• According To The Advanced Network Systems
Architecture (ANSA) and International Organization for
Standardization’s Reference Model for Open
Distributed Processing (RM-ODP) there is eight forms of
transparency:
o
o
o
o
o
o
o
o
Access transparency
Location transparency
Concurrency transparency
Replication transparency
Failure transparency
Mobility transparency
Performance transparency
Scaling transparency
1.3.7.1 Access Transparency
• Access transparency enables local and remote resources to
be accessed using identical operations.
• Examples of access transparency are:
• File system operations in NFS: Users of UNIX NFS can use the
same commands for copying, moving and deleting files
regardless whether the accessed files are local or remote. Or
Application programmers can use the same library calls to
manipulate files on NFS
• Navigation in the Web: Users of a web browser can navigate to
another page by clicking on a hyperlink, regardless whether the
hyperlink leads to a local or a remote page.
• SQL Queries: Programmers of a database application can use
the same SQL commands regardless whether they are
accessing a local or a remote database in a distributed
relational database management system.
1.3.7.2 Location Transparency
• Location transparency enables resources to be accessed without
knowledge of their physical or network location
• Examples of location transparency are:
• File system operations in NFS: Users of the network file system can
access files by name and do not need to know whether the file
resides on a local or a remote disk.
• Pages in the Web: Users of a Web browser need not be aware where
the page physically resides. They can access initially pages by a URL
and then can navigate further by URLs that are embedded in the web
page.
• Tables in distributed databases: Programmers of a relational
database application do not need to worry where the tables
physically reside. They can access tables by table name and need not
worry about where the table is physically located. Their local
database monitor will translate the names into physical location and
have remote monitors transferring tables if a remote table should be
accessed.
1.3.7.3 Concurrency Transparency
• Enables several processes to operate concurrently using shared
resources without interference between them.
• Examples of location transparency are:
• NFS: Multiple users can access and update files on the same file
system and they do not know about each other.
Concurrency is, however, not transparent for an application
programmer using the file system. To avoid lost updates and
inconsistent analysis, s/he has to explicitly lock files.
• Automatic teller machine network: Users of an ATM need not be
aware of the fact that other customers are using tellers at the same
time and that bank clerks may be concurrently manipulating account
balances.
• Database management system: Programmers of relational database
applications typically need not worry about concurrency, but integrity
against concurrent updates is typically preserved by the database
management system (e.g. by two-phase locking).
1.3.7.4 Replication Transparency
• Enables multiple instances of information objects to be used to increase
reliability and performance without knowledge of the replicas by users or
application programs.
• Replication is the duplication of data on other hosts.
• Replication is used to increase the reliability of data accesses as well as the
performance with which data is accessed and updated.
• Replication transparency denotes the fact that neither users nor application
programmers have to be aware about the replication of data.
• Examples of location transparency are:
• Distributed DBMS: Tables in a distributed relational database may be replicated.
However neither users, nor application programmers are aware that the tables are
replicated and updates have to be propagated to the other replicas as well.
• Mirroring Web Pages.: Often Web pages are replicated to increase performance of
their retrieval and to have them available also in the presence of network failures. The
SuperJanet gateway, i.e., replicates pages from the US that are frequently accessed.
Replication, however, is transparent for both Web surfers and Web page designers. A
Web surfer does not see that the page is not being brought over the Atlantic (S/he may
be surprised by the speed, however). A Web page designer can still refer to the US URL
and need not take the mirror site into account.
1.3.7.5 Failure Transparency
• enables the concealment of faults, allowing users and
application programs to complete their tasks despite the failure
of hardware or software components..
• Failure transparency is rather difficult to achieve. It involves
complete fault recovery.
• Examples of location transparency are:
• Database Management System: consider the distributed database
again. If the database has kept local replicates of remote data, users
can continue to use the database, even though the remote data
monitors have failed. The local data monitor has to detect the failure
of remote monitors. Updates of local data then have to be buffered in
the local replicate, inconsistencies have to be temporarily tolerated
(as multiple sites may temporarily buffer updates). After the remote
monitor has come up again, the buffered updates have to be
incorporated into the remote databases and inconsistencies (if any)
have to be reconciled..
1.3.7.6 Mobility Transparency
• Allows the movement of resources and clients within a system
without affecting the operation of users or programs.
• Migration denotes the fact that software and/or data is moved to
other processors.
• Migration is transparent to users and application programmers if
they do not have to be aware of the fact that software and/or
data has moved.
• Migration transparency is dependent on location transparency.
• Examples of location transparency are:
• NFS: If UP decides to move file systems of the NFS to a different disk,
you will not recognize that.
• Web Pages: If UP moves the department Web site to a different
location in the file system, you will not recognize that because the
URL http://www.up.edu.ps will be interpreted by the local http
daemon.
1.3.7.7 Performance Transparency
• Allows the system to be reconfigured to improve performance
as loads vary.
• Performance Transparency denotes the fact that users and
application programmers are not aware as to how the
performance that a distributes system has is actually achieved.
• Examples of location transparency are:
• Distributed make : There is a variant of make that is capable of
performing jobs (e.g. compiling a source module) on remote
machines. Then complex systems can be compiled much quicker. It
not only considers the different processors and their capabilities, but
also their actual load. If it can choose from a set of processors, it will
delegate it to the fastest processor that has the lowest load. In this
way it achieves an even better performance. Programmers using
make, however, do not see or choose which machine performs which
job. The way how the actual performance is achieved is transparent
for them.
1.3.7.8 Scaling Transparency
• Allows the system and applications to expand in scale without change
to the system structure or the application algorithms.
• Scalability denotes the fact that the distributed system can be adjusted
to accommodate a growing load / number of users.
• Scaling the distributed system up is transparent if users and application
programmers do not have to be changed.
• Examples of location transparency are:
• World-Wide-Web: New Web sites can be added to the Internet, thus scaling
the Internet up without existing sites having to change their set-up. Or new
network connections can be added in the Internet or existing connections are
replaced with higher bandwidth connections to improve throughput. Existing
Web sites do not have to be changed to benefit from this improvement.
• Distributed Database: new hosts can be added to accommodate parts of the
database. The allocation tables maintained by database monitors will have to
be adjusted. Existing database schemas and applications, however, need not
be changed.
1.3.7.8 Scaling Transparency
• Allows the system and applications to expand in scale without change
to the system structure or the application algorithms.
• Scalability denotes the fact that the distributed system can be adjusted
to accommodate a growing load / number of users.
• Scaling the distributed system up is transparent if users and application
programmers do not have to be changed.
• Examples of location transparency are:
• World-Wide-Web: New Web sites can be added to the Internet, thus scaling
the Internet up without existing sites having to change their set-up. Or new
network connections can be added in the Internet or existing connections are
replaced with higher bandwidth connections to improve throughput. Existing
Web sites do not have to be changed to benefit from this improvement.
• Distributed Database: new hosts can be added to accommodate parts of the
database. The allocation tables maintained by database monitors will have to
be adjusted. Existing database schemas and applications, however, need not
be changed.
1.3.8 Quality of service (QoS)
• QoS of system refers to the quality aspects (non-functional
properties) that are satisfactory to some user.
• The requirements include performance and network related
QoS attributes such as reliability, scalability, exception
handling, accuracy, integrity, accessibility, availability,
interoperability security and others.
• e.g., Web Service QoS
1.3.8 Quality of service (QoS)
Web service QoS attributes description
QoS attribute
Description
Response time indicates the delay time in seconds that
Response Time
measure the time between sending the requests and receiving
the results.
Cost
Availability
Cost indicates the prices or the amount of money the service
consumer should pay to use or invoke the services.
Availability
indicates
the
probabilities
of
the
service
availability.
Reliability indicates the quality of the Web service in terms of
Reliability
functionality that is determining the correctness and
consistency of the Web service functionality.
Reputation
Reputation indicates the trustworthiness.
1.4 Basic Design issues
• Software structure
• Tiered architecture
• System architecture
1.4.1 Software Structure
1.4.1 Software Structure
• A platform for distributed systems and applications consists of the
lowest-level hardware and software layers. These low-level layers
provide services to the layers above them, which are implemented
independently in each computer, bringing the system’s programming
interface up to a level that facilitates communication and coordination
between processes.
• Middleware is a a layer of software whose purpose is to mask
heterogeneity and to provide a convenient programming model to
application programmers.
• Middleware is represented by processes or objects in a set of computers
that interact with each other to implement communication and
resource-sharing support for distributed applications.
• Support RMI; communication between a group of processes;
notification of events; he replication of shared data objects and so on..
1.4.2 Tiered Architecture
1.4.2 Tiered Architecture
• Tiered architectures are complementary to layering. Whereas layering
deals with the vertical organization of services into layers of abstraction,
tiering is a technique to organize functionality of a given layer and place
this functionality into appropriate servers and, as a secondary
consideration, on to physical nodes.
• the presentation logic, which is concerned with handling user
interaction and updating the view of the application as presented to the
user;
• the application logic, which is concerned with the detailed applicationspecific processing associated with the application (also referred to as
the business logic, although the concept is not limited only to business
applications);
• the data logic, which is concerned with the persistent storage of the
application, typically in a database management system.
1.4.3 System Architectures
• Client-server
• Peer-to-peer
• Services provided by multiple servers
• Proxy servers and caches
• Mobile code and mobile agents
• Thin clients and mobile devices
1.4.3.1 Client-server
1.4.3.1 Client-server
• This is the architecture that is most often cited when distributed
systems are discussed.
• It is historically the most important and remains the most widely
employed.
• client processes interact with individual server processes to access the
shared resources.
• Servers may in turn be clients of other servers, e.g., a web server is
often a client of a local file server that manages the files in which the
web pages are stored. Web servers and most other Internet services are
clients of the DNS service, which translates Internet domain names to
network addresses.
1.4.3.2 Peer-to-peer
1.4.3.2 Peer-to-peer
• In this architecture all of the processes involved in a task or activity play
similar roles, interacting cooperatively as peers without any distinction
between client and server processes or the computers on which they
run.
• All participating processes run the same program and offer the same set
of interfaces to each other.
• Week scalability: The centralization of service provision and
management implied by placing a service at a single address does not
scale well beyond the capacity of the computer that hosts the service
and the bandwidth of its network connections.
• The hardware capacity and operating system functionality of today’s
desktop computers exceeds that of yesterday’s servers
• The aim of the peer-to-peer architecture is to exploit the resources
(both data and hardware) in a large number of participating computers
for the fulfilment of a given task or activity.
• Example: BitTorrent file-sharing system
1.4.3.2 Peer-to-peer
• Applications are composed of large numbers of peer processes running
on separate computers and the pattern of communication between
them depends entirely on application requirements.
• A large number of data objects are shared, an individual computer holds
only a small part of the application database, and the storage,
processing and communication loads for access to objects are
distributed across many computers and network links.
• Each object is replicated in several computers to further distribute the
load and to provide resilience in the event of disconnection of individual
computers.
• The need to place individual objects and retrieve them and to maintain
replicas amongst many computers renders this architecture
substantially more complex than the client-server architecture.
1.4.3.3 Services provided by
multiple servers
1.4.3.3 Services provided by
multiple servers
• Services may be implemented as several server processes in separate
host computers interacting as necessary to provide a service to client
processes.
• The servers may partition the set of objects on which the service is
based and distribute those objects between themselves.
Example: The Web provides a common example of partitioned data in
which each web server manages its own set of resources. A user can
employ a browser to access a resource at any one of the servers.
• They may maintain replicated copies of them on several hosts.
Example : Sun Network Information Service (NIS), is used to enable all the
computers on a LAN to access the same user authentication data when users
log in
1.4.3.4 Web Proxy Server
1.4.3.4 Web Proxy Server
• Web proxy servers provide a shared cache of web resources for the client
machines at a site or across several sites.
• Caching: A cache is a store of recently used data objects that is closer to one
client or a particular set of clients than the objects themselves.
• When a new object is received from a server it is added to the local cache store,
replacing some existing objects if necessary.
• Web browsers maintain a cache of recently visited web pages and other web
resources in the client’s local file system, using a special HTTP request to check
with the original server that cached pages are up-todate before displaying them
• The purpose of proxy servers is to increase the availability and performance of
the service by reducing the load on the wide area network and web servers.
• The purpose of proxy servers is to increase the availability and performance of
the service by reducing the load on the wide area network and web servers.
• Proxy servers can take on other roles; for example, they may be used to access
remote web servers through a firewall.
1.4.3.5 Web Applets
1.4.3.5 Web Applets
• Applets are a well-known and widely used example of mobile code – the
user running a browser selects a link to an applet whose code is stored
on a web server; the code is downloaded to the browser and runs there.
• advantage of running the downloaded code locally is that it can give
good interactive response since it does not suffer from the delays or
variability of bandwidth associated with network communication.
• Mobile code is a potential security threat to the local resources in the
destination computer. Therefore browsers give applets limited access to
local resources.
1.4.3.6 Thin Clients and Compute
Servers
1.4.3.6 Thin Clients and Compute
Servers
• The trend in distributed computing is towards moving complexity away from the
end-user device towards services in the Internet.
• This is most apparent in the move towards cloud computing and tiered
architectures.
• It enables access to sophisticated networked services, with few assumptions or
demands on the client device.
• The advantage of this approach is that potentially simple local devices
(including, for example, smart phones,..) can be significantly enhanced with a
plethora of networked services and capabilities.
• The Disadvantage of the thin client architecture is in highly interactive graphical
activities such as CAD and image processing, where the delays experienced by
users are increased to unacceptable levels by the need to transfer image and
vector information between the thin client and the application process, due to
both network and operating system latencies.
1.5 Summary
• Definitions of distributed systems and comparisons
to centralized systems.
• The characteristics of distributed systems.
• The eight forms of transparency.
• The basic design issues.
• Read Chapter 1 and Chapter 2 of the textbook.