Session-12 - Lyle School of Engineering
Download
Report
Transcript Session-12 - Lyle School of Engineering
Advanced Computer
Architecture
CSE 8383
April 24, 2008
Session 12
Computer Science and Engineering
Contents
Message Passing Systems (Chapters 5 & 7)
Communication Patterns
Network Computing
Client/Server System
Clusters
Grid
Interconnection Networks
Computer Science and Engineering
Message Passing Mechanisms
Message Format
Message arbitrary number of fixed length
packets
Packet basic unit containing destination
address. Sequence number is needed
A packet can further be divided into flits
(flow control digits)
Routing and sequence occupy header flit
Computer Science and Engineering
Message, Packets, Flits
Message
Packet
Destination
Data flit
Sequence
Computer Science and Engineering
Store and Forward Routing
Packets are the basic units of information
flow
Each node uses a packet buffer
A packet is transferred from S to D
through a sequence of intermediate
nodes
Channel and buffer must be available
Computer Science and Engineering
Wormhole Routing
Flits are the basic units of information flow
Each node uses a flit buffer
Flits are transferred from S to D through a
sequence of intermediate routers in order
(Pipeline)
Can be visualized as a railroad train
Flits from different packets cannot be mixed
up
Computer Science and Engineering
Latency Analysis
L packet length (in bits)
W Channel bandwidth
(bits/sec)
D Distance (number of hops)
F flit length (in bits)
Computer Science and Engineering
Store and Forward Latency
L
W
L
W
D
L
W
TSF
Computer Science and Engineering
WH Latency
L
W
D
TW
Computer Science and Engineering
Latency Analysis
L packet length (in bits)
W Channel bandwidth (bits/sec)
D Distance (number of hops)
F flit length (in bits)
TSF = D * L/W
TWH = L/W + D* F/W L/W if L>>F
(independent of D)
Computer Science and Engineering
Communication Patterns
Point to Point 1 - 1
Multicast 1 - n
Broadcast 1 - all
Conference n - n
Computer Science and Engineering
Routing potential problems
Deadlock:
When 2 messages, each is holding the resources required
by the other in order to move, both messages will be
blocked (cyclic dependency for resources)
Straightforward solution (but inefficient) is rerouting
Another solution is avoidance of occurrence of deadlock
using a strict monotonic order of network resources
Channel dependency graph (CDG) is a technique for
developing a deadlock-free routing algorithm.
Computer Science and Engineering
A 4-node network and its CDGs
c1
1
0
c4 c6
3
c5
c7
c1
c2
c3
c4
c5
c6
c7
c8
c8 c2
2
c3
(a) A 4-node network
c1
c5
(b) Channel dependency graph (CDG)
c2
c6
c3
c7
c4
c8
(c) CDG for a deadlock-free version of the network
Computer Science and Engineering
Livelock:
A message goes around the network and never
reaches its destination
It results from using adaptive routing algorithms
with dynamic injection, where nodes inject their
messages in the network at arbitrary times
Policies to avoid livelock are based on assigning a
priority to a message injected to the network:
Messages are routed according to their priorities
Once a message is injected, only a finite number of
messages will be injected with higher or equal priority.
Computer Science and Engineering
Starvation:
A node suffers from starvation if it has a message
to inject into the network but is never allowed to
do so.
The simplest policy to avoid starvation is to allow
each node to have an injection queue that
competes with the queues of the incoming links
to the same node.
The main disadvantage is that a node with a high
message injection rate can slow down all the other
nodes in the network.
Computer Science and Engineering
Routing Efficiency
Two Parameters
Channel Traffic (number of channels
used to deliver the message involved)
Communication Latency (distance)
Computer Science and Engineering
Multicast on a mesh (5 unicasts)
Traffic ?
Latency ?
Computer Science and Engineering
Multicast on a mesh (multicast pattern 1)
Traffic ?
Latency ?
Computer Science and Engineering
Multicast on a mesh (multicast pattern 2)
Traffic ?
Latency ?
Computer Science and Engineering
Broadcast (tree structure)
3
2
3
4
2
1
2
3
1
2
1
Computer Science and Engineering
Message Passing in PVM (Revisit)
Sending Task
User
application
1
3
Daemon
User
application
Library
4
2
Receiving Task
5
Library
8
6
7
Daemon
Computer Science and Engineering
Standard PVM asynchronous
communication
A sending task issues a send command (point 1)
The message is transferred to the daemon
(point 2)
Control is returned to the user application
(points 3 & 4)
The daemon will transmit the message on the
physical wire sometime after returning control
to the user application (point 3)
Computer Science and Engineering
Standard PVM asynchronous
communication (cont.)
The receiving task issues a receive command
(point 5) at some other time
In the case of a blocking receive, the receiving
task blocks on the daemon waiting for a
message (point 6). After the message arrives,
control is returned to the user application
(points 7 & 8)
In the case of a non-blocking receive, control is
returned to the user application immediately
(points 7 & 8)
Computer Science and Engineering
Send (3 steps)
1.
2.
3.
A send buffer must be initialized
The message is packed into the buffer
The completed message is sent to its
destination(s)
Computer Science and Engineering
Receive (2 steps)
1.
2.
The message is received
The received items are
unpacked
Computer Science and Engineering
Message Buffers
Buffer Creation (before packing)
Bufid = pvm_initsend(encoding_option)
Bufid = pvm_mkbuf(encoding_option)
Encoding option
0
1
2
Meaning
XDR
No encoding
Leave data in place
Computer Science and Engineering
Message Buffers (cont.)
Data Packing
pvm_pk*()
pvm_pkstr() – one argument
pvm_pkstr(“This is my data”);
Others – three arguments
1.
2.
3.
Pointer to the first item
Number of items to be packed
Stride
pvm_pkint(my_array, n, 1);
Packing functions can be called multiple
times to pack data into a single message
Computer Science and Engineering
Sending a message
Point to point (one receiver)
info = pvm_send(tid, tag)
broadcast (multiple receivers)
info = pvm_mcast(tids, n, tag)
info = pvm_bcast(group_name, tag)
Pack and Send (one step)
info = pvm_psend(tid, tag, my_array, length, data type)
Computer Science and Engineering
Receiving a message
Blocking
bufid = pvm_recv(tid, tag)
-1 wild card in either tid or tag
Nonblocking
bufid = pvm_nrecv(tid, tag)
bufid = 0 (no message was received)
Timeout
bufid = pvm_trecv(tid, tag, timeout)
bufid = 0 (no message was received)
Computer Science and Engineering
Different Receive in PVM
Time
Funciton
is called
Pvm_recv()
wait
Pvm_nrecv()
Continue
execution
Time is
expired
Message
arrival
Pvm_trecv()
wait
Resume
execution
Resume
execution
Blocking
Non-blocking
Timeout
Computer Science and Engineering
Data unpacking
pvm_upk*()
pvm_upkstr() – one argument
pvm_upkstr(string);
Others – three arguments
1.
2.
3.
Pointer to the first item
Number of items to be unpacked
Stride
pvm_upkint(my_array, n, 1);
Computer Science and Engineering
Networks Computing
Four categories
WAN
MAN
LAN
SAN
Internet
TCP/IP
Computer Science and Engineering
Other Network technologies
Fast Ethernet and Gigabit Ethernet
The Fiber Distributed Data Interface (FDDI)
High-Performance Parallel Interface (HIPPI)
Asynchronous Transfer Mode (ATM)
Scalable Coherent Interface (SCI)
Computer Science and Engineering
10Gbps
SCI
HiPPI
1000Mbps
1000 Base T
ATM
FDDI
100Mbps
100 Base T
10Mbps
10 Base T
SAN
LAN
MAN
WAN
A representation of network technologies
Computer Science and Engineering
Client/Server Systems
Client
Client
Server
Server Threads
Interconnection
Network
Computer Science and Engineering
Sockets
Sockets are used to provide the capability of making
connections from one application running on one
machine to another running on a different machine.
Once a socket is created, it can be used to wait for an
incoming connection (passive socket) or can be used
to initiate connection (active socket).
Client
Serve
r
A Socket Connection
Computer Science and Engineering
A Client Server Framework for Parallel
Applications
Server 1
Server 2
Server 3
Slaves (Workers)
Server n
Interconnection
Network
Client
Master (Supervisor)
Computer Science and Engineering
Computer Clusters
Advances in commodity processors
and network technology
Network of PCs and workstations
connected via LAN or WAN forms a
Parallel System
Compete favorably
(cost/performance)
Computer Science and Engineering
Cluster Architecture
Programming Environment
Middleware
OS
M
C
P
OS
I/O
M
C
OS
I/O
P
Interconnection
Network
M
C
I/O
P
Home cluster
Computer Science and Engineering
Grids
Geographically
distributed platforms.
Internet
Dependable, consistent,
pervasive, and
inexpensive access to
high end computing.
Computer Science and Engineering
Interconnection Networks
Ethernet
A packet-switched LAN technology.
All hosts connected to an Ethernet receive every
transmission, making it possible to broadcast a packet
to all hosts at the same time.
Ethernet uses a distributed access control scheme
called Carrier Sense Multiple Access with Collision
Detect (CSMA/CD).
Each computer connected to an Ethernet network is
assigned a unique 48-bit address known as its
Ethernet address, also called the media access control
address, (MAC).
Computer Science and Engineering
Switches
A n1 x n2 switch consists of:
n1 input ports
n2 output ports
Links connecting each input to every output
Control logic to select a specific connection
Internal buffers
The connections between input ports and output ports
may be:
One-to-one (point-to-point)
One-to-many (multicast or broadcast)
Many-to-one: may cause conflicts at the output ports and
needs arbitration.
Computer Science and Engineering
When only one-to-one connections are allowed, the
switch is called crossbar.
An n x n crossbar switch can establish n! connections.
If we allow both one-to-one as well as one-to-many in
an n x n switch, the number of connections that can be
established is nn.
(We discussed this before, remember?)
Computer Science and Engineering
Routing can be achieved using 2 mechanisms:
Source-path: the entire path to the destination is stored in the
packet header at the source location.
Table-based: the switch must have a complete routing table
that determines the corresponding port for each destination.
5
0
6
Dest-id
Port 0
Port 4
Port 0
Port 4
Port 1
Port 5
Port 1
Port 5
Port 2
Port 6
Port 2
Port 6
Port 3
Port 7
id
6
Port 3
Port 7
Routing table
Source-path Routing versus Table-based Routing
Computer Science and Engineering
Myrinet Clos network
Myrinet is a high-performance, packet communication
and switching technology.
Myrinet switches are multiple-port components that
route a packet entering on an input channel of a port
to the output channel of the port selected by the
packet.
Computer Science and Engineering
Myrinet Clos network
Network Spine
Clos “Spreader” Network
Connects Spine (upper 8 switches) to Leaves (16 lower switches)
128 Hosts
128-host Clos Network using 16-port Myrinet Switch
Computer Science and Engineering
Myrinet Clos network
Network Spine
2 links each
64 Hosts
64-host Clos Network using 16-port Myrinet Switch (Each line represents 2 links)
Computer Science and Engineering
Myrinet Clos network
Network Spine
4 links each
32 hosts
32-host Clos Network using 16-port Myrinet Switch (Each line represents 4 links)
Computer Science and Engineering
The Quadrics network (QsNet)
Consists of 2 hardware building blocks
A programmable network interface called Elan:
connects the Quadrics network to a processing node containing
one or more CPUs
Elan provides substantial local processing power to implement
high-level message passing protocols (ex: MPI).
High-bandwidth, low-latency communication switch called
Elite:
QsNet connects Elite switches in a quaternary fat-tree topology.
Computer Science and Engineering
The Quadrics network (QsNet)
Processing Nodes
Computer Science and Engineering