CIS6930: Advanced Topics in Networking
Download
Report
Transcript CIS6930: Advanced Topics in Networking
• Storage area network and System area network
(SAN)
– What are they?
– Network requirements
– Hardware/software issues
– References:
• Ulf Troppens, Rainer Erkens, a nd Wolfgang Muller, “Storage
Networks Explained - basic and application of Fibre Channel
SAN, NAS, iSCSI and Infiniband”, John Wiley & Sons, 2004.
• W. J. Dally and B. Towles, “Principles and Practices of
Interconnection Networks”, Morgan Kaufmann, 2004.
• Ajay V. Bhatt, “Creating a Third Generation I/O Interconnect,”
available at http://www.express-lane.org
• Storage area network (SAN):
– Server-centric IT architecture: storage devices
exist only with servers
• Storage-centric IT architecture: SCSI cables
are replaced by a network (storage is now
independent of servers).
• Storage area network (SAN) requirement:
– Serial transmission for high speed and long
distance
– Low transmission errors
– Low delay of transmitted data
• Needs to make it feel like using a local disk
• Low delay is a relative term:
– The disk subsystem has around 1ms – 10ms latency itself.
– The communication protocol should not use
CPU.
• Current Storage area network (SAN)
technology (IBM):
– Fibre Channel
– TCP/IP + Gigabit Ethernet (iSCSI)
– InfiniBand
• System area network: a network with a high
bandwidth and a low lantency that serves as
a connection between computers in a
distributed computer system.
• Why system area network:
– Historically, the system area network comes with a
particular parallel machine (supercomputer, e.g. Cray
T3D, Cray T3E, SGI origin 2000, IBM SP, Thinking
machine CM5, Intel Polygon)
• The network is very expensive due to low volume
• CPU is two generations behind
– A more cost effective way to build these system is to
decouple the processor technology from the networking
technology.
– To form cheaper clusters of workstations with the offthe-shelf system area network technology (compared to
traditional supercomputers).
– current system area networks:
• Myrinet, Quadrics, Infiniband
• System area network requirement:
– Low latency and high bandwidth at the
application level.
• Not just at the hardware level
• Not just at the system level
– Implicitation:
• Hardware, network interface, software messaging
layer should work together to achieve the goal.
– Infiniband is designed as both storage area
network and system area network.
• Hardware issues:
– High speed links:
• Infiniband: 2.5Gbps = 250MBps, 10Gbps=1GBps,
30 Gbps = 1GBps
• Fibre channel: 100MBps, 200MBps, 400MBps,
1GBps.
• Myrinet: up to 9.6Gbps
• As a reference PCI bus: 100MBps
– NIC may need to attach to the memory bridge
• A typical PC:
• A workstation connected to a system area
network:
• When the number of end points is large,
multiple switches will be needed.
• Topology
• Switching
• Routing
• Topology
– Static arrangement of channels and nodes in an
interconnection network
– Trade-off between cost and performance
• Cost: the number and complexity of chips, density
and length of the interconnections, etc.
• Performance:
– Bandwidth and latency: also depend on other factors other
than topology
– Topology performance metrics: Bisection bandwidth,
diameter, nodal degree, channel load
• A cut of a network is the set of channels that
partitions the set of all nodes into two disjoint sets.
• A bisection of a network is a cut that partitions the
network nodes in roughly half.
• The bisection bandwidth of a network is the
minimum bandwidth over all bisections of the
network.
• The diameter of a network is the largest minimal
hop count over all pairs of nodes.
• Under a particular traffic pattern, the channel that
carries the largest fraction of traffic determines the
maximum channel load of the topology.
• Example topologies:
– Regular or irregular
– Regular topologies are mostly derived from two
main families: butterflies (k-ary n-flies) or tori
(k-ary n-cubes)
• Switching: how a packet pass a switch
– Message/packet/flit
• Traditional scheme: store-and-forward
– Time = H (S + P/B)
• Cut-through switch:
– Forward to the next link after the header flit is
received. Stop only when the next hop buffer is
not available.
– Time = H S + P/B, when S << P/B, the time
does not depend on the number of hops!!!
• Wormhole routing:
– Cut-through switches still allocate buffer to
packets. May require a large amount of buffers
– Wormhole routing only allocates buffer for one
flit for each packet.
– Latency is the same as cut-through switching.
– When the packet is block, the whole flit “train”
is block, occupying links.
• Solution: add more virtual channels.
• The deadlock problem in wormhole routing:
– Need deadlock free routing scheme to select the
right path
• Cut-through switch and wormhole switch
are widely used in system are networks
– Routing in such systems is an issue!!
– Shortest path routing may result in deadlock.
– Deadlock free routing:
• Cut-through switch and wormhole switch
are widely used in system are networks
– Routing in such systems is an issue!!
– Shortest path routing may result in deadlock.
– Deadlock free routing:
• Basic idea: fix the priority of channels and using the
channels with increasing priority.
• Example: up/down routing
• Up/down routing:
– Select a node as the root
– Build a spanning tree from the root
– Nodes are partitioned into layers based on the position
in the spanning tree
– The channel from a lower layer node to a higher layer
node is the up link, the channel from a higher layer
node to a lower layer node is a down link, channels
between nodes in the same layer are marked as up or
down link based on the node number
– In the valid route: an up channel cannot follow an down
channel.
– These exists at least one valid path between each pair of
nodes.
• Problems with deadlock free routing:
– Load balancing is a problem, traffic are not
evenly distributed
– Non-adaptive version of the deadlock free
routing scheme is also a problem
• How to map the routes in order to get good
performance (metrics: maximum channel load?)
• More on the problem to be discussed later.
• Hardware/software codesign and software
API issues:
– What functionality should be implemented in
the hardware.
• E.g. adaptive routing may imply out of order packets
– Chien’04 paper gives good answers to some of
these questions.