Transcript Document

Reduced Communication Protocol
for Clusters
Clunix Inc.
Donghyun Kim
2000.9
Introduction
 Communication
Sub-system Performance is
decided by followings
• Transmission speed of physical network
• I/O handling capability
• Overheads of the communication protocol
 Communication
using traditional protocols
is the bottle-neck of parallel systems
• Myrinet with TCP/IP is not FAST.
• Small-granularity or communication-dense apps
show poor performance
Clunix Inc.
Introduction – cont’d
 A high
proportion of apps don’t need very
complicated communication functions
• By practice and theoretic analysis
Clunix Inc.
Overheads analysis of
traditional protocols
 Traditional
protocols overheads
• Time of context switching
• Time of data copying
 User
space – system space, adjacent protocol layers
• Time of data partitioning, re-constructing, data
analyzing
• Time of transmitting packet headers
• Time of routing, connection maintaining, traffic
controlling, error detecting, recovering, buffer
management
Clunix Inc.
Overheads analysis of
traditional protocols - cont’d

End-to-end latency L, bandwidth W modeling
• Assumptions : homogeneous, low network traffic
L  T (0)or T (1)
n max
W
T (nmax )
(1)
m
T (n) T0 (n)  2(τ   Ti (n))
(2)
i 1
T(n) : n-bytes transmission time
nmax : comm. subsystem max packet length
m : # of protocol layers
Ti(n) : i-th protocol layer processing time
(T0(n) : physical network transmission time)
Clunix Inc.
Overheads analysis of
traditional protocols - cont’d
n 
n i  n i 1  (  i 1   1) ρ i 1  i  m
(3)
 πi 
n i 1  n i 1 
Ti (n)  τ i 

  Ti 1 (π i  ρ i )  Ti 1 (ni 1modπ i ) (4)
ω πi 
n
T0 (n) 
ω0
(5)
 : context switching time
 : memory bandwidth
0 : physical network transmission bandwidth
i : max packet length of i-th layer
I : packet header length of i-th layer
ni : data length of i-th layer
i : calling expense (routing,traffic control, error
detecting, buffer management, connection maintaining)
Clunix Inc.
Overheads analysis of
traditional protocols - cont’d
 Analytical
& testing results
Protocol
 Testing
Analytical
Testing
Layer
L(s)
W(Mbps)
L(s)
W(Mbps)
TCP
1350
8.5
1450
8.6
UDP
1110
9.5
1150
9.5
DLPI
450
10.0
650
10.0
conclusions
• Very large overhead using above IP protocol layer
• Memory-to-memory copying is not neglected
 If
transmission bandwidth is the same as memory
bandwidth, data copying(ni+1/) problem is bigger
Clunix Inc.
Design Strategies for RPC
• Support reliable, synchronous, asynchronous
communications
• Implement reliale broadcast and multicast basing
directly on the physical layer
• Lay the protocol below the IP layer
 Above
physical or datalink layer
• Avoid data copying AFAP
• If possible, avoid buffer management using
hardware buffering
• Run the protocol entirely in the user space
 In
the form of libraries
Clunix Inc.
Implementation of RCP
 OSI-DLPI
version
• Standard physical-device independent data link
layer interface
 Can
write uniform program on different machines
and network devices
 Myrinet
version
 Providing
user interface like the TCP-socket
Clunix Inc.
Implementation of RCP – cont’d
 RCP supports
unicast, broadcast, multicast
 RCP addressing
• Unique source/destination using hostname+port#
• Static address configuration
 Supports
 No
heterogeneous machines
connection maintaining, error detecting
• Assuming that underlying network is reliable
Clunix Inc.
Implementation of RCP – cont’d
 Sequencing
control, traffic control
• Sliding-window algorithm+selective retransmission
• Windows size is adjusted accoring to retransmission
frequency
 Fast-Adapt
and Slow-Recover algorithm
• Very efficient traffic control
 Data
partitioning and packaging algorithm
• Almost no data-copy, work in user-space
Clunix Inc.
RCP Tesing results
Bandwidth(W)
Lantency(L)
Clunix Inc.
Conclusions and future issues
 RCP design
considerations
• How to reduce the overheads
 Over-complicated
protocol processing
 Context
switching
 Overhead of data copying
• How to use the transmission control functions
supported by hardware
 To
 Future
reduce the protocol processing
Work
• To gurantee the quality of the communication.
Clunix Inc.