CA*net 4 International Grid Testbed

Download Report

Transcript CA*net 4 International Grid Testbed

CA*net 4 International Grid
Testbed
Wade Hong
Carleton University
May 27, 2003
The Proposal
To develop and deploy a persistent experimental
infrastructure network testbed, in collaboration
with international participants, to advance the
realization and utilization of increasingly high
bandwidth-delay product networks, specifically as
offered bandwith approaches 10 Gbps and beyond.
Motivation
• primary motivation was to provide an experimental
infrastructure for the developers of the user control interfaces
and frameworks for the CA*net 4 Lightpath Cross-Connect
devices to test end-to-end lightpath provisioning
• an equally important and complementary purpose for the
International Grid Testbed is to provide a sandbox for
investigating the effective utilization of the established endto-end long distance high bandwidth lightpaths
Background
• ATLAS Canada Lightpath bulk data transfer trial over high
speed long distance networks at the iGrid 2002 conference
• Yotta Yotta high bandwidth utilization trial at CANARIE
8th Advanced Networks Workshop
• became evident that to continue with the research initiated,
persistent network resources and infrastructure was
necessary.
R&E Networks for Next
Generation Science
• some computing challenges
– data intensiveness of experiments
– complexity of the data
– global extent of the collaborations
• emerging technologies
– grid computing
– experimental infrastructure networks
• circuit switched, hybrid networks, e2e lightpaths
The ATLAS Experiment
• largest collaboration ever undertaken in the physical
sciences, with 2000 physicists participating from
more than 150 institutions and laboratories in 34
countries
• main physics goals: Higgs Boson discovery
• data volumes rising to the order of Exabytes (1018
Bytes) within a decade
The ATLAS Experiment
Canada
Canada
Atlas Canada Lightpath Data
Transfer Trial
• Canadian HEP participated at iGrid 2002 in Amsterdam last fall
• Core Team
– Wade Hong, Corrie Kost, Steve McDonald, Bryan Caron
• Demonstrated a manually provisioned “e2e” lightpath
– longest single hop network
• Transfered 1TB of ATLAS MC data generated in Canada from
TRIUMF to CERN
• Tested alpha 10GbE technology and channel bonding
• Established a new benchmark for high performance disk to disk
throughput over a high bandwidth delay product network
e2e Lightpaths
• core design principle of CA*net 4
– dedicated point to point pipes
• ultimately give control of lightpath creation,
teardown and routing to the end user
– hence, “Customer Empowered Networks”
• provide a flexible infrastructure for emerging
intensive grid applications
• inherent QoS, security, and high performance
e2e Lightpath
• e2e lightpaths manually provisioned today
• frameworks to enable provisioning of e2e lightpaths being
developed under CA*net 4 Directed Research Program
High Bandwidth Delay Product
e2e lightpaths
• can create high bandwidth long distance e2e
lightpaths
– problem is how to effectively utilize bandwidth
– TCP is not well suited
• bulk data transfers
– OS kernels, disk subsystems, network interfaces,
TCP/IP stack must be optimized in order to achieve
extremely high throughput
The Network
TRIUMF Local Loop
• only one SM fiber pair available between TRIUMF and
the CA*net 4 ONS 15454 at the BCnet POP
CERN Local Loop
• receive host located at the CERN IXP
The End Points
TRIUMF
CERN
We are live continent to continent!
• e2e lightpath up and running Sept 20 21:45 CET
– iGrid 2002 starts Sept 23
traceroute to cern-10g (192.168.2.2), 30 hops max, 38 byte packets
1 cern-10g (192.168.2.2) 161.780 ms 161.760 ms 161.754 ms
Results
• transfer rates from TRIUMF to CERN using the normal routed
path yielded 3.4 Mbps with ftp
– clearly not high performance over the routed path through UBC campus
• transferred 1TB in under 3 hrs using bbftp (IN2P3) and
Tsunami protocol (Indiana University) over e2e lightpath
– sustained transfers of over 1.2 Gbps disk to memory
– sustained transfers of 825 Mbps disk to disk
• created longest known single hop network (16K kms)
Sunday Nite Summaries
Comparative Results
Tool
Transferred
Average
Max Avg
wuftp 100 MbE
600 MB
3.4 Mbps
wuftp 10 GbE
6442 MB
71 Mbps
iperf
275 MB
940 Mbps
pftp (single stream)
600MB
532 Mbps
bbftp (10 streams)
1.4 TB
666 Mbps
737 Mbps (13)
Tsunami - disk to disk
0.5 TB
700 Mbps
825 Mbps
Tsunami - disk to memory
12 GB
> 1GBps
1136 Mbps
Sept 26 2002
CA*net 4 International Grid
Testbed
• Initially, a few Canadian sites
– UVic, TRIUMF, UofA, UofT, Carleton U,
CANARIE, UdeM
• CERN
• future sites
– UIC/OMNInet, Ultralight
Participants
• principal participants
–
–
–
–
Carleton U
UofA
TRIUMF
UVic
• collaboratorative participants
–
–
–
–
–
CERN
UIC
SURFnet
Indiana University
ORANs - Bcnet, Netera, ORION, RISQ
CA*net 4 International Grid Testbed
General Objectives
• create a facility where the user controlled interfaces and
frameworks for the CA*net 4 Lightpath Cross-Connect
Devices can be tested and demonstrated safely in a real world
environment, following thorough ``in-vitro'' testing.
• develop and adapt grid applications which are designed to
interact with a LightPath Grid Service which treats networks
and network elements as grid resources which can be
reserved, concatenated, consumed and released.
• in parallel with grid enabling applications, tune applications
for performance over a single hop end-to-end lightpath or a
routed multi-hop end-to-end lightpath.
• characterize the performance of bulk data transfer over an
end-to-end lightpath.
General Objectives
• investigate and test protocols for high speed long distance
optical networks to effectively utilize the available
bandwidth
• investigate and test emerging technologies and its impact 10 GbE, RDMA/IP, FC/IP, serial SCSI, HyperSCSI over
long distance ethernet, etc.
• collaborate with the EU ESTA project which is developing
10 GbE equipment with CERN, industrial and academic
partners
• collaborate with other international efforts such as
DataTag, EU DataGrid, CERN OpenLab, Ultralight, etc
• test interoperability of emerging 10 GbE networking
equipment
Specific Projects
• projects that have emerged and planned for the
testbed include
–
–
–
–
–
–
Bulk Data Transfer
Grid Canada Testbed
ATLAS Remote Real Time Processing Farms
ATLAS Forward Calorimeter TestBeam Streaming
s2io collaboration
RDMA Trials
• more projects will emerge over the lifetime of the
testbed
Bulk Data Transfer
• continue work initiated with the iGrid 2002 trial
• attempt to understand bottlenecks limiting effective use of
high bandwidth delay networks for data intensive transfers
• with the newer beta Intel 10GbE NICs, it is expected that
we would obtain 4 times the peformance of the alpha NICs
• continue working with hybrid protocols such as Tsunami
• investigate emerging bulk transfer protocols
• explore Web100 and Net100 to further understand TCP and
UDP performance by instrumenting the IP stack
• investigate zerocopy and IPv6
Grid Canada Testbed Over
End-to-End Lightpaths
• Grid Canada testbed has been established across Canada
primarily running ATLAS and Babar Monte Carlo
generation using the Globus Toolkit
• performance issues have been identified with internal
campus problems - it is proposed to investigate
performance of the testbed with sites meshed with end-toend lightpaths
ATLAS Remote Real Time
Processing Farms
• collaborative effort to investigate and exploit long distance,
high bandwidth, Trans-Atlantic links between CERN and
Canada with particular emphasis on remote real time
processing farms for the ATLAS experiment
• similar efforts are underway between CERN and European
sites using a traditional packet routed IP network, however,
as part of the IGT, an end-to-end lightpath would be
established between CERN and Canadian sites
• measurement station developed at CERN installed in
Canada, at the University of Alberta
• measurement and optimization of data transfers using TCP
and other protocols, data sources and sinks would be high
end PCs supporting GE and 10 GE NICs
ATLAS Remote Real Time
Processing Farms
• testing real applications, for example, remote processing of
data from the ATLAS test beam
• investigate passively monitoring 10 GbE e2e lightpath with
OC192mon
• initial planned activity period os 2 years
ATLAS Forward Calorimeter
TestBeam Data Streaming
• ATLAS Canada will be involved in an FCAL testbeam,
generating a large amount of data to be sent back to
participating sites - Carleton U, UofT
• estimated data rate of 30 Gbytes/day
• explore scheduling an e2e lightpath to stream the data back
to Canada
s2io Collaboration
• test s2io’s new TCP assist 10GbE NICs over a high speed
long distance e2e lightpath
• s2io will expose all tunable parameters through APIs - it
will be very interesting to experiment with and
characterize these parameters to understand performance
and fully utilize the offered bandwidth
• further experiments to be developed collaboratively
RDMA Trials
• s2io next generation 10 GbE NICs will support RDMA/IP
by mid 2004, an interesting demo for iGrid 2004 would be
data intensive transfers over a 10Gbps e2e lightpath
• many other vendors like Extreme Networks looking at
RDMA/IP
“A private railroad car is not an acquired taste.
One takes to it immediately.”