Transcript Slide 1

DataTAG project
Status & Perspectives
Olivier MARTIN - CERN
GNEW’2004 workshop
15 March 2004, CERN, Geneva
GNEW’2004 – 15/03/2004
Presentation outline
 Project overview
 Testbed characteristics and evolution
 Major networking achievements
 Where are we?
 Lambda Grids
 Networking testbed requirements
 Acknowledgements
 Conclusions
March 15, 2004
2
Final DataTAG Review, 24 March 2004
2
DataTAG Mission
TransAtlantic Grid
 EU  US Grid network research
 High Performance Transport protocols
 Inter-domain QoS
 Advance bandwidth reservation
 EU  US Grid Interoperability
 Sister project to EU DataGRID
March 15, 2004
3
Final DataTAG Review, 24 March 2004
3
Project partners
March 15, 2004
http://www.datatag.org
Final DataTAG Review, 24 March 2004
4
4
Funding agencies
Cooperating Networks
March 15, 2004
5
Final DataTAG Review, 24 March 2004
5
EU collaborators

Brunel University

CERN

CLRC

CNAF




NIKHEF

PPARC

UvA

University of Manchester

University of Padova

University of Milano

University of Torino

UCL
DANTE
INFN
INRIA
March 15, 2004
6
Final DataTAG Review, 24 March 2004
6
US collaborators

ANL

Northwestern University

Caltech

UIC

Fermilab

University of Chicago

FSU

University of Michigan

Globus

Indiana

SLAC

Wisconsin

Starlight
March 15, 2004
7
Final DataTAG Review, 24 March 2004
7
Workplan
 WP1:
 Establishment of a high performance intercontinental Grid testbed
(CERN)
 WP2:
 High performance networking (PPARC)
 WP3
 Bulk data transfer validations and application performance
monitoring (UvA)
 WP4
 Interoperability between Grid domains (INFN)
 WP5 & WP6
 Dissemination and project management (CERN)
March 15, 2004
8
Final DataTAG Review, 24 March 2004
8
DataTAG/WP4 framework
and relationships
HEP applications,
Other experiments
Integration
HICB/HIJTB
Interoperability
standardization
March 15, 2004
9
Final DataTAG Review, 24 March 2004
9
Testbed evolution
 The DataTAG testbed evolved from a simple 2.5 Gb/s
Layer3 testbed (Sept. 2002) into an extremely rich multivendor 10 Gb/s Layer2/Layer3 testbed (Sept. 2003)
 Alcatel, Chiaro, Cisco, Juniper, PRocket
 Exclusive access to the testbed is granted through an advance
testbed reservation application
 Direct extensions to Amsterdam UvA/Surfnet (10G) & Lyon
INRIA/VTHD (2.5G)
 Layer 2 extension to INFN/CNAF over GEANT & GARR
using Juniper’s CCC
 Layer 2 extension to the OptiPuter project at UCSD
(University of California San Diego) through Abilene and
CENIC under way.
 1st L2/L3 Transatlantic testbed with native 10Gigabit
Ethernet access.
March 15, 2004
10
Final DataTAG Review, 24 March 2004
10
DataTAG testbed phase 1 (2.5Gbps)
Linux PCs
VTHD/INRIA
ONS15454
STM64
SURF
NET
stm16
(FranceTelecom)
Linux PCs
ONS15454
r06gva
Alcatel7770
stm64
(GC)
GEANT
r06chi-Alcatel7770
Alcatel 1670
stm16(Colt)
backup+projects
Alcatel 1670
SURFNET
CESNET
CNAF
Linux PCs
r05gva-JuniperM10
r05chi-JuniperM10
r04chi
Cisco7609
s01chi
Extreme S5i
stm16
(T-Systems)
Chicago Geneva
1G ethernet
DataTAG testbed phase 2 (10Gbps) simplified
10G ethernet
2.5G STM16
10G STM64
Linux PCs
Linux PCs
10Gbps Optical wave
(T-Systems)
ABILENE
GEANT
StarLight Force10
Juniper T320
Cisco7609
JuniperM10
Juniper T320
Cisco7606
VTHD/INRIA
StarLight Cisco6509
Alcate l7770
[email protected]
March 15, 2004
last update: 20030909
11
Final DataTAG Review, 24 March 2004
11
March 15, 2004
12
Final DataTAG Review, 24 March 2004
12
DataTAG testbed
Alcatel
Chiaro
Cisco
Juniper
PRocket
March 15, 2004
13
Final DataTAG Review, 24 March 2004
13
Main networking achievements
(1)
 Internet landspeed records have been beaten one after the
other by the DataTAG project partners and/or teams closely
associated with DataTAG:
 Atlas Canada lightpath experiments during iGRID2002 (Gigabit
Ethernet) and Telecom World 2003 (10Gigabit Ethernet, aka WANPHY)
 New Internet2 landspeed record (I2 LSR) by Nikhef/Caltech team
(SC2002)
 FAST, GridDT, HS-TCP, Scalable TCP experiments (DataTAG
partners & Caltech)
 Intel 10GigE tests between CERN (Geneva) and SLAC (Sunnyvale)
(CERN, Caltech, Los Alamos Nationa Laboratory, SLAC)
 2.38 Gbps sustained rate, single flow, 1TB in one hour
 I2 LSR awarded during Internet2 Spring member meeting (April 2003)
March 15, 2004
14
Final DataTAG Review, 24 March 2004
14
ATLAS Canada Lightpath trials
TRIUMF Vancouver & CERN
Geneva through Amsterdam
NetherLight
“A full Terabyte of real data was
transferred at rates equivalent to a full
CD (680MB) in under 8 seconds and a
DVD in under 1 minute” Wade Hong et al
09/2002Bringing effective data transfer rates below
one second per CD! Subsequent 10GigE WAN-PHY Experiments during
March 15, 2004
Telecom World 2003
Final DataTAG Review, 24 March 2004
15
15
10GigE Data Transfer Trial
On Feb. 27-28 2003, a terabyte of data was transferred in European Commission
3700 seconds by S. Ravot of Caltech between the Level3 PoP in
Sunnyvale near SLAC and CERN through the TeraGrid router
at StarLight from memory to memory with a single TCP/IPv4
stream. This achievement translates to an average rate of
2.38 Gbps (using large windows and 9kB “jumbo frames”). This
beat the former record by a factor of ~2.5 and
used the 2.5Gb/s link at 99% efficiency.
Huge distributed effort, 10-15 highly skilled people
monopolized for several weeks!
March 15, 2004
16
10G DataTAG testbed extension
to Telecom World 2003 and Abilene/Cenic
On September 15, 2003, the DataTAG
project was the first transatlantic testbed
offering direct 10GigE access using Juniper’s
VPN layer2/10GigE emulation.
Sponsors: Cisco, HP, Intel, OPI (Geneva’s
Office for the Promotion of Industries &
Technologies), Services Industriels de Geneve,
March 15, 2004
17
Telehouse
Europe,
T-Systems
Final DataTAG Review, 24 March 2004
17
Main networking achievements
(2)
 Latest IPv4 & IPv6 I2LSR were awarded, live from the Internet2 fall
member meeting in Indianapolis, to Caltech & CERN during Telecom
World 2003:
 May 6, 2003:
 987 Mb/s single TCP/IP v6 stream
 October 1, 2003
 5.44 Gb/s single TCP/IP v4 stream between Geneva and Chicago:
 1.1TB in 26 minutes or one 680MB CD in 1 second
 More records have been established by Caltech & CERN since then:
 November 6, 2003:
 5.64 Gb/s single TCP/IP v4 stream between Geneva and Los Angeles (CENIC
PoP) across DataTAG and Abilene.
 November 11, 2003,
 4 Gb/s single TCP/IP v6 stream between Geneva and Phoenix (Arizona) through
Los Angeles
 February 24, 2004
 6.25 Gb/s with 9 streams for 638 seconds, i.e. half a terabyte transferred between
CERN in Geneva and the CENIC PoP in Los Angeles across DataTAG and
Abilene.
March 15, 2004
18
Final DataTAG Review, 24 March 2004
18
Internet2 landspeed record
history (IPv4&IPv6)
Internet2 landspeed record history
(in terabit-meters/second)
Evolution of the I2LSR in Gigabit/second
70000
6.000
60000
5.000
50000
4.000
40000
3.000
30000
IPv4 terabit-meters/second)
IPv6 (terabit-meters/second)
IPv4 (Gb/s)
IPv6 (Gb/s)
2.000
20000
1.000
10000
0.000
0
Month
Mar-00
Apr-02
Sep-02
Oct-02
Nov-02
Month
Feb-03
May-03
Oct-03
Nov-03
Nov-03
Month
Mar-00
Apr-02
Sep-02
Oct-02
Nov-02
Month
Feb-03
May-03
Oct-03
Nov-03
Nov-03
Impact of a single multiGb/s flow on the Abilene
backbone
March 15, 2004
19
Final DataTAG Review, 24 March 2004
19
Significance of I2LSRs to
the Grid?
 Essential to establish the feasibility of multi-Gigabit/second
single stream IPv4 & IPv6 data transfers:





Over dedicated testbeds in a first phase
Then across academic & research backbones
Last but not least across campus networks
Disk to disk rather than memory to memory
Study impact of high performance TCP over disk servers
 Next steps:
 Above 6Gb/s expected soon between CERN and Los Angeles
(Caltech/CENIC PoP) across DataTAG & Abilene
 Goal is to reach 10Gb/s with new PCI Express buses
 Study alternatives to standard TCP (Reno)
 Non-TCP transport (Tsunami, SABUL/VDT)
 HS-TCP, Scalable TCP, H-TCP, FAST, Grid-DT, Wesley+, etc…
March 15, 2004
20
Final DataTAG Review, 24 March 2004
20
Main networking achievements
(3)
 QoS
Geneva
Juniper M10
AF
AF
Layer2:
VLAN
Layer2:
VLAN
BE
1 GE bottleneck IP-Qos configured
BE
 Advance bandwidth reservation
GARA extensions
AAA extensions
March 15, 2004
21
Final DataTAG Review, 24 March 2004
21
Where are we?
 The DataTAG project came up at exactly the right time:
 Back in the late 2000, 2.5 Gb/s looked futuristic
 10GigE, especially host interfaces, did not really exist
 However, it was already very clear that the standard TCP
stack (Reno/Newreno) was problematic
 Much hope was placed on autotuning (Web100/Net100) & ECN/RED
like solutions
 Actual bit error rates of transatlantic circuits were over-estimated
 Much better shape than expected on over-provisioned R&D
backbones such as Abilene, Canarie, GEANT
 For how long?
 One of the strongest proof made by DataTAG is the extreme
vulnerability of production R&D backbones in the presence of high
performance flows (i.e. 10GigE or even less)
March 15, 2004
22
Final DataTAG Review, 24 March 2004
22
Where are we (cont)?
 For many years the Wide Area Network has been the
bottlemeck, this is no longer the case in many countries,
thus making the deployment of data intensive Grid
infrastructure, in principle, possible, e.g.
 EGEE the DataGrid successor
 Recent I2LSR records show, for the first time ever, that the
network can be truly transparent and that throughput is only
limited by the end hosts and/or campus network
infrastructures.
 Challenge shifted from getting adequate bandwidth to deploying
adequate LANs and cybersecurity infrastructure as well as making
effective use of it!
 Non-trivial transport protocol issues still need to be resolved
 The only encouraging sign is that this is now widely recognized
 But we are still quite far from converging on a practical solution?
March 15, 2004
Final DataTAG Review, 24 March 2004
23
23
Layer1/2/3 networking
(1)
 Conventional layer 3 technology is no longer fashionable
because of:
 High associated costs, e.g. 200/300 KUSD for a 10G router
interfaces
 Implied use of shared backbones
 The use of layer 1 or layer 2 technology is very attractive
because it helps to solve a number of problems, e.g.
 1500 bytes Ethernet frame size (layer1)
 Protocol transparency (layer1 & layer2)
 Minimum functionality hence, in theory, much lower costs (layer1&2)
March 15, 2004
24
Final DataTAG Review, 24 March 2004
24
Layer1/2/3 networking
(2)
 « Lambda Grids » are becoming very popular:
 Pros:
 circuit oriented model like the telephone network, hence no need for
complex transport protocols
 Lower equipment costs (i.e. « in theory » a factor 2 or 3 per layer)
 the concept of a dedicated end to end light path is very elegant
 Cons:
 « End to end » still very loosely defined, i.e. site to site, cluster to cluster
or really host to host
 Higher circuit costs, Scalability, Additional middleware to deal with circuit
set up/tear down, etc
 Extending dynamic VLAN functionality to the campus network is a
potential nightmare!
March 15, 2004
25
Final DataTAG Review, 24 March 2004
25
« Lambda Grids »
What does it mean?
 Clearly different things to different people, hence the « apparently
easy » consensus!
 Conservatively, on demand « site to site » connectivity
 Where is the innovation?
 What does it solve in terms of transport protocols?
 Where are the savings?
 Less interfaces needed (customer) but more standby/idle circuits needed
(provider)
 Economics from the service provider vs the customer perspective?
» Traditionally, switched services have been very expensive,
 Usage vs flat charge
 Break even, switches vs leased, few hours/day
 Why would this change?
 In case there are no savings, why bother?
 More advanced, cluster to cluster
 Implies even more active circuits in parallel
 Even more advanced, Host to Host
 All optical
 Is it realisitic?
March 15, 2004
26
Final DataTAG Review, 24 March 2004
26
Networking testbed
requirements
 Multi-vendor
 Unless a particular research group is specifically interested by the behaviour of TCP in
the presence of out of order packets, running high performance TCP tests across a
Juniper M160 backbone is pretty useless.
 IPv6 achievable performance vary widely between different vendors
 MPLS & QoS implementations also veary widely
 Interoperability
 Dynamic
 Implies manpower & money
 Partitionable
 Reservation application
 Reconfigurable
 Avoid manual recabling, implies Electronic or Optical switch/patch panel
 Extensible
 Extensions to other networks
 Implies collaboration
 Not limited to network equipment, must also include high performance servers,
high perf. Disks & NICs,
 Coordination with other testbeds
March 15, 2004
27
Final DataTAG Review, 24 March 2004
27
Acknowledments
 The project would not have accumulated so many successes without the
active participation of our North American colleagues, in particular:






Caltech/DoE
University of Illinois/NSF
iVDGL
Starlight
Internet2/Abilene
Canarie
 and our European sponsors and colleagues as well, in particular:





European Union’s IST program
Dante/GEANT
GARR
Surfnet
VTHD
 The GNEW2004 workshop is yet another example of successful
collaboration between Europe and USA
March 15, 2004
28
Final DataTAG Review, 24 March 2004
28