APM - DataTAG

Download Report

Transcript APM - DataTAG

DataTAG overview
Summary

Why DataTAG?

DataTAG project

Test-bed extensions

General information

Open DataTAG

Network map

(some) Research topics

(a lot of) Issues

Conclusion and acknowledgements
3 February 2003
APM meeting - Barcelona
Paolo Moroni (Slide 2)
Background for DataTAG






High-energy physicists are building the LHC
(start-up in 2007?) at CERN: an unprecedented
amount of data will have to be analyzed and CERN
alone will not have enough computing resources
The planned computing model is distributed
geographically (GRID)
The EU-DataGrid project is addressing the
middleware problem
Reliable, advanced networking is needed
underneath
At least part of the GRID traffic will not look like
any IP commodity traffic now
A lot of bandwidth will be necessary, but it won’t
be enough
3 February 2003
APM meeting - Barcelona
Paolo Moroni (Slide 3)
Addressing the problem

Buy (a lot of) bandwidth

Buy (expensive) network equipment

Address security issues without compromising
performance (firewalls)

Tune some more or less obvious TCP parameter

Design the network for reliability and performance

Plan for interconnection with key research
networks
3 February 2003
APM meeting - Barcelona
Paolo Moroni (Slide 4)
Why network research?






Need to enhance data transport protocols (TCP)
Need to measure the perceived application
performance (end-to-end)
Need to try the GRID software in a WAN
environment and at very high speed (Gigabit or
more)
Need to test the compatibility between EU and US
GRIDs (middleware integration)
Need to test network technology with enough
speed and enough features to support the planned
GRID workload (end-to-end inter-domain QoS)
Consistent risk of breaking something: production
networks can help with some (but not all) the
above needs
3 February 2003
APM meeting - Barcelona
Paolo Moroni (Slide 5)
DataTAG project





Full project title: “Research and technological
development for a transatlantic GRID”
IST project (EU funded), supported by the NSF
and the DoE (Caltech)
Partners: PPARC (UK), INRIA (FR), University of
Amsterdam (NL), INFN (IT) and CERN (CH)
Researchers also from CalTech, SLAC and Canada
Test-bed kernel: transatlantic STM-16 (Tsystems) between Geneva (CERN) and Chicago
(StarLight), with interconnected workstations at
each side
3 February 2003
APM meeting - Barcelona
Paolo Moroni (Slide 6)
Test-bed extensions






Amsterdam-Geneva (SARA-CERN): STM-64 from
SURFnet (Global Crossing)
Lyon-Geneva: STM-16 from VTHD (France
Telecom)
CH backup access STM-16 to GEANT (COLT)
Chicago-Sunnyvale: STM-64 from TeraGrid
(Level3)
Back-to-back GbE to Canarie in Chicago
Back-to-back 10GbE to TeraGrid and Abilene in
Chicago
3 February 2003
APM meeting - Barcelona
Paolo Moroni (Slide 7)
General information

2-years project: 2002 and 2003

Dedicated staff (recruited on project budget)

Part-time staff, shared with other activities


Open to cooperate with other projects: EUDataGrid, GEANT, Abilene, TeraGrid,
NetherLight, etc.
Typical EU work package structure: quarterly
reports, deliverables at fixed deadlines, periodic
reviews by external inspectors. Very formal, heavy
and structured framework, but effective to avoid
project drifting, delays and wastes
3 February 2003
APM meeting - Barcelona
Paolo Moroni (Slide 8)
Open DataTAG




DataTAG is open to cooperate with other
research projects
Proposals for additional activity on the
DataTAG test-bed are welcome
One requirement: ongoing work must not
be affected (no overbooking)
The current schedule is already relatively
busy (both EU and US activities)
3 February 2003
APM meeting - Barcelona
Paolo Moroni (Slide 9)
DataTAG Network map
R06chi-Alcatel7770
W01chi
w02chi
w03chi
w04chi
w05chi
w06chi
ONS15454
SURFNET
2x1GE
R06gva-Alcatel7770
V10chi
v11chi
v12chi
v13chi
4x1GE
8x1GE
Alcatel 1670
ONS15454
R05chi-JuniperM10
ABILENE
Stm64(L3)
2x1GE
2x1GE
VTHD/INRIA
1GE
Stm16 (FranceTelecom)
1GE
2x1GE
Stm16(DTag)
Extreme
Summit5i
10GE
Stm16(GC)
4x1GE
2x1GE
SUNNYVALE
2x1GE
Alcatel 1670
10x1GE
1GE
SURFNET
W03gva
w04gva
1GE
CANARIE
2x1GE
ONS15454
W01gva
w02gva
w05gva
w06gva
R04chi-Cisco7609
R04gva-Cisco7606
1GE
R05gva-JuniperM10
Extreme
Summit1i
1GE
2x1GE
10GE
Cisco5505-management
Chicago Geneva
CERN External Network
1GE
Vlan4
Vlan5
1GE
Vlan7
DataTAG
1GE
Teragrid JuniperT640
1GE
Stm4(DTag)
1GE
Stm16(Swisscom)
Cisco2950-management
ar3-chicago -Cisco7606
Cernh4-Cisco7609
SWITCH
Cernh7-Cisco7609
GEANT
Stm16(Colt)
backup+projects
3 February 2003
[email protected] - last update: 20021204
APM meeting - Barcelona
GARR/CNAF
Paolo Moroni (Slide 10)
Management addresses and path to reach them from the interne
Datatag testbed addresses
Path of the CCC tunnel from CNAF
Path between the OC farms in Chicago and Geneva
Path between Sunnyvale and the PC farm in Geneva
DataTAG Routing map
W01chi
w02chi
w03chi
w04chi
w05chi
w06chi
SURFNET
R06gva
R06chi
V10chi
v11chi
v12chi
v13chi
W01gva
w02gva
w05gva
w06gva
SURFNET
W03gva
w04gva
CANARIE
VTHD/INRIA
R05gva
R05chi
R04gva
R04chi
SUNNYVALE
ABILENE
192.91.236.0/23 192.91.238.0/23
Cisco5505-management
Teragrid JuniperT640
Chicago Geneva
192.91.246.192/26
192.91.244.0/27
DataTAG
CERN External Network
SWITCH
Cisco2950-management
ar3-chicago
3 February 2003
[email protected] - last update: 20021204
Cernh4
APM meeting - Barcelona
Cernh7
GEANT
GARR/CNAF
Paolo Moroni (Slide 11)
(some) Research topics
(I)





Linux kernel tuning for high performance: for
example, 8 Terabytes in 24 hours, memory-tomemory (achieved by S. Ravot, CalTech)
Bulk file transfer (Terabyte disk-to-disk, at 2
Gbps, achieved by Canadian researchers between
TRIUMF and CERN)
TCP stack improvements (things get worse with
longer RTT)
Application-level performance measurement
GRID middleware interoperability between EU and
US
3 February 2003
APM meeting - Barcelona
Paolo Moroni (Slide 12)
(some) Research topics
(II)


10 Gb tests (10GbE and transatlantic link upgrade,
tentatively in September 2003) at layer 2 and
layer 3
Multi-vendor equipment (Alcatel 1670 and 7770,
Cisco 760x, Juniper M10, Extreme Summit
switches)

QoS tests, advanced reservation

Optical networking: validation of equipment

Direct access (hardware and software) to the
equipment is essential for most activities
3 February 2003
APM meeting - Barcelona
Paolo Moroni (Slide 13)
Issues (I)


Electrical power in Chicago (now fixed)
Broken network hardware (mainly 10 GbE cards,
but not only)

Broken or hung PCs

Router interfaces disabled

Never enough workstations available for testing

Reservation software: never sophisticated enough


Network topology: each research group has
different requirements
Alcatel 1670: needs STM-16 reconfiguration
3 February 2003
APM meeting - Barcelona
Paolo Moroni (Slide 14)
Issues (II)





Cisco 760x components (un)availability
Demo workshops (iGRID2002, SC2002, …): nice to
see, but organizational nightmares
KPNQwest collapse (re-procurement via Tsystems)
Routing: interconnections with production networks,
with external test networks, even with commodity
Internet, plus management access everywhere
Furthermore, routing is open for experiments
(research groups require enabled access to the
routers)
3 February 2003
APM meeting - Barcelona
Paolo Moroni (Slide 15)
Issues (III)

A lot of enthusiasm and interest, unfortunately not
always supported by adequate planning

When adequate planning is there, it is ignored
(because of the enthusiasm, of course)

Network planning is not straightforward: mix
between test and production

Partners coordination is even less straightforward:
mix of shared and private resources
3 February 2003
APM meeting - Barcelona
Paolo Moroni (Slide 16)
Issues (IV)





Some JunOS features were discovered in the hard
way
Same for some IOS “features”
DNS, security, access control, management
servers, VLANs, WWW site, IP addressing (>100
addresses assigned), etc.: all need planning, work
and maintenance
PoP management (run out of rack space) +
installation issues (mainly, but not only, Alcatel)
OOB access in Chicago, to recover from
configuration mistakes (this works only sometimes
with PCs)
3 February 2003
APM meeting - Barcelona
Paolo Moroni (Slide 17)
Conclusion and
acknowledgements



DataTAG is for testing what it would be often
useful to do and may not be done because it is too
risky, or too expensive, or too complicated, or
simply because you cannot afford a test laboratory
DataTAG is open to external collaborations, as far
as it is allowed by the current test-bed workload
Many thanks to DANTE/GEANT and DoE/CalTech,
who are actively supporting the project
3 February 2003
APM meeting - Barcelona
Paolo Moroni (Slide 18)
http://www.datatag.org
Thank you