ESnet - Indico

Download Report

Transcript ESnet - Indico

Services to the US Tier-1 Sites
LHCOPN
April 4th, 2006
Joe Metzger
[email protected]
ESnet Engineering Group
Lawrence Berkeley National Laboratory
1
Outline
• Next Generation ESnet
 Next
Generation ESnet
 Requirements
 Architecture
 Studying Architectural Alternatives
 Reliability
 Connectivity
 2010 Bandwidth and Footprint goal
• ESnet Circuit Services
 OSCARS
 LHCOPN
Circuits
 BNL
 FERMI
2
Next Generation ESnet
• Current IP Backbone Contract Expires End of 07
–
Backbone Circuits
–
Hub Colocation Space
–
Some Site Access Circuits
• Acquisition
–
Background research in progress
• Implementation
–
Major changes may happen in 2007
• No Negative LHC Impact
–
–
Should not change primary LHCOPN paths
May change/improve some US Tier 1 to US Tier 2 paths
3
Next Generation ESnet Requirements
• Greater reliability
–
Multiple connectivity at several levels
• Two backbones: production IP and Science Data Network (SDN)
• Redundant site access links
• Redundant, high bandwidth US and international R&E connections
–
Continuous, end-to-end monitoring to anticipate problems and assist in
debugging distributed applications
• Connectivity
–
–
–
Footprint to reach major collaborators in the US, Europe, and Asia
Connections to all major R&E peering points
Initial build-out that satisfies near-term LHC connectivity requirements
• More bandwidth
–
–
–
Multiple lambda based network – SDN
Scalable bandwidth
Initial build-out that satisfies near-term LHC bandwidth requirements
4
 Next Generation ESnet Architecture
• Main architectural elements and the rationale for each element
1) A High-reliability IP core (e.g. the current ESnet core) to address
–
–
–
–
–
General science requirements
Lab operational requirements
Backup for the SDN core
Vehicle for science services
Full service IP routers
2) Metropolitan Area Network (MAN) rings to provide
–
–
–
–
Dual site connectivity for reliability
Much higher site-to-core bandwidth
Support for both production IP and circuit-based traffic
Multiply connecting the SDN and IP cores
2a) Loops off of the backbone rings to provide
– For dual site connections where MANs are not practical
3) A Science Data Network (SDN) core for
–
–
–
–
–
–
Provisioned, guaranteed bandwidth circuits to support large, high-speed science data flows
Very high total bandwidth
Multiply connecting MAN rings for protection against hub failure
Alternate path for production IP traffic
Less expensive router/switches
Initial configuration targeted at LHC, which is also the first step to the general configuration that will
address all SC requirements
– Can meet other unknown bandwidth requirements by adding lambdas
5
ESnet Target Architecture:
High-reliability IP Core
Sunnyvale
New York
Denver
IP Core
Washington
DC
LA
San Diego
Albuquerque
IP core hubs
SDN hubs
Primary DOE Labs
Possible hubs
6
ESnet Target Architecture:
Metropolitan Area Rings
Sunnyvale
New York
Denver
Washington
DC
LA
San Diego
Metropolitan
Area Rings
Albuquerque
IP core hubs
SDN hubs
Primary DOE Labs
Possible hubs
7
ESnet Target Architecture:
Loops Off the IP Core
CERN
Loop off
Backbone
Sunnyvale
New York
Denver
Washington
DC
LA
San Diego
Albuquerque
IP core hubs
SDN hubs
Primary DOE Labs
Possible hubs
8
ESnet Target Architecture:
Science Data Network
Science Data Network Core
Sunnyvale
New York
Denver
Washington
DC
LA
San Diego
Albuquerque
IP core hubs
SDN hubs
Primary DOE Labs
Possible hubs
9
ESnet Target Architecture:
IP Core+Science Data Network Core+Metro Area Rings
international
connections
international
connections
international
connections
SDN Core
international
connections
Sunnyvale
New York
Denver
Metropolitan
Area Rings
IP Core
Loop off
Backbone
Washington
DC
LA
Albuquerque
San
Diego
IP core hubs
SDN hubs
Primary DOE Labs
Possible hubs
international
connections
international
connections
10-50 Gbps circuits
Production IP core
Science Data Network core
Metropolitan Area Networks
International connections
10
 Studying Architectural Alternatives
• ESnet has considered a number of technical
variations that could result from the acquisition
process
• Dual Carrier Model
One carrier provides IP circuits, a 2nd provides SDN circuits
– Physical diverse Hubs, Fiber, Conduit
–
• Diverse fiber routes in some areas.
• Single Carrier Model
–
–
–
One carrier provides both SDN and IP circuits
Use multiple smaller rings to improve reliability in the face of
partition risks
• In event of dual cut, fewer sites are isolated because of richer cross
connections
• Multiple lambdas also provide some level of protection
May require additional engineering effort, colo space and
equipment to meet the reliability requirements
11
Dual Carrier Model
SDN Core
Sunnyvale
New York
Denver
IP Core
Washington
DC
LA
San Diego
Albuquerque
IP core hubs
SDN hubs
Primary DOE Labs
possible hubs
10-50 Gbps circuits
Production IP core
Science Data Network core
Metropolitan Area Networks
12
Single Carrier Model
MAN connections
IP core
rtr
Seattle
SDN
core
sites,
peers, etc.
IP core
peers, etc.
sw
sw
Sites on
MAN ring
Denver
MAN ring
Chicago
Cleveland
New York
Sunnyvale
Boise
SDN
core
Kansas City
Wash. DC
SDN & IP are
different
Atlanta
lambdas on the
same fiber
Albuquerque
San Diego
Jacksonville
router+switch site
IP core
rtr
core
sites,
peers, etc.
sw
San Antonio
core
SDN core
sw
switch site
core
Lambda used for IP core
Lambdas used for SDN core
13
 Reliability
• Reliability within ESnet
–
Robust architecture with redundant equipment to reduce or
eliminate risk of single or multiple failures
• End-to-End Reliability
–
Close planning collaboration with national and international
partners
–
Multiple distributed connections with important national and
international R&E networks
–
Support end-to-end measurement and monitoring across
multiple domains (PerfSONAR)
• Collaboration between ESnet, GEANT, Internet2, and European
NRENS
• Building measurement infrastructure for use by other monitoring and
measurement tools
14
 Connectivity
CANARIE
Asia-Pacific
CANARIE
Asia Pacific
CANARIE
GLORIAD
Asia
Pacific
CERN
CERN
GEANT
(Europe)
Australia
SDN Core
Aus.
Sunnyvale
New York
Denver
IP Core
Washington
DC
LA
Albuquerque
San Diego
IP core hubs
AMPATH
AMPATH
SDN hubs
Primary DOE Labs
High Speed Cross connects with Abilene
Gigapops and International Peers
10-50 Gbps circuits
Production IP core
Science Data Network core
Metropolitan Area Networks
International connections
15
ESnet 2007 SDN+MANs Upgrade Increment
CERN-1 CERN-2
CERN-3
Seattle
Portland
Boise
Clev.
NYC
Denver
KC
Ogden
Pitts.
Raleigh
Albuq.
San Diego
Tulsa
Atlanta
Dallas
Jacksonville
El Paso Las Cruces
ESnet IP core hubs
GÉANT-2
LA
Phoenix
Wash DC
GÉANT-1
Sunnyvale
Chicago
San Ant.
Houston
Pensacola
Baton Rouge
ESnet IP core sub-hubs
ESnet SDN/NLR switch/router hubs
ESnet SDN/NLR switch hubs
New hubs
NLR PoPs
ESnet Science Data Network core (10G/link))
CERN/DOE supplied (10G/link)
International IP connections (10G/link)
16
ESnet 2008 SDN+MANs Upgrade Increment
CERN-1 CERN-2
CERN-3
Seattle
Portland
Boise
Clev.
NYC
Denver
PPPL
KC
Ogden
Pitts.
Raleigh
GA
Albuq.
San Diego
Tulsa
Atlanta
ORNL-ATL
Dallas
Jacksonville
El Paso Las Cruces
ESnet IP core hubs
GÉANT-2
LA
Phoenix
Wash DC
GÉANT-1
Sunnyvale
Chicago
San Ant.
Houston
Pensacola
Baton Rouge
ESnet IP core sub-hubs
ESnet SDN/NLR switch/router hubs
ESnet SDN/NLR switch hubs
New hubs
NLR PoPs
ESnet Science Data Network core (10G/link))
CERN/DOE supplied (10G/link)
International IP connections (10G/link)
17
ESnet 2009 SDN+MANs Upgrade Increment
CERN-1 CERN-2
CERN-3
Seattle
Portland
Boise
Clev.
NYC
Denver
PPPL
KC
Ogden
Pitts.
Raleigh
GA
Albuq.
San Diego
Tulsa
Atlanta
ORNL-ATL
Dallas
Jacksonville
El Paso Las Cruces
ESnet IP core hubs
GÉANT-2
LA
Phoenix
Wash DC
GÉANT-1
Sunnyvale
Chicago
San Ant.
Houston
Pensacola
Baton Rouge
ESnet IP core sub-hubs
ESnet SDN/NLR switch/router hubs
ESnet SDN/NLR switch hubs
New hubs
NLR PoPs
ESnet Science Data Network core (10G/link))
CERN/DOE supplied (10G/link)
International IP connections (10G/link)
18
ESnet 2010 SDN+MANs Upgrade Increment
CERN-1 CERN-2
CERN-3
(Up to Nine Rings Can be Supported with the Hub Implementation)
Seattle
Portland
Boise
Clev.
Denver
PPPL
KC
Ogden
Pitts.
Raleigh
Albuq.
San Diego
GÉANT-2
LA
Phoenix
Wash DC
Tulsa
Atlanta
Dallas
GÉANT-1
Sunnyvale
Chicago
Jacksonville
El Paso Las Cruces
San Ant.
Houston
Pensacola
Baton Rouge
ESnet IP core hubs
ESnet SDN/NLR switch/router hubs
ESnet SDN/NLR switch hubs
New hubs
NLR PoPs
SDN links added since last presentation to DOE
ESnet Science Data Network core (10G/link))
CERN/DOE supplied (10G/link)
International IP connections (10G/link)
19
 Bandwidth and Footprint Goal – 2010
160-400 Gbps in 2011 with equipment upgrade
Asia-Pacific
Canada
(CANARIE)
Asia
Pacific
Canada
(CANARIE)
GLORIAD
Australia
Science Data Network Core
30 Gbps
Europe
(GEANT)
CERN 30 Gbps
CERN 30 Gbps
Europe
(GEANT)
Aus.
Sunnyvale
New York
Denver
LA
IP Core
10 Gbps
Metropolitan
Area Rings
20+ Gbps
San Diego
Albuquerque
IP core hubs South America
(AMPATH)
SDN hubs
Primary DOE Labs
High Speed Cross connects with I2/Abilene
possible new hubs
Washington
DC
South America
(AMPATH)
Production IP core
SDN core
MANs
International connections
20
OSCARS: Guaranteed Bandwidth Virtual Circuit Service
•
•
ESnet On-demand Secured Circuits and Advanced Reservation System (OSCARS)
To ensure compatibility, the design and implementation is done in collaboration with
the other major science R&E networks and end sites
–
Internet2: Bandwidth Reservation for User Work (BRUW)
–
GEANT: Bandwidth on Demand (GN2-JRA3), Performance and Allocated Capacity for
End-users (SA3-PACE) and Advance Multi-domain Provisioning System (AMPS) Extends
to NRENs
BNL: TeraPaths - A QoS Enabled Collaborative Data Sharing Infrastructure for Peta-scale
Computing Research
GA: Network Quality of Service for Magnetic Fusion Research
SLAC: Internet End-to-end Performance Monitoring (IEPM)
USN: Experimental Ultra-Scale Network Testbed for Large-Scale Science
–
–
–
–
•
•
• Development of common code base
Its current phase is a research project funded by the Office of Science,
Mathematical, Information, and Computational Sciences (MICS) Network R&D
Program
A prototype service has been deployed as a proof of concept
–
–
–
–
–
To date more then 20 accounts have been created for beta users, collaborators, and
developers
More then 100 reservation requests have been processed
BRUW Interoperability tests successful
DRAGON interoperability tests planned
GEANT (AMPS) interoperability tests planned
21
ESnet Virtual Circuit Service Roadmap
• Dedicated virtual circuits
• Dynamic virtual circuit allocation
• Generalized MPLS (GMPLS)
Initial production service
2005
2006
2007
2008
Full production service
• Interoperability between GMPLS circuits,
VLANs, and MPLS circuits (Layer 1-3)
• Interoperability between VLANs and MPLS circuits
(Layer 2 & 3)
• Dynamic provisioning of Multi-Protocol Label Switching
(MPLS) circuits (Layer 3)
22
ESnet Portions of LHCOPN Circuits
• Endpoints are VLANs on a trunk
–
BNL and FERMI will see 3 Ethernet VLANS from ESnet
–
CERN will see 3 VLANS on both interfaces from USLHCnet
• Will be dynamic Layer 2 circuits using AToM
–
Virtual interfaces on the ends will be tied to VRFs
–
VRFs for each circuit will be tied together using an MPLS
LSP or LDP
–
Manually configured
• Dynamic provisioning of circuits with these capabilities is on the
OSCARS roadmap for 2008
• USLHCnet portion will be static initially
–
They may explore using per-vlan spanning tree
23
Physical Connections
Canarie
BNL
GEANT
Bnl-mr1
Cisco
6500
AOA-mr1
Cisco
6500
R01ext
CERN
Cisco 6500
Aoa-cr1
Juniper
T320
manlan
Cisco
6500
Fnal-rt1
M20
e600nyc
USLHC
Force 10
E600
e600gva1
USLHC
Force 10
E600
Chi-cr1
T320
R01lcg
CERN
Force 10
E1200
ro2lcg
CERN
Force 10
E1200
Chi-sl-sdn1
T320
FERMI
fnal-mr1
Cisco
6500
Chi-sl-mr1
Cisco
6500
e600chi
USLHC
Force 10
E600
Starlight
Force 10
E1200
e600gva2
USLHC
Force 10
E600
R02ext
CERN
Cisco 6500
JCM
Version 6.0
March 29th
24
BNL LHCOPN Circuits
Green and yellow
circuits are carried
across aoa-cr1 until
aoa-mr1 is deployed.
BNL
Circuits are carried
across the LIMAN in
MPLS LSPs and can
use either path.
GEANT
Bnl-mr1
Cisco
6500
This circuit is carried in an
LSP from BNL to Starlight. It
can go either way around the
LIMAN and either way around
the ESnet backbone between
NYC and CHI. It will not go
through AOA once the 60
Hudson hub is installed
(FY07 or FY08)
Fnal-rt1
M20
Canarie
AOA-mr1
Cisco
6500
Path of last resort
(Routed)
Aoa-cr1
Juniper
T320
manlan
Cisco
6500
e600nyc
USLHC
Force 10
E600
e600gva1
Primary USLHC
Force 10
Circuit
E600
R01ext
CERN
Cisco 6500
R01lcg
CERN
Force 10
E1200
2nd Backup
Circuit
1st Backup
Circuit
Chi-cr1
T320
Chi-sl-sdn1
T320
ro2lcg
CERN
Force 10
E1200
FERMI
fnal-mr1
Cisco
6500
Chi-sl-mr1
Cisco
6500
e600chi
USLHC
Force 10
E600
Starlight
Force 10
E1200
e600gva2
USLHC
Force 10
E600
R02ext
CERN
Cisco 6500
JCM
Version 6.0
March 29th
25
FERMI LHCOPN Circuits
Canarie
BNL
GEANT
Bnl-mr1
Only 1 path across the ESnet links
between fnal-mr1 and StarlightCisco
is
shown for the green and yellow6500
path to simplify the diagram.
However they will be riding MPLS
LSPs and will automatically fail
over to any other available links.
AOA-mr1
Cisco
6500
Aoa-cr1
Juniper
T320
The red path will also fail over to
other circuits in the Chicago area,
or other paths between Chicago
and New York if available.
manlan
Cisco
6500
Note: The green primary circuit will
be 10Gbps across ESnet. The red
path is shared with other
production traffic.
Fnal-rt1
M20
Path of last resort
(Routed)
2nd Backup
Circuit
e600nyc
USLHC
Force 10
E600
e600gva1
USLHC
Force 10
E600
R01ext
CERN
Cisco 6500
R01lcg
CERN
Force 10
E1200
1st Backup
Circuit
Primary
Circuit
Chi-cr1
T320
ro2lcg
CERN
Force 10
E1200
Chi-sl-sdn1
T320
FERMI
fnal-mr1
Cisco
6500
Chi-sl-mr1
Cisco
6500
e600chi
USLHC
Force 10
E600
Starlight
Force 10
E1200
e600gva2
USLHC
Force 10
E600
R02ext
CERN
Cisco 6500
JCM
Version 6.0
March 29th
26
Outstanding Issues
• Is a single point of failure at the Tier 1 edges a
reasonable long term design?
• Bandwidth guarantees in outage scenarios
How do the networks signal that something has failed to the
applications?
– How do sites sharing a link during a failure coordinate BW
utilization?
–
• What expectations should be set for fail-over times?
–
Should BGP timers be tuned?
• We need to monitor the backup paths ability to
transfer packets end-to-end to ensure they will work
when needed.
–
How are we going to do it?
27