20050718-ESNet-Johnston

Download Report

Transcript 20050718-ESNet-Johnston

The Evolution of ESnet
Joint Techs Summary
William E. Johnston
ESnet Manager and Senior Scientist
Lawrence Berkeley National Laboratory
([email protected], www.es.net)
1
What Does ESnet Provide?
•
The purpose of ESnet is to support the science
missions of the Department of Energy’s
Office of Science, as well as other parts of DOE
(mostly NNSA). To this end ESnet provides
o
Comprehensive physical and logical connectivity
- High bandwidth access to DOE sites and DOE’s primary science
collaborators – the Research and Education institutions in the US,
Europe, Asia Pacific, and elsewhere
- Full access to the global Internet for DOE Labs (160,000 routes
from 180 peers at 40 peering points)
o
An architecture designed to move huge amounts of data
between a small number of sites that are scattered all over
the world
o
Full ISP services
2
ESnet Science Data Network
(SDN) core
ESnet Provides High-Speed Internet Connectivity to
DOE Facilities and Collaborators, Summer 2005
Australia
CA*net4
Taiwan
(TANet2)
Singaren
CA*net4
France
GLORIAD
Kreonet2
MREN
Netherlands
StarTap
TANet2
Taiwan (ASCC)
SINet (Japan)
Japan – Russia (BINP)
CERN
(LHCnet – part
DOE funded)
GEANT
- France, Germany,
Italy, UK, etc
LIGO
PNNL
ESnet IP core
MIT
BNL
JGI
LBNL
NERSC
SLAC
TWC
LLNL
SNLL
QWEST
ATM
FNAL
ANL
AMES
Lab DC
Offices
PPPL
MAE-E
YUCCA MT
KCP
JLAB
OSTI
LANL
ARM
GA
Equinix
OSC GTN
NNSA
PAIX-PA
Equinix, etc.
SNLA
ORNL
ORAU
NOAA
SRS
Allied
Signal
42 end user sites
Office Of Science Sponsored (22)
NNSA Sponsored (12)
Joint Sponsored (3)
Other Sponsored (NSF LIGO, NOAA)
ESnet IP core: Packet over
Laboratory Sponsored (6)
SONET Optical Ring and Hubs
commercial and R&E peering points
SND core hubs
IP core hubs
high-speed peering points with Internet2/Abilene
International (high speed)
10 Gb/s SDN core
10G/s IP core
2.5 Gb/s IP core
MAN rings (≥ 10 G/s)
OC12 ATM (622 Mb/s)
OC12 / GigEthernet
OC3 (155 Mb/s)
45 Mb/s and less
DOE Office of Science Drivers for Networking
•
•
The large-scale science that is the mission of the
Office of Science is dependent on networks for
o
Sharing of massive amounts of data
o
Supporting thousands of collaborators world-wide
o
Distributed data processing
o
Distributed simulation, visualization, and computational
steering
o
Distributed data management
These issues were explored in two Office of Science
workshops that formulated networking requirements
to meet the needs of the science programs (see
refs.)
4
Evolving Quantitative Science Requirements for Networks
Science Areas
considered in the
Workshop
(not Nuclear
Physics and
Supercomputing)
Today
End2End
Throughput
5 years
End2End
Documented
Throughput
Requirements
5-10 Years
End2End
Estimated
Throughput
Requirements
Remarks
High Energy
Physics
0.5 Gb/s
100 Gb/s
1000 Gb/s
high bulk
throughput
Climate (Data &
Computation)
0.5 Gb/s
160-200 Gb/s
N x 1000 Gb/s
high bulk
throughput
SNS NanoScience
Not yet started
1 Gb/s
1000 Gb/s +
QoS for control
channel
remote control
and time critical
throughput
Fusion Energy
0.066 Gb/s
(500 MB/s
burst)
0.198 Gb/s
(500MB/
20 sec. burst)
N x 1000 Gb/s
time critical
throughput
Astrophysics
0.013 Gb/s
(1 TBy/week)
N*N multicast
1000 Gb/s
computational
steering and
collaborations
Genomics Data &
Computation
0.091 Gb/s
(1 TBy/day)
100s of users
1000 Gb/s +
QoS for control
channel
high throughput
and steering
5
 Observed Drivers for the Evolution of ESnet
ESnet is CurrentlyESnet
Transporting
530 Terabytes/mo.
Monthly AcceptedAbout
Traffic Through
and this volume is increasing
exponentially
May, 2005
ESnet Monthly Accepted Traffic
Feb., 1990 – May, 2005
500
400
300
200
100
Feb, 05
Aug, 04
Feb, 04
Aug, 03
Feb, 03
Aug, 02
Feb, 02
Aug,01
Feb, 01
Aug, 00
Feb, 00
Aug, 99
Feb, 99
Aug, 98
Feb, 98
Aug, 97
Feb, 97
Aug, 96
Feb, 96
Aug, 95
Feb, 95
Aug, 94
Feb, 94
Aug, 93
Feb, 93
Aug, 92
Feb, 92
Aug, 91
Feb, 91
Aug, 90
0
Feb, 90
TByte/Month
TBytes/Month
600
6
Observed Drivers for the Evolution of ESnet
ESnet traffic has increased by 10X every 46 months, on average,
since 1990
ESnet Monthly Accepted Traffic Through May, 2005
Dec., 2001
R2 = 0.9903
Jul., 1998
42 months
100.0
57 months
Oct., 1993
10.0
Aug., 1990
39 months
1.0
Feb, 05
Aug, 04
Feb, 04
Aug, 03
Feb, 03
Aug, 02
Feb, 02
Aug,01
Feb, 01
Aug, 00
Feb, 00
Aug, 99
Feb, 99
Aug, 98
Feb, 98
Aug, 97
Feb, 97
Aug, 96
Feb, 96
Aug, 95
Feb, 95
Aug, 94
Feb, 94
Aug, 93
Feb, 93
Aug, 92
Feb, 92
Aug, 91
Feb, 91
0.0
Aug, 90
0.1
Feb, 90
TByte/Month
TBytes/Month
1000.0
7
ESnet Science Traffic
•
The top 100 ESnet flows consistently account for
25% - 40% of ESnet’s monthly total traffic – these
are the result of DOE’s Office of Science large-scale
science projects
o
o
•
Top 100 are 100-150 Terabytes out of about 550 Terabytes
The other 60-75% of the ESnet monthly traffic is in
6,000,000,000 flows
As LHC (CERN high energy physics accelerator)
data starts to move, the large science flows will
increase a lot (200-2000 times)
o
Both LHC, US tier 1 data centers are at DOE Labs –
Fermilab and Brookhaven
- All of the data from the two major LHC experiments – CMS and
Atlas – will be stored at these centers for analysis by groups at US
universities
8
Terabytes/Month
DOE Lab-International R&E
Lab-U.S. R&E (domestic)
12
12
10
10
8
8
6
6
4
4
2
2
0
SLAC (US)  RAL (UK)
Fermilab (US)  WestGrid (CA)
SLAC (US)  IN2P3 (FR)
LIGO (US)  Caltech (US)
SLAC (US)  Karlsruhe (DE)
LLNL (US)  NCAR (US)
SLAC (US)  INFN CNAF (IT)
Fermilab (US)  MIT (US)
Fermilab (US)  SDSC (US)
Fermilab (US)  Johns Hopkins
Fermilab (US)  Karlsruhe (DE)
IN2P3 (FR)  Fermilab (US)
LBNL (US)  U. Wisc. (US)
Fermilab (US)  U. Texas, Austin (US)
BNL (US)  LLNL (US)
BNL (US)  LLNL (US)
Fermilab (US)  UC Davis (US)
Qwest (US)  ESnet (US)
Fermilab (US)  U. Toronto (CA)
BNL (US)  LLNL (US)
BNL (US)  LLNL (US)
CERN (CH)  BNL (US)
NERSC (US)  LBNL (US)
DOE/GTN (US)  JLab (US)
U. Toronto (CA)  Fermilab (US)
NERSC (US)  LBNL (US)
NERSC (US)  LBNL (US)
NERSC (US)  LBNL (US)
NERSC (US)  LBNL (US)
CERN (CH)  Fermilab (US)
Source and Destination of the Top 30 Flows, Feb. 2005
Lab-Lab (domestic)
Lab-Comm. (domestic)
9
DOE Science Requirements for Networking
1) Network bandwidth must increase substantially, not just
in the backbone but all the way to the sites and the
attached computing and storage systems
2) A highly reliable network is critical for science – when
large-scale experiments depend on the network for success,
the network must not fail
3) There must be network services that can guarantee various
forms of quality-of-service (e.g., bandwidth guarantees)
and provide traffic isolation
4) A production, extremely reliable, IP network with Internet
services must support the process of science
10
Strategy For The Evolution of ESnet
A three part strategy for the evolution of ESnet
1) Metropolitan Area Network (MAN) rings to provide
-
dual site connectivity for reliability
-
much higher site-to-core bandwidth
-
support for both production IP and circuit-based traffic
2) A Science Data Network (SDN) core for
-
provisioned, guaranteed bandwidth circuits to support large, high-speed
science data flows
-
very high total bandwidth
-
multiply connecting MAN rings for protection against hub failure
-
alternate path for production IP traffic
3) A High-reliability IP core (e.g. the current ESnet core) to address
-
general science requirements
-
Lab operational requirements
-
Backup for the SDN core
-
vehicle for science services
11
Strategy For The Evolution of ESnet:
Two Core Networks and Metro. Area Rings
CERN
Asia-Pacific
GEANT
(Europe)
Australia
Science Data Network Core
(SDN) (NLR circuits)
Aus.
Sunnyvale
New York
IP Core (Qwest)
Washington,
DC
LA
Albuquerque
San Diego
IP core hubs
Metropolitan
Area Rings
SDN/NLR hubs
Primary DOE Labs
New hubs
El Paso
Production IP core
Science Data Network core
Metropolitan Area Networks
Lab supplied
International connections
First Two Steps in the Evolution of ESnet
1) The SF Bay Area MAN will provide to the five OSC
Bay Area sites
o
Very high speed site access – 20 Gb/s
o
Fully redundant site access
2) The first two segments of the second national
10 Gb/s core – the Science Data Network – will be
San Diego to Sunnyvale to Seattle
13
ESnet SF Bay Area
MAN Ring (Sept., 2005)
SDN to Seattle
(NLR)
• 2 λs (2 X 10 Gb/s channels)
in a ring configuration, and
delivered as 10 GigEther
circuits
~10-50X current site
bandwidth
λ1 production IP
LBNL
λ2 SDN/circuits
NERSC
λ3 future
λ4 future
SF Bay
Area
• Will be used as a 10 Gb/s
production IP ring and
2 X 10 Gb/s paths (for
circuit services) to each site
• Project completion date is
9/2005
10 Gb/s
optical channels
Joint
Genome
Institute
• Dual site connection
(independent “east” and
“west” connections) to each
site
• Qwest contract signed for
two lambdas 2/2005 with
options on two more
IP core to
Chicago (Qwest)
LLNL
SNLL
SLAC
DOE Ultra
Science Net
(research net)
Level 3
hub
Qwest /
ESnet hub
ESnet MAN
ring (Qwest
circuits)
ESnet hubs
and sites
SDN to San
Diego
NASA
Ames
IP core to El Paso
SF Bay Area MAN – Typical Site Configuration
0-10 Gb/s
drop-off
IP traffic
Site LAN
site
Site
West
λ1 and λ2
ESnet
6509
nx1GE
or 10GE
IP
SF BA
MAN
1 or 2 x 10 GE
(provisioned circuits
via VLANS)
East
λ1 and λ2
0-20 Gb/s
VLAN traffic
0-10 Gb/s
pass-through
VLAN traffic
0-10 Gb/s
pass-through
IP traffic
= 24 x 1 GE line cards
= 4 x 10 GE line cards
(using 2 ports max. per
card)
15
Evolution of ESnet – Step One:
SF Bay Area MAN and West Coast SDN
CERN
Asia-Pacific
GEANT
(Europe)
Australia
Science Data Network Core
(SDN) (NLR circuits)
Sunnyvale
Aus.
New York
IP Core (Qwest)
Washington,
DC
LA
San Diego
IP core hubs
Albuquerque
Metropolitan
Area Rings
El Paso
SDN/NLR hubs
Primary DOE Labs
New hubs
In service by Sept., 2005
planned
Production IP core
Science Data Network core
Metropolitan Area Networks
Lab supplied
International connections
Evolution – Next Steps
•
•
ORNL 10G circuit to Chicago
Chicago MAN
o
•
Long Island MAN
o
•
•
IWire partnership
Try and achieve some diversity in NYC by including a hub
at 60 Hudson as well as 32 AoA
More SDN segments
Jefferson Lab via MATP and VORTEX
17
New Network Services
•
New network services are also critical for ESnet to
meet the needs of large-scale science
•
Most important new network service is dynamically
provisioned virtual circuits that provide
o
Traffic isolation
- will enable the use of high-performance, non-standard transport
mechanisms that cannot co-exist with commodity TCP based
transport
(see, e.g., Tom Dunigan’s compendium
http://www.csm.ornl.gov/~dunigan/netperf/netlinks.html )
o
Guaranteed bandwidth
- the only way that we have currently to address deadline
scheduling – e.g. where fixed amounts of data have to reach sites
on a fixed schedule in order that the processing does not fall
behind far enough so that it could never catch up – very important
for experiment data analysis
18
OSCARS: Guaranteed Bandwidth Service
•
Testing OSCARS Label Switched Paths (MPLS
based virtual circuits)
o
•
(update in the panel discussion)
A collaboration with the other major science R&E
networks to ensure compatible services (so that
virtual services can be set up end-to-end across
ESnet, Abilene, and GEANT)
o
code is being jointly developed with Internet2's Bandwidth
Reservation for User Work (BRUW) project – part of the
Abilene HOPI (Hybrid Optical-Packet Infrastructure)
project
o
Close cooperation with the GEANT virtual circuit project
(“lightpaths – Joint Research Activity 3 project)
19
Federated Trust Services
•
Remote, multi-institutional, identity authentication is
critical for distributed, collaborative science in order
to permit sharing computing and data resources,
and other Grid services
•
Managing cross site trust agreements among many
organizations is crucial for authentication in
collaborative environments
o
•
ESnet assists in negotiating and managing the cross-site,
cross-organization, and international trust relationships to
provide policies that are tailored to collaborative science
The form of the ESnet trust services are driven
entirely by the requirements of the science
community and direct input from the science
community
20
ESnet Public Key Infrastructure
•
ESnet provides Public Key Infrastructure and X.509
identity certificates that are the basis of secure,
cross-site authentication of people and Grid systems
•
These services (www.doegrids.org) provide
o
Several Certification Authorities (CA) with different uses
and policies that issue certificates after validating request
against policy

This service was the basis of the first routine sharing of
HEP computing resources between US and Europe
21
ESnet Public Key Infrastructure
• Root CA is kept off-line in a vault
• Subordinate CAs are kept in locked,
ESnet root CA
alarmed racks in an access controlled
machine room and have dedicated firewalls
• CAs with different policies as required by
the science community
DOEGrids CA
o DOEGrids CA has a policy tailored to
accommodate international science
collaboration
o NERSC CA policy integrates CA and
certificate issuance with NIM (NERSC
user accounts management services)
NERSC CA
FusionGrid CA
…… CA
o FusionGrid CA supports the FusionGrid
roaming authentication and
authorization services, providing
complete key lifecycle management
22
5250
5000
4750
4500
4250
4000
3750
3500
3250
3000
2750
2500
2250
2000
1750
1500
1250
1000
750
500
250
0
User Certificates
Service Certificates
Expired(+revoked)
Certificates
Total Certificates Issued
Total Cert Requests
Ja
n
Fe -0 3
b
M - 03
ar
Ap - 03
M r- 03
ay
Ju -03
nJu 0 3
Au l-03
g
Se - 03
p
O - 03
ct
No 03
v
De -0 3
cJa 0 3
n
Fe -0 4
b
M - 04
ar
Ap - 04
r
M - 04
ay
Ju -04
nJu 0 4
Au l-04
g
Se - 04
p
O - 04
ct
No 04
v
De -0 4
cJa 0 4
n05
No.of certificates or requests
DOEGrids CA (one of several CAs) Usage Statistics
Production service began in June 2003
User Certificates
1386 Total No. of Certificates
3569
Service Certificates
2168 Total No. of Requests
4776
Host/Other Certificates
* FusionGRID CA certificates not included here.
15 Internal PKI SSL Server
certificates
36
* Report as of Jan 11,200523
DOEGrids CA Usage - Virtual Organization Breakdown
DOEGrids CA Statistics (Total Certs 3569)
ANL
4.3%
DOESG
0.5% ESG
1.0%
ESnet
0.6%
FusionGRID
7.4%
*Others
38.9%
*
iVDGL
17.9%
LBNL
1.8%
NERSC
4.0%
LCG
0.3%
NCC-EPA
0.1%
FNAL
8.6%
PNNL
PPDG 0.6%
13.4%
ORNL
0.7%
*DOE-NSF
collab.
24
North American Policy Management Authority
• The Americas Grid, Policy Management Authority
• An important step toward regularizing the management of trust in the
international science community
• Driven by European requirements for a single Grid Certificate Authority
policy representing scientific/research communities in the Americas
• Investigate Cross-signing and CA Hierarchies support for the science
community
• Investigate alternative authentication services
• Peer with the other Grid Regional Policy Management Authorities (PMA).
European Grid PMA [www.eugridpma.org ]
o Asian Pacific Grid PMA [www.apgridpma.org]
o
• Started in Fall 2004 [www.TAGPMA.org]
• Founding members
o
o
o
o
o
DOEGrids (ESnet)
Fermi National Accelerator Laboratory
SLAC
TeraGrid (NSF)
CANARIE (Canadian national R&E network)
25
Federated Crypto Token Services
•
Strong authentication is needed to reduce the risk of
identity theft
o
•
Identity theft was the mechanism of the successful attacks
on US supercomputers in spring 2004
RADIUS Authentication Fabric pilot project (RAF)
o
For enabling strong, cross-site authentication
o
Cryptographic tokens (e.g. RSA SecurID cards) are
effective, but every site uses a different approach

ESnet has developed a federation service for crypto
tokens (SecurID, CRYPTOCard, etc.)
26
ESnet RADIUS Authentication Fabric
•
What is the RAF?
o
Access to an application (e.g. system login) is based on
authentication info. provided by the token and the user's
home site identity
o
Hierarchy of RADIUS servers that
- route authentication queries from an application (e.g. a login
process) at one site to a One-Time Password (OTP) service at the
user’s home site
- the home site can then authenticate the user
- outsourcing the routing reduces inter site connection management
from O(n2) to O(n)
o
A collection of cross site trust agreements
27
References – DOE Network Related Planning Workshops

1) High Performance Network Planning Workshop, August 2002
http://www.doecollaboratory.org/meetings/hpnpw

2) DOE Science Networking Roadmap Meeting, June 2003
http://www.es.net/hypertext/welcome/pr/Roadmap/index.html
3) DOE Workshop on Ultra High-Speed Transport Protocols and Network
Provisioning for Large-Scale Science Applications, April 2003
http://www.csm.ornl.gov/ghpn/wk2003
4) Science Case for Large Scale Simulation, June 2003
http://www.pnl.gov/scales/
5) Workshop on the Road Map for the Revitalization of High End Computing, June
2003
http://www.cra.org/Activities/workshops/nitrd
http://www.sc.doe.gov/ascr/20040510_hecrtf.pdf (public report)
6) ASCR Strategic Planning Workshop, July 2003
http://www.fp-mcs.anl.gov/ascr-july03spw
7) Planning Workshops-Office of Science Data-Management Strategy, March &
May 2004
o
http://www-conf.slac.stanford.edu/dmw2004
28