Transcript Summary

The Evolution of ESnet
(Summary)
William E. Johnston
ESnet Manager and Senior Scientist
Lawrence Berkeley National Laboratory
1
Summary – 1
•
•
ESnet’s mission to support the large-scale
science of the DOE Office of Science results in a
very unique network
o
The top 100 data flows each month account for about
25-40% of the total monthly network traffic – that is,
100-150 Terabytes out of about 450 Terabytes
(450,000,000 Megabytes)
o
These top 100 flows represent massive data flows from
science experiments to analysis sites and back
At the same time ESnet supports all of the other
DOE collaborative science and the Lab operations
o
The other 60-75% of the ESnet monthly traffic is in
6,000,000,000 flows
2
Summary – 2
•
ESnet must have an architecture that provides very
high capacity and high reliability at the same time
o
o
•
Science demands both
Lab operations demand high reliability
To meet the challenge of DOE science ESnet has
developed a new architecture that has two
national core rings and many metropolitan area
rings
o
o
o
o
One national core is specialized for massive science data
The other core is for general science and Lab operations
Each core is designed to provide backup for the other
The metropolitan rings reliably connect the Labs at highspeed to the two cores and connect the cores together
3
Summary – 3
•
ESnet is designed, built, and operated as a
collaboration between ESnet and the DOE
science community and the Labs
o
•
ESnet planning, configuration, and even operation have
input and active participation from the DOE science
community and the DOE Labs
ESnet also provides a collection of value added
services (“science services”) that support the
process of DOE collaborative science
o
o
National and international trust management services for
strong user authentication across all DOE science
collaborators (Lab, US university, and international
research and education institutions)
Audio and video conferencing that can be scheduled world
wide
4
Summary – 4
•
Taken together, these points demonstrate that
ESnet is an evolving creation that is uniquely
tailored to meet the needs of the large-scale science
of the Office of Science
o
This is not a network that can be purchased from a
commercial telecom – and if it were, it would be very
expensive
o
A very specialized set of services has been combined into
a unique facility to support the Office of Science mission
5
DOE Office of Science Drivers for Networking
•
•
The large-scale science that is the mission of the
Office of Science is dependent on networks for
o
Sharing of massive amounts of data
o
Supporting thousands of collaborators world-wide
o
Distributed data processing
o
Distributed simulation, visualization, and computational
steering
o
Distributed data management
These issues were explored in two Office of Science
workshops that formulated networking requirements
to meet the needs of the science programs (see
refs.)
6
 Science Requirements for Networking
August, 2002 Workshop
Organized by Office of Science
Mary Anne Scott, Chair, Dave Bader,
Steve Eckstrand. Marvin Frazier, Dale Koelling,
Vicky White
Workshop Panel Chairs:
Ray Bair, Deb Agarwal, Bill Johnston,
Mike Wilde, Rick Stevens, Ian Foster,
Dennis Gannon, Linda Winkler, Brian Tierney,
Sandy Merola, and Charlie Catlett
The network and middleware requirements to support DOE science
were developed by the OSC science community representing major
DOE science disciplines:
o
o
o
o
o
Climate simulation
Spallation Neutron Source facility
Macromolecular Crystallography
High Energy Physics experiments
Magnetic Fusion Energy Sciences
o
o
o
Chemical Sciences
Bioinformatics
The major supercomputing
facilities and Nuclear Physics were
considered separately
Conclusions: the network is essential for
long term (final stage) data analysis and collaboration
o “control loop” data analysis (influence an experiment in progress)
o distributed, multidisciplinary simulation
Available at www.es.net/#research
7
o
Evolving Quantitative Science Requirements for Networks
Science Areas
considered in the
Workshop
(not Nuclear
Physics and
Supercomputing)
Today
End2End
Throughput
5 years
End2End
Documented
Throughput
Requirements
5-10 Years
End2End
Estimated
Throughput
Requirements
Remarks
High Energy
Physics
0.5 Gb/s
100 Gb/s
1000 Gb/s
high bulk
throughput
Climate (Data &
Computation)
0.5 Gb/s
160-200 Gb/s
N x 1000 Gb/s
high bulk
throughput
SNS NanoScience
Not yet started
1 Gb/s
1000 Gb/s +
QoS for control
channel
remote control
and time critical
throughput
Fusion Energy
0.066 Gb/s
(500 MB/s
burst)
0.198 Gb/s
(500MB/
20 sec. burst)
N x 1000 Gb/s
time critical
throughput
Astrophysics
0.013 Gb/s
(1 TBy/week)
N*N multicast
1000 Gb/s
computational
steering and
collaborations
Genomics Data &
Computation
0.091 Gb/s
(1 TBy/day)
100s of users
1000 Gb/s +
QoS for control
channel
high throughput
and steering
8
March, 2005
 Observed Drivers for the Evolution of ESnet
ESnet is Currently Transporting About 430 Terabytes/mo.
(=430,000 Gigabytes/mo. = 430,000,000 Megabytes/mo.)
and this volume is increasing exponentially
500
ESnet Monthly Accepted Traffic
Feb., 1990 – Feb. 2005
TByte/Month
TBytes/Month
400
300
200
100
Feb, 05
Aug, 04
Feb, 04
Aug, 03
Feb, 03
Aug, 02
Feb, 02
Aug,01
Feb, 01
Aug, 00
Feb, 00
Aug, 99
Feb, 99
Aug, 98
Feb, 98
Aug, 97
Feb, 97
Aug, 96
Feb, 96
Aug, 95
Feb, 95
Aug, 94
Feb, 94
Aug, 93
Feb, 93
Aug, 92
Feb, 92
Aug, 91
Feb, 91
Aug, 90
Feb, 90
0
9
Observed Drivers for the Evolution of ESnet
ESnet traffic has increased by 10X every 46 months, on average,
since 1990
ESnet Monthly Accepted Traffic Through March, 2005
Dec., 2001
R2 = 0.99
Jul., 1998
42 months
100.0
57 months
Oct., 1993
10.0
Aug., 1990
39 months
1.0
Feb, 05
Aug, 04
Feb, 04
Aug, 03
Feb, 03
Aug, 02
Feb, 02
Aug,01
Feb, 01
Aug, 00
Feb, 00
Aug, 99
Feb, 99
Aug, 98
Feb, 98
Aug, 97
Feb, 97
Aug, 96
Feb, 96
Aug, 95
Feb, 95
Aug, 94
Feb, 94
Aug, 93
Feb, 93
Aug, 92
Feb, 92
Aug, 91
Feb, 91
0.0
Aug, 90
0.1
Feb, 90
TByte/Month
TBytes/Month
1000.0
10
ESnet Science Traffic
•
Since SLAC and FNAL based, high energy physics
experiment data analysis started, the top 100 ESnet
flows have consistently accounted for 25% - 40% of
ESnet’s monthly total traffic
o
•
Much of this data goes to sites in Europe for analysis
As LHC (CERN high energy physics accelerator)
data starts to move, the large science flows will
increase a lot (200-2000 times)
o
Both LHC, US tier 1 data centers are at DOE Labs –
Fermilab and Brookhaven
- All of the data from the two major LHC experiments – CMS and
Atlas – will be stored at these centers for analysis by groups at US
universities
11
Terabytes/Month
DOE Lab-International R&E
Lab-U.S. R&E (domestic)
12
12
10
10
8
8
6
6
4
4
2
2
0
SLAC (US)  RAL (UK)
Fermilab (US)  WestGrid (CA)
SLAC (US)  IN2P3 (FR)
LIGO (US)  Caltech (US)
SLAC (US)  Karlsruhe (DE)
LLNL (US)  NCAR (US)
SLAC (US)  INFN CNAF (IT)
Fermilab (US)  MIT (US)
Fermilab (US)  SDSC (US)
Fermilab (US)  Johns Hopkins
Fermilab (US)  Karlsruhe (DE)
IN2P3 (FR)  Fermilab (US)
LBNL (US)  U. Wisc. (US)
Fermilab (US)  U. Texas, Austin (US)
BNL (US)  LLNL (US)
BNL (US)  LLNL (US)
Fermilab (US)  UC Davis (US)
Qwest (US)  ESnet (US)
Fermilab (US)  U. Toronto (CA)
BNL (US)  LLNL (US)
BNL (US)  LLNL (US)
CERN (CH)  BNL (US)
NERSC (US)  LBNL (US)
DOE/GTN (US)  JLab (US)
U. Toronto (CA)  Fermilab (US)
NERSC (US)  LBNL (US)
NERSC (US)  LBNL (US)
NERSC (US)  LBNL (US)
NERSC (US)  LBNL (US)
CERN (CH)  Fermilab (US)
Source and Destination of the Top 30 Flows, Feb. 2005
Lab-Lab (domestic)
Lab-Comm. (domestic)
12
Enabling Future OSC Science:
ESnet’s Evolution over the Next 5-10 Years
•Based both on the
o
projections of the science programs
o
changes in observed network traffic and patterns over the
past few years
it is clear that the network must evolve substantially
in order to meet the needs of OSC science
DOE Science Requirements for Networking - 1
The primary network requirements to come out
of the Office of Science workshops were
1) Network bandwidth must increase substantially,
not just in the backbone but all the way to the
sites and the attached computing and storage
systems
o
The 5 and 10 year bandwidth requirements mean that the
network bandwidth has to almost double every year
o
Upgrading ESnet to accommodate the anticipated
increase from the current 100%/yr traffic growth to
300%/yr over the next 5-10 years is priority number 7 out
of 20 in DOE’s “Facilities for the Future of Science – A
Twenty Year Outlook”
14
DOE Science Requirements for Networking - 2
2) A highly reliable network is critical for science –
when large-scale experiments depend on the
network for success, the network must not fail
3) There must be network services that can guarantee
various forms of quality-of-service (e.g.,
bandwidth guarantees)
4) A production, extremely reliable, IP network with
Internet services must support the process of
science
•
This network must have backup paths for high reliability
•
This network must be able to provide backup paths for
large-scale science data movement
15
ESnet Evolution
•
With the old architecture (to 2004) ESnet can not
meet the new requirements
•
The current core ring cannot handle the anticipated
large science data flows at affordable cost
•
The current point-to-point tail circuits to sites are
neither reliable nor scalable to the required
bandwidth
New York (AOA)
DOE sites
ESnet
Core
Washington, DC (DC)
Sunnyvale (SNV)
El Paso (ELP)
Atlanta (ATL)
16
ESnet’s Evolution – The Network Requirements
•
Based on the growth of DOE large-scale science,
and the resulting needs for remote data and
experiment management, the architecture of the
network must change in order to support the
general requirements of both
1) High-speed, scalable, and reliable production IP networking
for
- University and international collaborator and general science
connectivity
- Highly reliable site connectivity to support Lab operations
- Global Internet connectivity
2) High bandwidth data flows of large-scale science
- Very high-speed network connectivity to specific sites
- Scalable, reliable, and very high bandwidth site connectivity
- Provisioned circuits with guaranteed quality of service
(e.g. dedicated bandwidth) and for traffic isolation
17
ESnet’s Evolution – The Network Requirements
•
In order to meet these requirements, the capacity
and connectivity of the network must increase to
provide
o
Fully redundant connectivity for every site
o
High-speed access to the core for every site
- at least 20 Gb/s, generally, and 40-100 Gb/s for some sites
o
100 Gbps national core/backbone bandwidth by 2008 in
two independent backbones
18
Wide Area Network Technology
ESnet site Site IP “gateway”
router
site LAN
RTR
ESnet border
router
ESnet hub
(e.g. Sunnyvale, Chicago,
NYC, Washington, Atlanta,
Albuquerque)
RTR
10GE
ESnet core
RTR
10GE
• usually SONET data
framing or Ethernet data
framing
“tail circuit”
“local loop”
RTR
ESnet
hub router
Lambda (optical)
channels are converted
to electrical channels
Site – ESnet network
policy demarcation
(“DMZ”)
“DWDM”
Dense Wave (frequency)
Division Multiplexing
provides the circuits
• today typically 64 x 10 Gb/s
optical channels per fiber
• channels (referred to as
“lambdas”) are usually used in
bi-directional pairs
A ring topology network is inherently reliable – all
single point failures are mitigated by routing traffic in
the other direction around the ring.
RTR
RTR
optical
fiber ring
RTR
19
ESnet Strategy For A New Architecture
A three part strategy for the evolution of ESnet
1) Metropolitan Area Network (MAN) rings to provide
-
dual site connectivity for reliability
-
much higher site-to-core bandwidth
-
support for both production IP and circuit-based traffic
2) A Science Data Network (SDN) core for
-
provisioned, guaranteed bandwidth circuits to support large, high-speed
science data flows
-
very high total bandwidth
-
multiply connecting MAN rings for protection against hub failure
-
alternate path for production IP traffic
3) A High-reliability IP core (e.g. the current ESnet core) to address
-
general science requirements
-
Lab operational requirements
-
Backup for the SDN core
-
vehicle for science services
20
ESnet Target Architecture:
IP Core + Science Data Network + MANs
CERN
AsiaPacific
ESnet
Science Data Network Core
(SDN) (NLR circuits)
Aus.
GEANT
(Europe)
New York
Sunnyvale
Aus.
ESnet
IP Core (Qwest)
LA
San Diego
Albuquerque (ALB)
Washington,
DC
Atlanta (ATL)
El Paso (ELP)
IP core hubs
SDN/NLR hubs
Primary DOE Labs
Possible new hubs
Metropolitan
Area Rings
Production IP core
Science Data Network core
Metropolitan Area Networks
Lab supplied
International connections
First Two Steps in the Evolution of ESnet
1) The SF Bay Area MAN will provide to the five OSC
Bay Area sites
o
Very high speed site access – 20 Gb/s
o
Fully redundant site access
2) The first two segments of the second national
10 Gb/s core – the Science Data Network – will be
San Diego to Sunnyvale to Seattle
22
Science Data Network – Step One:
SF BA MAN and West Coast SDN
CERN
AsiaPacific
ESnet
Science Data Network Core
(SDN) (NLR circuits)
Aus.
GEANT
(Europe)
New York
Sunnyvale
Aus.
ESnet
IP Core (Qwest)
LA
San Diego
Albuquerque (ALB)
Washington,
DC
Atlanta (ATL)
El Paso (ELP)
IP core hubs
SDN/NLR hubs
Primary DOE Labs
Possible new hubs
Metropolitan
Area Rings
Production IP core
Science Data Network core
Metropolitan Area Networks
Lab supplied
International connections
ESnet SF Bay Area MAN Ring (Sept., 2005)
• 2 λs (2 X 10 Gb/s
channels) in a ring
configuration, and
delivered as 10 GigEther
circuits
Seattle and
Chicago (NLR)
λ4 future
Joint
Genome
Institute
LBNL
λ3 future
NERSC
• Dual site connection
(independent “east” and
“west” connections) to
each site
λ2 SDN/circuits
λ1 production IP
SF Bay
Area
• Will be used as a 10 Gb/s
production IP ring and
2 X 10 Gb/s paths (for
circuit services) to each
site
• Qwest contract signed for
two lambdas 2/2005 with
options on two more
ESnet MAN ring
(Qwest circuits)
Chicago (Qwest)
LLNL
SNLL
SLAC
DOE Ultra
Science Net
• Project completion date is LA and
San Diego
9/2005
Level 3
hub
ESnet SDN
core
(NLR circuits)
Qwest /
ESnet hub
NASA
Ames
ESnet hubs
and sites
ESnet
IP core ring
(Qwest
circuits)
El Paso
24
References – DOE Network Related Planning Workshops

1) High Performance Network Planning Workshop, August 2002
http://www.doecollaboratory.org/meetings/hpnpw

2) DOE Science Networking Roadmap Meeting, June 2003
http://www.es.net/hypertext/welcome/pr/Roadmap/index.html
3) DOE Workshop on Ultra High-Speed Transport Protocols and Network
Provisioning for Large-Scale Science Applications, April 2003
http://www.csm.ornl.gov/ghpn/wk2003
4) Science Case for Large Scale Simulation, June 2003
http://www.pnl.gov/scales/
5) Workshop on the Road Map for the Revitalization of High End Computing, June
2003
http://www.cra.org/Activities/workshops/nitrd
http://www.sc.doe.gov/ascr/20040510_hecrtf.pdf (public report)
6) ASCR Strategic Planning Workshop, July 2003
http://www.fp-mcs.anl.gov/ascr-july03spw
7) Planning Workshops-Office of Science Data-Management Strategy, March &
May 2004
o
http://www-conf.slac.stanford.edu/dmw2004
25