ESnet Defined: Challenges and Overview Department of

Download Report

Transcript ESnet Defined: Challenges and Overview Department of

ESnet Status Update
ESCC
July 18, 2007
William E. Johnston
ESnet Department Head and Senior
Scientist
Energy Sciences Network
Lawrence Berkeley National Laboratory
[email protected], www.es.net
This talk is available at www.es.net/ESnet4
Networking for the Future of Science
1
DOE Office of Science and ESnet – the ESnet Mission
•
ESnet’s primary mission is to enable the largescale science that is the mission of the Office of
Science (SC) and that depends on:
–
–
–
–
–
Sharing of massive amounts of data
Supporting thousands of collaborators world-wide
Distributed data processing
Distributed data management
Distributed simulation, visualization, and computational
steering
– Collaboration with the US and International Research and
Education community
• ESnet provides network and collaboration services
to Office of Science laboratories and many other
DOE programs in order to accomplish its mission
2
Talk Outline
I. Current Network Status
II. Planning and Building the Future Network ESnet4
III. Science Collaboration Services - 1. Federated
Trust
IV. Science Collaboration Services - 2. Audio,
Video, Data Teleconferencing
3
ESnet3 Today Provides Global High-Speed Internet Connectivity
for DOE Facilities and Collaborators (Early 2007)
ESnet Science Data Network (SDN) core
I.
Japan (SINet)
Australia (AARNet)
Canada (CA*net4
Taiwan (TANet2)
Singaren
KAREN/REANNZ
ODN Japan
Telecom America
NLR-Packetnet
Abilene/I2
CA*net4
MREN
France
StarTap
GLORIAD
Taiwan (TANet2,
(Russia, China) ASCC)
Korea (Kreonet2
GÉANT
- France, Germany,
Italy, UK, etc
SINet (Japan)
Russia (BINP)
CERN
(USLHCnet
DOE+CERN funded)
NSF/IRNC
funded
LIGO
PNNL
AU
MIT
JGI
TWC
LLNL
SNLL
LBNL
NERSC
SLAC
ESnet IP core:
Packet over SONET
Optical Ring and
Hubs
Lab DC
Offices
AMES
NASA
Ames
GA
AMPATH
Equinix
OSC GTN
NNSA
KCP
JLAB
OSTI
LANL
ARM
AU
PPPL
MAE-E
Equinix
PAIX-PA
Equinix, etc.
YUCCA MT
FNAL
ANL
SNLA
Allied
Signal
42 end user sites (S. America)
Office Of Science Sponsored (22)
NNSA Sponsored (13)
Joint Sponsored (3)
Other Sponsored (NSF LIGO, NOAA)
Laboratory Sponsored (6)
R&E
commercial peering points
networks Specific R&E network peers
Other R&E peering points
ESnet core hubs
high-speed peering points with Internet2/Abilene
BNL
ORNL
ORAU
NOAA
SRS
AMPATH
(S. America)
International (high speed)
10 Gb/s SDN core
10G/s IP core
2.5 Gb/s IP core
MAN rings (≥ 10 G/s)
Lab supplied links
OC12 ATM (622 Mb/s)
OC12 / GigEthernet
OC3 (155 Mb/s)
45 Mb/s and less
ESnet Availability
ESnet Availability 8/2006 through 7/2007
With a goal of “5 nines” for the large science Labs it becomes clear
that ESnet will have to deploy dual routers at the site and core-core
attachment points in order to avoid down time due to router
reloads/upgrades.
2500
1500
“4 nines” (>99.95%)
“5 nines” (>99.995%)
“3 nines” (>99.5%)
1000
Dually connected sites
Lamont 99.624
NOAA 99.782
Ames-Lab 99.851
OSTI 99.868
ORAU 99.878
INL 99.883
DOE-GTN 99.889
DOE-NNSA 99.918
NREL 99.941
Bechtel 99.948
Pantex 99.949
Allied 99.951
SNLA 99.955
LANL 99.955
DOE-ALB 99.955
BNL 99.970
GA 99.973
PPPL 99.982
JLab 99.982
MIT 99.985
LLNL-DC 99.986
LANL-DC 99.986
JGI 99.990
ANL 99.990
BJC 99.991
Yucca 99.993
SRS 99.995
IARC 99.995
FNAL 99.995
LLNL 99.996
PNNL 99.997
LBL 99.997
SNLL 99.998
NERSC 99.998
0
SLAC 100.000
500
ORNL 100.000
Outage Minutes
2000
Aug
Sep
Oct
Nov
Dec
Jan
Feb
Mar
Apr
May
Jun
Jul
Note: These availability measures are only for ESnet infrastructure, they do
not include site-related problems. Some sites, e.g. PNNL and LANL, provide
circuits from the site to an ESnet hub, and therefore the ESnet-site demarc
is at the ESnet hub (there is no ESnet equipment at the site. In this case,
circuit outages between the ESnet equipment and the site are considered
5
site issues and are not included in the ESnet availability metric.
Peering Issues
•
ESnet has experienced congestion at both the West Coast
and mid-West Equinix commercial peering exchanges
6
Commercial Peers Congestion Issues: Temporary changes
Long-term Fixes
•
The OC3 connection between paix-pa-rt1 and snv-rt1 was very congested, with
peaks clipped for most of the day.
– Temporary mitigation
• Temporarily forcing West coast Level3 traffic to eqx-chicago - Traffic is now only clipped (if at
all) at the peak of the day
– Long term solution
• Establish new Level3 peering at eqx-chicago (7/11/07)
• Working on establishing a second peering with Global Crossing
• Upgrade current loop (OC3) and fabric (100Mpbs) to 1Gbps
•
Congestion to AT&T
– Long term solution
• Upgraded AT&T peering at eqx-sanjose from OC3 to OC12 (3/15/07)
– Established OC12 peering with AT&T at eqx-ashburn (1/29/07) and
eqx-chicago (07/11/07)
•
The Equinix shared fabric at eqx-ashburn is congested
– Long term solution
• New Level3 peering at eqx-chicago has helped to relieve congestion
– Additional mitigation
• Third peering with Google at eqx-chicago, third peering with Yahoo at eqx-chicago
– Future mitigation
• Establish a second peering with Global Crossing at eqx-chicago
• Upgrade equinix-sanjose and equinix-ashburn fabrics connections from 100Mb/s to 1Gbps
II.
Planning and Building the Future Network - ESnet4
•
Requirements are primary drivers for ESnet – science
focused
•
Sources of Requirements
1. Office of Science (SC) Program Managers
•
The Program Offices Requirements Workshops
– BES completed
– BER in July, 2007
– Others to follow at the rate of 3 a year
2. Direct gathering through interaction with science users of the network
•
Example case studies (updated 2005/2006)
–
–
–
–
Magnetic Fusion
Large Hadron Collider (LHC)
Climate Modeling
Spallation Neutron Source
3. Observation of the network
•
Requirements aggregation
– Convergence on a complete set of network requirements
8
1. Basic Energy Sciences (BES) Network Requirements
Workshop
•
Input from BES facilities, science programs and sites
– Light Sources
– SNS at ORNL, Neutron Science program
– Nanoscience Centers
– Combustion Research
– Computational Chemistry
– Other existing facilities (e.g. National Center for Electron
Microscopy at LBL)
– Facilities currently undergoing construction (e.g. LCLS at
SLAC)
Workshop Process
•
Three inputs
– Discussion of Program Office – goals, future projects, and
science portfolio
– Discussions with representatives of individual programs
and facilities
– Group discussion about common issues, future
technologies (e.g. detector upgrades), etc.
•
Additional discussion – ESnet4
– Architecture
– Deployment schedule
– Future services
10
BES Workshop Findings (1)
•
BES facilities are unlikely to provide the magnitude of load
that we expect from the LHC
– However, significant detector upgrades are coming in the next 3 years
– LCLS may provide significant load
– SNS data repositories may provide significant load
– Theory and simulation efforts may provide significant load
•
Broad user base
– Makes it difficult to model facilities as anything other than point
sources of traffic load
– Requires wide connectivity
•
Most facilities and disciplines expect significant increases in
PKI service usage
11
BES Workshop Findings (2)
•
Significant difficulty and frustration with moving data sets
– Problems deal with moving data sets that are small by HEP’s
standards
– Currently many users ship hard disks or stacks of DVDs
•
Solutions
– HEP model of assigning a group of skilled computer people to address
the data transfer problem does not map well onto BES for several
reasons
• BES is more heterogeneous in science and in funding
• User base for BES facilities is very heterogeneous and this results in a
large number of sites that must be involved in data transfers
• It appears that this is likely to be true of the other Program Offices
 ESnet action item – build a central web page for disseminating
(1A) information about data transfer tools and techniques
 Users also expressed interest in a blueprint for a site-local
BWCTL/PerfSONAR service
12
2. Case Studies For Requirements
• Advanced Scientific Computing
Research (ASCR)
– NERSC
– NLCF
• Basic Energy Sciences
– Advanced Light Source
• Macromolecular Crystallography
• Fusion Energy Sciences
– Magnetic Fusion Energy/ITER
• High Energy Physics
– LHC
• Nuclear Physics
– RHIC
– Chemistry/Combustion
– Spallation Neutron Source
• Biological and Environmental
– Bioinformatics/Genomics
– Climate Science
13
(2A) Science Networking Requirements Aggregation Summary
Science
Drivers
End2End
Reliability
Connectivity
Science Areas
/ Facilities
Magnetic
Fusion Energy
NERSC and
ACLF
• DOE sites
(Impossible • US Universities
without full
• Industry
redundancy)
99.999%
-
• DOE sites
• US Universities
• International
• Other ASCR
Today
End2End
Band
width
5 years
End2End
Band
width
200+
Mbps
1 Gbps
10 Gbps
20 to 40
Gbps
Traffic
Characteristics
• Bulk data
• Remote control
• Bulk data
• Remote control
• Remote file
system sharing
supercomputers
NLCF
Nuclear
Physics (RHIC)
Spallation
Neutron Source
-
-
High
(24x7
operation)
• DOE sites
• US Universities
• Industry
• International
• DOE sites
• US Universities
• International
Backbone
Band
width
parity
Backbone
band width
parity
12 Gbps
70 Gbps
• DOE sites
640 Mbps
Network Services
• Guaranteed
bandwidth
• Guaranteed QoS
• Deadline scheduling
• Guaranteed
bandwidth
• Guaranteed QoS
• Deadline Scheduling
• PKI / Grid
• Bulk data
• Remote file
system sharing
• Bulk data
• Guaranteed
bandwidth
• PKI / Grid
2 Gbps
• Bulk data
Science Network Requirements Aggregation Summary
Science
Drivers
End2End
Reliability
Connectivity
Science Areas
/ Facilities
Advanced
Light Source
-
Bioinformatics
-
Chemistry /
Combustion
Climate
Science
-
-
• DOE sites
• US Universities
• Industry
• DOE sites
• US Universities
Today
End2End
Band
width
5 years
End2End
Band width
1 TB/day
5 TB/day
300 Mbps
1.5 Gbps
625 Mbps
250 Gbps
12.5
Gbps in
two years
• DOE sites
• US Universities
• Industry
-
• DOE sites
• US Universities
• International
-
Traffic
Characteristics
Network Services
• Bulk data
• Guaranteed
bandwidth
• Remote control
• PKI / Grid
• Bulk data
• Guaranteed
bandwidth
• Remote control
• High-speed
• Point-tomulticast
multipoint
10s of
Gigabits per
second
• Bulk data
5 PB per year
• Bulk data
• Guaranteed
• Remote control bandwidth
• PKI / Grid
5 Gbps
• Guaranteed
bandwidth
• PKI / Grid
Immediate Requirements and Drivers
High Energy
Physics (LHC)
99.95+%
(Less
than 4
hrs/year)
• US Tier1 (FNAL, BNL)
• US Tier2
(Universities)
• International (Europe,
Canada)
10 Gbps
60 to 80 Gbps
(30-40 Gbps
per US Tier1)
• Bulk data
• Coupled data
analysis
processes
• Guaranteed
bandwidth
• Traffic isolation
• PKI / Grid
(2B) The Next Level of Detail: LHC Tier 0, 1, and 2
Connectivity Requirements Summary
CERN-1 CERN-2
CERN-3
TRIUMF
(Atlas T1,
Canada)
Vancouver
CANARIE
USLHCNet
Seattle
Toronto
ESnet
SDN
BNL
(Atlas T1)
Virtual Circuits
Boise
Chicago
Denver
KC
ESnet
IP Core
Internet2
Internet2 // RONs
RONs
FNAL
(CMS T1)
Wash DC
Albuq.
San Diego
Dallas
GÉANT
Atlanta
Jacksonville
USLHC nodes
• Direct connectivity T0-T1-T2
Internet2/GigaPoP nodes
ESnet IP core hubs
• USLHCNet to ESnet to Abilene
ESnet SDN/NLR hubs
Tier 1 Centers
Cross connects ESnet - Internet2
Tier 2 Sites
• Backup connectivity
• SDN, GLIF, VCs
GÉANT-2
LA
GÉANT-1
Sunnyvale
New York
(2C) The Next Level of Detail:
LHC ATLAS Bandwidth Matrix as of April 2007
Site A
Site Z
ESnet A
ESnet Z
A-Z 2007
Bandwidth
A-Z 2010
Bandwidth
CERN
BNL
AofA (NYC)
BNL
10Gbps
20-40Gbps
BNL
U. of Michigan
(Calibration)
BNL
(LIMAN)
Starlight
(CHIMAN)
3Gbps
10Gbps
BNL
Boston University
3Gbps
10Gbps
BNL
Harvard University
Internet2 /
NLR Peerings
(Northeastern
Tier2 Center)
(Northeastern
Tier2 Center)
BNL
Indiana U. at
Bloomington
Internet2 /
NLR Peerings
3Gbps
10Gbps
(Midwestern
Tier2 Center)
(Midwestern
Tier2 Center)
3Gbps
10Gbps
BNL
U. of Chicago
BNL
Langston University
BNL
U. Oklahoma
Norman
BNL
(LIMAN)
BNL
(LIMAN)
BNL
(LIMAN)
Internet2 /
NLR Peerings
(Southwestern
Tier2 Center)
(Southwestern
Tier2 Center)
BNL
U. of Texas
Arlington
BNL
Tier3 Aggregate
BNL
(LIMAN)
Internet2 /
NLR Peerings
5Gbps
20Gbps
BNL
TRIUMF (Canadian
ATLAS Tier1)
BNL
(LIMAN)
Seattle
1Gbps
5Gbps
LHC CMS Bandwidth Matrix as of April 2007
Site A
Site Z
ESnet A
ESnet Z
A-Z 2007
Bandwidth
A-Z 2010
Bandwidth
CERN
FNAL
Starlight
(CHIMAN)
FNAL
(CHIMAN)
10Gbps
20-40Gbps
FNAL
U. of Michigan
(Calibration)
FNAL
(CHIMAN)
Starlight
(CHIMAN)
3Gbps
10Gbps
FNAL
Caltech
FNAL
(CHIMAN)
Starlight
(CHIMAN)
3Gbps
10Gbps
FNAL
MIT
FNAL
(CHIMAN)
AofA (NYC)/
Boston
3Gbps
10Gbps
FNAL
Purdue University
FNAL
(CHIMAN)
Starlight
(CHIMAN)
3Gbps
10Gbps
FNAL
U. of California at
San Diego
FNAL
(CHIMAN)
San Diego
3Gbps
10Gbps
FNAL
U. of Florida at
Gainesville
FNAL
(CHIMAN)
SOX / Ultralight 3Gbps
at Starlight
10Gbps
FNAL
U. of Nebraska at
Lincoln
FNAL
(CHIMAN)
Starlight
(CHIMAN)
3Gbps
10Gbps
FNAL
U. of Wisconsin at
Madison
FNAL
(CHIMAN)
Starlight
(CHIMAN)
3Gbps
10Gbps
FNAL
Tier3 Aggregate
FNAL
(CHIMAN)
Internet2 / NLR
Peerings
5Gbps
20Gbps
18
Large-Scale Data Analysis Systems (Typified by the LHC)
have Several Characteristics that Result in
Requirements for the Network and its Services
• The systems are data intensive and high-performance, typically
moving terabytes a day for months at a time
• The system are high duty-cycle, operating most of the day for months at
a time in order to meet the requirements for data movement
• The systems are widely distributed – typically spread over continental
or inter-continental distances
• Such systems depend on network performance and availability, but
these characteristics cannot be taken for granted, even in well run
networks, when the multi-domain network path is considered
• The applications must be able to get guarantees from the network that
there is adequate bandwidth to accomplish the task at hand
• The applications must be able to get information from the network
that allows graceful failure and auto-recovery and adaptation to
unexpected network conditions that are short of outright failure
This slide drawn from [ICFA SCIC]
Enabling Large-Scale Science
•
These requirements are generally true for
systems with widely distributed components to be
reliable and consistent in performing the
sustained, complex tasks of large-scale science
•
Networks must provide communication capability
that is service-oriented: configurable, schedulable,
(2D)
predictable, reliable, and informative – and the
network and its services must be scalable
20
3. Observed Evolution of Historical ESnet Traffic Patterns
3000
2000
1500
1000
Terabytes / month
2500
ESnet total traffic
passed
2 Petabytes/mo about
mid-April, 2007
top 100
sites to site
workflows
site to
site
workflow
data not
available
ESnet Monthly Accepted Traffic, January, 2000 – June, 2007
• ESnet is currently transporting more than1 petabyte (1000 terabytes) per
month
• More than 50% of the traffic is now generated by the top 100 sites 
large-scale science dominates all ESnet traffic
Jan, 07
Jan, 06
Jan, 05
Jan, 04
Jan, 03
Jan, 02
Jan, 01
0
Jan, 00
500
ESnet Traffic has Increased by
10X Every 47 Months, on Average, Since 1990
Apr., 2006
1 PBy/mo.
10000.0
Nov., 2001
100 TBy/mo.
1000.0
100.0
53 months
Oct., 1993
1 TBy/mo.
10.0
40 months
Aug., 1990
100 MBy/mo.
57 months
1.0
0.1
38 months
Log Plot of ESnet Monthly Accepted Traffic, January, 1990 – June, 2007
Jan, 07
Jan, 06
Jan, 05
Jan, 04
Jan, 03
Jan, 02
Jan, 01
Jan, 00
Jan, 99
Jan, 98
Jan, 97
Jan, 96
Jan, 95
Jan, 94
Jan, 93
Jan, 92
Jan, 91
0.0
Jan, 90
Terabytes / month
Jul., 1998
10 TBy/mo.
Requirements from Network Utilization Observation
•
In 4 years, we can expect a 10x increase in traffic
over current levels without the addition of
production LHC traffic
– Nominal average load on busiest backbone links is ~1.5
Gbps today
(3A)
•
– In 4 years that figure will be ~15 Gbps based on current
trends
Measurements of this type are science-agnostic
– It doesn’t matter who the users are, the traffic load is
increasing exponentially
– Predictions based on this sort of forward projection tend
to be conservative estimates of future requirements
because they cannot predict new uses
23
Requirements from Traffic Flow Observations
•
(3B)
Most of ESnet science traffic has a source or sink outside
of ESnet
– Drives requirement for high-bandwidth peering
– Reliability and bandwidth requirements demand that peering be
redundant
– Multiple 10 Gbps peerings today, must be able to add more
bandwidth flexibly and cost-effectively
– Bandwidth and service guarantees must traverse R&E peerings
• Collaboration with other R&E networks on a common framework is
critical
• Seamless fabric
•
(3C)
Large-scale science is now the dominant user of the
network
– Satisfying the demands of large-scale science traffic into the future
will require a purpose-built, scalable architecture
– Traffic patterns are different than commodity Internet
24
Summary of All Requirements To-Date
Requirements from SC Programs:
1A) Provide “consulting” on system / application network tuning
Requirements from science case studies:
2A) Build the ESnet core up to 100 Gb/s within 5 years
2B) Deploy network to accommodate LHC collaborator footprint
2C) Implement network to provide for LHC data path loadings
2D) Provide the network as a service-oriented capability
Requirements from observing traffic growth and change trends in the
network:
3A) Provide 15 Gb/s core within four years and 150 Gb/s core within eight
years
3B) Provide a rich diversity and high bandwidth for R&E peerings
3C) Economically accommodate a very large volume of circuit-like traffic
25
 ESnet4 - The Response to the Requirements
I) A new network architecture and implementation strategy
• Provide two networks: IP and circuit-oriented Science Data Netework
•
Reduces cost of handling high bandwidth data flows
–
–
Highly capable routers are not necessary when every packet goes to the same place
Use lower cost (factor of 5x) switches to relatively route the packets
• Rich and diverse network topology for flexible management and high
reliability
• Dual connectivity at every level for all large-scale science sources and sinks
• A partnership with the US research and education community to build a
shared, large-scale, R&E managed optical infrastructure
•
•
a scalable approach to adding bandwidth to the network
dynamic allocation and management of optical circuits
II) Development and deployment of a virtual circuit service
• Develop the service cooperatively with the networks that are intermediate
between DOE Labs and major collaborators to ensure and-to-end
interoperability
III) Develop and deploy service-oriented, user accessable network
monitoring systems
IV) Provide “consulting” on system / application network performance
tuning
26
ESnet4
•
Internet2 has partnered with Level 3 Communications Co.
and Infinera Corp. for a dedicated optical fiber infrastructure
with a national footprint and a rich topology - the “Internet2
Network”
– The fiber will be provisioned with Infinera Dense Wave Division
Multiplexing equipment that uses an advanced, integrated opticalelectrical design
– Level 3 will maintain the fiber and the DWDM equipment
– The DWDM equipment will initially be provisioned to provide10 optical
circuits (lambdas - s) across the entire fiber footprint (80 s is max.)
•
ESnet has partnered with Internet2 to:
– Share the optical infrastructure
– Develop new circuit-oriented network services
– Explore mechanisms that could be used for the ESnet Network
Operations Center (NOC) and the Internet2/Indiana University NOC to
back each other up for disaster recovery purposes
27
ESnet4
•
ESnet will build its next generation IP network and
its new circuit-oriented Science Data Network
primarily on the Internet2 circuits (s) that are
dedicated to ESnet, together with a few National
Lambda Rail and other circuits
– ESnet will provision and operate its own routing and
switching hardware that is installed in various commercial
telecom hubs around the country, as it has done for the
past 20 years
– ESnet’s peering relationships with the commercial
Internet, various US research and education networks,
and numerous international networks will continue and
evolve as they have for the past 20 years
28
Internet2 and ESnet Optical Node
ESnet
IP
core
M320
ESnet
metro-area
networks
RON
T640
SDN
core
switch
support devices:
•measurement
•out-of-band access
•monitoring
•security
dynamically
allocated and
routed waves
(future)
grooming
Ciena device
CoreDirector
Direct Optical
Connections
to RONs
Network Testbeds
support devices:
•measurement
•out-of-band access
•monitoring
•…….
various equipment and
experimental control plane
management systems
Future access to control plane
fiber east
fiber west
Infinera DTN
fiber north/south
Internet2/Level3
National Optical Infrastructure
29
ESnet Metropolitan Area Network Ring Architecture for High Reliability Sites
US
LHCnet
switch
SDN
core
west
SDN
core
east
US
LHCnet
switch
SDN
core
switch
ESnet production IP core hub
IP core
west
SDN
core
switch
IP core
router
ESnet
IP core hub
ESnet SDN
core hub
MAN fiber ring: 2-4 x 10 Gbps channels provisioned initially,
with expansion capacity to 16-64
Large Science Site
ESnet MAN
switch
ESnet production
IP service
ESnet managed
λ / circuit services
Independent
port cards
supporting
multiple 10 Gb/s
line interfaces
SDN circuits
to site systems
IP core
east
ESnet switch
Virtual
Circuit to
Site
Site
router
Site LAN
Site edge router
ESnet managed
virtual circuit services
tunneled through the
IP backbone
Site
Virtual Circuits
to Site
Site gateway router
30
ESnet 3 Backbone as of January 1, 2007
Future ESnet Hub
ESnet Hub
10 Gb/s SDN core (NLR)
10/2.5 Gb/s IP core (QWEST)
MAN rings (≥ 10 G/s)
Lab supplied links
31
ESnet 4 Backbone as of April 15, 2007
Boston
Clev.
Future ESnet Hub
ESnet Hub
10 Gb/s SDN core (NLR)
10/2.5 Gb/s IP core (QWEST)
10 Gb/s IP core (Level3)
10 Gb/s SDN core (Level3)
MAN rings (≥ 10 G/s)
Lab supplied links
32
ESnet 4 Backbone as of May 15, 2007
Boston
Clev.
Future ESnet Hub
ESnet Hub
10 Gb/s SDN core (NLR)
10/2.5 Gb/s IP core (QWEST)
10 Gb/s IP core (Level3)
10 Gb/s SDN core (Level3)
MAN rings (≥ 10 G/s)
Lab supplied links
33
 ESnet 4 Backbone as of June 20, 2007
Boston
Clev.
Kansas City
Houston
Future ESnet Hub
ESnet Hub
10 Gb/s SDN core (NLR)
10/2.5 Gb/s IP core (QWEST)
10 Gb/s IP core (Level3)
10 Gb/s SDN core (Level3)
MAN rings (≥ 10 G/s)
Lab supplied links
34
ESnet 4 Backbone Target August 1, 2007
Denver-Sunnyvale-El Paso
ring installed July 16, 2007
Boston
Clev.
Kansas City
Los Angeles
Houston
Future ESnet Hub
ESnet Hub
10 Gb/s SDN core (NLR)
10/2.5 Gb/s IP core (QWEST)
10 Gb/s IP core (Level3)
10 Gb/s SDN core (Level3)
MAN rings (≥ 10 G/s)
Lab supplied links
35
ESnet 4 Backbone Target August 30, 2007
Boston
Boise
Clev.
Kansas City
Los Angeles
Houston
Future ESnet Hub
ESnet Hub
10 Gb/s SDN core (NLR)
10/2.5 Gb/s IP core (QWEST)
10 Gb/s IP core (Level3)
10 Gb/s SDN core (Level3)
MAN rings (≥ 10 G/s)
Lab supplied links
36
ESnet 4 Backbone Target September 30, 2007
Boston
Boise
Clev.
Los Angeles
Kansas City
Houston
Future ESnet Hub
ESnet Hub
10 Gb/s SDN core (NLR)
10/2.5 Gb/s IP core (QWEST)
10 Gb/s IP core (Level3)
10 Gb/s SDN core (Level3)
MAN rings (≥ 10 G/s)
Lab supplied links
37
ESnet4 Roll Out
ESnet4 IP + SDN Configuration, mid-September, 2007
All circuits are 10Gb/s, unless noted.
Seattle
(28)
Portland
(29)
Boise
Boston
(9)
(7)
Chicago
(11)
Clev.
(10)
Sunnyvale
NYC
(32)
Denver
Salt
Lake
City
San Diego
(30)
(22)
Raleigh
Tulsa
Nashville
OC48
(1(3))
(3)
(4)
(1)
(2)
(20)
(19)
El Paso
Jacksonville
(17)
(6)
ESnet IP switch/router hubs
ESnet IP switch only hubs
Houston
(5)
ESnet SDN switch hubs
Layer 1 optical nodes at eventual ESnet Points of Presence
Layer 1 optical nodes not currently in ESnet plans
Lab site
Wash DC
(21)
Albuq.
(24)
(26)
Pitts.
(0)
LA
Philadelphia
KC
(15)
(23)
(25)
(13)
Baton
Rouge
ESnet IP core
ESnet Science Data Network core
ESnet SDN core, NLR links
Lab supplied link
LHC related link
MAN link
International IP Connections
ESnet4 Metro Area Rings, 2007 Configurations
Long Island MAN
West Chicago MAN
600 W.
Chicago
USLHCNet
Seattle
32 AoA, NYC
USLHCNet
(28)
Portland
BNL
(29)
Boise
Boston
(9)
(7)
Chicago
Sunnyvale
FNAL
Ames
(32)
(23)
Bay Area MAN
LA
(24)
NYC
(19)
Pitts.
(0)
LLNL
ESnet IP switch/router hubs
SNLL
ESnet IP switch only hubs
ESnet SDN switch hubs
(30)
Raleigh
Tulsa
Nashville
OC48
(1(3))
(3)
(4)
Newport News - Elite
Atlanta
(2)
(20)
Jacksonville
(17)
(6)
Atlanta MAN
ORNL
Nashville
All circuits are 10Gb/s. Wash., DC
56 Marietta
(SOX)
Layer 1 optical nodes at eventual ESnet Points of Presence
Layer 1 optical nodes not currently in ESnet plans
Lab site
Wash DC
(22)
NERSC
El Paso
(26)
(21)
Albuq.
SLAC
(25)
Philadelphia
LBNL(1)
San Diego
(10)
KC
(15)
JGI
Clev.
(13)
Denver
Salt
Lake
City
San Francisco
(11)
ANL
180 Peachtree
Houston
Wash.,
DC
MATP
JLab
ESnet IP core
ELITE
ESnet Science Data
Network core
ESnet SDN core, NLR links (existing)
Lab suppliedODU
link
LHC related link
MAN link
International IP Connections
39
 Note that the major ESnet sites are now directly on the ESnet
“core” network
Long Island MAN
West Chicago MAN
Sunnyvale
e.g. the
Seattle
bandwidth
into
and (28)
out of FNAL
Portland
is equal to, or
greater, than5the
(29)
ESnet core
bandwidth
4
600 W.
Chicago
4
LA
(24)
San Diego
32 AoA, NYC
Starlight
(>1 )
BNL
Boise
USLHCNet
Boston
(7)
5 FNAL
(13)
(25)
5
Philadelphia
5 (26)
(21)
5
4
Albuq.
Tulsa
OC48
(4)
Atlanta
(2)
(20)
4
(17)
4
Atlanta
MAN
ESnet IP switch only hubsLLNL
ORNL
Jacksonville
(6)
(5)
56 Marietta
(SOX)
Baton
HoustonNashville
Rouge
Wash.,
DC
Layer 1 optical nodes at eventual ESnet Points of Presence
Layer 1 optical nodes not currently in ESnet plans
Raleigh
5
(3) 3
3
(1)
NERSC
ESnet SDN switch hubs SNLL
(30)
Nashville
4
LBNL
Wash. DC
3
(22)
(0)
El Paso
ESnet IP switch/router hubs
Lab site
5 (10)
4
JGI
SLAC
(19)
Clev.
KC
(15)
San Francisco
Bay Area MAN
4
(11)
(9)
NYC
ANL
Denver
Salt
Lake
City
4
5
Chicago
(32)
(23)
USLHCNet
180 Peachtree
Houston
(20)
Wash.,
DC
ESnet IP core (1)
MATP
ESnet Science Data Network core
ESnet SDN core, NLR links
(existing)
JLab
Lab supplied link
ELITE
LHC related link
MAN link
ODU
International IP Connections
Internet2 circuit number
The Evolution of ESnet Architecture
ESnet IP
core
ESnet to 2005:
ESnet IP
core
ESnet Science Data
Network (SDN) core
• A routed IP network with sites
singly attached to a national core
ring
ESnet sites
ESnet hubs / core network connection points
Metro area rings (MANs)
Other IP networks
Circuit connections to other science networks (e.g. USLHCNet)
independent
redundancy
(TBD)
ESnet from 2006-07:
• A routed IP network with sites
dually connected on metro area
rings or dually connected directly to
core ring
• A switched network providing
virtual circuit services for dataintensive science
• Rich topology offsets the lack of
dual, independent national cores
41
ESnet 4 Factiods as of July 16, 2007
•
Installation to date:
– 10 new 10Gb/s circuits
– ~10,000 Route Miles
– 6 new hubs
– 5 new routers 4 new switches
• Total of 70 individual pieces of equipment shipped
– Over two and a half tons of electronics
– 15 round trip airline tickets for our install team
• About 60,000 miles traveled so far….
• 6 cities
– 5 Brazilian Bar-B-Qs/Grills sampled
42
Typical ESnet 4 Hub
OWAMP Time
Source
Power Controllers
Secure Term Server
Peering Router
10G Performance
Tester
M320 Router
7609 Switch
43
(2C) Aggregate Estimated Link Loadings, 2007-08
9
12.5
Seattle
13
(28)
Portland
(29)
Boise
13
9
Existing site
supplied
circuits
2.5
Boston
(9)
(7)
Chicago
(11)
Clev.
(10)
Sunnyvale
NYC
(32)
Denver
Salt
Lake
City
San Diego
Pitts.
(30)
Nashville
OC48
(1(3))
(3)
El Paso
6
(4)
Atlanta
(2)
(20)
(19)
8.5
Raleigh
Tulsa
(1)
6
Jacksonville
(17)
(6)
ESnet IP switch/router hubs
ESnet IP switch only hubs
Wash DC
(22)
Albuq.
(24)
(26)
(21)
(0)
LA
Philadelphia
KC
(15)
(23)
(25)
(13)
Houston
(5)
Baton
Rouge
ESnet SDN switch hubs
Layer 1 optical nodes at eventual ESnet Points of Presence
Layer 1 optical nodes not currently in ESnet plans
Committed bandwidth, Gb/s
2.5
Lab site
2.5
ESnet IP core (1)
ESnet Science Data Network core
ESnet SDN core, NLR links
Lab supplied link
LHC related link
MAN link
International IP Connections
44
(2C) ESnet4 2007-8 Estimated Bandwidth Commitments
Long Island MAN
600
W. Chicago
West
Chicago
MAN
CERN
CMS
5
Seattle
USLHCNet
BNL
(28)
Portland
CERN
32 AoA, NYC
Starlight
Boise
(29)
13
Sunnyvale
(32)
(23)
Bay Area MAN
LA
Chicago
10
(24)
SLAC
(19)
Philadelphia
KC
(15)
Wash DC
(30)
Raleigh
Tulsa
ANL
Nashville
OC48
(1(3))
(3)
(4)
Newport News - Elite
(2)
(20)
Jacksonville
(17)
(6)
LLNL
ESnet IP switch/router hubs
ESnet SDN switch hubs
(26)
Pitts.
Atlanta
NERSC
ESnet IP switch only hubsSNLL
(25)
(22)
Albuq.
El Paso
(10)
NYC
LBNL(1)
San Diego
Clev.
(21)
(0)
JGIFNAL
(11)
(13)
Denver
Salt
Lake
City
San Francisco
Boston
(9)
29
(total)
(7)
USLHCNet
10
Houston
(5)
All circuits are 10Gb/s.
Layer 1 optical nodes at eventual ESnet Points of Presence
Layer 1 optical nodes not currently in ESnet plans
Committed bandwidth, Gb/s
2.5
Lab site
Baton
MAX
Rouge
Wash.,
DC
MATP
JLab
ESnet IP core
ELITE
ESnet Science Data
Network core
ESnet SDN core, NLR links (existing)
Lab suppliedODU
link
LHC related link
MAN link
International IP Connections
45
Are These Estimates Realistic? YES!
FNAL Outbound CMS Traffic
Max= 1064 MBy/s (8.5 Gb/s), Average = 394 MBy/s (3.2 Gb/s)
46
ESnet4 IP + SDN, 2008 Configuration
Seattle
(28)
Portland
(? )
(2)
(29)
Boise
(2)
(7)
(1)
Chicago
Sunnyvale
(2)
San Diego
(13)
Denver
(1)
Albuq.
(1)
(2)
(22)
(0)
(1)
(2)
Nashville
(30)
OC48
(2)
(1)
(1)
(4)
(3)
Atlanta
(2)
(20)
(19)
(1)
(17)
Jacksonville
(1)
ESnet IP switch/router hubs
ESnet IP switch only hubs
(6)
(5)
Houston
Baton
Rouge
ESnet SDN switch hubs
Layer 1 optical nodes at eventual ESnet Points of Presence
Layer 1 optical nodes not currently in ESnet plans
Lab site
Wash. DC
Raleigh
Tulsa
(1)
El Paso
Philadelphia
(2) (26)
(1)
(21)
(2)
(10)
(25)
(2)
KC
(15)
(1)
(24)
(2)
NYC
Salt
Lake
City
LA
Clev.
(2)
(32)
(23)
(11)
Boston
(9)
(20)
ESnet IP core
ESnet Science Data Network core
ESnet SDN core, NLR links (existing)
Lab supplied link
LHC related link
MAN link
International IP Connections
Internet2 circuit number
47
Estimated ESnet4 2009 Configuration
(Some of the circuits may be allocated dynamically from shared a pool.)
Seattle
(28)
Portland
(? )
3
(29)
Boise
Boston
(7)
2
3
Chicago
(11)
Sunnyvale
3
(13)
Denver
Salt
Lake
City
2
San Diego
3
3
2
Albuq.
Wash. DC
2
(22)
Tulsa
(30)
OC48
(4)
(3) 2
1
(1)
Atlanta
(2)
(20)
El Paso
2
(17)
Raleigh
3
Nashville
2
(19)
Jacksonville
2
ESnet IP switch/router hubs
ESnet IP switch only hubs
(6)
(5)
Houston
Baton
Rouge
ESnet SDN switch hubs
Layer 1 optical nodes at eventual ESnet Points of Presence
Layer 1 optical nodes not currently in ESnet plans
Lab site
Philadelphia
3 (26)
2
2
2
(25)
(21)
(0)
(24)
3 (10)
KC
(15)
(23)
LA
Clev.
NYC
3
(32)
(9)
(20)
ESnet IP core
ESnet Science Data Network core
ESnet SDN core, NLR links (existing)
Lab supplied link
LHC related link
MAN link
International IP Connections
Internet2 circuit number
(2C) Aggregate Estimated Link Loadings, 2010-11
30
45
Seattle
(28)
Portland
50
5
(29)
Boise
Sunnyvale
Boston
(7)
4
LA
(24)
San Diego
20
(13)
Denver
(25)
5
Philadelphia
5 (26)
(21)
5
4
Albuq.
Tulsa
5
(20)
El Paso
(17)
Raleigh
5
OC48
(4)
(3) 3
3
10
(30)
Nashville
4
(1)
Wash. DC
3
(22)
(0)
(19)
20
Atlanta
(2)
5
4
Jacksonville
4
ESnet IP switch/router hubs
20
(6)
(5)
Houston
Baton
Rouge
ESnet SDN switch hubs
Layer 1 optical nodes at eventual ESnet Points of Presence
Layer 1 optical nodes not currently in ESnet plans
Lab site
5 (10)
4
5
ESnet IP switch only hubs
Clev.
KC
(15)
4
4
(11)
(9)
NYC
5
Salt
Lake
City
(23)
5
Chicago
(32)
4
20
15
(>1 )
(20)
ESnet IP core (1)
ESnet Science Data Network core
ESnet SDN core, NLR links (existing)
Lab supplied link
LHC related link
MAN link
International IP Connections
Internet2 circuit number
49
(2C) ESnet4 2010-11 Estimated Bandwidth Commitments
600 W. Chicago
CMS
CERN
25
40
Seattle
BNL
(28)
Portland
15
(>1 )
32 AoA, NYC
CERN
5
(29)
Boise
Starlight
65
(7)
4
Sunnyvale
20
USLHCNet
25
LA
(24)
San Diego
20
4
(20)
El Paso
(17)
Wash. DC
(30)
5
OC48
(4)
(3) 3
10
Atlanta
(2)
5
4
Jacksonville
10
(6)
(5)
Houston
Baton
Rouge
ESnet SDN switch hubs
Layer 1 optical nodes at eventual ESnet Points of Presence
Layer 1 optical nodes not currently in ESnet plans
Lab site
Raleigh
5
Nashville
4
ESnet IP switch/router hubs
ESnet IP switch only hubs
4
3
(1)
(19)
Philadelphia
5 (26)
3
(22)
Tulsa
Albuq.
40 ANL
4
5
(21)
(0)
FNAL
(25)
100
80
80
5
4
4
5 (10)
KC
(15)
5
(23)
(13)
Denver
Salt
Lake
City
4
(11)
Clev.
NYC
5
(32)
USLHCNet
5
Chicago
Boston
(9)
(20)
ESnet IP core (1)
ESnet Science Data Network core
ESnet SDN core, NLR links (existing)
Lab supplied link
LHC related link
MAN link
International IP Connections
Internet2 circuit number
50
ESnet4 IP + SDN, 2011 Configuration
Seattle
(28)
Portland
(>1 )
5
(29)
Boise
(7)
4
Sunnyvale
Boston
Chicago
(13)
Denver
Salt
Lake
City
San Diego
5
5
Philadelphia
5 (26)
4
Albuq.
Wash. DC
3
(22)
Tulsa
(30)
OC48
(4)
(3) 3
3
(1)
Atlanta
(2)
(20)
El Paso
ESnet IP switch/router hubs
4
(17)
Raleigh
5
Nashville
4
(19)
Jacksonville
4
ESnet IP switch only hubs
(6)
(5)
Houston
Baton
Rouge
ESnet SDN switch hubs
Layer 1 optical nodes at eventual ESnet Points of Presence
Layer 1 optical nodes not currently in ESnet plans
Lab site
(25)
4
4
4
5 (10)
(21)
(0)
(24)
Clev.
KC
(15)
(23)
LA
(11)
(9)
NYC
5
(32)
4
5
(20)
ESnet IP core (1)
ESnet Science Data Network core
ESnet SDN core, NLR links (existing)
Lab supplied link
LHC related link
MAN link
International IP Connections
Internet2 circuit number
51
ESnet4 Planed Configuration
Core networks: 40-50 Gbps in 2009-2010, 160-400 Gbps in 2011-2012
Canada
Canada
Asia-Pacific
(CANARIE)
Asia Pacific
(CANARIE)
CERN (30 Gbps)
CERN (30 Gbps)
GLORIAD
Europe
(Russia and
China)
(GEANT)
Boston
Australia
1625 miles / 2545 km
Science Data
Network Core
Boise
IP Core
New York
Denver
Washington
DC
Australia
Tulsa
LA
Albuquerque
San Diego
South America
IP core hubs
(AMPATH)
SDN (switch) hubs
Primary DOE Labs
Core network fiber path is
High speed cross-connects
~ 14,000 miles / 24,000 km
with Ineternet2/Abilene
Possible hubs
2700 miles / 4300 km
South America
(AMPATH)
Jacksonville
Production IP core (10Gbps) ◄
SDN core (20-30-40Gbps) ◄
MANs (20-60 Gbps) or
backbone loops for site access
International connections
52
ESnet Virtual Circuit Service
•
Traffic isolation and traffic engineering
– Provides for high-performance, non-standard transport mechanisms
that cannot co-exist with commodity TCP-based transport
– Enables the engineering of explicit paths to meet specific
requirements
• e.g. bypass congested links, using lower bandwidth, lower latency paths
•
Guaranteed bandwidth (Quality of Service (QoS))
– User specified bandwidth
– Addresses deadline scheduling
• Where fixed amounts of data have to reach sites on a fixed schedule,
so that the processing does not fall far enough behind that it could never
catch up – very important for experiment data analysis
• Secure
– The circuits are “secure” to the edges of the network (the site
boundary) because they are managed by the control plane of the
network which is isolated from the general traffic
•
Provides end-to-end connections between Labs and
collaborator institutions
53
Virtual Circuit Service Functional Requirements
• Support user/application VC reservation requests
– Source and destination of the VC
– Bandwidth, start time, and duration of the VC
– Traffic characteristics (e.g. flow specs) to identify traffic designated for the VC
• Manage allocations of scarce, shared resources
– Authentication to prevent unauthorized access to this service
– Authorization to enforce policy on reservation/provisioning
– Gathering of usage data for accounting
• Provide circuit setup and teardown mechanisms and security
– Widely adopted and standard protocols (such as MPLS and GMPLS) are well
understood within a single domain
– Cross domain interoperability is the subject of ongoing, collaborative
development
– secure and-to-end connection setup is provided by the network control plane
• Enable the claiming of reservations
– Traffic destined for the VC must be differentiated from “regular” traffic
• Enforce usage limits
– Per VC admission control polices usage, which in turn facilitates guaranteed
bandwidth
– Consistent per-hop QoS throughout the network for transport predictability
54
OSCARS Overview
On-demand Secure Circuits and Advance Reservation System
Web Services APIs
Path Computation
• Topology
• Reachability
• Contraints
Scheduling
• AAA
• Availability
OSCARS
Guaranteed
Bandwidth
Virtual Circuit Services
Provisioning
• Signaling
• Security
• Resiliency/Redundancy
55
The Mechanisms Underlying OSCARS
Based on Source and Sink IP addresses, route of LSP between ESnet border routers is determined
using topology information from OSPF-TE. Path of LSP can be explicitly directed to take SDN network.
On the SDN Ethernet switches all traffic is MPLS switched (layer 2.5), which stitches together VLANs
On ingress to ESnet,
packets matching
reservation profile are
filtered out (i.e. policy
based routing),
policed to reserved
bandwidth, and
injected into a LSP.
Source
VLAN 1
VLAN 2
VLAN 3
SDN
SDN
SDN
RSVP, MPLS
enabled on
internal interfaces
Label Switched Path
IP Link
IP
IP
Sink
IP
high-priority
queue
standard,
best-effort
queue
MPLS labels are attached onto packets from Source and
placed in separate queue to ensure guaranteed bandwidth.
Interface queues
Regular production traffic queue.
56
Environment of Science is Inherently Multi-Domain
•
End points will be at independent institutions – campuses or
research institutes - that are served by ESnet, Abilene,
GÉANT, and their regional networks
– Complex inter-domain issues – typical circuit will involve five or more
domains - of necessity this involves collaboration with other networks
– For example, a connection between FNAL and DESY involves five
domains, traverses four countries, and crosses seven time zones
FNAL (AS3152)
[US]
GEANT (AS20965)
[Europe]
ESnet (AS293)
[US]
DESY (AS1754)
[Germany]
DFN (AS680)
[Germany]
57
Interdomain Virtual Circuit Reservation Control Flow
Progress!
58
OSCARS Status Update
•
ESnet Centric Deployment
– Prototype layer 3 (IP) guaranteed bandwidth virtual circuit service deployed in ESnet
(1Q05)
– Layer 2 (Ethernet VLAN) virtual circuit service under development
•
Inter-Domain Collaborative Efforts
– Terapaths
• Inter-domain interoperability for layer 3 virtual circuits demonstrated (3Q06)
• Inter-domain interoperability for layer 2 virtual circuits under development
– HOPI/DRAGON
• Inter-domain exchange of control messages demonstrated (1Q07)
• Initial integration of OSCARS and DRAGON has been successful (1Q07)
– DICE
• First draft of topology exchange schema has been formalized (in collaboration with NMWG)
(2Q07), interoperability test scheduled for 3Q07
• Drafts on reservation and signaling messages under discussion
– UVA
• Integration of Token based authorization in OSCARS under discussion
•
Measurements
– Hybrid dataplane testing with ESnet, HOPI/DRAGON, USN, and Tennessee Tech
(1Q07)
•
Administrative
– Vangelis Haniotakis (GRNET) has taken a one-year sabbatical position with ESnet to
work on interdomain topology exchange, resource scheduling, and signalling
59
 Monitoring Applications Move Networks Toward
Service-Oriented Communications Services
• perfSONAR is a global collaboration to design, implement and deploy a
network measurement framework.
– Web Services based Framework
•
•
•
•
•
Measurement Archives (MA)
Measurement Points (MP)
Lookup Service (LS)
Topology Service (TS)
Authentication Service (AS)
– Some of the currently Deployed Services
•
•
•
•
•
•
Utilization MA
Circuit Status MA & MP
Latency MA & MP
Bandwidth MA & MP
Looking Glass MP
Topology MA
– This is an Active Collaboration
• The basic framework is complete
• Protocols are being documented
• New Services are being developed and deployed.
perfSONAR Collaborators
•
•
•
•
•
•
•
•
•
•
•
•
ARNES
Belnet
CARnet
CESnet
Dante
University of Delaware
DFN
ESnet
FCCN
FNAL
GARR
GEANT2
•
•
•
•
•
•
•
•
•
•
•
•
Georga Tech
GRNET
Internet2
IST
POZNAN Supercomputing Center
Red IRIS
Renater
RNP
SLAC
SURFnet
SWITCH
Uninett
* Plus others who are contributing, but haven’t added
their names to the list on the WIKI.
61
perfSONAR Deployments
16+ different networks have deployed at least 1 perfSONAR
service (Jan 2007)
62
 E2Emon - a perfSONAR application
•
E2Emon provides end-to-end path status in a
service-oriented, easily interpreted way
– a perfSONAR application used to monitor the LHC paths
end-to-end across many domains
– uses perfSONAR protocols to retrieve current circuit status
every minute or so from MAs and MPs in all the different
domains supporting the circuits
– is itself a service that produces Web based, real-time
displays of the overall state of the network, and it
generates alarms when one of the MP or MA’s reports link
problems.
E2Emon: Status of E2E link CERN-LHCOPN-FNAL-001
E2Emon generated view of the data for one OPN link [E2EMON]
64
 Path Performance Monitoring
•
Path performance monitoring needs to provide
users/applications with the end-to-end, multi-domain
traffic and bandwidth availability
– should also provide real-time performance such as path
utilization and/or packet drop
•
Multiple path performance monitoring tools are in
development
– One example – Traceroute Visualizer [TrViz] ,developed by
Joe Metzger, ESnet – has been deployed at about 10 R&E
networks in the US and Europe that have at least some of
the required perfSONAR MA services to support the tool
65
Traceroute Visualizer
•
Forward direction bandwidth utilization on application path
from LBNL to INFN-Frascati (Italy)
– traffic shown as bars on those network device interfaces that have an
associated MP services (the first 4 graphs are normalized to 2000 Mb/s, the
last to 500 Mb/s)
1 ir1000gw (131.243.2.1)
2 er1kgw
3 lbl2-ge-lbnl.es.net
link capacity is also provided
10 esnet.rt1.nyc.us.geant2.net (NO DATA)
11 so-7-0-0.rt1.ams.nl.geant2.net (NO DATA)
12 so-6-2-0.rt1.fra.de.geant2.net (NO DATA)
13 so-6-2-0.rt1.gen.ch.geant2.net (NO DATA)
14 so-2-0-0.rt1.mil.it.geant2.net (NO DATA)
15 garr-gw.rt1.mil.it.geant2.net (NO DATA)
16 rt1-mi1-rt-mi2.mi2.garr.net
4 slacmr1-sdn-lblmr1.es.net (GRAPH OMITTED)
5 snv2mr1-slacmr1.es.net (GRAPH OMITTED)
6 snv2sdn1-snv2mr1.es.net
17 rt-mi2-rt-rm2.rm2.garr.net (GRAPH OMITTED)
18 rt-rm2-rc-fra.fra.garr.net (GRAPH OMITTED)
19 rc-fra-ru-lnf.fra.garr.net (GRAPH OMITTED)
7 chislsdn1-oc192-snv2sdn1.es.net (GRAPH OMITTED)
8 chiccr1-chislsdn1.es.net
20
21 www6.lnf.infn.it (193.206.84.223) 189.908 ms 189.596 ms 189.684 ms
9 aofacr1-chicsdn1.es.net (GRAPH OMITTED)
66
III. Federated Trust Services – Support for Large-Scale Collaboration
•
Remote, multi-institutional, identity authentication is critical
for distributed, collaborative science in order to permit
sharing widely distributed computing and data resources, and
other Grid services
•
Public Key Infrastructure (PKI) is used to formalize the
existing web of trust within science collaborations and to
extend that trust into cyber space
– The function, form, and policy of the ESnet trust services are driven
entirely by the requirements of the science community and by direct
input from the science community
• International scope trust agreements that encompass many
organizations are crucial for large-scale collaborations
– ESnet has lead in negotiating and managing the cross-site, crossorganization, and international trust relationships to provide policies
that are tailored for collaborative science
 This service, together with the associated ESnet PKI service, is the
basis of the routine sharing of HEP Grid-based computing resources
between US and Europe
67
DOEGrids CA (Active Certificates) Usage Statistics
9000
8500
US, LHC ATLAS project
adopts ESnet CA service
8000
7500
6500
6000
5500
5000
4500
4000
3500
Active User Certificates
3000
Active Service Certificates
2500
2000
Total Active Certificates
1500
1000
500
-2
0
03
A
pr
-2
00
3
Ju
l20
03
O
ct
-2
00
3
Ja
n20
04
A
pr
-2
00
4
Ju
l20
04
O
ct
-2
00
4
Ja
n20
05
A
pr
-2
00
5
Ju
l20
05
O
ct
-2
00
5
Ja
n20
06
A
pr
-2
00
6
Ju
l20
06
O
ct
-2
00
6
Ja
n20
07
A
pr
-2
00
7
0
Ja
n
No.of certificates or requests
7000
Production service began in June 2003
* Report as of July 5, 2007
68
DOEGrids CA Usage - Virtual Organization Breakdown
DOEGrids CA Statistics(6982)
ANL
1.9%
*Others
11.0%
ESG
0.8%
ESnet
0.2%
FusionGRID
0.5%
iVDGL
13.0%
**OSG
22.9%
LBNL
0.7%
NERSC
1.3%
ORNL
0.6%
LCG
1.2%
PNNL
0.0%
FNAL
23.7%
* DOE-NSF collab. & Auto renewals
PPDG
22.2%
** OSG Includes (BNL, CDF, CMS, CompBioGrid,DES, DOSAR, DZero, Engage, Fermilab, fMRI, GADU, geant4,
GLOW, GPN, GRASE, GridEx, GROW, GUGrid, i2u2, iVDGL, JLAB, LIGO, mariachi, MIS, nanoHUB, NWICG,
OSG, OSGEDU, SBGrid,SDSS, SLAC, STAR & USATLAS)
69
DOEGrids CA Adopts Red Hat CS 7.1
•
Motivation: SunOne CMS (Certificate Management System)
went End-of-Life 2 years ago
– RH CS is a continuation of the original CA product and development
team, and is fully supported by Red Hat
•
Transition was over a year in negotiation, development, and
testing
•
•
•
05 July 2007: Transition from SunONE CMS 4.7 to Red Hat
Major transition - Minimal outage of 7 hours
Preserved important assets
 Existing DOEGrids signing key
– Entire history: over 35000 data objects transferred
– UI (for subscriber-users and operators)
•
New CA software will allow ESnet to develop more useful
applications and interfaces for the user communities
70
IV.
•
ESnet Conferencing Service (ECS)
An ESnet Science Service that provides audio, video, and
data teleconferencing service to support human collaboration
of DOE science
– Seamless voice, video, and data teleconferencing is important for
geographically dispersed scientific collaborators
– Provides the central scheduling essential for global collaborations
– ESnet serves more than a thousand DOE researchers and
collaborators worldwide
• H.323 (IP) videoconferences (4000 port hours per month and rising)
• audio conferencing (2500 port hours per month) (constant)
• data conferencing (150 port hours per month)
• Web-based, automated registration and scheduling for all of these
services
– Very cost effective (saves the Labs a lot of money)
71
ESnet Collaboration Services (ECS)
Audio & Data
ESnet
6-T1's
6-T1's
ISDN
Production
Web Latitude
Collaboration Server
(DELL)
Router
Production
Latitude
M3 AudioBridge
Sycamore Networks DNX
2 PRI’s
.252
Video Conferencing
Gatekeeper Neighbors
Codian ISDN Gateway
GDS North American Root
Internet
Codian MCU 1
Radvision Gatekeeper
Institutional
Gatekeepers
Codian MCU 2
Codian MCU 3
H.323
72
ECS Video Collaboration Service
• High Quality videoconferencing over IP and ISDN
• Reliable, appliance based architecture
• Ad-Hoc H.323 and H.320 multipoint meeting creation
• Web Streaming options on 3 Codian MCU’s using Quicktime
or Real
•
•
3 Codian MCUs with Web Conferencing Options
•
384k access for video conferencing systems using ISDN
protocol
•
Access to audio portion of video conferences through the
Codian ISDN Gateway
120 total ports of video conferencing on each MCU (40 ports
per MCU)
73
ECS Voice and Data Collaboration
• 144 usable ports
– Actual conference ports readily available on the system.
• 144 overbook ports
– Number of ports reserved to allow for scheduling beyond the number of
conference ports readily available on the system.
• 108 Floater Ports
– Designated for unexpected port needs.
– Floater ports can float between meetings, taking up the slack when an extra
person attends a meeting that is already full and when ports that can be
scheduled in advance are not available.
• Audio Conferencing and Data Collaboration using Cisco MeetingPlace
• Data Collaboration = WebEx style desktop sharing and remote viewing of
content
•
•
•
•
Web-based user registration
Web-based scheduling of audio / data conferences
Email notifications of conferences and conference changes
650+ users registered to schedule meetings (not including guests)
74
ECS Service Level
• ESnet Operations Center is open for service 24x7x365.
• A trouble ticket is opened within15 to 30 minutes and
assigned to the appropriate group for investigation.
• Trouble ticket is closed when the problem is resolved.
• ECS support is provided Monday to Friday, 8AM to 5 PM
Pacific Time excluding LBNL holidays
– Reported problems are addressed within 1 hour from receiving
a trouble ticket during ECS support period
– ESnet does NOT provide a real time (during-conference) support
service
75
Typical Problems Reported to ECS Support
Video Conferencing
• User E.164 look up
• Gatekeeper registration problems – forgotten IP address or user network problems
• Gateway Capacity for ISDN service expanded to 2 full PRI’s = 46 x 64kbps chs
• For the most part, problems are with user-side network and systems configuration.
Voice and Data Collaboration
• Scheduling Conflicts and Scheduling Capacity has been addressed by expanding
overbooking capacity to 100% of actual capacity
• Future equipment plans will allow for optimal configuration of scheduling
parameters.
• Browser compatibility with Java based data sharing client – users are advised to
test before meetings
• Lost UserID and/or passwords
Words of Wisdom
We advise users that at least two actions must be taken in advance of conferences
to reduce the likelihood of problems:
A) testing of the configuration to be used for the audio, video and data conference.
B) appropriate setup time must be allocated BEFORE the conference to ensure
punctuality and correct local configuration. (at least 15 min recommended)
76
Real Time ECS Support
•
A number of user groups have requested “real-time”
conference support (monitoring of conferences while
in-session)
•
Limited Human and Financial resources currently
prohibit ESnet from:
A) Making real time information available to the public on the
systems status (network, ECS, etc) This information is
available only on some systems to our support personnel
B) 24x7x365 real-time support
C) Addressing simultaneous trouble calls as in a real time
support environment.
• This would require several people addressing multiple problems
simultaneously
77
Real Time ECS Support
•
Proposed solution
– A fee-for-service arrangement for real-time conference support
– Such an arrangement could be made by contracting directly with
TKO Video Communications, ESnet’s ECS service provider
– Service offering would provide:
• Testing and configuration assistance prior to your conference
• Creation and scheduling of your conferences on ECS Hardware
• Preferred port reservations on ECS video and voice systems
• Connection assistance and coordination with participants
• Endpoint troubleshooting
• Live phone support during conferences
• Seasoned staff and years of experience in the video conferencing
industry
• ESnet community pricing at $xxx per hour (Commercial Price: $yyy/hr)
78
Summary
• ESnet is currently satisfying its mission by enabling SC
science that is dependant on networking and distributed,
large-scale collaboration:
“The performance of ESnet over the past year has been
excellent, with only minimal unscheduled down time. The
reliability of the core infrastructure is excellent.
Availability for users is also excellent” - DOE 2005 annual
review of LBL
• ESnet has put considerable effort into gathering
requirements from the DOE science community, and has
a forward-looking plan and expertise to meet the five-year
SC requirements
– A Lehman review of ESnet (Feb, 2006) has strongly endorsed the
plan presented here
79
References
1.
High Performance Network Planning Workshop, August 2002
–
2.
3.
http://www.doecollaboratory.org/meetings/hpnpw
Science Case Studies Update, 2006 (contact [email protected])
DOE Science Networking Roadmap Meeting, June 2003
–
4.
http://www.es.net/hypertext/welcome/pr/Roadmap/index.html
Science Case for Large Scale Simulation, June 2003
–
5.
http://www.pnl.gov/scales/
Planning Workshops-Office of Science Data-Management Strategy, March & May 2004
–
6.
http://www-conf.slac.stanford.edu/dmw2004
For more information contact Chin Guok ([email protected]). Also see
-
http://www.es.net/oscars
[LHC/CMS]
http://cmsdoc.cern.ch/cms/aprom/phedex/prod/Activity::RatePlots?view=global
[ICFA SCIC] “Networking for High Energy Physics.” International Committee for Future
Accelerators (ICFA), Standing Committee on Inter-Regional Connectivity (SCIC),
Professor Harvey Newman, Caltech, Chairperson.
-
http://monalisa.caltech.edu:8080/Slides/ICFASCIC2007/
[E2EMON] Geant2 E2E Monitoring System –developed and operated by JRA4/WI3, with
implementation done at DFN
http://cnmdev.lrz-muenchen.de/e2e/html/G2_E2E_index.html
http://cnmdev.lrz-muenchen.de/e2e/lhc/G2_E2E_index.html
[TrViz] ESnet PerfSONAR Traceroute Visualizer
https://performance.es.net/cgi-bin/level0/perfsonar-trace.cgi
80
And Ending on a Light Note….
NON SEQUITUR - BY WILEY
81