20080122-barczyk
Download
Report
Transcript 20080122-barczyk
US LHC
NWG
Dynamic Circuit Services
in
US LHCNet
Artur Barczyk, Caltech
Joint Techs Workshop
Honolulu, 01/23/2008
US LHC
NWG
US LHCNet Overview
Mission oriented network:
Provide trans-Atlantic network infrastructure
to support the US LHC program
SARA
Starlight
CERN
Manlan
Four PoPs:
CERN
Starlight (→ Fermilab)
Manlan (→ Brookhaven)
2008: 30 (40) Gbps trans-Atlantic bandwidth
(roadmap: 80 Gbps by 2010)
SARA
Large Hadron Collider @ CERN
Start in 2008
US LHC
NWG
pp s =14 TeV L=1034 cm-2 s-1
27 km Tunnel in Switzerland & France
6000+ Physicists &
Engineers
250+ Institutes
60+ Countries
Atlas
LHCb
ALICE
CMS
Higgs,
SUSY,Analyze
Extra Dimensions,
Violation,
QG
Plasma, …
Challenges:
petabytes ofCP
complex
data
cooperatively
Harnessthe
global
computing, data & network resources
Unexpected
The LHC Data Grid Hierarchy
US LHC
NWG
CERN/Outside Ratio ~1:4 T0/(T1)/(T2) ~1:2:2
~40% of Resources in Tier2s
US T1s and T2s Connect to US LHCNet PoPs
Online
GEANT2+NRENS
Germany T1
USLHCNet + ESnet
BNL T1
10 – 40
Gbps
10 Gbps
Outside/CERN Ratio Larger; Expanded Role of
Emerging Vision: A Richly Structured, Global Dynamic System
Tier1s & Tier2s: Greater Reliance on Networks
The Roles of Tier Centers
US LHC
NWG
11 Tier1s, over 100 Tier2s
→ LHC Computing will be more dynamic & network-oriented
Defines the dynamism
of data transfers
Prompt calibration and
alignment
Reconstruction
Store complete set of RAW data
Reprocessing
Store part of
processed data
Monte Carlo
Production
Physics Analysis
Tier 0
(CERN)
Requirements for
Dynamic Circuit
Services in US LHCNet
Physics Analysis
Tier 1
Tier 1
Tier 2
Tier 3
CMS Data Transfer Volume
(May – Aug. 2007)
10 PetaBytes transferred
Over 4 Mos. = 8.0 Gbps Avg.
(15 Gbps Peak)
US LHC
NWG
End-system capabilities growing
US LHC
NWG
40 G In
40 G Out
88 Gbps Peak; 80+ Gbps
Sustainable for Hours,
Storage-to-Storage
Managed Data Transfers
US LHC
NWG
The scale of the problem and the capabilities of the end-systems require a
managed approach with scheduled data transfer requests
The dynamism of the data transfers defines the requirements for
scheduling
Tier0 → Tier1, linked to duty cycle of the LHC
Tier1 → Tier1, whenever data sets are reprocessed
Tier1 → Tier2, distribute data sets for analysis
Tier2 → Tier1, distribute MC produced data
Transfer Classes
Fixed allocation
Preemptible transfers
Best effort
All of this will happen
“on demand” from
Experiment’s Data
Management systems
Priorities
Preemption
Use LCAS to squeeze low(er) priority circuits
Interact with End-Systems
Verify and monitor capabilities
Needs to work end-to-end:
collaboration in GLIF,
DICE
Managed Network Services
Operations Scenario
US LHC
NWG
Receive request, check capabilities, schedule network resources
“Transfer N Gigabytes from A to B with target throughput R1”
Authenticate/authorize/prioritize
Verify end-host rate capabilities R2 (achievable rate)
Schedule bandwidth B > R2; estimate time to complete T(0)
Schedule path with priorities P(i) on segment S(i)
Check progress periodically
Compare rate R(t) to R2, update time to complete T(i) to T(i-1)
Trigger on behaviours requiring further action
Error (e.g. segment failure)
Performance issues (e.g. poor progress, channel underutilized, long
waits)
State change (e.g. new high priority transfer submitted)
Respond dynamically: to match policies and optimize throughput
Change channel size(s)
Build alternative path(s)
Create new channel(s) and squeeze others in class
Managed Network Services:
End-System Integration
US LHC
NWG
Required for a robust end-to-end production system
Integration of network services and end-systems
Requires end-to-end view of the network and end-systems, real-time
monitoring
Robust, real-time and scalable messaging infrastructure
Information extraction and correlation
e.g. network state, end-host state, transfer queues-state
Obtain via network services end-host agent (EHA) interactions
Provide sufficient information for decision support
Cooperation of EHAs and network services
Automate some operational decisions using accumulated experience
Increase level of automation to respond to: increases in usage,
number of users, and competition for scarce network resources
Lightpaths in US LHCNet domain
US LHC
NWG
Dynamic setup and reservation of lightpaths has been successfully
demonstrated by the VINCI project controlling optical switches
Control Plane
Data Plane
(Virtual Intelligent Networks for Computing Infrastructures in Physics)
Planned Interfaces
US LHC
NWG
Most, if not all, LHC data transfers will cross more than one domain
E.g. in order to transfer data from CERN to Fermilab:
CERN → US LHCNet → ESnet → Fermilab
VINCI Control Plane for intra-domain,
DCN (DICE/GLIF) IDC for inter-domain provisioning
I-NNI:
VINCI (custom)
protocols
UNI:
DCN IDC?
LambdaStation?
TeraPaths?
E-NNI:
UNI:
Web Services
(DCN IDC)
VINCI custom protocol,
client = EHA
Protection Schemes
US LHC
NWG
Mesh-protection at Layer 1
US LHCNet links are assigned to
primary users
CERN – Starlight for CMS
CERN – Manlan for Atlas
In case of link failure cannot blindly
use bandwidth belonging to the other
collaboration
Carefully choose protection links,
e.g. use the indirect path (CERNSARA-Manlan)
Designated Transit Lists, and DTL-
Sets
High-level protection features
implemented in VINCI
Re-provision lower priority circuits
Preemption, LCAS
Needs to work end-to-end:
collaboration in GLIF,
DICE
Basic Functionality To-Date
US LHC
NWG
Semi-automatic intra-domain circuit
provisioning
Bandwidth adjustment (LCAS)
End-host tuning US
by LHCNet
the End-Host Agent
End-to-End monitoring
routers
Pre-production (R&D) setup:
Local domain: routing of private
IP subnets onto tagged VLANs
Core network (TDM): VLAN based
Virtual Circuits
Ultralight
routers
Ciena
CoreDirectors
High
performance
servers
14
MonALISA: Monitoring the
US LHCNet Ciena CDCI Network
SARA
USLHCnet
Starlight
Manlan
CERN
Geneva
US LHC
NWG
Roadmap Ahead
The current capabilities include
End-to-End monitoring
Intra-domain circuit provisioning
End-host tuning by the End-Host Agent
Towards a production system (intra-domain)
Integrate existing end-host agent, monitoring and measurement
services
Provide a uniform user/application interface
Integration with experiments’ Data Management Systems
Automated fault handling
Priority-based transfer scheduling
Include Authorisation, Authentication and Accounting
Towards a production system (inter-domain)
Interface to DCN IDC
Work with DICE, GLIF on IDC protocol specification
Topology exchange, routing, end-to-end path calculation
Extend AAA infrastructure to multi-domain
US LHC
NWG
Summary and Conclusions
US LHC
NWG
Movement of LHC data will be highly dynamic
Follow LHC data grid hierarchy
Different data sets (size, transfer speed and duration), different priorities
Data Management requires network-awareness
Guaranteed bandwidth end-to-end (storage-system to storage-system)
End-to-end monitoring including end-systems
We are developing the intra-domain control plane for US
LHCNet
VINCI project, based on MonALISA framework
Many services and agents are already developed or in advanced state
Use Internet2’s IDC protocol for inter-domain provisioning
Collaboration with Internet2, ESNet, LambdaStation, Terapaths on end-to-end
circuit provisioning