20080311-LHC_Community_BCP

Download Report

Transcript 20080311-LHC_Community_BCP

US LHC Tier-2
Network
Performance
BCP
LHC Community Network
Performance
Recommended BCP
Eric Boyd
Deputy Technology Officer
Internet2
Mar-3-08
US LHC Tier-2
Network
Performance
BCP
Recap
•At November, 2007 LHC OPN meeting,
the group asked Internet2 and ESnet to
work on a straw man “Best Practices
Guide” for deploying perfSONAR
Mar-3-08
US LHC Tier-2
Network
Performance
BCP
What
•Straw Man Recommendation from US
perfSONAR participants to US Atlas and US
CMS Sites
•Working on a set of recommendations to help
the US LHC community better react to
network performance problems
•Plan to develop these recommendations with
the Internet2 HENP-SIG, the US-Atlas, USCMS community, participants from a
BNL/FNAL sponsored workshop this spring,
as well as anyone else interested in
developing a best practices guide
Mar-3-08
US LHC Tier-2
Network
Performance
BCP
Recommended Goals
1.
Characterize and track network connectivity and
performance to important peer sites
2.
Characterize and quantify network performance
problems
3.
Differentiate between application and network
performance problems
4.
Differentiate between local and remote network
problems
5.
Identify, understand and respond effectively to
changes in the underlying network
Mar-3-08
US LHC Tier-2
Network
Performance
BCP
Recommended Primary Use Cases
• End scientist attempting to determine why
data transfers to her lab are not fast enough
• Site validating/debugging transfers to/from
other sites
• Site validating/debugging transfers to/from
end scientist
Mar-3-08
US LHC Tier-2
Network
Performance
BCP
Recommended Approach: Network
Performance Troubleshooting
•End-to-End network performance analysis
• TCP transfer throughput (reported by application/end-user)
• Identify where transfer is limited
• Application related problems
• Network end system problems (NDT)
• Network path problems (perfSONAR OWAMP, perfSONAR
BWCTL)
•Network Performance Analysis Methodology
• Problem identification
• Step-by-step remediation of the detected problems
• Packet trace analysis as last resort
Mar-3-08
US LHC Tier-2
Network
Performance
BCP
Recommended Infrastructure
•Tools and archives will be made
available with the perfSONAR
infrastructure
•New deployments will be found using
the perfSONAR Lookup Service
•New tools can be integrated into the
infrastructure at any time
Mar-3-08
US LHC Tier-2
Network
Performance
BCP
Basic Strategy
Each site (T0, T1, T2, …) acting independently:
•Exposes active measurement targets to
support/control other sites tests to them
•Performs active tests to other participants
•Collects and exposes passive metrics (SNMP,
sFlow, etc..) using pS archives
•Collects results from active tests and exposes
metrics using pS archives
Any participant:
• Can then use analysis tools to interact with any
available archives to examine performance
problems
Mar-3-08
US LHC Tier-2
Network
Performance
BCP
Analysis of Strategy
•Success of strategy scales with the
degree of participation (Metcalf’s Law)
•New tools and analysis can be phased
into the infrastructure as they become
available
• Analysis that is specific to this community
can be integrated into the infrastructure
Mar-3-08
US LHC Tier-2
Network
Performance
BCP
Site Participation Levels
No Participation (Or Worse):
• Hostile: firewalls (blocked ICMP)
• Non-cooperative: no tools, no data
Limited Partner:
• Willing target: daemons installed
Active Partner:
• Participant: daemons installed, active testing to peers
• Data Provider: passive/active test results shared
RECOMMENDED: Limited participation (T3s) or active
participation (T1s and T2s)
Mar-3-08
US LHC Tier-2
Network
Performance
BCP
Site Involvement Levels
•Not interested
•Hands-off
• Delegate participation to a 3rd Party
•Hands-on (any subset)
• Manage hardware
• Install software
• Manage software
• Manage data collection
• Decide testing strategy
• Decide data access policy
Mar-3-08
US LHC Tier-2
Network
Performance
BCP
Site Deployment Options
Target Options
• Knoppix install
• Tool installation
– owampd/bwctld
Active Partner Options
• Knoppix install
– Add perfSONAR
• Tool installation
– owampd/bwctld
• perfSONAR (CPAN
install)
Very limited configuration
More extensive configuration
necessary, once tools are
installed very little maintenance Identify important services to
your site, monitor to those sites
is required
Mar-3-08
US LHC Tier-2
Network
Performance
BCP
Initial Useful Metrics and Tools
Network Path characteristics
•Round trip time (perfSONAR PingER)
•Routers along the paths (traceroute)
•Path utilization/capacity (perfSONAR SNMPMA)
•One way delay, delay variance (perfSONAR
owamp)
•One way packet drop rate (perfSONAR
owamp)
•Packets reordering (perfSONAR owamp)
•Achievable throughput (perfSONAR bwctl)
Mar-3-08
US LHC Tier-2
Network
Performance
BCP
Plan forward
• Specific analysis methodology will be
developed with the community of users.
(methods must match usage patterns)
• Specific metrics and tools will be
recommended based on needs of
methodology
Mar-3-08