WP7 Web sites
Download
Report
Transcript WP7 Web sites
Network Monitoring and GridPP
Richard Hughes-Jones, University of Manchester
6 November 2001
DataGrid WP7
MB – NG
DataTAG
GridPP
GridPP Collaboration Meeting Nov 2001
R. Hughes-Jones Manchester
DataGrid WP7 Networking
Active and proceeding well, Meetings:
Oxford DataGrid meeting Jul 01
CERN 18 Sep 01
Frascati DataGrid meeting Oct 01
Provisioning, Reports on
DataGrid network requirements and current infrastructure
Use of IP ports for TestBed1
Monitoring (Robin Tasker)
High Performance High Throughput (Richard Hughes-Jones)
QoS and Bandwidth Reservation (Tiziana Ferrari)
Secuity (Dave Kelsey)
GridPP Collaboration Meeting Nov 2001
R. Hughes-Jones Manchester
Network Monitoring
Several tools in test – plugged into a coherent structure:
PingER, RIPE one way times, iperf, UDPmonRE, rTPL, GridFTP,
and NWS prediction engine
continuous tests for last month to selected sites:
DL Man RL UCL CERN Lyon Bologna SARA NBI SLAC …
Discussions this week at WP7 in Amsterdam
The aims of monitoring for the Grid:
to inform Grid applications, via the middleware, of the current status of the
network – input for resource broker and scheduling
to identify fault conditions in the operation of the Grid
to understand the instantaneous, day-to-day, and month-by-month
behaviour of the network – provide advice on configuration etc.
Report written on LDAP scheme for publishing the network
information
GridPP Collaboration Meeting Nov 2001
R. Hughes-Jones Manchester
Network Monitoring Architecture
LDAP
Schema
Grid Apps
GridFTP
PingER
(RIPE TTB)
iperf
rTPL
NWS
etc
Local Network
Monitoring
Store & Analysis
of Data (Access)
Backend LDAP script to fetch metrics
Monitor process to push metrics
local
LDAP
Server
Grid Application access via
LDAP Schema to
- monitoring metrics;
- location of monitoring data.
Access to current and historic data
and metrics via the Web, i.e. WP7
NM Pages, access to metric forecasts
Robin Tasker
GridPP Collaboration Meeting Nov 2001
R. Hughes-Jones Manchester
How do the Grid apps access the
metrics of network monitoring?
What is the RTT viewed from UCL to RAL and QMW?
Query an LDAP server that makes use of an
LDAP Schema containing the ObjectClass RTT
to find out.
o=grid
How?
ou=uk
cn=netmon
rou=rl
ou=DataGrid
dc=ucl
dc=rl
hn=host1
hn=host2
rou=qmw
Proposed tree here
GridPP Collaboration Meeting Nov 2001
R. Hughes-Jones Manchester
dc=qmw
and find the details from the LDAP Schema
for publication of the RTT metric
service=netmon, dc=ucl, ou=UK, ou=DataGrid, o=Grid
rou=ral
objectclass=networkmonitorHost
objectclass=networkmonitorRTT
objectclass=networkmonitorThroughput
objectclass=networkmonitorLoss
rou=qmw
objectclass=networkmonitorHost
objectclass=networkmonitorRTT
objectclass=networkmonitorThroughput
objectclass=networkmonitorLoss
PingER RTT metric
rTPL RTT metric
GridPP Collaboration Meeting Nov 2001
R. Hughes-Jones Manchester
to allow a request to return a valid metric
PingER
datafile
3) Check datafile
for metric update
IperfER
datafile
Backend
Scripts
rTPL
datafile
4) Read new
metric from
logfile
Ftree
backend
2) Sends query to
backend
5) Reply to query
Slapd
frontent
1) Query server
Run periodically
and generates
data file
6) Reply to query
GridPP Collaboration Meeting Nov 2001
R. Hughes-Jones Manchester
Data: RIPE 1-way & TCP throughput
RIPE 1-way time ms
Sara RAL
20 Oct 01
RIPE 1-way time ms
RAL Sara
TCP Iperf + prediction
Mbit/s
UCL Sara
GridPP Collaboration Meeting Nov 2001
R. Hughes-Jones Manchester
Data: Ping & UDP throughput
PingER rtt (ms)
dl – cern
1000 byte packet
Forecast
From 20 Oct 01
UDPmon throughput Mbit/s
man – cern
300 * 1400 byte frames
GridPP Collaboration Meeting Nov 2001
R. Hughes-Jones Manchester
High Performance High Throughput
Document produced outlining the tests to be made:
Understanding the end system HW,
best way to monitor traffic and protocol packets,
Type and effect of background traffic,
Throughput vs rtt, throughput vs window-size,
Throughput using multiple TCP streams, and effect on the Network
GridFTP at Gigabit,
Effect of different TCP stacks,
Use of non-TCP protocols, and effect on the Network
Show and tell demos
Discussions this week at WP7 in Amsterdam
Links to MB-NG, DataTAG, GGF/IETF
GridPP Collaboration Meeting Nov 2001
R. Hughes-Jones Manchester
QoS and BW Reservation
Proposed workplan presented at Oxford WP7 meeting
Identification of traffic classes and middleware components
requiring QoS
• Discussion with other WorkPackages – definition of application
requirements
• GridFTP packet loss
• Interactive applications packet loss, delay, jitter
• Piloting of IP Premium (GEANT and NRNs)
Study of traffic differentiation and traffic engineering techniques:
• Layer 2 MPLS VPNs (CISCO and Juniper)
• LAN & WAN QoS
• Traffic clasification, marking, policing, ccongestion control,
queue scheduling, traffic aggregation
demonstration of QoS & Bandwith Reservation
Document in production outlining the tests to be made
GridPP Collaboration Meeting Nov 2001
R. Hughes-Jones Manchester
MB - NG
E-science core project
Project to investigate and pilot:
end-to-end traffic engineering and management over multiple
administrative domains – MPLS in core diffserv at the edges.
Managed bandwidth and Quality-of-Service provision. (Robin T)
High performance high bandwidth data transfers. (Richard HJ)
Demonstrate end-to-end network services to CERN using Dante
EU-DataGrid and the US DataTAG.
Status: approved with requested funds start on 1 Dec 2001
Partners:CISCO, CLRC, Manchester, UCL, UKERNA plus
Lancaster and Southampton (IPv6)
A technical meeting held – draft spec. of the HW required
Would like to use real Grid traffic – maybe:
CDF UCL-RAL
Collaboration Meeting Nov 2001
BaBar Man-RAL GridPP
R. Hughes-Jones Manchester
MB - NG
Manc
MB – NG SuperJANET Testbed
MCC
Leeds
SJ4 Dev
C-PoP
Warrington
SuperJANET4
Production
Network
Gigabit Ethernet
2.5 Gbit POS
10 Gbit POS
SJ4 Dev
C-PoP
Reading
RAL /
UKERNA
RAL /
UKERNA
SJ4 Dev
C-PoP
London
ULCC
GridPP Collaboration Meeting Nov 2001
R. Hughes-Jones Manchester
UCL
DataTAG The EU DataTAG project
EU Transatlantic Gigabit project.
Status: approved for 4 M ECU start on 1 Dec 2001
Partners: CERN/PPARC/INFN/UvA. IN2P3 sub-contractor
The main foci are:
Grid Network Research including:
Provisioning (CERN)
Investigations of high performance data transport (PPARC)
End-to-end inter-domain QoS + BW / network resource reservation
Bulk data transfer and monitoring (UvA)
Interoperability between Grids in Europe and the US
PPDG, GriPhyN, DTF, iVDGL (USA)
GridPP Collaboration Meeting Nov 2001
R. Hughes-Jones Manchester
DataTAG project
NL
SURFnet
MREN
STAR-LIGHT
UK
SuperJANET4
STAR-TAP
GEANT
CERN
ESNET
NewYork
IT
GARR-B
Abilene
2.5 Gbit lambda between CERN and Starlight
POS 2nd half 2002 – WDM later
GridPP Collaboration Meeting Nov 2001
R. Hughes-Jones Manchester
Olivier Martin
DataTAG
Single stream vs Multiple streams
effect of a single packet loss (e.g. link error, buffer overflow)
Streams/Throughput
10
5
1
7.5
4.375
Throughput Gbps
2
9.375
2T Avg. 7.5 Gbps
10
75
2T Avg. 4.375 Gbps
5
Avg. 3.75 Gbps
2.5
T = 2.37 hours!
(RTT=200msec, MSS=1500B)
T
T
T
GridPP Collaboration Meeting Nov 2001
R. Hughes-Jones Manchester
Time
GGF
Grid High-Performance Networking RG
Talk on Lambda Networking Research, Cees de Laat, UvA
BW reservation & Gigabit Tests ATLAS at Michigan, E. Myers
Charter fully discussed
Network monitoring WG
Much in common with WP7 two way exchange of techniques.
GridFTP
UK participation in protocols and α-testing with the Globus
developers.
LDAP Schema for network status
Discussions with GLobus
GridPP Collaboration Meeting Nov 2001
R. Hughes-Jones Manchester
Focus on The UK and GridPP
GridPP + DataGRID + DataTAG + MB-NG + GGF will
collaborate closely.
GridPP has UK specific issues and includes experiments at:
BaBar with links to SLAC
CDF and D0 working at Fermi
UKQCD UKDMC (dark matter) MINOS
PPNCG is a natural forum fro UK GridPP network matters
Recent throughput problems, perceived as transatlantic,
traced to on-campus bottlenecks.
The PPNCG strongly encourage good liaison between the
HEP teams and the campus networking groups.
Compiled a picture of the access BW to the HEP sites:
GridPP Collaboration Meeting Nov 2001
R. Hughes-Jones Manchester
Glasgow 1G 100M 30M?
Connectivity of UK Grid Sites.
BW to Campus, BW to site, limit
Edinburgh 1G 100M
Lancaster 155M 100M
move to c&nlman at
155Mbit
Durham 155M ??100M
Manchester 1G 100M
1 G soon
Sheffield 155M ??100M
Liverpool 155M 100M
4*155M soon. To hep ?
Cambridge 1G 16M?
DL 155M 100M
UCL 155M 1G 30M?
Birmingham 622M ??
100M
Oxford 622M 100M
IC 155M 34M then 1G to
Hep
RAL 622M 100M Gig on
site soon
QMW 155M ??
Swansea 155M 100M
Brunel 155M ??
Bristol 622M 100M
Portsmouth 155M 100M
RHBNC 34M 155M soon
?? 100M
GridPP Collaboration Meeting Nov 2001
R. Hughes-Jones Manchester
Southampton 155 100M
Sussex155 100M
Ptolemy simulation of the Grid (1)
Ptolemy - a discrete event simulation tool from
Berkley
Simulation based on flows from “DataGrid-7-D7.1019-1-0-NetworkRequirements.doc”
Predics the flow between Institutes and across
SuperJANET
Paul Mealor
GridPP Collaboration Meeting Nov 2001
R. Hughes-Jones Manchester
Ptolemy simulation of the Grid (2)
CERN
RAL
Bristol
Liverpool
SuperJANET
Lancaster
Birmingham
Tier 3
London
Man
Glasgow &
Edinburgh
(Imperial)
Paul Mealor
GridPP Collaboration Meeting Nov 2001
R. Hughes-Jones Manchester
Don’t Forget Involvement with:
UKQCD UKDMC (dark matter) MINOS
AstroGRID
AccessGRID
Grids for High Performance Computer Centres –
Edinburgh Manchester
Lambda Switching Projects
…
GridPP Collaboration Meeting Nov 2001
R. Hughes-Jones Manchester
European Access
PPNCG monitoring shows packet
loss and increased rtt from the UK
to Europe.
Due to packet loss at the TEN155gw.ja.net
Start of problems 12 Oct
10-20 % loss seen by 31 Oct
The 155 Mbit line to Dante running
at ~130 Mbit
A reconfiguration Fri 2 nov helped.
TEN155 contact ends 1 Dec –
replaced by Dante 2.5Gbit access
link.
GridPP Collaboration Meeting Nov 2001
R. Hughes-Jones Manchester