20161018_HEPIX-infn-t1_site_reportx

Download Report

Transcript 20161018_HEPIX-infn-t1_site_reportx

INFN-T1 site report
Andrea Chierici
On behalf of INFN-T1 staff
HEPiX Fall 2016
Outline
•
•
•
•
•
Facilities
Network
Data Management and Storage
Farming
Evolution of monitoring
18-Oct-2016
Andrea Chierici
2
Facilities
Activities
• BMS switch
– From TAC Vista to SBO by Schneider Electric → DONE
• Chillers substitution: a more efficient solution using a new
configuration to bypass many architectural bounds.
– Preliminary draft → DONE
– Executive project → PENDING
– Implementation → Scheduled for next year
• Clean room for the Tape Library: the amount of dust in the
room is causing some problems to the drivers. The best
solution for the problem seems to be implementing a clean
room.
– Preliminary draft → DONE
– Implementation → Scheduled for next year
18-Oct-2016
Andrea Chierici
4
BMS Upgrade
• The old BMS system was the TAC VISTA:
– Phased out
– Many cons: difficult to edit, no compatibility with open protocols, GUI
based on Java, ...
• Adopted Schneider StruxureWare™ Building Operation software
(SBO):
– Possible to re-use the same “sensors” and “collectors”, minimizing
hardware substitution costs (only 3 new AS required)
– Full compatibility with Modbus TCP/IP, serial and TAC VISTA Lonworks
network
– The web station user interface requires just standard browser (no Java
or other plug-ins)
– Runs on mobile devices
– Open to standard protocols (e.g. web services)
– Time saving: 8 weeks to complete migration.
Automation Server
18-Oct-2016
Andrea Chierici
5
Chillers substitution
The actual chiller plant is far
away from being efficient. The
choice done in 2007 to insert
the chiller in a confined room
(level -1) was forced by the
boundaries in terms of available
space.
Now the adoption of the
Turbocor™ compressors makes
easier
implementing
long
distance remote condensing
units, thus bypassing these
architectural boundaries and
increasing the energy savings up
to 50%.
18-Oct-2016
Proposal:
• Three chiller (N+1) of 700 kW each;
• Turbocor compressors
• Remote condensation
• New «green» refrigerant (HFO1234ze)
Andrea Chierici
6
Tape Library
The amount of dust in the
room is causing problems
to drivers and tapes.
Separating,
cleaning,
pressurizing the library
environment and also
finely-filtering
the
incoming air will hopefully
solve the problems up to
the cause
Proposal:
• One air handler with high efficiency filters, centrifugal
fan and bypass for the minimum air flow requirements.
• Dust free materials for the "yellow" wall.
18-Oct-2016
Andrea Chierici
7
Network
WAN@CNAF (before 26-jul)
LHC OPN
KR-KISTI
RRC-KI
JINR
Main Tier-2s
LHC ONE
GARR Mi1
IN2P3
RAL
SARA
PIC
TRIUMPH
BNL
FNAL
TW-ASGC
NDGF
General IP
20 Gb/s For General IP Connectivity
20Gb/s
Cisco7600
40Gb/s
GARR
GARR Bo1
BO1
10Gb/s
NEXUS
40 Gb Physical Link (4x10Gb)
Shared by LHCOPN and
LHCONE.
CNAF TIER1
18-Oct-2016
Andrea Chierici
9
WAN@CNAF (since 26-jul)
LHC OPN
KR-KISTI
RRC-KI
JINR
Main Tier-2s
LHC ONE
GARR Mi1
IN2P3
RAL
SARA
PIC
TRIUMPH
BNL
FNAL
TW-ASGC
NDGF
General IP
20 Gb/s For General IP Connectivity
20Gb/s
Cisco7600
40Gb/s
GARR
GARR Bo1
BO1
10Gb/s
NEXUS
40 Gb Physical Link (4x10Gb)
Shared by LHCOPN and
LHCONE.
CNAF TIER1
18-Oct-2016
Andrea Chierici
10
Network usage trends
• 26-jul: Upgraded bandwidth on LHC OPN towards
CERN leaving the total physical bandwidth
untouched (4x10Gb/s used by OPN+ONE).
OPN+ONE (Daily)
15/sep
OPN Yearly
Immediately clear
that main
bandwidth request
is on LHC OPN
Saturation
Upgraded CNAF-CERN (OPN)
to 4X10Gb/s (26/jul/2016)
OPN Daily
ONE Daily
18-Oct-2016
Andrea Chierici
11
WAN@CNAF (OPN+ONE upgrade 4060Gb/s)
LHC OPN
KR-KISTI
RRC-KI
JINR
Main Tier-2s
LHC ONE
GARR Mi1
IN2P3
RAL
SARA
PIC
TRIUMPH
BNL
FNAL
TW-ASGC
NDGF
General IP
20 Gb/s For General IP Connectivity
20Gb/s
Cisco7600
40Gb/s
GARR
GARR Bo1
BO1
NEXUS
40
60 Gb Physical Link (4x10Gb)
(6x10Gb)
Shared by LHCOPN and
LHCONE.
CNAF TIER1
18-Oct-2016
Andrea Chierici
12
Physical TIER1 network upgrade to 60Gb/s
•
•
26-Sep: Upgraded T1 physical WAN bandwidth (LHC-OPN/ONE) from 4x10Gb/s to
6x10Gb/s.
OPN+ONE (Daily)
3-Oct: reached 60 gb/s
OPN+ONE (weekly statistics)
OPN (Daily)
ONE (Daily)
OPN: 90% of peak traffic was Xrootd towards WNs
from CERN machines.
ONE: traffic on ONE generated mainly by T2s
13
18-Oct-2016
Andrea Chierici
13
Medium term evolution
• GARR should be able to provide 100GB/s links
starting from may 2017
• We are purchasing 2x100Gb/s cards to be
placed on new Core Switch (already acquired,
under burn-in now)
18-Oct-2016
Andrea Chierici
14
Data Management and Storage
Storage Resources
• 23 PB (net) of disks
‒ GPFS 4.1
‒ 3-4 PB for each file system
• 32 PB on tapes
‒ TSM 7.2
‒ 17 tape drives (T10KD)
• High density installations:
‒ 2015 tender - 10 PB (net) in 3
racks
• Decreased number of servers
‒ 150 → 100
• 2016 tender: 3 PB
‒ Huawei OceanStor 6800 V3
18-Oct-2016
Andrea Chierici
16
Storage servers consolidation
• Introduced FDR InfiniBand infrastructure
‒ 56 Gbit/s per port
‒ 2 switches (36 ports each)
‒ SAN-like fabric configuration (fully redundant)
• 4x10 Gbit Ethernet bonding
‒ 3x10 Gbit currently in use (limited by the number of
10 Gbit ports currently available on core switch)
• 12 I/O servers x 10PB of usable capacity
‒ 1420 TB for every pair of I/O servers
18-Oct-2016
Andrea Chierici
17
ATLAS@CNAF
18-Oct-2016
Andrea Chierici
18
Tape servers consolidation
• To reduce TSM license costs we moved each (big)
experiment configuration from active/active to
active/stand-by
• Mean daily transfer rate for each server hitting 600
MB/s
‒ Each tape server capable of handling 800 MB/s
simultaneously for inbound and outbound traffic
‒ Theoretical limit is defined by a single FC8 connection
towards the Tape Area Network
‒ Some inefficiency in selecting the migration candidate
prevents us from reaching 800 MB/s of mean daily
transfer rate
18-Oct-2016
Andrea Chierici
19
Farming
Computing resources
Farm power: 200K HS-06
– 2016 tender still to be delivered
– 2016 tender: HUAWEI x6800
• 255 x Dual Xeon e5-2618Lv4, 128GB ram
• Should provide 400HS each
– We continuously get requests of extra pledges, it’s
difficult to switch off old nodes
• But we decommissioned many racks this year.
– New hardware for virtualization infrastructure
• Managed by Ovirt
• New VMWare contract with special discount price
18-Oct-2016
Andrea Chierici
21
Extending the farm
• We successfully extended our farm using 2
different approaches:
– Italian cloud provider, sharing vmware resources
– 20k HS06 (now increasing due to a request of extra
pledge by virgo) provided by ReCaS computing center
in Bari, (built thanks to a collaboration between INFN
and University of Bari)
• Running mainly CMS and atlas multi-core jobs
• Many issues faced, but now in production
• Other tests foreseen with Microsoft through
azure and another Italian cloud provider
• See my presentation on thursday
18-Oct-2016
Andrea Chierici
22
Other activities
• Updated batch system to LSF9
– But we are currently testing HTCondor as a
substitute
• Next week HTCondor WS at CNAF
• Provisioning: puppet/foreman is the
provisioning system for CNAF
– Only CEs still run with quattor
• Implementing Openstack Pilot
18-Oct-2016
Andrea Chierici
23
Evolution of monitoring
Monitoring
• Complete refactory of monitoring and alarm
tools across all CNAF functional units
• Past: nagios, lemon, home made probes and
sensors, legacy UIs
• Future: Central Infrastructure, Sensu+Uchiwa,
InfluxDB and Grafana, community probes and
sensors (home made when necessary)
18-Oct-2016
Andrea Chierici
25
Status
•
•
•
•
Setup ready for production.
About 1500 servers monitored.
All the infrastructure managed by Puppet.
Separated environments for each CNAF functional unit,
but unique infrastructure.
• Future activities
–
–
–
–
–
Complete porting of probes and sensors.
Monitoring data center networks.
Optimizations.
Scaling components if necessary.
Decommission of Nagios and Lemon.
18-Oct-2016
Andrea Chierici
26