Sheffield Site Report

Download Report

Transcript Sheffield Site Report

HEP Computing Status
Sheffield University
Matt Robinson
Paul Hodgson
Andrew Beresford
Interactive Cluster
•
•
•
•
•
•
•
•
•
•
30 self built linux boxes
AMD Athlon XP cpu’s, 256/512 meg ram
OS Scientific Linux 303
100 megabit network
Use NIS for authentication, NFS mount /home etc
System install using kickstart + post install scripts
Separate backup machine
15 Laptops mostly dual boot
Some MAC’s and one Windows Box
3 Disk servers mounted as /data1 /data2 etc (few TB)
Batch Cluster
•
•
•
•
•
•
•
100 cpu farm Athlon XP 2400/2800
OS Scientific Linux 303
NFS mounted /home and /data
OpenPBS batch system for job submission
Gigabit Backbone with 100 MBit to worker nodes
Disk server provides 1.3 TB as /data Raid5
Entire cluster assembled in house from OEM components
for less than 50k
• Hard part was finding air-conditioned room with sufficient
power
Cluster Usage
Software
•
•
•
•
•
•
PAW, CERNLIB etc
Geant4
ROOT
Atlas 10.0.1
FLUKA
ANSYS, LS-DYNA
Comments - Issues
• Have tightened up security in last year
• Strict firewall policy, limited machine exemption
• Blocking scripts prevent ssh access after 3
authentication failures within 1 hour
• Cheap disks allow construction of large disk
arrays
• Very happy with SL3 for desktop machines
• Use FC3 for Laptops – 2.6 kernel
The Sheffield LCG Cluster
Division of Hardware
• 162 x AMD Opteron 250 (2.4
GHz)
• 4 GB RAM/box (2 GB/CPU)
• 72 GB U320 10K RPM local
SCSI disk
• Currently running 32 bit
SL303 for maximum
compatibility with grid.
• ~2.5 TB storage for
experiments.
• Middleware: 2.4.0
• Probably the most purple
cluster in the grid.
Looking Sinister
Status
Usage so far
• We can take quite a bit more.
Monitoring
• Ganglia with modified
webfrontend to present
queue information
Installation
• Service nodes connected to VPN and Internet
• PXE Installation via VPN allows complete control of
dhcpd and named
• RedHat kickstart + post install script
• ssh servers not exposed
• RGMA always the hardest part
• Stumbled across routing rules.
• WN install takes about 30 minutes, can do up to 40
simultaneously.
Matt Robinson:
Future plans
• Keep up with middleware updates
• Increase available storage as required in
~3-4 TB steps
• Fix things that break
• Try not to mess anything up by screwing
around
• Look toward operating with 64 bit OS.