Architecture
Download
Report
Transcript Architecture
Network Architecture
Gary Buhrmaster
ST&E Readiness Review
May 14th, 2007
Work supported by U. S. Department of Energy contract DE-AC03-76SF00515
Network Philosophy
Support getting the science done (safely)
Simplicity (where possible)
The science is the thing
Limit vendors, technologies used
Leverage existing SCCS staff expertise
Redundancy (where appropriate)
SCCS is not staffed for 24/7 coverage
“Throwing smart (dedicated) people at issues”
works as long as you do not throw them too often
Overview
SLAC administers globally routed network
space of
134.79.0.0/16 “SLAC” address space
Visitor and RAS subnets
IPv6 (test) subnet
A number of internal private subnets for
control systems, isolated systems, batch
farms
Accelerator, SSRL, IR2, SCCS
Overview
Hardware
Vendors: Cisco, Nokia
~300 Layer 2 (capable) devices
~50 Layer 3 (capable) devices
~20 Enforcement (firewall/filter) devices
Many devices are categorized as more than one
swouters/frankenrouters (not all swouters are used as L2/3)
what is an infiniband “switch” (it has routing in it…)
Misc. appliances (WLSE (HP), EndRun)
~15 support systems (logging, monitoring, etc.)
Sun/Dell – systems managed by the systems group
Overview
Physical instantiation
~70 buildings
Some buildings have numerous switches (some none)
klystron gallery, computer center, SSRL
~200 VLANS
Switched network design
Some buildings have multiple subnets/vlans
Some vlans are in multiple buildings, some in only one
Some in only one switch
router to router connections, span monitoring…
Some internally used by devices
Staffing
Network Engineering
Manage/Configure/Monitor network devices
Five FTEs
Network Research
Primarily research activities
But operationally focused (not just blue sky), which is
leveraged to support SLAC and HEP/BES activities
(especially WAN performance issues)
Staffing (outside of Network group)
Network Operations
Reports to SCCS Operations
Physical installation/support
Five FTEs
Netops also coordinate with CEF staff and contractors
for some installations (cable pullers, bulk fiber
installation and termination, etc.)
Staffing (outside of Network group)
Security group
Windows group
Responsible for overall security policies and
approvals
Apply approved policies to the Cisco enforcement
devices
Apply approved policies to the Checkpoint
enforcement devices
Systems group
Maintain the Unix network support systems
SLAC Speak
IFZ – Internet Free Zone
At least some part of every network is blocked
from offsite network access
Printers, Batch nodes, Network devices, “problematic”
devices (i.e. SBCs/IOCs)
SFZ – SLAC Free Zone
Some special networks (controls) are accessible
only from their local networks
IR2, MCC
SLAC Speak
RouterBlock
Layer 3 forward and uRPF blocking (advertise the
/32 addresses into routing table to null route
device at the router(s))
EPN – “Extremely Private Network”
Elevated level protections (the “PII” place)
EPN(1) (original design), EPN2 (revised design)
CANDO – Computer And Network Database
in Oracle (?)
Database of record for IP addresses/systems
Big (dense) Picture
But still simplified
“Internet”
IPv6
Border
visitor
Core
Net
Mgmt
IR2
Farm
Net
rsch
VPN
……………
Infra
Campus SSRL “Special” BSD
EPN
……………
……………
……………
MCC
Drill down (Layer 3 view)
Network segmentation
Enclaves
Functional/Physical
SLAC, accelerator….
research yard, visitor network, decnet
Performance/Availability
batch farm, network research
IPv6 Network
rtr-ipv6
ESnet
BAMAN
WWW
IPv6
Network
Dipping a toe in the (IPv6) water
its cold and lonely there
External to SLAC network
One web server
was originally proposed to be named VVVVVV
Visitor (& RAS) Network
ESnet
Visitor
Network
BAMAN
External to SLAC network (no trust)
Wireless access is only on visitor network
Client only support (block servers)
Border Network
ESnet
BAMAN
Border
router
Stanford
CENIC
Internet2
Border enforcement device is a filtering router
ACLs block ports <1024 (except to allowed hosts), and
various special ports (X, netbus, backoriface, …)
Infrastructure Services
B050
(2nd floor)
SLAC Network
“Nethub/IFZ/IFZ-Lite”
Centrally administered servers
Windows/Unix infrastructure services
Unix & Windows infrastructure – DNS, Kerberos, AFS,
AD, file servers, web services, email, ….
IFZ and where possible
Most exceptions to port < 1024 filters are to these
servers (web, email, kerberos)
Campus
Campus Distribution
Access (many buildings)
Campus
Most staff/engineers/scientists are connected
to one of the “PUB” networks
Legacy workgroup allocations (based on “yellow
cable”) have changed to physical location
allocations (trying to avoid flat earth operations)
Farm
“Farm” networks
batch systems
SLAC Network
Batch resources for scientific discovery
Most resources are IFZ
Campus
Exceptions for external data transfer systems, and
scientific login systems
Many resources are (policy (i.e. netgroup))
limited to be used only from other batch
systems
Different Availability/Performance needs
BaBar / IR2
Farm
mcc
IR2 has four subnets
one public general purpose subnet, one IFZ
subnet (local compute farm), one SFZ subnet
(dedicated SBCs and detector subsystems) with
EPICs gateway, and isolated device control
Intention is that these networks/systems can
operate independently from SCCS
Accelerator (MCC)
SLAC Network
IR2
Accelerator network has four subnets
One public general purpose subnet (slclavc), two “slac free”
subnets (leb, slcc) for control systems, and one isolated
subnet (pep)
Use of multi-homed controls systems (VMS) for
access to isolated networks devices
Intention is that these networks/systems can
operate independently from SCCS
Network Management
SLAC Network
Network
Management
and monitoring
networks
Network monitoring and configuration
management (BAM - Backup and Monitoring)
SNMP (via acls on network devices) only respond
to requests from the management network hosts
ACLs protect appliances/APs (bastion hosts)
Systems are limited access
Network Research
SLAC Network
Research network
Network Research activities
Isolated to allow local experimentation
ex: tsunami multicast
Systems are maintained the same as other
systems on site
Systems are limited login, sponsored users
SSRL
SSRL manages their own network equipment
and configurations, including their own
firewall implementations to protect their
control and experimental systems
A later presentation will discuss SSRL
BSD (EPN(1))
SLAC net
bsd
bsd-epn
rtr-bsdnet
bsd-dmz
EPN(1)
Air Gap possibility
Extensive filtering
Users access PeopleSoft via Citrix
More details in later presentation
EPN2
DMZs
Backend
SLAC Network
Revised approach based on new PS arch
Multiple DMZ nets (web servers), Backend nets
(app servers, DBs)
In realty, collapsed firewalls
Details in later presentation
VPN
SLAC Network
VPN Servers
VPN (GRE/IPSEC) only to official servers
Windows PPTP/L2TP VPN server
Discouraged (use Citrix where possible)
Firewall/filters
Block RPC, NFS, CIFS except to approved
servers, & NetBus, BackOriface, etc.
“Special” subnet(letts)
SLAC Network
SLAC Network
SLAC Network
A few networks specially protected due to inability to
maintain the systems, or certified configurations
Ex: GLAST Clean Room, PCD, HVAC
Group responsible for equipment purchase, SCCS
maintains the devices/configurations
Procedures/Policies
Device connection policy
Network equipment
Devices need to be in CANDO
Users are not to install switches/routers/hubs
Wireless
No wireless on the SLAC networks
Devices installed/coordinated by SCCS
Network protections
Dedicated subnet for network management
Network devices are IFZ
SNMP restricted to network management subnet
SSH on all but a few legacy devices
Finally got funding to upgrade the last few
Disable ports not allocated on switches
No devices on native .1q vlan
WLSE used for rogue access point detection
Network protections
Restricted physical access to “core” devices
Routing/switching best practices
(Building 050 OmniLock door access)
no ip unreachable, BGP passwords, schedule
allocate, no source route, ….
Strong working relationship with upstreams
Network Intrusion Detection
Primarily log and netflow based
Central logging and analysis
“Significant” events cause paging
Netflow detects many scanners (and P2P)
Collected for both internal and external traffic
“scanning” detection catches (SMTP) bots in “real time”
And the occasional “special” user
Extremely useful for incident analysis
Discussion?
Obligatory final slide to avoid “End of slide show” artifact