Architecture

Download Report

Transcript Architecture

Network Architecture
Gary Buhrmaster
ST&E Readiness Review
May 14th, 2007
Work supported by U. S. Department of Energy contract DE-AC03-76SF00515
Network Philosophy

Support getting the science done (safely)


Simplicity (where possible)



The science is the thing
Limit vendors, technologies used
Leverage existing SCCS staff expertise
Redundancy (where appropriate)


SCCS is not staffed for 24/7 coverage
“Throwing smart (dedicated) people at issues”
works as long as you do not throw them too often
Overview

SLAC administers globally routed network
space of




134.79.0.0/16 “SLAC” address space
Visitor and RAS subnets
IPv6 (test) subnet
A number of internal private subnets for
control systems, isolated systems, batch
farms

Accelerator, SSRL, IR2, SCCS
Overview

Hardware





Vendors: Cisco, Nokia
~300 Layer 2 (capable) devices
~50 Layer 3 (capable) devices
~20 Enforcement (firewall/filter) devices
Many devices are categorized as more than one



swouters/frankenrouters (not all swouters are used as L2/3)
 what is an infiniband “switch” (it has routing in it…)
Misc. appliances (WLSE (HP), EndRun)
~15 support systems (logging, monitoring, etc.)

Sun/Dell – systems managed by the systems group
Overview

Physical instantiation

~70 buildings

Some buildings have numerous switches (some none)


klystron gallery, computer center, SSRL
~200 VLANS



Switched network design
Some buildings have multiple subnets/vlans
Some vlans are in multiple buildings, some in only one


Some in only one switch
 router to router connections, span monitoring…
Some internally used by devices
Staffing

Network Engineering



Manage/Configure/Monitor network devices
Five FTEs
Network Research

Primarily research activities

But operationally focused (not just blue sky), which is
leveraged to support SLAC and HEP/BES activities
(especially WAN performance issues)
Staffing (outside of Network group)

Network Operations



Reports to SCCS Operations
Physical installation/support
Five FTEs

Netops also coordinate with CEF staff and contractors
for some installations (cable pullers, bulk fiber
installation and termination, etc.)
Staffing (outside of Network group)

Security group



Windows group


Responsible for overall security policies and
approvals
Apply approved policies to the Cisco enforcement
devices
Apply approved policies to the Checkpoint
enforcement devices
Systems group

Maintain the Unix network support systems
SLAC Speak

IFZ – Internet Free Zone

At least some part of every network is blocked
from offsite network access


Printers, Batch nodes, Network devices, “problematic”
devices (i.e. SBCs/IOCs)
SFZ – SLAC Free Zone

Some special networks (controls) are accessible
only from their local networks

IR2, MCC
SLAC Speak

RouterBlock


Layer 3 forward and uRPF blocking (advertise the
/32 addresses into routing table to null route
device at the router(s))
EPN – “Extremely Private Network”

Elevated level protections (the “PII” place)


EPN(1) (original design), EPN2 (revised design)
CANDO – Computer And Network Database
in Oracle (?)

Database of record for IP addresses/systems
Big (dense) Picture
But still simplified
“Internet”
IPv6
Border
visitor
Core
Net
Mgmt
IR2
Farm
Net
rsch
VPN
……………
Infra
Campus SSRL “Special” BSD
EPN
……………
……………
……………
MCC
Drill down (Layer 3 view)

Network segmentation

Enclaves


Functional/Physical


SLAC, accelerator….
research yard, visitor network, decnet
Performance/Availability

batch farm, network research
IPv6 Network
rtr-ipv6
ESnet
BAMAN
WWW
IPv6
Network

Dipping a toe in the (IPv6) water


its cold and lonely there
External to SLAC network

One web server

was originally proposed to be named VVVVVV
Visitor (& RAS) Network
ESnet
Visitor
Network
BAMAN



External to SLAC network (no trust)
Wireless access is only on visitor network
Client only support (block servers)
Border Network
ESnet
BAMAN
Border
router
Stanford
CENIC
Internet2

Border enforcement device is a filtering router

ACLs block ports <1024 (except to allowed hosts), and
various special ports (X, netbus, backoriface, …)
Infrastructure Services
B050
(2nd floor)
SLAC Network

“Nethub/IFZ/IFZ-Lite”
Centrally administered servers

Windows/Unix infrastructure services



Unix & Windows infrastructure – DNS, Kerberos, AFS,
AD, file servers, web services, email, ….
IFZ and where possible
Most exceptions to port < 1024 filters are to these
servers (web, email, kerberos)
Campus
Campus Distribution
Access (many buildings)
Campus

Most staff/engineers/scientists are connected
to one of the “PUB” networks

Legacy workgroup allocations (based on “yellow
cable”) have changed to physical location
allocations (trying to avoid flat earth operations)
Farm
“Farm” networks
batch systems
SLAC Network


Batch resources for scientific discovery
Most resources are IFZ



Campus
Exceptions for external data transfer systems, and
scientific login systems
Many resources are (policy (i.e. netgroup))
limited to be used only from other batch
systems
Different Availability/Performance needs
BaBar / IR2
Farm
mcc

IR2 has four subnets


one public general purpose subnet, one IFZ
subnet (local compute farm), one SFZ subnet
(dedicated SBCs and detector subsystems) with
EPICs gateway, and isolated device control
Intention is that these networks/systems can
operate independently from SCCS
Accelerator (MCC)
SLAC Network
IR2

Accelerator network has four subnets



One public general purpose subnet (slclavc), two “slac free”
subnets (leb, slcc) for control systems, and one isolated
subnet (pep)
Use of multi-homed controls systems (VMS) for
access to isolated networks devices
Intention is that these networks/systems can
operate independently from SCCS
Network Management
SLAC Network

Network
Management
and monitoring
networks
Network monitoring and configuration
management (BAM - Backup and Monitoring)



SNMP (via acls on network devices) only respond
to requests from the management network hosts
ACLs protect appliances/APs (bastion hosts)
Systems are limited access
Network Research
SLAC Network

Research network
Network Research activities

Isolated to allow local experimentation



ex: tsunami multicast
Systems are maintained the same as other
systems on site
Systems are limited login, sponsored users
SSRL

SSRL manages their own network equipment
and configurations, including their own
firewall implementations to protect their
control and experimental systems

A later presentation will discuss SSRL
BSD (EPN(1))
SLAC net
bsd
bsd-epn
rtr-bsdnet





bsd-dmz
EPN(1)
Air Gap possibility
Extensive filtering
Users access PeopleSoft via Citrix
More details in later presentation
EPN2
DMZs
Backend
SLAC Network

Revised approach based on new PS arch

Multiple DMZ nets (web servers), Backend nets
(app servers, DBs)


In realty, collapsed firewalls
Details in later presentation
VPN
SLAC Network




VPN Servers
VPN (GRE/IPSEC) only to official servers
Windows PPTP/L2TP VPN server
Discouraged (use Citrix where possible)
Firewall/filters

Block RPC, NFS, CIFS except to approved
servers, & NetBus, BackOriface, etc.
“Special” subnet(letts)
SLAC Network
SLAC Network
SLAC Network

A few networks specially protected due to inability to
maintain the systems, or certified configurations


Ex: GLAST Clean Room, PCD, HVAC
Group responsible for equipment purchase, SCCS
maintains the devices/configurations
Procedures/Policies

Device connection policy


Network equipment


Devices need to be in CANDO
Users are not to install switches/routers/hubs
Wireless


No wireless on the SLAC networks
Devices installed/coordinated by SCCS
Network protections

Dedicated subnet for network management



Network devices are IFZ
SNMP restricted to network management subnet
SSH on all but a few legacy devices




Finally got funding to upgrade the last few
Disable ports not allocated on switches
No devices on native .1q vlan
WLSE used for rogue access point detection
Network protections

Restricted physical access to “core” devices


Routing/switching best practices


(Building 050 OmniLock door access)
no ip unreachable, BGP passwords, schedule
allocate, no source route, ….
Strong working relationship with upstreams
Network Intrusion Detection

Primarily log and netflow based

Central logging and analysis


“Significant” events cause paging
Netflow detects many scanners (and P2P)


Collected for both internal and external traffic
“scanning” detection catches (SMTP) bots in “real time”


And the occasional “special” user
Extremely useful for incident analysis
Discussion?
Obligatory final slide to avoid “End of slide show” artifact