Building Reliable, Secure and Manageable Substation

Download Report

Transcript Building Reliable, Secure and Manageable Substation

Building
Reliable, Secure and Manageable
Substation Communications
Dragan Dokic | CCIE, CISSP, MCSE
Introduction - Experience
• Dragan Dokic | President, Summit Energy Tech
• Focus on utility sector
– Infrastructure systems management
– Custom business systems software development
• 16 years of experience in IT industry
• 10 years in utility sector
– Managed network operations for PNGC Power
[Portland, OR] from September 2002 to October 2011
– Presentation focuses on lessons learned in field
network reliability, security and manageability from
this experience
Introduction
• PNGC’s 2001 – 2011 field network
– 92 office, substation and repeater sites at 11
distribution utilities in Oregon, Idaho
• System mission
– Gather real-time load data 24/7 for power
scheduling operation in Portland
– Support local utility SCADA/AMI/Site Security
operations
PNGC Power WAN – July 2011
Toledo, OR
Boardman, Oregon
Junction City, Oregon
Lewiston, ID
Malta, ID
The Moon
Areas of Focus
Reliability
Security
Manageability
Presentation available for download at
in the Events section
Reliability – Network Design
• Keys to success
– Diversity in media
• Combine land lines, fixed wireless [private/public], mobile wireless and
satellite
– Diversity in providers
• Local and national
– Dynamic Routing [OSPF]
• Routers exchange knowledge of local network with neighboring routers
• Enterprise grade routers / switches a requirement
• Perfect world configuration
– Private wired/wireless ‘island’ with two Internet gateways
using distinct media and distinct providers
Connectivity
overview
Backup
router
Primary
router
Link cost
overview
Backup
Primary
Link cost calculation
Sub A -> Main Office via
Satellite tunnel:
3+1=4
Link cost calculation
Sub A -> Main Office via
900Mhz+DSL tunnel:
1+1+1=3
Open Shortest Path
Link cost via Satellite
tunnel [4] higher than via
DSL tunnel[3]; therefore,
packets will traverse
900Mhz/DSL tunnel in
normal operation
Normal Operation
Open Shortest Path
From substation A to
Main Office
Normal Operation
Open Shortest Path
From substation B to
Main Office
Link down operation
If DSL tunnel is down,
packets will traverse
satellite tunnel;
Sub A  Main Office
X
Link down operation
If DSL tunnel is down,
packets will traverse
satellite tunnel;
Sub B  Main Office
X
Questions?
Security – Overview
• Wireless link encryption
• Function specific VLANs
• No default routes!
Wireless Link Encryption
• Media device level [e.g. Radio, Modem]
– WEP, WPA, WPA2
• Routing device level [e.g. Cisco 891 router]
– IPSEC
• End device level [e.g. DIGI TS4 port server]
– SSL
At what level to secure data?
Security - Wireless Link Encryption
[continued]
• Most secure option?
– Use all three if management overhead is not an issue
• Most efficient but secure enough option?
– Use routing device site-to-site VPN capabilities
– Advantages:
• Support for best commercially available security
technologies [e.g., AES-256]
• Comprehensive change logging capabilities
• Standardized configuration throughout the system [less
management overhead]
Security – Function Specific VLANs
• Define VLAN’s per business function
– SCADA, AMI, Security System, Wireless, VOIP, Network Mgmt.
• Firewall traffic between VLANs on need-to-access basis
– E.g., Prevent personnel attached to substation wireless VLAN to
access documentation stored on a server at the main office
from accessing recloser controls in the SCADA VLAN
• Reliability advantages
– Non-critical VLANs [e.g. AMI, security] can be shut down
automatically/remotely if link quality is too poor to carry all
traffic, but good enough to carry SCADA
One VLAN per
business
function
High-speed link outage
scenario
Security – No Default Route!
• Do not use default routes through service providersupplied gateways
• Define a single host route back to the main office, then
establish default route through VPN tunnel
• This is the most effective method to prevent attacks
sourced from the Internet
• Always use in conjunction to regular firewall configuration
lists [not a substitute!]
Less secure
Provider
gateway
More secure
Provider
gateway
Questions?
Manageability - Overview
• Tools – network management systems
• Addressing – developing a scheme
• Watchdog system – preventing lockout
Manageability – Tools
• Network Management Systems [NMS]
• Protocols used
• SNMP, Syslog, ICMP, HTTP
• Applications
• PRTG
• Solarwinds Syslog
Manageability – Tools
[continued]
• How to collect data? Push vs. Pull
– Pull: Poll devices using SNMP/HTTP/ICMP at regular
intervals [e.g., every
– Push: Devices send data per defined event triggers
– SNMP traps
– Syslog messages
• What data to collect?
–
–
–
–
Availability [ping]
Network utilization
Input voltages
RSSI [radio link quality]
Manageability – Tools
[continued]
• Pull example:
– 5 minute SNMP poll of UPS for input voltage
– If voltage drops below threshold of 108VAC for a duration
of time longer than 5 minutes, an alert will be triggered by
NMS [e-mail, text message, event log]
– But what if voltage drops for 2 minutes only in between
polls? You may not know it even happened.
• Push comes to rescue:
– UPS sends SNMP trap to NMS as soon as voltage drops
below 108VAC
– Alert is triggered by NMS when trap is received
Paessler PRTG – Screen shot
Solarwinds Kiwi Syslog – Screen shot
Manageability – Addressing
• Develop consistent scheme to use system wide
• Recommended private range: 10.0.0.0/8
–
–
–
–
First octet: same for entire system
Second octet: site ID [e.g. 8=Springfield Sub]
Third octet: business function ID [e.g., 4=AMI]
Fourth octet: device itself [e.g., Collector #1]
Subnet Mask
[255.255.255.0]
1st octet ‘fixed’
2nd octet = site ID
4th octet = device
3rd octet =
vlan/business
function
Manageability – Addressing
[continued]
• Large network?
– Group sites by region using second octet
– Allows for address summarization if needed.
• Example:
– Eastern division region:
• 10.64-127.0.0
• Summary address: 10.64.0.0/10
– Western division region:
• 10.128-191.0.0
• Summary address: 10.128.0.0/10
Manageability – Watchdog System
• General concept
– Reboot key remote communications devices if
connectivity to central site is interrupted
• Benefit
– Prevent unnecessary site visits due to
• Operator error
• Device lock-up [e.g., buggy firmware, heat issues]
Manageability – Watchdog System
[continued]
• Hardware requirements:
– SNMP-capable switched PDU with task scheduling and
delayed power cycling command capabilities
– Example: APC AP7900 8-port 15A PDU
• Software capability requirements:
– Centralized command override mechanism using NMS
– Send SNMP ‘Set’ to cancel pending power cycling
command
Manageability – Watchdog System
Example
• ‘Delayed’ power cycle schedule is defined on PDU:
– Outlets to power cycle:
– Frequency:
– Command execute delay:
1,2 [e.g., radio, router]
60 minutes
30 minutes
• Network management system running at main office sends an
SNMP delayed power-cycle command cancel message
– Frequency:
every 5 minutes
• Process
– If delayed power cycle cancel command cannot reach the PDU at least
one time during the 30 minute reboot delay period, outlets 1 and 2
will be power cycled and communication will (hopefully!) be restored
Questions?
Thank you!