Overview - How to Build a NOC

Download Report

Transcript Overview - How to Build a NOC

How to Build a NOC
Quilt NOC Workshop Panel Discussion:
Indiana University/Global NOC
WiscNet
Pacific Northwest GigaPop
October 3, 2007
7/8/2015
How to Build a NOC
Customers and Expectations
•
•
•
•
Who are your customers and what are
their expectations/SLA’s?
Campus, University System, StateNet,
GigaPoP, RON, National Backbone,
International Connections
24x7, Business Hours, Best Effort
Problem Resolution, Triage, Problem
Identification, Service Desk
9/11/2007
How to Build a NOC
Supported Services
• In addition to networking, what other
services does your organization support?
• Computer Operations
• Support Center
• Security Response
• Grid Computing
• Consulting
9/11/2007
How to Build a NOC
Monitoring and Troubleshooting
 How large and complex is your network?
 Large and Simple, or Small and Complex
 Types of Networking Supported
 Optical DWDM
 Layer 2 – Switch Network
 Layer 3 – Routed Network
 How many hats being worn? Incestuous
operational and support relationships?
9/11/2007
How to Build a NOC
Monitoring and Troubleshooting
 What level troubleshooting and/or
monitoring will your NOC do?
 Troubleshooting is based on delegation of
responsibility within a NOC organization
and/or other related NOC’s
 Monitoring based on SLA’s and health of the
network & services provided
9/11/2007
How to Build a NOC
Monitoring and Troubleshooting
 How will you communicate outages and
planned work to customers?
 Phone
 Email / Listserv
 Web page announcements
 RSS
9/11/2007
How to Build a NOC
Staffing
• Staffing requirements due to SLA's or after
hour service response policies
• 24x7, Business Hours, Best Effort
9/11/2007
How to Build a NOC
Staffing
• Service Hours - Hours of coverage? Not all
NOCs need to be 7x24x365, but what
about holidays? Weekends? On-call?
9/11/2007
How to Build a NOC
Staffing
• What level of staff needs to be present, and
when?
• Tier One: Service Desk (Call Center), Customer
Service, Problem Assessment, Network
Knowledgeable
• Tier Two: Engineering, Problem Resolution,
Perform Maintenance
• Tier Three: Advanced Engineering, Complex
Problem Resolution, Escalation Point, Network
Planning
9/11/2007
How to Build a NOC
Staffing
• Means of responding to issues when NOC
is not staffed 24x7?
• Other group within organization answering
phone, email, and watches monitoring,
contacting the on-call
• Monitoring sends message directly to the
on-call pager
• Out-Source after-hours
9/11/2007
How to Build a NOC
Organizational Structure
•
What staffing tiers/hierarchy will you have
for support? Techs? Leads? NEs?
9/11/2007
How to Build a NOC
Organizational Structure
•
•
•
Escalation practices and policies
When to move a ticket to an escalation
group or person within an organization
When to inform key personnel within
organization or network supported about
outages/problems
9/11/2007
How to Build a NOC
Organizational Structure
•
•
•
•
Writing/updating procedures, training
manuals, etc.
Who is charged with this? When is it
accomplished?
NOC personnel in conjunction with their
other responsibilities
Dedicated resources
9/11/2007
How to Build a NOC
NOC Location
• What is your facility like?
• Does your facility have any unique or
particular advantages?
• How do you want to arrange your
staff? Separate offices? "War room"?
9/11/2007
How to Build a NOC
NOC Funding
•
•
•
•
•
How is your organization funded
University funds
State appropriations
GigaPoP / RON revenue
Contracts, grants
9/11/2007
How to Build a NOC
Tools
•
•
•
•
9/11/2007
How will you track trouble tickets?
Enterprise wide systems shared used
on university or state wide level
Proprietary system supported by the
NOC and/or other related support
groups
Commercial application or Homegrown
How to Build a NOC
Tools
•
•
•
•
9/11/2007
How will you track customer
information? (Database needs,
CRM?)
Ticketing system
Database
Web or Wiki information repository
How to Build a NOC
Tools
•
•
•
•
9/11/2007
How will you monitor and
troubleshoot? Tools, specifically.
Network monitoring system like
Nagios, WhatsUp Gold, HP
OpenView
Weather Maps
MRTG
How to Build a NOC
Tools
•
Are you writing any of your own
tools?
•
Who will maintain your applications?
9/11/2007
How to Build a NOC
Reporting
•
What are the key metrics for a NOC?
•
How will you measure these?
•
•
•
•
•
Uptime availability
Nodes monitored
Trouble tickets
Phone calls
Emails
9/11/2007
How to Build a NOC
NOC Evolution
• What factors have determined operational
changes for your organization - new
services, expanded hours,
increased number of customers, new
equipment types, deeper skill level
9/11/2007
How to Build a NOC
Building a NOC
Indiana University/Global NOC
Case Study
Steve Peck
October 3, 2007
7/8/2015
How to Build a NOC
Customers and Expectations
•
•
•
•
•
•
•
Who are your customers and what are
their expectations/SLA’s
Indiana University
Indiana GigaPoP & IP Grid
I-Light (state of Indiana Higher Ed)
Internet2
National LambdaRail
CIC OmniPoP
9/11/2007
How to Build a NOC
IU/Global NOC
Customers and Expectations
•
•
•
•
•
•
•
Who are your customers and what are
their expectations/SLA’s
TransPAC2
AMPATH
MAN LAN
HOPI
Connecticut Education Network
(pending)
OneNet (consulting)
9/11/2007
How to Build a NOC
IU/Global NOC
Supported Services
• In addition to networking, what other
services does your organization support?
• REN-ISAC (Security service)
• Open Science Grid (Grid monitoring
service)
9/11/2007
How to Build a NOC
IU/Global NOC
Monitoring and Troubleshooting
 How large and complex is your network?
YES!!!
 Types of Networking Supported
 Optical DWDM
 Layer 2 – Switch Network
 Layer 3 – Routed Network
 How many hats being worn? Yes!
 Incestuous operational and support
relationships? Yes!
9/11/2007
How to Build a NOC
IU/Global NOC
Monitoring and Troubleshooting
 What level troubleshooting and/or
monitoring will your NOC do?
 Within the division of work between our
Service Desk and Network Engineering
groups, our NOC is able to perform all levels
of troubleshooting and monitoring. Ranges
from simple layer 3 connections to complex
DWDM systems.
9/11/2007
How to Build a NOC
IU/Global NOC
Monitoring and Troubleshooting
 How will you communicate outages and planned
work to customers?
 Email / Listserv
 Web page announcements
 RSS Feeds
 Web and iCalendar based Maintenance/Outage
Calendars
 Phone (in limited circumstances)
9/11/2007
How to Build a NOC
IU/Global NOC
Staffing
• Service Hours - Hours of coverage? Not all
NOCs need to be 7x24x365, but what
about holidays? Weekends? On-call?
• Service Desk: 24x7x365
• Engineering: Business Hours & On-Call
• Systems Engineering: Business Hours &
On-Call
9/11/2007
How to Build a NOC
IU/Global NOC
Staffing
• Staffing requirements due to SLA's or after
hour service response policies
• Service Desk: 24x7x365
• Engineering: Business Hours & On-Call
• Systems Engineering: Business Hours &
On-Call
9/11/2007
How to Build a NOC
IU/Global NOC
Staffing
• What level of staff needs to be present, and
when?
• Service Desk: 24x7x365
• Engineering: Business Hours & On-Call
• Systems Engineering: Business Hours &
On-Call
9/11/2007
How to Build a NOC
IU/Global NOC
Staffing
• Means of responding to issues when NOC
is not staffed 24x7?
• We have on-call rotation for Engineering and
System Engineering groups, as well as
Service Desk Supervisors.
9/11/2007
How to Build a NOC
IU/Global NOC
Organizational Structure
•
What staffing tiers/hierarchy will you have
for support? Techs? Leads? NEs?
Service Desk
• 2 Shift Supervisors (Day & Night shifts)
• 5 Senior Technicians (at least one on every
shift)
• 13 Technicians (including hourlys)
• 5 Off Front-Line support personnel
9/11/2007
How to Build a NOC
IU/Global NOC
Organizational Structure
•
What staffing tiers/hierarchy will you have for
support? Techs? Leads? NEs?
Network Engineering
• 17 Network Engineers
–
–
Network Engineering Team
Network Planning Team
Systems Engineering
• 7 Systems Engineers (+1 open position)
–
–
9/11/2007
Application Developers
System Administrators
How to Build a NOC
IU/Global NOC
Organizational Structure
•
•
•
Escalation practices and policies
Service Desk has 15 minutes to assess
problem or outage before escalating to
Engineering.
Standard “escalation” processes for
outages and problems (immediate, 1 hour,
4 hours, 12 hours, etc.)
9/11/2007
How to Build a NOC
IU/Global NOC
Organizational Structure
•
•
•
Writing/updating procedures, training
manuals, etc.
NOC personnel in conjunction with their
other responsibilities (Service Desk &
Engineering)
Recently have hired dedicated resources
to focus on internal documentation
environment
9/11/2007
How to Build a NOC
IU/Global NOC
NOC Location
• What is your facility like?
• State of the art
• Does your facility have any unique or particular
advantages?
• Showpiece for tours, on the edge of downtown
Indianapolis, close to State Capitol
• How do you want to arrange your staff? Separate
offices? "War room"?
• War room (for most part), plus offices for
appropriate staff
9/11/2007
How to Build a NOC
IU/Global NOC
NOC Funding
•
•
•
•
•
How is your organization funded
Contracts, grants
University funds
State appropriations
GigaPoP revenue
9/11/2007
How to Build a NOC
IU/Global NOC
Tools
•
•
•
9/11/2007
How will you track trouble tickets?
Footprints ticketing system
(manufactured by Numara) for all
GRNOC networks & projects
Peregrine for all IU campus related
tickets
How to Build a NOC
IU/Global NOC
Tools
•
•
•
•
9/11/2007
How will you track customer
information? (Database needs,
CRM?)
Ticketing system
Great Database (developed in-house)
Web or Wiki information repository
How to Build a NOC
IU/Global NOC
Tools
•
How will you monitor and
troubleshoot? Tools, specifically.
•
Monitoring: AlertMon homegrown web based
alert interface links Nagios to Footprints ticketing
system
Visualization: Weather Maps, Utilization Graphs
(SNAPP)
Management: GRNOC Database and linked
systems (RADIUS, DNS, etc.)
Other special-purpose tools (examples: Spanning
Tree state map, Juniper Firewall Filter Grapher,
Syslog Analysis Scripts, prefix list diff checker,
etc.)
•
•
•
9/11/2007
•
How to Build a NOC
IU/Global NOC
Tools
•
•
Are you writing any of your own tools?
Yes. Large deployment of custom
developed and open source software
with a sprinkle of commercial software.
•
•
Who will maintain your applications?
Systems Engineering Team: Software
Developers and System Administrators.
9/11/2007
How to Build a NOC
IU/Global NOC
Tools: Monitoring
• AlertMon: “Big Board” front-end to monitoring system
• Nagios: open source network monitoring system with
custom-developed Plug-ins and monitoring agents.
Monitor variety of services: BGP session status,
Interface up/down, IS-IS Adjacency, Router/Switch CPU
load, MSDP status, etc.
9/11/2007
How to Build a NOC
IU/Global NOC
Tools: Monitoring
Nagios Monitoring System
9/11/2007
How to Build a NOC
IU/Global NOC
Tools: Weathermaps
“Mini Maps”
embedded in NOC
web sites
9/11/2007
How to Build a NOC
IU/Global NOC
Tools: Weathermaps
9/11/2007
How to Build a NOC
IU/Global NOC
Tools: Management
GRNOC Database
• Network Management System including:
– Contact Management
– Device Management
•
•
•
•
•
Inventory
RADIUS-based authentication
DNS record generation
Configuration archiving
Tied to utilization measurement system
– Circuit Management (Layer 0, Layer 1, Layer 2)
– IP Address Management
– Services Management
9/11/2007
How to Build a NOC
IU/Global NOC
Tools: GRNOC DB
9/11/2007
How to Build a NOC
IU/Global NOC
Tools: GRNOC DB
9/11/2007
How to Build a NOC
IU/Global NOC
Reporting
•
•
What are the key metrics for a NOC?
How will you measure these?
•
•
•
•
•
Uptime availability
Services monitored
Trouble tickets
Phone calls
Emails
9/11/2007
How to Build a NOC
IU/Global NOC
NOC Evolution
• What factors have determined operational
changes for your organization - new
services, expanded hours,
increased number of customers, new
equipment types, deeper skill level
• Yes! Optical networking and dynamic
circuit switching has made the most
difference
9/11/2007
How to Build a NOC
IU/Global NOC
Building a NOC
WiscNet
Case Study
Kevin Schmidtke
October 3, 2007
7/8/2015
How to Build a NOC
Supported Services
• In addition to networking, what other
services does your organization support?
– Service & System support, providing support for all
servers, mainframes and applications serving UWMadison as well as some UW- System wide
applications.
– All NOC staff are cross trained to provide both
network support as well as server and application
support
9/11/2007
How to Build a NOC
WiscNet
NOC Funding
• How is your organization funded
– Budget is completely revenue based
– Service support (monitoring, troubleshooting and
incident management) and server hosting fees
account for 64% of total budget/revenue.
– Network support accounts for 36% of budget/revenue
• WiscNet (44%), UW Madison campus (41%) and BOREAS
Net (15%)
9/11/2007
How to Build a NOC
WiscNet
Customers and Expectations
•
Who are your customers and what are
their expectations/SLA’s
– UW Madison Campus
– WiscNet
– BOREAS Net
•
•
9/11/2007
No defined SLA’s for any of customers
Customers expect maximum up-time, efficient
incident management, effective change
management at a reasonable price.
How to Build a NOC
WiscNet
Monitoring and Troubleshooting
 How large and complex is your network?
– UW Madison Campus
• Cisco based network reaching 80 campus buildings
• Managing/monitoring over 1,140 3750 Cisco
switches
• Managing/monitoring over 40,000 ports across
campus
• Managing/monitoring 1,930 wireless access points
across campus
9/11/2007
How to Build a NOC
WiscNet
Monitoring and Troubleshooting
 How large and complex is your network?
– WiscNet
• State-wide backbone connecting 500 + members
• 20+ local telcos provide circuits to members
throughout the state.
• Members include all UW campuses and institutions
across the state, all state technical colleges, all
private colleges in the state, the majority of K-12
school districts and some libraries across the state.
9/11/2007
How to Build a NOC
WiscNet
Monitoring and Troubleshooting
 How large and complex is your network?
– BOREAS Net
• Regional Optical Network
• WiscNet is the operator
• A collaboration of four major research institutions in
the upper Midwest: Iowa State University, the
University of Iowa, the University of Minnesota, and
the University of Wisconsin-Madison.
9/11/2007
How to Build a NOC
WiscNet
Monitoring and Troubleshooting
 What level troubleshooting and/or
monitoring will your NOC do?






Node up/node down
Network Closet environmental alarms
UPS alarms
Backbone error monitoring
Access Switch port errors
Bandwidth threshold monitoring
9/11/2007
How to Build a NOC
WiscNet
Monitoring and Troubleshooting
 How will you communicate outages and
planned work to customers?
– Outage pages for incidents/unplanned outages
– E-mail communications for changes/planned
outages
• In house tools used to determine sites affected and
build custom e-mail list.
9/11/2007
How to Build a NOC
WiscNet
Staffing
• Service Hours
– NOC level 1 staffed 24X7X365 including all
holidays
– Level 2 engineering on-call during off-shift and
holidays
– Remote ‘hand and eyes’ on call for UW Campus
and BOREAS Net
9/11/2007
How to Build a NOC
WiscNet
Staffing
• What level of staff needs to be present, and
when?
– A minimum of two NOC level 1 support
personnel staffed at all times except holidays;
only one NOC level 1 support personnel staffed
on holidays.
– Level 2 network support (Operational
Engineers) and Network engineers available
during the day.
9/11/2007
How to Build a NOC
WiscNet
Staffing
• After hour service response policies
– Varies for each of the networks depending on
on-call or best effort support model
– Immediate phone support expected for on-call
staff
– If on-call staff presence is necessary, one hour
on-site response is expected
9/11/2007
How to Build a NOC
WiscNet
Organizational Structure
•
Escalation practices and policies
– Escalation policies and procedures defined in
an online knowledge database
– Policies and procedures differ from network-tonetwork
– On-call and best effort support models in
place
9/11/2007
How to Build a NOC
WiscNet
Organizational Structure
•
What staffing tiers/hierarchy will you have
for support? Techs? Leads? NEs?
– UW Campus
•
•
•
9/11/2007
NOC Level 1 on site at all times
Operation Engineers/Level 2 on-site during the day,
on call after hours
Network Engineers on-site during the day, best
effort after hours
How to Build a NOC
WiscNet
Organizational Structure
•
What staffing tiers/hierarchy will you have
for support? Techs? Leads? NEs?
– WiscNet
•
•
•
•
•
9/11/2007
NOC Level 1 on-site at all times
Network Engineers on site during the day, best
effort after hours
Tech Support (service support) on site during the
day, best effort after hours
Telco providers available at all times
Remote tech contact is best effort
How to Build a NOC
WiscNet
Organizational Structure
•
What staffing tiers/hierarchy will you have
for support? Techs? Leads? NEs?
– BOREAS Net
•
•
•
9/11/2007
NOC Level 1 on-site at all times
Network Engineers on site during the day, best
effort after hours
Level 3 Communications available at all times
How to Build a NOC
WiscNet
Organizational Structure
•
Writing/updating procedures, training
manuals, etc.
–
–
–
–
9/11/2007
Shared responsibility between NOC staff, Tech
Support and Network Engineering
All support/training documents stored in an online
knowledge database
Ability to share documents between instances of the
knowledge database
A defined lifecycle for each article; draft, team review,
production, update/review, etc.
How to Build a NOC
WiscNet
NOC Location
• What is your facility like?
– Large control center facility. Network support
and service support duties performed in same
area
– War room and the primary workstation in control
center available to manage major incidents.
Primary workstation in control center is
preferred.
– Network engineers and tech support staff in
adjacent campus buildings.
9/11/2007
How to Build a NOC
WiscNet
Tools
•
How do you track customer
information? (Database needs,
CRM?)
– All customer information (support,
configuration, contact, etc.) is maintained
in database tools developed in-house.
– Different networks use different tools
– Customer information can be searched by
site id, circuit id, vlan, IP address, etc.
9/11/2007
How to Build a NOC
WiscNet
Tools
•
How will you monitor and
troubleshoot? Tools, specifically.
– UW Madison campus – HP Network Node
Manager, syslogs and custom monitoring
tools.
– WiscNet – HP Network Node Manager,
syslogs and custom monitoring tool.
– BOREAS Net – Infinera Digital Network
Administrator
9/11/2007
How to Build a NOC
WiscNet
Tools
•
Are you writing any of your own
tools?
– None of our support tools are written at the
NOC. The NOC identifies the need for
new tools or modifications to existing tools
and the network engineers do the
development.
•
Who will maintain your applications?
– The tools are maintained by the Network
Engineering staff
9/11/2007
How to Build a NOC
WiscNet
Tools
•
How do you track trouble tickets?
– Incidents for all supported networks are
tracked through Clarify
– Each network has a unique product name
– The subject line includes the site name
and/or device name
– Weekly reports are generated for each of
the networks and distributed
9/11/2007
How to Build a NOC
WiscNet
Reporting
•
What are the key metrics for a NOC?
– Incident management
•
•
•
•
•
Detection
Troubleshooting
Communication
Escalation
How will you measure these?
– Post incident review
– Customer feedback
– Problem management
9/11/2007
How to Build a NOC
WiscNet
NOC Evolution
• What factors have determined operational
changes for your organization?
– New technologies & network configurations
– Changes in support models dictated by
campus
– Newly configured networks
– Incident tracking trends and analysis.
9/11/2007
How to Build a NOC
WiscNet
Building a NOC
Pacific Northwest GigaPop
Case Study
Linda Hornung
October 3, 2007
7/8/2015
How to Build a NOC
Customers and Expectations
• Our customers are in the Pacific
Northwest and Pacific Rim.
• We track customer information in a
database designed and maintained locally.
• Time zones have driven our need for 7x24.
• We cover 15 time zones which also makes
scheduling planned maintenance difficult.
9/11/2007
How to Build a NOC
PNWGP
Customers and Expectations
• Escalation policies also drive our need for
on-call schedules and on-site personnel.
• Our customers expect 48 hours notice
prior to work, unless it is an emergency.
• Outages are communicated via an
application we have built locally.
• We also want to know when our
customers have planned work.
9/11/2007
How to Build a NOC
PNWGP
Customers and Expectations
• Expectations are included in the contract
and located on the PNWGP web page.
• We prefer to have the NOC as the primary
customer contact point for our organization
in order to maintain quality of service when
a ticket moves between groups.
9/11/2007
How to Build a NOC
PNWGP
Supported Services
• In addition to network monitoring for
PNWGP, our NOC monitors:
• UW campus connectivity, including two
hospitals (both layer 2 and layer 3, and
wired as well as wireless, many UPSs),
• Approximately 500 sites throughout WA
state for the K-20 network,
• Pacific Wave, and Transit Rail.
9/11/2007
How to Build a NOC
PNWGP
Supported Services
• We do not do any host, server, or
customer application monitoring.
• We do not do desktop support.
• We do not reset passwords or arrange
services such as email and web.
• We do not monitor data center security
alarms such as for fire or flood.
9/11/2007
How to Build a NOC
PNWGP
Monitoring and Troubleshooting
 PNWGP: approx. 20 customers, 5 node
sites, MPLS, BGP, VPN
 Campus: 115,000 connected devices,
~100 remote locations, ~7,000 wireless
APs, ~4,000 layer 2 switches, ~150
routers.
9/11/2007
How to Build a NOC
PNWGP
Monitoring and Troubleshooting
 Over 450 Washington state K-20 sites
monitored.
 Transit Rail, approx. 5 node sites,BGP and
ISIS status.
9/11/2007
How to Build a NOC
PNWGP
Monitoring and Troubleshooting
 PNWGP troubleshoots with the customer
on routing problems, latency, and loss of
connectivity.
 IS-IS and IGP status is monitored.
 We manage DWDM, Sonet, and MPLS
circuits.
 Complexity is increased with escalation
paths being different depending on what
isn’t working.
9/11/2007
How to Build a NOC
PNWGP
Monitoring and Troubleshooting
 Outages and planned events are sent via
email announcement in a standard format.
 We include the date/time of the work or
when the outage began.
 If the customer’s connectivity is entirely
down, we also call or page them.
 Updates are sent at predefined intervals
for large events, or when we have a
change in status.
9/11/2007
How to Build a NOC
PNWGP
Staffing
• We are 7x24 with full-time staff.
• Weekends we only have one person
covering each day, so vacations and sick
time are problems.
• Holidays are covered by on-site and oncall staff.
• On-call consists of a 7-day period and
rotates among all NOC staff on a regular
schedule.
9/11/2007
How to Build a NOC
PNWGP
Staffing
• Tier 1 are student staff, also called
Network Analysts.
• Tier 2 are full-time staff, most are titled
Network Specialists.
• Tier 3 are full-time staff, titled Network
Engineers.
9/11/2007
How to Build a NOC
PNWGP
Staffing
• We advertise to all customers an on-site
7x24 staff for immediate response to
outages.
• Our SLAs indicate when a problem should
be escalated beyond the oncall staff to a
manager, director, and higher at any time
of the day or week.
9/11/2007
How to Build a NOC
PNWGP
Staffing
• Day shift M-F requires that we have
engineers (Tier 3) staff in the building.
• Off-hours we have Tier 2 staff always
present in the building. They will escalate
as necessary, either due to outage
severity or complexity, or at a customer’s
request.
9/11/2007
How to Build a NOC
PNWGP
Staffing
• To maximize staffing efficiencies, 2nd and
3rd shift personnel report to Computer
Operations managers, rather than the
NOC manager.
• These staff provide all of the same support
and have the same training/access as the
daytime Tier 2 staff in the NOC.
9/11/2007
How to Build a NOC
PNWGP
Staffing
• We have on-call lists with a primary and
secondary person for backup when NOC
staff is not on-site.
• A separate call list exists for escalation to
managers.
• Engineering and other service groups are
also available on-call 7x24x365.
9/11/2007
How to Build a NOC
PNWGP
Organizational Structure
• PNWGP has recently adopted a tiered
approach for staffing.
• NOC Network Engineers are available
after 6pm M-F and weekends only via oncall.
• One manager, reporting to the Director of
Operations.
9/11/2007
How to Build a NOC
PNWGP
Organizational Structure
• Our escalation practices and policies are
based on length or severity of outage.
• At predetermined intervals additional
management levels are notified of severe
outages in order to help with escalation at
other organizations (telcos), or to keep
peers updated at the affected sites.
9/11/2007
How to Build a NOC
PNWGP
Organizational Structure
• Many people in the NOC work to provide
and update training materials; SMEs write
procedures.
• We use a wiki to maintain our
documentation.
• We were able to have a tech writing expert
give us training on writing effective
procedural documents and re-organize our
wiki.
9/11/2007
How to Build a NOC
PNWGP
NOC Location
• Currently, our NOC has some singleoccupant offices and some multiple (2-3).
We are all on the same floor.
• We find that being located near the
Network Engineering team is quite helpful
for urgent escalations.
• Next year we will be moving into a new
location and we do not yet have details on
what that will look like.
9/11/2007
How to Build a NOC
PNWGP
NOC Location
• Consider lighting and noise control with
shared offices.
• How many monitors will each person
need? Will you use a large central monitor
for some things?
• Provide an impromptu meeting space for
collaboration on big events.
9/11/2007
How to Build a NOC
PNWGP
NOC Location
• Conference bridges greatly enhance
collaboration across geographic distances
whether working on outages or events.
9/11/2007
How to Build a NOC
PNWGP
NOC Funding
• UW funding comes primarily from the state
tax payers of Washington,and UW Medical
Centers.
• Funding also comes from the PNWGP and
customers and WA state K-20 network.
• If you recharge you will need a business
and billing model.
• Or will you use time and materials?
9/11/2007
How to Build a NOC
PNWGP
Tools
• A very useful tool is live chat or IM for
coordinating efforts no matter where your
office is.
• Our customer information is tracked in a
home-grown database which has grown
and morphed over a dozen years.
• New needs such as SLAs and layer 1 info
now require significant investment in
upgrades.
9/11/2007
How to Build a NOC
PNWGP
Tools
• Our monitoring system – surprise – is also
“homegrown”.
• We monitor interface state and IP
reachability; performance and protocol
state connectivity will soon be integrated
into our “event system” (NMS)
• Automated tools can page the appropriate
group to notify them of outages or
threshold conditions.
9/11/2007
How to Build a NOC
PNWGP
Tools
• We have a separate Tools team (with 10
staff members) who design, write,
implement, and maintain tools.
• This allows us to have full-featured and
robust tools.
• One trade-off is fewer “one-off” tools for
specific or isolated issues.
9/11/2007
How to Build a NOC
PNWGP
Tools
• PNWGP uses Request Tracker, RT, an
open-source application to track trouble
tickets.
• Weekly reports are generated for our
Directors by sector, severity, and type.
• Monthly reports are generated by sector
for billing purposes.
9/11/2007
How to Build a NOC
PNWGP
Reporting
•
Key metrics we track include:
1.
2.
3.
4.
Ticket numbers by sector for billing
Phone call volumes
Duration of outages
Root Cause Analysis for high-impact events
•
Outage time is measured by duration of
the customer impact.
After Action Review and Follow-up is
conducted for serious events.
•
9/11/2007
How to Build a NOC
PNWGP
Reporting
• Monthly report is emailed to the customer
for traffic sent to/from their site.
• Our internal reporting includes “operational
impacts” to groups under our main
organization.
• How do you measure your NOC’s
success? Response times? Reduced
calls?
9/11/2007
How to Build a NOC
PNWGP
NOC Evolution
• Factors that have determined operational
changes for our organization have been
increased size, complexity and number of
networks monitored;
• Need to respond to outages 24 hours/day
with on-site personnel rather than paging;
• Skill and responsibility levels have
increased significantly, and continue to do
so.
9/11/2007
How to Build a NOC
PNWGP
Panel Contact Information
Linda Hornung
Pacific Northwest GigaPop
[email protected]
Steve Peck
Indiana University/Global NOC
[email protected]
Kevin Schmidtke
WiscNet
[email protected]
9/11/2007
How to Build a NOC