Transcript slide

Connect. Communicate. Collaborate
SETTING UP AND
Click to edit Master title style
OPERATING A PERT
COURSE OBJECTIVES
By the end of this course you will be able to:
• Describe the rationale for and the structure and ethos of a
federated PERT (Performance Enhancement and Response
Team).
• Set up local PERTs within a federated PERT structure.
• Describe the methodologies used to investigate performance
issues.
• Use a variety of tools to investigate performance issues.
You will also have the opportunity to provide feedback about
the evolution of the PERT and its systems.
2
COURSE OUTLINE (1)
Administrative Session
• Module 1 – Overview: Why a federated PERT?
• Module 2 – The Federated PERT: Structure, Workflow and
Procedures
• Module 3 – Setting up a PERT
• Module 4 – PERT Systems and Procedures
• Module 5 – Feedback on the Federated PERT
3
COURSE OUTLINE (2)
First Technical Session
• Module 6 – Network Performance and User Expectations
• Module 7 – Bulk Transfers Under TCP
4
COURSE OUTLINE (3)
Second Technical Session
• Module 8 – Investigative Tools
• Module 9 – The Methodology of Performance Issue
Investigation
• Module 10 – How Middleboxes Impact Performance
5
Module 1: Overview: Why a Federated
PERT?
MOTIVATION FOR THE PERT
Why have a Performance Enhancement and Response Team (PERT)?
• Historically, long-distance circuits (the ‘wide-area’) were the
network’s bottleneck.
• Recently, long-distance circuit capacity has greatly increased.
• Performance bottlenecks can now occur anywhere in end-to-end
system:
• Local Area Network (LAN)
• Wide Area Network (WAN)
• End-system (application, operating system, hardware)
• Increasingly difficult for end-users to diagnose network performance
issues
7
PURPOSE OF THE PERT
To enable end-users to enjoy optimal performance for
networked applications by
• Providing them with services:
• Advice on tuning end-system hardware and operating systems
• Help in identifying and fixing backbone bottlenecks
• Promoting awareness of realistic performance expectations.
8
ORIGINS OF THE PERT
PERT first proposed in 2001 at Hawaii Internet2 meeting
• Support structure for performance issues with networked
applications
• Based on successful Computer Emergency Response Team
(CERT) structure
Idea not pursued further by Internet2
December 2002: DANTE and group of NRENs decide to set up pilot
PERT for year 4 of GÉANT project
• GÉANT PERT initially just a mailing list
• GÉANT PERT offered ‘best effort’ responses only
• Freeware issue tracker (‘RoundUp’) was added
9
TOWARDS GN2
GÉANT PERT was successful
PERT service planned for GN2 project
GN2 PERT to be ‘evolution’ of GÉANT PERT
• Offered ‘guaranteed’ service
• Guarantee was to investigate, not necessarily to resolve, issues
• 9 NRENS agreed to participate
• CARnet, CESNET, FCCN, GARR, Hungarnet, PSNC, RedIRIS,
RENATER and SWITCH
• Permanently staffed during working hours
• Virtual team of network engineers from the 9 participating NRENs
10
THE PERT IN THE CONTEXT OF THE GN2 PROJECT
How does the PERT fit into the GN2 project?
• It is part of GN2 Service Activity 3 (SA3, Introduction of MultiDomain Services)
• Initially there were five related SA3 work items:
• WI2 (DANTE) Define and establish the PERT
• WI3 (SWITCH) Deliver PERT documentation
• WI4 (PSNC) Deliver PERT Ticket System (PTS)
• WI5 (PSNC) Deliver PERT Troubleshooting Procedures
• WI6 (DANTE) Day to Day PERT Operations
11
ORGANISATION OF THE GN2 PERT (1)
The participating NRENs provide ‘Duty Case Managers’
(DCMs) on a rotating weekly basis.
• DCM responsible for:
• Opening new cases as help is requested…must respond to new
request within two working hours
• Progressing existing cases
• DCM encouraged to become Special Case Manager (SCM)
for cases they have opened or become involved in
• Share joint responsibility with DCMs
• Increases continuity
12
ORGANISATION OF THE GN2 PERT (2)
DCMs and SCMs are supported by Subject Matter Experts
(SEs).
• SEs work on a purely voluntary basis
PERT Ticket System (PTS), developed by PSNC, holds case
information
PERT Knowledge Base holds case results and other advice
and information about network performance issues
• PERT Knowledge Base is a Twiki website maintained by
SWITCH
13
GN2 PERT POLICY
End-users can self-register on PTS and submit requests for
help through it.
Response time is two working hours
All education and research users are eligible for PERT
assistance,
• But NRENs and pan-European research projects are given
preferential treatment when demand is high.
14
THE PERT AND THE NETWORK OPERATIONS CENTRE
(NOC)
What is the difference between PERT and NOC cases?
• Root cause of NOC case usually failure of own or
neighbouring network equipment
• Relatively straightforward to investigate and easy to diagnose
• Resolution is urgent
• Lines of responsibility are clear
• Root cause of PERT cases can be anywhere on end-to-end
path
• Often cause is in the end-systems
• Difficult to investigate and diagnose
• Less urgent to resolve
• Lines of responsibility not always clear
15
FUTURE OF THE PERT
GN2 PERT funded by GN2 project (due to complete in 2008 /
2009)
Discussion of future of GN2 PERT began at second GN2
technical workshop in June 2006
De-Centralised PERT identified as way forward:
• Function should be ‘embedded’ in all network support
organisations
16
DE-CENTRALISING THE PERT (1)
WI12, led by DANTE, created in September 2006
• Purpose: study ways of de-centralising the PERT
• Produced GN2 Year 3 deliverable, ‘Description of a Decentralised PERT’ in March 2007
• ‘straw-man’ for design of future PERT
• Produced by DANTE, SWITCH, PSNC and RedIRIS
17
DE-CENTRALISING THE PERT (2)
Three main options considered:
• Disband PERT
• Centralised PERT
• De-Centralised PERT
18
OPTION 1: DISBAND THE PERT
Disbanding the PERT was quickly discounted:
• Current participants recognised value of the PERT
• PERT seen as worthy of continuation
19
OPTION 2: CENTRALISED PERT
The centralised PERT would effectively have continued the
current policy.
• Advantages:
• Single entity – easy to organise and manage
• Single PTS – cases easy to find and track
• Disadvantages:
• Difficult and expensive to scale
• Difficult to discuss issues in local languages
20
OPTION 3: DE-CENTRALISED PERT ADVANTAGES
The de-centralised PERT would be a distributed PERT,
without hierarchy. Each organisation would implement its
own PERT.
• Advantages:
• Scalable
• Local languages and time-zones easier to cater for
• Encourages uptake of PERT concept
21
OPTION 3: DE-CENTRALISED PERT - DISADVANTAGES
Disadvantages:
• Harder to allocate responsibility for multi-domain cases
• No central ticketing system
• Harder to find and track cases
• Duplication of information possible
• End-users only supported if their organisation implements a
PERT
22
THE FEDERATED PERT
The federated PERT will combine the best features of the
centralised and de-centralised PERT.
• Consists of:
• Well-funded and resourced central PERT
• National, regional and perhaps campus PERTs
• Should encourage spread of PERT as a NOC function
• Central PERT available to help users whose organisation
does not have its own PERT
23
ORGANISATION OF THE FEDERATED PERT
The Federated PERT will be made up of:
• A central PERT
• Provided by DANTE
• Staffed by dedicated engineers
• National (NREN) PERTs
• Regional PERTs
• For those countries whose networks are regionalised
• Campus and Institution PERTs
24
PERT BASICS
Expected that in most cases, the PERT will be part of the
NOC
No organisation should have to recruit extra personnel to staff
the PERT.
Advantages of setting up a PERT:
• Performance problems dealt with more quickly and efficiently
• Support from better resourced / more experienced PERTs,
e.g. DANTE,SWITCH PERT, etc.
25
Module 2: The Federated PERT: Structure,
Workflow and Procedures
TYPES OF PERT PROCEDURE
PERT procedures fall into two categories:
• Operational procedures
• For investigating PERT issues
• Embody best practice
• Mainly optional
• Management procedures
• For setting up and managing a PERT
• Specify how PERTs should interact with customers and other PERTs
• Mainly mandatory
• PERTs must comply to join federated structure
27
CREATION OF MANAGEMENT PROCEDURES
How are the management procedures being devised?
• Some procedures evolved from first discussions about postGN2 PERT (June 2006)
• Some procedures presented in ‘Description of a a Decentralised PERT’ deliverable (March 2007)
• Some procedures formulated recently following GN2-SA3
mailing list discussions
All procedures presented today are drafts.
We expect and welcome suggestions for changes.
28
TYPES OF PERT (1)
Campus or Institutional PERTs
• May have some networking expertise, but likely to have only one
or two staff working “as and when”
• Unlikely to have extensive international contacts
• May be poorly-placed to resolve multi-domain networking issues
Regional PERTs
• May have more networking expertise
• May have more staff and contacts
29
TYPES OF PERT (2)
NREN PERTs
• Likely to have considerable networking expertise
• May be well resourced and may have extensive international
contacts
Central GN2 PERT
• Extensive networking expertise and contacts
• One or two full-time staff
30
PERT TERMINOLOGY
Local PERT
• The PERT closest to the user
• The PERT that an issue is first reported to
• Could be a campus or institutional PERT, but may also be a
regional or NREN PERT
Parent PERT
• A PERT that has associated PERTs below it in the federated
structure
• E.g. an NREN PERT that supports regional PERTs
Child PERT
• A PERT that has a parent
• E.g. a campus PERT that is supported by an NREN PERT
31
PERT FUNDAMENTALS
Key to the federated PERT is communication.
• WHO to contact.
• HOW to contact them.
Parent and child PERTs must inform each other of
organisational changes.
32
ADMINISTRATIVE STRUCTURE
GN2 Central
PERT
NREN X’s
PERT
NREN Y’s
PERT
Region 1’s
PERT
Campus
A’s
PERT
Campus
B’s
PERT
Campus
C’s
PERT
Campus
D’s
PERT
NREN Z’s
PERT
Region 2’s
PERT
Campus
E’s
PERT
Campus
F’s
PERT
Campus
G’s
PERT
Campus
H’s
PERT
33
REGISTER OF PERTS (1)
Each PERT must make a simple statement of its capabilities:
• Hours of operation
• Response time (SLA)
• Contact information
• Etc.
Each parent PERT must maintain a register of any ‘children’
that it has, including their statement of capabilities
• E.g. A regional PERT should keep a register of any campus
or institutional PERTs under it.
34
REGISTER OF PERTS (2)
The PERT registers:
• Should be made available to the PERT community.
• Should speed up the process of contacting the right person in
the course of a PERT investigation.
35
LOCAL LIST OF CONTACTS
You should also hold a local list of contacts in neighbouring networks.
Child
PERT
Federated
PERTs
Parent
PERT
Neighbour
NOC
Your
PERT
Child
PERT
Neighbour
NOC
36
TRANSFERRING PERT CASES
A child PERT needs to decide whether or not to transfer a
PERT case to its parent
• Directly after the basic case information is collected
• At appropriate points during the investigation
The decision will depend upon:
• The scope and nature of the issue
• Examples: single or multi-domain, straight-forward or complicated
• The resource available
37
CASE MANAGEMENT AND INVESTIGATION
When a PERT case is transferred, a better-placed PERT
becomes responsible for managing the case to resolution.
Transfers can be from:
• Child to parent, or
• Parent to child.
A PERT can also ask another PERT to assist in investigating
a case without transferring management responsibilities.
• Example: information is needed relating to a specific domain
or a specialised subject
38
TRANSFERRING AND MANAGING CASES
Central
(GÉANT2)
National
(NREN)
Manage
case
Network
Transfer
case
Open
case
User
Campus
Campus
Slow Connection
User
39
TRACKING CASES AND RECORDING INFORMATION
Each PERT should:
• Track cases and capture case status via:
• A ticket system
• Bulletin board
• wiki
• Email (for very small PERTs only)
• Implement a method of recording information
• E.g. Lessons learned from cases
• Could be done via a wiki
• Should ideally be possible to retrieve information via search facility
40
Start
FEDERATED PERT WORKFLOW
User reports issue
Local PERT
collects basic
information and
opens case
Is this PERT best
placed to manage the
issue?
N
Transfer to parent
PERT
Open case and
investigate
Y
Continue
investigation
Is the issue
resolved?
N
Is this PERT best
placed to manage
the issue?
N
Should / can we
transfer the issue to
the parent PERT?
Y
Y
Record and
communicate
outcome
End
Y
Continue
investigation
Is the issue
resolved?
N
Y
Record and
communicate outcome
End
N
Transfer
to child
PERT
FORMALISING FEDERATED PERT PROCEDURES
How will the draft procedures for the federated PERT be
formalised?
• We would like your feedback later today
• You can use the [email protected] mailing list to propose
and review new and changed procedures
• All NREN Access Port Managers were invited to subscribe to the list
• All others involved in the PERT are also invited to join:
– Mail [email protected]
42
TOWARDS THE FEDERATED PERT
The federated PERT is due to come into operation in
September 2008.
The policy for the federated PERT is a GN2 deliverable. It
will be presented to the GN2 Executive committee for
approval in June 2008.
43
Module 3: Setting up a PERT
PERT ROLES AND RESPONSIBLITIES (1)
At a minimum, the following roles are required:
• Administrator
• A named person, responsible for communicating PERT contact details
to parent PERT and users
• Likely to be a part-time role or best efforts
• Technician(s)
• Responsible for receiving, investigating and, if necessary, escalating
cases
• Part-time or full-time role or best efforts
45
PERT ROLES AND RESPONSIBILITIES (2)
In addition to an administrator and technicians, NREN and
Regional PERTs are strongly recommended to have:
• PERT Manager
• Named individual
• Has overall responsibility for the PERT
• Point of escalation
• Part-time role
• The manager may also be the technician and / or the administrator
• Deputy PERT Manager
• Responsible for the PERT in the manager’s absence
46
SKILLS AND EXPERIENCE REQUIRED (1)
Technical group:
• Qualification and / or experience in network management
• Good knowledge of TCP/IP
• Knowledge of Ethernet and / or other relevant data-link /
physical layer protocols
• Good knowledge of own network's topology, policies and
configuration
47
SKILLS AND EXPERIENCE REQUIRED (2)
Pert Manager / Deputy Manager
• Managerial / supervisory skills
Administrator
• Good communication skills
• Co-located, or in regular contact with, rest of team
All team members
• Competent written and spoken English
• Required for national PERTs; encouraged for other PERTs
48
RECOMMENDED TRAINING AND DEVELOPMENT
The following training and development is recommended:
• Setting up and Operating a PERT (workshop)
• Frequency depends upon demand and resource
• ‘In-house’ training
• Informal, NREN-led training
• Self-training
• Using the PERT knowledge base, and reading the articles and papers
it recommends
• Discussion groups:
• [email protected] for all network performance issues
49
MINIMUM REQUIREMENTS FOR NEW PERTS (1)
Register with the parent PERT, specifying:
• Full, official name of the organisation
• Description of organisation’s purpose or link to web-site
• Proposed PERT name
• Administrative point of contact details:
• Name, address, email address, telephone number
• Technical team’s contact details
• PERT’s normal working hours and public holidays
– Continued on next slide
50
MINIMUM REQUIREMENTS FOR NEW PERTS (2)
Register with the parent PERT, specifying:
• Internet footprint
• List of ASes (and / or a whois AS-macro) and assigned address
ranges.
• The primary language and other supported languages
• English must be supported for a national or international PERT
Local list of points of contact for neighbouring networks
The PERT may also optionally specify any special skills or
services that it can offer (e.g. interpretation of pcap files)
51
MINIMUM REQUIREMENTS FOR NEW PERTS (3)
Register with the parent PERT, specifying:
• The PERT’s grade, dependent upon response time to new
issues
• Grade 1 PERT: not more than 7 working days
• Grade 2 PERT: not more than 5 working days
• Grade 3 PERT: not more than 3 working days
• Grade 4 PERT: not more than 1 working days
• Grade 5 PERT: not more than 4 working hours
• Grade 6 PERT: not more than 2 working hours
– Continued on next slide
52
PROVISIONAL STATUS
New PERT can choose provisional status.
Provisional PERTs are the same as normal PERTs, except:
• Their reaction times (as per grade) are targets, not
commitments.
• They may choose not to publicise their existence to their
users.
53
DESIGN YOUR PERT (1)
Choose what size to make your PERT, depending on:
• Size of your organisation
• Scope of network
• Existing skills available
• Availability of new or re-allocated resource
• No need to allocated new resource
• Basic PERT should be able to operate within existing budgets
• Choose a name
• Should reflect the scope of your PERT’s responsibilities
54
DESIGN YOUR PERT (2)
Identify PERT non-working days
Choose your normal hours of cover
Choose your grade (response time)
• Choose from grades 1 to 6
• Grade 1 = Not more than 7 working days
• Grade 6 = not more than 2 working hours
Decide whether you will be provisional
55
BUILD YOUR PERT (1)
Appoint the PERT administrator
• Should be a named individual
Appoint the technical group
Appoint a PERT Manager and deputies if required
56
BUILD YOUR PERT (2)
Determine technical team contact details
• Group mail address
• Telephone
Determine public contact details
• E-mail
• Web-form
• Telephone
57
BUILD YOUR PERT (3)
Determine what, if any, network monitoring systems or tools
will be available to:
• Parent / child PERTs
• PERTs in general
• The public at large
Choose case tracking system for managing open cases
Choose case history system for recording lessons learned
58
CONTACT YOUR PARENT PERT (1)
(Usually a parent PERT will contact the child PERT first)
Administrator should give the parent PERT the following
information:
• Administrator's name, address and telephone contact details
• The organisation’s full official name
• A brief description of its purpose
• The proposed PERT name
• Name and contact details of the PERT manager and deputies
59
CONTACT YOUR PARENT PERT (2)
Administrator should give the parent PERT the following
information:
• Contact email address and telephone number of the
technical team (preferably contact details for individuals)
• The PERT’s hours of operation
• The PERT’s internet footprint
• The primary language and other supported languages
• Any special skills or abilities the PERT has to offer, including
tools
60
RAISING THE PROFILE OF YOUR PERT
Publicise your PERT
• Make sure your organisation’s helpdesk are aware of the
PERT and know that network performance issues should be
routed to it.
• Contact key users
• E.g. International GRID projects
• Create a web page, advertising your services and displaying
contact information
• Place links on other relevant web-pages
61
MANAGING USER EXPECTATAIONS
Issues may be ‘simple’. Examples:
• Ethernet duplex mismatch
• TCP buffers
• Sub-optimal routing
However, once ‘simple’ solutions are ruled out, the only
solution left may be to upgrade hardware or software.
This may, or may not, be possible.
62
Module 4: PERT Systems and Procedures
THE PERT KNOWLEDGE BASE
Purpose - To act as a first point of reference for network
performance issues
Objectives
• Provide overview of network performance factors
• Give guidance on optimising networked systems
• Offer practical advice on using common monitoring and
investigative tools
• Host links to other relevant websites, including source
material
64
PERT KNOWLEDGE BASE OVERVIEW
Delivered as SA3 Work Item 3 (lead by SWITCH)
A Twiki site, hosted by SWITCH
URL http://kb.pert.geant2.net
Growing in popularity
65
PERT KNOWLEDGE BASE VISITS
66
PERT KNOWLEDGE BASE FRONT PAGE
67
PERT KNOWLEDGE BASE CONTENTS
Performance Basics, and Metrics
Network and application protocols (particularly TCP)
PERT Tools
• General purpose tools (ping, shell commands)
• Measurement tools (traceroute-like, active, passive)
• NREN tools and statistics
System tuning (application, end-system, network)
Networking Technologies and "evil middle boxes"
PTS Case Histories and other Case Studies
68
SEARCHING THE PERT KNOWLEDGE BASE
Enabled for Google indexing
Uses Google custom search
69
CONTRIBUTING TO THE PERT KNOWLEDGE BASE
All are encouraged to contribute to the PERT KB
Simply select 'Register' from the front page, left hand side
Editing is simple…
• Existing pages or text can be used as templates
• Full version history is stored – no risk of unrecoverable errors
70
THE FUTURE PERT KNOWLEDGE BASE
Central PERT Knowledge Base, maintained by SWITCH,
likely to remain
National PERTs are welcome to set up local PERT
Knowledge Bases in native languages
• Should focus on relevant local user issues
• Should not attempt to translate the whole of the central PERT
Knowledge Base
71
DEMONSTRATIONS
Demonstrations:
• Searching the PERT Knowledge Base
• Registering on the PERT Knowledge Base
• Adding and Amending information
72
THE PERT TICKET SYSTEM
The central GN2 PERT will use the central PERT Ticket
System (PTS)
Other PERTs may use their own ticketing systems, for
example:
• Existing NOC ticketing systems
• Cheap and quick to implement
• May adversely affect incident closure statistics, since PERT cases
often have long durations
• New proprietary or open source system
73
THE PURPOSE OF THE PERT TICKET SYSTEM
The purpose of the PERT Ticket System (PTS) is:
• To provide a simple and secure method for users to open and
follow PERT cases.
• To maximise the efficiency of the subsequent PERT
investigation.
74
PERT TICKET SYSTEM BASICS
The current PERT Ticket system:
• Is a bespoke development by PSNC.
• Full title: ‘PSNC-PTS’
• Is a web-based application (no client software is necessary).
• Features secure access by username and password or X509
certificate.
• Holds ticket details specific to IP network performance
issues.
PTS v2 just released
75
PERT TICKET SYSTEM USERS
Anyone can create themselves a PERT account, and report a
performance issue.
Primary users (NRENs, international R&E projects) can
request X509 certificates for simplified access.
PERT staff (Case Managers and Subject Matter Experts) can
view and update all cases.
PERT / PTS Managers can edit user accounts and update the
CM duty roster.
76
PTS TICKET LIFECYCLE (1)
User creates new ticket
• User prompted for detailed technical information, but details
are not compulsory
• State - "waiting for acknowledgement"
PERT Case Manager acknowledges ticket
PERT engineers (and potentially end-users) add notes to the
ticket
• Notes can be text and / or attachments
• State - "waiting for PERT / user / third party action“
77
PTS TICKET LIFECYCLE (2)
Once problem is resolved
• PERT CM adds a resolution note
• State "Resolution Proposed"
Once resolution is approved
• State "Closed“
78
PTS VERSION 2: ADVANCED FEATURES
Secret notes
• For the secure storage of passwords
Categorised and prioritised notes
• For quick filtering
Web Service
• PTSv2 offers a Web Service for the submission of tickets (in
the appropriate XML format)
79
THE FUTURE OF THE PERT TICKET SYSTEM
The PSNC-PTS will be used by the central PERT for the rest
of GN2.
Depending on interest levels:
• PTS may be offered to national PERTs for their own use
• PTS may be supported and further developed
80
ACTIVITY
Exercise
• Handling PERT Tickets in PTS
81
Module 5: Feedback on the Federated PERT
ACTIVITY
Feedback
• Please fill in a feedback form
83