Network monitoring approaches (2)

Download Report

Transcript Network monitoring approaches (2)

Monitoring network services
Pavle Vuletić
SGA-2 JRA2T4 Task Leader, GÉANT Project
JRA2T4 face-to-face meeting in Poznan
November 30th 2016
Networks ∙ Services ∙ People
www.GÉANT .org
Network service Key performance metrics
• MEF metrics (10.3)
•
•
•
•
•
•
•
•
One-way Frame Delay
One-way Mean Frame Delay
One-way Frame Delay Range
One-way Inter-Frame Delay
Variation
One-way Frame Loss Ratio
One-way Availability
One-way Resiliency
One-way Group Availability
• All these metrics can be
obtained from an owamp-like
tool or by simple
timespamping and comparing
Networks ∙ Services ∙ People
www.GÉANT .org
• Y.1540 (IP) metrics
• IP packet transfer delay
• Mean, min, max
• End-to-end 2-point IP packet delay
variation
• IP packet error ratio
• IP packet loss ratio
• Spurious IP packet rate
• IP packet reordered rate
• IP packet severe loss block ratio
• IP packet duplicate ratio
• Replicated IP packet ratio
• Capacity metrics
• Capacity, transfered bits available
bandwidth, section capacity, variability of
capacity
• IP service availability
2
Network monitoring approaches (1)
• Passive (SNMP, reading from NE, reading from EMS)
⁺ Suitable for capacity, used bandwidth and packet error metrics (read from
devices)
⁺ Suitable in single-domain environment
⁺ No additional traffic, no (significant) interference with the other network
traffic
⁺ Support for fault localization
⁻ Not suitable for delay/jitter/loss metrics
⁻ Problems in multiple domains,
⁻ Problems in multi-vendor environments
⁻ Problems with services which dynamically change path (e.g. MPLS based
VPNs)
Networks ∙ Services ∙ People
www.GÉANT .org
3
Network monitoring approaches (2)
• Active (injecting special purpose network traffic)
⁺
⁺
⁺
⁻
Suitable for delay/jitter/loss metrics
Suitable for monitoring in multiple domains
No problems with dynamic path changing
Not suitable for capacity and available bandwidth monitoring (very intrusive
and not reliable results)
⁻ Injected traffic might not have the same conditions as the monitored service
traffic
⁻ Not suitable for chained services and fault localization
• Can be done:
• From NE – there are methods only for specific network services (e.g. 802.1ag
Connectivity Fault Management - CFM)
• From dedicated external devices – OK for all services except p2p services (no
place to inject traffic)
Networks ∙ Services ∙ People
www.GÉANT .org
4
Network monitoring approaches (3)
• Out-of-band/Network visibility
⁺ Became popular recently (Brocade Packet Brokers and Visibility Manager, Ixia
IxVision architecture - taps and packet brokers, Accedian smart SFPs and
Packet Brokers,..)
⁺ Allows various types of analyses (performance, security, per flow, per service
instance,...)
⁺ Allows all types of performance metrics for all types of services (just filter the
appripriate field in the header)
⁺ Enables fault localization
⁺ There are virtual taps for „inside data centre“ monitoring
⁻ Multiple copies of tapped traffic have to be transported to central facility –
smart sampling is required if central facility is far from taps
⁻ Not very suitable for WANs – How to transport tapped traffic and not create a
copy of the existing network? Target use: data centres, security verification,
mobile network monitoring.
⁻ Privacy issues!!!
Networks ∙ Services ∙ People
www.GÉANT .org
5
What we’ve done so far?
• Agreed to try the network visibility approach as the one which supports fault
localization and the widest set of services
• Proved in GTS that this is doable for VLAN and MPLS based network services
• Understood key architecture elements:
•
•
•
•
Packet capturers and filters
Domain controllers/capture aggregators
Inter-domain exchange
Analysis modules
• And now we have to design these elements.
Networks ∙ Services ∙ People
www.GÉANT .org
6
What is on the market?
• Similar systems appeared recently: We are on a right track, but will we be
able to make something that will be more cost effective than those
solutions?
• Some examples:
• Ixia IxVision
• Brocade Visibility architecture
• Accedian FlowBrocker
Networks ∙ Services ∙ People
www.GÉANT .org
7
Ixia IxVision
• Taps, Packet Brokers and Analisys tools
Networks ∙ Services ∙ People
www.GÉANT .org
8
Ixia Taps
• Passive
• Do not modify traffic
• Decrease optical signal strength
• High density: in chassis, for data centre
monitoring
• Prices:
• IXIA Flex tap 10G - 809$
• IXIA Flex Tap 40G LC - 1499$
• IXIA Flex Tap 100G - 629$ (LC)
Networks ∙ Services ∙ People
www.GÉANT .org
9
Ixia Virtual taps
• Virtual taps (phantom taps) for inter VM tapping for VMWare ESXi,
Microsoft Hyper-V, KVM
Networks ∙ Services ∙ People
www.GÉANT .org
10
Ixia Packet Brokers
• Take tapped traffic, filter it and
distribute to the set of recipients
• Network monitoring
• Security
• Forensics
• Various sizes and port densities
• Customizable for various purposes
(filters)
Networks ∙ Services ∙ People
www.GÉANT .org
11
Ixia Use Cases – Network Security
Networks ∙ Services ∙ People
www.GÉANT .org
12
Ixia Use Case – Proactive monitoring
• Continuous SLA and Experience
Validation
• Typical deployments consist of
• software and/or hardware
active endpoints,
• emulated application traffic,
• simple web based
management and monitoring
interface
• XRPi Probe – Based on Rpi
• e2e tests, application tests
Networks ∙ Services ∙ People
www.GÉANT .org
13
Brocade visibility architecture
Networks ∙ Services ∙ People
www.GÉANT .org
14
Brocade packet brokers
• Target use case – mobile - 5G networks
• SDN based control of packet brokers
• Customizable,...
Networks ∙ Services ∙ People
www.GÉANT .org
15
Accedian Flow Brocker Architecture
• Accedian has their own
performance modules - packet
capturers (smart SFPs, NIDs) which
do the timestamping and filtering
(packet slicing)
• VCX controls capturers, sets filters
and gets the captures traffic
• Brokered flows are sent to further
analysis depending on the purpose
• Brokered flows << 10% of the
original bandwidth
Networks ∙ Services ∙ People
www.GÉANT .org
16
Accedian use cases and filtering
• Accedian use cases:
•
•
•
•
Video over LTE
Video QoE
Financial Compliance and Trade flow analysis
Security and Policy
Networks ∙ Services ∙ People
www.GÉANT .org
17
New activity within JRA2
• E-line service : Multidomain ethernet based p2p link
• Configuration (openNSI based C&A system) – T1
• User portal, service&resource inventory – T2
• Monitoring service - T4
• Deadline – March 2017
• Playground – GTS testbed
Networks ∙ Services ∙ People
www.GÉANT .org
18
Monitoring any line deployed with openNSA
Networks ∙ Services ∙ People
www.GÉANT .org
19
Goals of this f2f meeting
• Agree on the JRA2T4 strategy
• Position our task against the existing tools from the market
• Develop everything or buy something and develop the rest?
• Our solution has to be leaner, smarter, faster, less intrusive, than the
competition, well tailored for our environment, integrated with C&A tools...
• (we always have our unique multi-domain environment which is not covered
by those tools, but it would be better if we had other reasons as well – cost
effectiveness, specific service monitoring, smart chained service monitoring
etc…)
• Get the whole team on the same page on all elements of the architecture
• Set the path for the architecture elements
• Define deadlines for the milestone - architecture
Networks ∙ Services ∙ People
www.GÉANT .org
20
The Agenda
• Day 1
• 09:00 - 09:15 PV: Welcome, Agenda
• 09:15 - 09:45 PV: Summary of T4 goals, work done so far, key performance metrics,
existing commercial tools
• 09:45 - 13:00 All: Discussion about the T4 strategy and final product (coffee break
around 10:30)
• (13:00 - 13:45 Lunch)
• 13:45 - 14:30 HW: Network tap - filter architecture
• 14:30 - 15:15 PM: Domain collector architecture
• (15:15 - 15:30 Coffee break)
• 15:30 - 16:15 DS, TA: Correlation of the measurement data with network topology
• 16:15 - 17:00 KG: Correlation of the measurement data with the service
information
• Day 2
• 09:00 - 12:30
• PMV Architecture - decisions and plan for the Milestone
• Coffee break around 10:30
Networks ∙ Services ∙ People
www.GÉANT .org
21
Thank you and any questions
Networks ∙ Services ∙ People
www.GÉANT .org
© GÉANT Limited on behalf of the GN4 Phase 1 project.
The research leading to these results has received funding from the European Union’s Horizon 2020 research and innovation programme under Grant Agreement No. 691567 (GN4-1).
Networks ∙ Services ∙ People
www.GÉANT .org
22
MA conf (1)
⁺ Smart and configurable taps and filters
⁺ Switches already in place
⁻ Change to the topology, SDN switch failures, lower availability of the
service
Networks ∙ Services ∙ People
www.GÉANT .org
23
MA conf (2)
⁺ No changes to the topology, taps are passive
⁺ All packets captured & smart filtering
⁻ A lot of additional equipment
Networks ∙ Services ∙ People
www.GÉANT .org
24
MA conf (3)
⁺ Only taps added
⁺ No canges to the topology
⁻ Number of server ports in domains with a lot of STPs and SDPs
Networks ∙ Services ∙ People
www.GÉANT .org
25
MA conf (4)
⁺ No additional equipment
⁻ Lost packets, additional burden for switch, suitable for low traffic services
Networks ∙ Services ∙ People
www.GÉANT .org
26