diploma presentation-v.0.62

Download Report

Transcript diploma presentation-v.0.62

ΔΙΠΛΩΜΑΤΙΚΗ ΕΡΓΑΣΙΑ
Service Level Agreement
A Service Level Agreement (SLA) is a formal definition of the
relationship that exists between a service provider and its customer.
A SLA can be defined and used in the context of any industry, and is
used to specify what the customer could expect from the provider, the
obligations of the customer as well as the provider, performance,
availability and security objectives of the service, as well as the
procedures to be followed to ensure compliance with the SLA.
Service level agreements are often used when corporations outsource
functions considered outside the scope of their own core competencies
to third party service providers.
Service Level Agreement
A service level agreement would typically contain the following information:
A description of the nature of service to be provided
The expected performance level of the service, specifically its reliability and
responsiveness
The procedure for reporting problems with the service
The time-frame for response and problem resolution
The process for monitoring and reporting the service level
The consequences for the service provider not meeting its obligations
Escape clauses and constraints
Not all of the components of a SLA may be present in all contracts, but a good
SLA would provide an overview of the different items that can go wrong with
the provided service, and attempt to cover those situations as part of the SLA
agreement.
Network Performance Metrics
 One-Way Delay (OWD)
 Round-Trip Time (RTT)
 Delay Variation or "Jitter" and RTT Variation
 Packet Loss – Interfaces error and drops
 Maximum Transfer Unit (MTU)
 Path MTU
 Link Utilization – IP bandwidth utilization and achievable TCP throughput.
One-Way Delay
One – Way Delay
per-hop One-Way Delay
per-link delay:
per-node delay:
• propagation delay
• forwarding delay
• serialization delay
• queuing delay
• additional delays
Delay Variation (“Jitter”)
Delay Variation
 Describes the level of disturbance of packet arrival times
 Comparison to “ideal” pattern
IP Delay Variation Metric (IPDV) (RFC 3393)
 Delays for equally-sized packets
 Delay depends on packet size due to serialization delay
 Critical for real time applications (audio/video)
Caused by:
 Queuing on routers (especially on CPU-based router architectures)
 Collision avoidance (shared Ethernet)
 Link-level retransmission (802.11 wireless LANs)
Round-Trip Time & PacketLoss
Round-Trip Time (RTT)
 A to B one way delays +
B to A one way delays +
time for B to respond to A.
Packet Loss
One-way Packet Loss Metric for IPPM (RFC 2680)
Caused by:
 Congestion : severe congestion overflows queues and leads to packet
drops (gradually or burst).
 Errors: corruption, packets modified in-transit (noisy lines etc.),
checksum failure on receiving end.
Packet Reordering, MTU & Performance
Packet Reordering
Caused by:
 Alternative routes
 Router internal parallelism
 Packet size
Maximum Transfer Unit (MTU)
Common MTUs:
 1500 bytes (Ethernet, 802.11 WLAN)
 4470 bytes (FDDI, common default for POS and serial links)
 9000 bytes (Internet2 and GÉANT convention, limit of some Gigabit Ethernet adapters)
 9180 bytes (ATM, SMDS)
Path MTU
 The MTU supported by a path
 The minimum of MTUs of links along the path
Performance
 TCP / STCP applications might have performance impact
 Real-time media applications experience more serious problems
Results Requirements
 Be able to monitor the services deployed
 IPv4/IPv6.
 Multicast/unicast.
 IP QoS.
 VPN/point-to-point connections.
 Emulate behavior close from the one from the application used.
 Different tools used within each networks
 Need to abstract the data from the type of measurement tools used
through a well define interface.
 Inter-operability between tools.
Measurement Tools
 Traceroute-like Tools :
 traceroute, MTR, PingPlotter, lft, tracepath, traceproto
 Bandwidth Measurement Tools :
 pchar, Iperf, bwctl, Netperf, RUDE/CRUDE, ttcp, NDT, DSL Reports
 Active Measurement Boxes :
 DFN/GEANT2 HADES (formerly IPPM)
 RIPE TTM
 RENATER QoSMetrics
 Passive Measurement Tools :
 SNMP Device Polling: MRTG, Cricket
 NetFlow-based: flow-tools etc.
 Packet Capture and Analysis Tools: tcpdump, Wireshark/Ethereal, jnettop
Measurement Tools
 OWD, OWPL, IPDV, traceroute –DFN IPPM
 IPv4, IPv6, IP QoS, on-demand.
 But also RIPE TTM for IPv4 and IPv6. http://www-win.rrze.uni-erlangen.de/ippm/
 TCP/UDP throughput –I2 BWCTL/iperf
 IPv4, IPv6, on-demand. http://abilene.internet2.edu/observatory/data-views.html
 IP link utilization, link capacity, interface errors, interface drops – from existing DB.
 IPv4, IPv6, (multicast?)
 On-demand.
 Netflow –under investigation
 IPv4, IPv6.
 Info (working document):
http://monstera.man.poznan.pl/wiki/index.php/JRA1_D3.4_Flow_Monitoring
 Packet capture tools – HW: 10Gbps DAG cards, SW: Scampi framework.
 Info (working document): http://monstera.man.poznan.pl/wiki/index.php/Passive
 FYI: Global Performance Measurement Points directory
 Info: http://e2epi.internet2.edu/pipes/pmp/pmp-dir.html
PerfSONAR System
 perfSONAR
(Performance focused Service Oriented Network
monitoring ARchitecture) system
 Is a joint effort of GÉANT2-JRA1, Internet2, and ESnet
 The solution is deployed and further elaborated in
 European Research Backbone Géant
 Connected European National Research and Education Networks
 Internet2’s Abilene network
 ESnet (Energy Sciences network in US)
 RNP (Brasilian NREN)
 Open source development also for other interested networks
 Name reflects the choice of Service Oriented Architecture
PerfSONAR System
The Choice of Service Oriented Architecture
 Reasons for Service Oriented Architecture in the middle layer
(“Service Layer”):
 Large task can be split into independent “services”
 Can be developed separately
 Easier to maintain afterwards
 Services can be added/dropped at runtime
 Flexibility of deployment (e.g. NREN may use GEANT
Lookup Service to advertise services)
 Different implementations possible (e.g. using different
programming languages)
PerfSONAR System
Services
 Measurement Point Service (MP)
 Measurement Archive Service (MA)
 Lookup Service
 Allows the client to discover the existing services and other LSservices.
 Dynamic: services registration themselves to the LS and mention their capabilities,




they can also leave or be removed if a service gets down.
Authentication Service (GN2-JRA5)
 Authentication functionality for the framework
 Users can have several role, the authorisation is done based on the user roles.
 Trust relationship between networks
Transformation Service
 Transform the data (aggregation, concatenation, correlation, translation, etc).
Topology Service
 Make the network topology information available to the framework.
 Find the closest MP, provide topology information for visualisation tools
Resource protection Service
 Arbitrate the consumption of limited resources
GENERAL CASE
LOCAL BWCTLD
UNAVAILABLE
• What Is It?
A resource allocation and scheduling daemon for arbitration of iperf tests
• Bwctl controls the throughput tests by adding resource allocation and
scheduling policy controls.
• Problem Statement
• Users want to verify available bandwidth from their site to another.
• Methodology:
Verify available bandwidth from each endpoint to points in the
middle to determine problem area.
• Implementation
• Applications
• bwctld daemon
• bwctl client
• Built upon protocol abstraction library
• Supports one-off applications
•
BWCTL
Throughput Measurement
 Metrics
 Throughput (Mbps)
 Parameters
 "interval" the report interval (bwctl option -i)
 "protocol" either udp or tcp, default is tcp (bwctl option -u for udp)
 "bufferSize" size of read/write buffer (bwctl option -l)
 "windowSize" size of tcp window / udp socket receive buffer (bwctl
option -w)
 "duration" duration of test, default is 10 seconds (bwctl option -t)
 "bandwidth" limits udp send rate (bwctl option -b)
 "ToS" specifies ToS bit (bwctl option -S)
 "login" if authentication is needed . "password" dito Methods
 On-demand testing with php-based BWCTL-client (web-GUI)
[ 15] local 147.102.13.77 port 5001 connected with 147.102.13.75 port 5001
[ ID] Interval
Transfer Bandwidth
[ 15] 0.0-10.0 sec 116957184 Bytes 93343208 bits/sec
[ 15] MSS size 1448 bytes (MTU 1500 bytes, ethernet)
bwctl: stop_exec: 3469448542.020009
[ 5] local 147.102.13.75 port 5001 connected with 147.102.13.77 port 5001
[ ID] Interval
Transfer Bandwidth
[ 5] 0.0-10.0 sec 116957184 Bytes 93538771 bits/sec
[ 5] MSS size 1448 bytes (MTU 1500 bytes, ethernet)
bwctl: stop_exec: 3469448541.018142
• What Is It?
OWD or One-Way PING
• A control protocol
• A test protocol
• A sample implementation of both
Why the OWAMP protocol?
• Find problems in the network
• Congestion usually happens in one direction first…
• Routing (asymmetric, or just changes)
• SNMP polling intervals mask high queue levels that active probes
can show
• There have been many implementations to do One-Way delay over the
years (Surveyor, Ripe…)
• The problem has been interoperability.
• http://www.ietf.org/internet-drafts/draft-ietf-ippmowdp-014.txt
•
•
•
OWAMP Control protocol
• Supports authentication and authorization
• Used to configure tests
• Endpoint controlled port numbers
• Extremely configurable send schedule
• Configurable packet sizes
• Used to start/stop tests
• Used to retrieve results
• Provisions for dealing with partial session results
OWAMP Test protocol
• Packets can be “open”, “authenticated”, or “encrypted”
Sample Implementation
Applications
• owampd daemon
• owping client
Built upon protocol abstraction library
• Supports one-off applications
• Allows authentication/policy hooks to be incorporated
--- owping statistics from [dhcp-75.netmode.ece.ntua.gr]:59382 to [147.102.13.77]:35770 --SID: 93660d4dcecb984136ad1d045d58ef75
first: 2009-12-10T17:54:42.364
last: 2009-12-10T17:54:52.998
100 sent, 0 lost (0.000%), 0 duplicates
one-way delay min/median/max = -1.31/-1.2/-0.642 ms, (err=2.86 ms)
one-way jitter = 0.1 ms (P95-P50)
TTL not reported
no reordering
--- owping statistics from [147.102.13.77]:56641 to [dhcp-75.netmode.ece.ntua.gr]:51684 --SID: 93660d4bcecb98413cf85be0ccbf222f
first: 2009-12-10T17:54:42.386
last: 2009-12-10T17:54:53.041
100 sent, 0 lost (0.000%), 0 duplicates
one-way delay min/median/max = 1.72/2.1/13.9 ms, (err=2.86 ms)
one-way jitter = 6.3 ms (P95-P50)
TTL not reported
no reordering
Implementation
Set Up of BWCTL and OWAMP daemons (bwctld and owampd) that run
constantly in the background listening and accepting incoming measurement
connections
Scheduling with cron to conduct measurements using owping and bwctl to a
specific target host every 5 minutes
Measurement data are collected and stored in a RRD DataBase and in a MySQL
DataBase
Using Apache Tomcat a Graphical user is provided that exhibits the latest current
measurement results and the ability to dynamically select the measurement
date is offered
We utilize the behavior prediction algorithm of RRDTool to predict future
measurement behavior and to ensure SLA conformance
Implementation
Implementation
Implementation
Implementation