What is QoS?

Download Report

Transcript What is QoS?

CAS CS 835
QoS Networking Seminar
What to expect?
• Discuss latest developments and research issues,
current and possible future QoS/CoS technologies
• Queue management, traffic monitoring/shaping,
admission control, routing, …
• Focus on Internet extensions and QoS-sensitive
protocols/applications
• Integrated Services and RSVP
• Differentiated Services
• Multi Protocol Label Switching
• Traffic Engineering (or QoS Routing)
What to expect?
• Proposal:
pre-proposal, full proposal, presentation
• Paper presentation
• 2 quizzes
• Active participation
What is QoS?
• A broad definition: methods for differentiating traffic
and services
• To some, introducing an element of predictability and
consistency into a highly variable best-effort network
• To others, obtaining higher network throughput while
maintaining consistent behavior
• Or, offering resources to high-priority service classes at
the expense of lower-priority classes (conservation law)
• Or, matching network resources to application demands
What is Quality? Service? Why Integrated?
• Quality encompasses data loss, induced delay or latency,
consistency in delays (jitter), efficient use of network
resources, …
• Service means end-to-end communication between
applications (e.g., audio, video, Web browsing), a
protocol type (e.g., TCP, UDP), …
• A single network is better --- unused capacity is available
to others, one network to manage
• Better to give each user/traffic what it wants
• QoS mechanisms for unfairness: managing queuing
behavior, shaping traffic, control admission, routing, …
• Engineering the network is hard, and over-engineering it
is expensive (especially for long-distance links)
Why the recent interest in QoS?
• No longer enough for an ISP to keep traffic flowing, links
up, routing stable --- offer QoS to be more competitive
• QoS is becoming more cost-effective with improved
implementation of differentiation tools (classification,
queue-manipulation, …), more reliable tools for
measuring delivered QoS
• Hard to dimension the network as it is becoming
increasingly hard (if not impossible) to predict traffic
patterns (e.g., 80 local/20 remote rule no longer reliable,
now mostly reversed 25/75)
• If you throw more bandwidth, there will be more demand
(a “vicious cycle”)
Applications of a QoS Network
• Real-time: voice, video, emergency control, stock quotes ...
• Non-real-time (or best-effort): telnet, ftp, …
• Real-time:
- hard with deterministic or guaranteed QoS: no loss, packet
delay less than deadline, difference in delays of 2 packets
less than jitter bound, …
Note: reducing jitter reduces buffers needed to absorb delay
variation at receiving host
- soft with statistical or probabilistic QoS: no more than x%
of packets lost or experience delay greater than deadline, …
• Best-effort do not have such timing constraints
Why end-to-end control not enough?
• Problem: with common FCFS schedulers at routers, delay and delay
variance increase very rapidly with load
• For an M/M/1 model:
average delay = 1 / [ServiceRate - ArrivalRate]
= 1 / [ServiceRate (1 - Load)]
delay variance = 1 / [ServiceRat e 2 (1 - Load)]
• As load increases, buffer overflows and router starts dropping packets
• Solution: TCP reduces load (slow start and congestion avoidance
algorithm)
• 2 TCP users on different hosts sharing the same bottleneck may get
different share of the bandwidth (uncontrolled unfairness) ==>
users should not trust network
• Some users may not “play by the rules” and reduce their sending rates
upon congestion, i.e. not TCP-friendly sources like a voice or video
UDP-based application ==> network should not trust users
How to provide guarantees?
• Some form of resource reservation within the network;
hosts can’t “hide” delays
• Challenge: do it asynchronously (i.e., as needed), don’t
reserve peak rate for a voice connection/flow active 30% of
the time
• Want to support as many real-time flows as possible to
increase revenue
• Key question:
Can the network admit a new flow at its requested QoS,
without violating QoS of existing flows?
• A flow has to specify its QoS (Rspec) and also its traffic
pattern (Tspec)
Contract or SLA
• Service Level Agreement between client (subscriber) and
network (provider): the network keeps its promise as long as
the flow conforms to the traffic specification
• The network must monitor/police/shape incoming traffic
• The shape is important: E.g. a gigabit network contracting
with a 100Mbps flow. A big difference between sending one
100Mb packet every second and sending 1Kb packet every
10 microsec.
Traffic Shaping
• Leaky bucket:
- Data packets leak from a bucket of depth “sigma” at a rate
“rho”.
- Network knows that the flow will never inject traffic at a
rate higher than “rho” --- incoming traffic is bounded
- Easy to implement
- Easy for the network to police
- Accommodates well fixed-rate flows (“rho” set to rate),
but does not accommodate variable-rate (bursty) flows
unless “rho” is set to peak rate, which is wasteful
Token Bucket
• Allows “bounded” burstiness
• Tokens generated at rate “rho” are placed in bucket with
depth “sigma”
• An arriving packet has to remove token(s) to be allowed
into the network
• A flow can never send more than [sigma + rho * t] over an
interval of time t
• Thus, the long-term average rate will not exceed rho
• More accurate characterization of bursty flows means better
management of network resources
• Easy to implement, but harder to police
• Can add a peak-rate leaky bucket after a token bucket
Token Bucket
• Example:
Flow A always send at 1 MBps
==> rho = 1 MBps, sigma = 1 byte
Flow B sends at 0.5 MBps for 2 sec, then at 2
MBps for 1sec, then repeats
==> rho = 1 MBps, sigma = 1 MB
We can also this Tspec for Flow A, but that would
be an inaccurate characterization
Link Scheduling
• Challenge: Tspec may change at exit of scheduler (e.g., FCFS)
because of interactions among flows
• FCFS increases worst-case delay and jitter, so we admit less flows
• Solution: non-FCFS scheduler that isolates different flows or
classes of flows (hard, soft, best-effort)
• Emulate TDM without wasting bandwidth
• Virtual Clock provides flows with throughput guarantees and
isolation
• Idea: use logical (rather than real) time
• AR: average data rate required by a flow
• When a packet of size P arrives, VC = VC + P/AR
• Stamp packet with VC
• Transmit packets in increasing order of time stamps
• If a flow has twice AR, it will get double a double rate
Virtual Clock
• If buffer is full, the packet with largest timestamp is
dropped
• Problem: A flow can save up credits and use them to
bump other flows in the future
• Fix: when a packet arrives, catch up with real time first
VC = MAX (real time, VC)
VC = VC + P/AR
• Also, if AI is averaging interval, upon receiving AR*AI
bytes on this flow, if VC > real time + Threshold, then
send advisory to source to reduce its rate
WFQ
• WFQ provides isolation and delay guarantees
• FQ simulates fair bit-by-bit RR by assigning packets
priority based on finishing times under bit-by-bit RR
- E.g. Flow 1 packets of size 5 and 8, Flow 2 packet of size
10: size 5 first, then size 10, then size 8
• Round number increases at a rate inversely proportional to
number of currently active flows
• On packet arrival: recompute round number, compute finish
number, insert in priority queue, if no space drop packet
with largest finish number (max-min fairness)
• Approximation error bounded by max_pkt_size / capacity
• WFQ can assign different weights to different flows
Computing Deterministic Delay Bounds
• If flow is shaped by a token bucket (sigma, rho), all routers
along the “h”-hop path employ WFQ schedulers, and the
flow is assigned a rate of rho at each hop, then end-to-end
delay of a packet is bounded by:
(sigma / rho) + (h * max_pkt_size / rho) +
total approximation error + total propagation delay
• A flow is totally isolated, even if other traffic is not shaped
at all
• Cons: bandwidth and delay are coupled
• WFQ does not bound jitter
• A non-work-conserving scheduler (e.g., Stop-and-Go, jitterEDD) may be used to bound jitter
Earliest Due Date (EDD)
• Unlike WFQ, delay bound is independent of bandwidth
requirement
• Packet with earliest deadline selected
• EDD prescribes how to assign deadlines to packets
• Deadline = expected arrival time + delay bound
• “Expected” arrival time is the time the packet should
arrive according to traffic specification (to deal with
bunching of packets downstream)
• Delay bound is the smallest feasible bound, computed
assuming worst-case arrival pattern from all flows
Xmin2 = 4
3
Xmin1 = 10
3
2
Bounding jitter
• Assume we want to eliminate all jitter
• We can achieve this by making the network look like a
constant delay line
• Idea: At the entry of each switch/router, have a regulator
that absorbs delay variations by delaying a packet that
arrived ahead of its local deadline at previous switch
• Traffic characteristics are preserved as traffic passes through
the network
• Schedulers with regulators are called “rate-controlled”
schedulers
• Reduce burstiness within network, thus less buffers needed
• Average packet delay is higher than with work-conserving
schedulers, but that’s fine for hard real-time traffic
Statistical/soft/predictive QoS
• Goal: bound the tail of the measure distribution
• Not a good idea to use worst-case delay bounds since
very few packets (or none!) will actually experience
this worst-case delay
• Computing statistical bounds (e.g., using effective
bandwidths) is usually approximate and often
conservative
• FIFO+ attempts to reduce worst-case delay and jitter
using minimal isolation (and maximal statistical gain)
• At each router, a packet is assigned lower priority if it
left previous routers ahead of measured average delay,
and higher priority if behind average delay
Admission Control with Statistical Guarantees
• Key idea (law of large numbers): as capacity increases,
number of flows that can be supported increases, the
probability that a significant number of flows burst
simultaneously decreases, so the sum of peak rates can
be higher than the available capacity
• As the number of flows increases, the capacity allocated
to each flow is “effectively” its average rate
• Put in enough buffers to make probability of loss low
Measurement-based Admission
• Key assumption: past behavior is a good indicator of
the future
• User tells peak rate
• If peak rate + measured average rate < capacity, admit
• Over time, new call becomes part of average
• Can afford to make some errors for predictive (or
controlled load) service
• Obviously, can admit more calls than admission at
peak rate
Summary
• Performance guarantees can be achieved by combining
traffic shaping and scheduling schemes
• How good the bounds are? Loose or tight?
• How easy to implement these schemes?
• What kind of guarantees they provide?
• How good is the utilization of the network?
• How do clients signal their QoS requirements?
• What is the best path to route a flow?
• How to achieve QoS for multicast flows and with
mobility?
RSVP ReSerVation Protocol
• Designed to handled multicast flows, where multiple receivers may
have different capabilities
• RSVP is receiver-initiated, i.e. receiver generates reservation
• PATH message tentatively reserves resources, RESV makes
reservation (travels as far back up as necessary)
• RSVP supports merging of reservations
• RSVP is decoupled from routing
• PATH and RESV messages periodically sent to refresh state before
timeout (i.e. soft state)
• Multiple senders can share the same reservation
• In contrast, ATM signaling (Q.2931) is sender-initiated, coupled with
routing, uses hard state (i.e. explicit delete), and handles only uniform
QoS
Where to Implement Policies in a Network Topology?
Or, QoS Architectures
• Key to scaling is to maintain levels of hierarchy: core, distribution,
and access
• Link speeds should be faster toward the core of the network
• Access routers generally do not have to handle high packet switching
rates and thus can do complex traffic identification, classification,
policing and marking
• The overhead of implementing QoS policies in the core would affect a
large amount of traffic
• Access routers can mark the IP precedence field in the packet header
• Core routers simply map precedence to queuing or drop actions
• This is similar to the IETF DiffServ architecture (as opposed to the
“less scalable” per-flow IETF IntServ architecture)
Tension: Scalability versus QoS
• Scalability calls for aggregating flows into precedence
classes
• Also, aggregating destinations results in fewer routing
entries, but less granularity of routing decisions
• Performing disaggregation gives finer granularity at a
higher cost
• E.g. more detailed routing advertisments can allow more
granular choice of paths that better satisfy QoS
requirements
• OSPF can advertise multiple costs per link, and compute
multiple shortest paths
• Or, loose source route through a particular service
provider, or multiple addresses/names per host
Other QoS Requirements/Issues
• Pricing/Billing
- can shift demand to off-peak times
- charge more during peak hours
- rational users help the network by helping themselves
• Privacy
• Reliability and availability
• Operating systems support
- reduce costs of data copying, interrupts, …
- real-time CPU scheduling, ...