Endsystem_v2 - Washington University in St. Louis

Download Report

Transcript Endsystem_v2 - Washington University in St. Louis

Endsystem Support for
Network Virtualization
Fred Kuhns
[email protected]
Washington
WASHINGTON UNIVERSITY IN ST LOUIS
Overview
• Context
• Endsystem networking model
• Protocol instances: user or kernel space
– pros and cons
– explore user space protocols
– propose kernel level model
Fred Kuhns - 4/8/2016
Washington
WASHINGTON UNIVERSITY IN ST LOUIS
2
Context: Virtual (Diversified) Networking
substrate
link
substrate
router
virtual
router
virtual
link
virtual
end-system
Fred Kuhns - 4/8/2016
Washington
WASHINGTON UNIVERSITY IN ST LOUIS
3
Simulates Star Topology for Substrate Links
…
VLANX1
Internetworking over a diversified network
Ethernet example:
• VLANs are used to provide the equivalent of a
virtualized “wire” connecting an endsystem to
a specific access router.
• All vnets on an endsystem share common
VLAN
• Use priority queuing (802.1P/Q) to isolate vnet
traffic.
• Use admission control (static or dynamic) to
provide bandwidth guarantees to vnet traffic.
• Substrate layer on endsystems enforce per
VLAN and per vnet bandwidth constraints
Fred Kuhns - 4/8/2016
VLANX2
VLANXN
ethernet switched LAN
vNetX
VR1
• Each host to substrate router connection is assigned a
distinct VLAN. So N hosts implies N VLANs on ethernet.
• Alternative is to define one VLAN tree for each protocol
suite (i.e. vnet).
Washington
WASHINGTON UNIVERSITY IN ST LOUIS
4
vnetX traffic uses high priority queues
…
Ethernet Hub
with High and Low
Priority TX queues
Low
High
Low
Low
High
Low
High
High
vNetX
VR1
Fred Kuhns - 4/8/2016
Washington
WASHINGTON UNIVERSITY IN ST LOUIS
5
Substrate Link as a VLAN Tree
…
ethernet switched LAN
VLANX
• One VLAN is used for all virtual net traffic
to/from a substrate router.
Fred Kuhns - 4/8/2016
Washington
WASHINGTON UNIVERSITY IN ST LOUIS
6
Multiple Substrate Links
…
• Three VLANs are used for all virtual net
traffic to/from a substrate router.
• Corresponds to 3 substrate links:
1.Low priority: default for best-effort traffic
2.Medium priority for virtual nets with soft
performance requirements (average
bandwidth)
3.High priority for isochronous or lowdelay, interactive applications
Fred Kuhns - 4/8/2016
ethernet switched LAN
VLANdgram
VLANhigh
VLANmed
Washington
WASHINGTON UNIVERSITY IN ST LOUIS
7
Multiple vNets per Host
…
ether
addr/vlan
ether
addr/vlan
vlan 1
• Substrate link: serves to connect an endsystem to a
substrate router. Virtualization of a physical cable or wire.
A packet enters one end, exists the other and is opaque
within. Simplex or Duplex?
• Substrate interface: (need better term?) endsystem
abstraction representing a substrate link.
• Ethernet: <interface, VLAN, dest>.
• Could be an IP tunnel
• Not required to be point-to-point.
• Virtual link: represents the logical interconnection of
adjacent network nodes for a given protocol suite.
• Point-to-point. Simplex or Duplex?
• Virtual interface: endsystem abstraction representing one
end of a virtual link. Substrate defines mechanism for
multiplexing onto common substrate link. For example a
virtual link identifier (VLI) in a substrate header. Simplex
or Duplex?
Fred Kuhns - 4/8/2016
Washington
WASHINGTON UNIVERSITY IN ST LOUIS
ether
addr/vlan
vlan 2
vlan 3
ethernet LAN
filter on
ethernet address and
vlan membership
for substrate router
VLI
VLI
VLI
8
Multiple next hop VRs?
Host A
on vnetX
vNetX
VR2
VLANXA2
vNetX
VR3
VLANXA3
ethernet switched LAN
• Not a fundamental part of the model but it is
consistent with the current model used for TCP/IP
in endsystem.
• Allows us to implement TCP/IP as a virtual net
protocol and not change the basic model
Fred Kuhns - 4/8/2016
VLANXA1
Washington
WASHINGTON UNIVERSITY IN ST LOUIS
vNetX
VR1
9
TCP/IP as an Example Protocol
destination
prefix
gateway
192.168.12.0/24
0.0.0.0
*
192.168.12.254
virtual interface
substrate
address
interface
eth0
ARP
vint0
(eth0, VLAN)
VLI,dst
…
IP Route Table
vint0
standard ethernet
(eth0 + VLANX)
Interface
LL Info = SR1 addr + VLI
ethernet device
direct connect
VLANX
ethernet LAN
Substrate Interface:
Ethernet interface. Destination address by ARP.
ethernet
Directly connected: destination IP address + ARP = enet addr
dest. addr
Substrate Router
Gateway: (Gateway’s IP + ARP = enet addr) + VLAN
VLAN
SR1
Virtual Interface:
VLI VLI
VLI
Directly connected: Not used, model only for internetworking
Gateway: VLI assigned by substrate.
IP
Fred Kuhns - 4/8/2016
Washington
WASHINGTON UNIVERSITY IN ST LOUIS
10
OS Kernel Block Diagram
User Space (Applications)
AST Processing
File Interface
ops
FS management
open
files
…
TCPn
UDP
RAW IP
1
callback
task management
util
tasks
Interrupt Processing
TCP module
TCP
TCP2
Basic I/O Interface
buffer
cache
op
s
Socket Interface
hardware independent layer
Device independent I/O
hardware dependent layer
uart
Hardware
timer
Fred Kuhns - 4/8/2016
scheduler
SW int
(AST)
TCP
poll
callout Q
IP
TC/
AST
routes
qdisc
clock handler
process accounting
scheduling
time management
device driver
ethernet
configuration: registers, MMU (TLB, cache, VM) bus
and peripherals
System Exception handlers
OS ISR demux
Washington
HW interrupt/Exception
WASHINGTON UNIVERSITY IN ST LOUIS
eth0
txqueue
rxqueue
11
User or kernel Space protocols?
• Each has pros and cons
• User space protocols:
– easier to implement and debug
– easier to introduce new protocols (not tightly dependent on socket layer
knowing about the new protocol)
– easier to isolate and protect protocols and apps from each other (leverage
process model)
• kernel level protocols
– easier to integrate into existing framework (simplifies support for system
interface functions like select/poll)
– simplifies intra-protocol security and protection (since protocol runs within
trusted kernel)
– simplifies kernel demultiplexing to correct protocol context (endpoint)
– increased efficiency
Fred Kuhns - 4/8/2016
Washington
WASHINGTON UNIVERSITY IN ST LOUIS
12
User Space Protocol Implementation
• Uncommon outside of high-performance community, they want
zero-copy and specialized demux keys.
• Problems: asynchronous processing, life cycle, authentication and
demiultiplexing to endpoints
– latency in delivering packets (i.e. acks) to user space
– increased overhead in per packet processing before a drop/keep decision is
made
– processing received acks
– timeouts and retransmissions
– establishing connections and security: snooping, masquerading
– supporting select and poll
– protocols where connection may outlive process (TCP’s TIMED_WAIT)
– global routing and address resolution tables
– global connection tables
• need to know what other ports are being used (locally)
• accepting/rejecting new connections
Fred Kuhns - 4/8/2016
Washington
WASHINGTON UNIVERSITY IN ST LOUIS
13
Assumptions
• Assumptions:
– Applications using different VNs (or no VN) will need
to communicate using the various IPC mechanisms
– We want to manage all aspects of Network I/O but not
the use of other traditional resources (memory, files etc)
– CPU, memory and interface bandwidth controlled at
the virtual net granularity
– intra-VN, implementers should have the mechanisms to
support QoS and Security
– simple mechanism for adding new protocols/VNs
Fred Kuhns - 4/8/2016
Washington
WASHINGTON UNIVERSITY IN ST LOUIS
14
User Space Protocols
 Chandramohan A. Thekkath , Thu D. Nguyen , Evelyn Moy , Edward D.
Lazowska, Implementing network protocols at user level, IEEE/ACM
Transactions on Networking (TON), v.1 n.5, p.554-565, Oct. 1993
 Chris Maeda, Brian Bershad, Protocol Service Decomposition for HighPerformance Networking, Proceedings of the 14th ACM Symposium on
Operating Systems Principles. December 1993, pp. 244-255.
• Aled Edwards , Steve Muir, Experiences implementing a high
performance TCP in user-space, Proceedings of the conference on
Applications, technologies, architectures, and protocols for computer
communication, p.196-205, 1995
• Kieran Mansley, Engineering a User-Level TCP for the CLAN
Network, Proceedings of the ACM SIGCOMM workshop on Network-I/O
convergence: experience, lessons, implications, Pages: 228 – 236, 2003
Fred Kuhns - 4/8/2016
Washington
WASHINGTON UNIVERSITY IN ST LOUIS
15
user-space protocols: Global Issues
•
Routing: Direct packets to/from correct endpoint/interface
– How is traffic demultiplexed and sent to the correct endpoint/process?
• In-kernel filters
– Where are the routing tables and how are they maintained?
• route fixed when connection established or located in shared memory
•
Control: I use IPv4 as an example
– Address resolution protocols/tables?
– Other control protocols. For example ICMP, IGRP, others?
– Where are the routing protocols implemented?
•
Management:
– Must manage a protocols namespace (for example, port numbers in IPv4).
– Common programming technique, allow protocol instance to select local address part
• specify port = 0 and addr = 0 then implementation will assign correct values
– Passive connect model?
• In IPv4 a server listens on a port (host:port:proto) for a connection request. To establish a connection a
unique (to the endsystem) port number is assigned and new socket allocated.
– socket-oriented system calls must be supported. On UNIX must support non-blocking I/O with
select and poll.
– Connection lifetime may outlast process.
• For example TCP TIME_WAIT or simply waiting for a final ack or resending if no ack received.
•
Security: we must provide sufficient mechanisms for protocol developers
–
implementations must be able to guard against masquerading and eavesdropping
Fred Kuhns - 4/8/2016
Washington
WASHINGTON UNIVERSITY IN ST LOUIS
16
User Space: Configurations
• Given these global issues there are two likely
configurations:
– all traffic passes through common protocol daemon in user
space
– control daemon implements basic set of control functions while
user library implements majority of data path functions
– prior work has shown the latter approach to be superior.
• Having all traffic pass through a common protocol
daemon => at least one extra copy operation (kernel ->
daemon -> user process)
• A better solution is for a daemon to insert relatively
simple packet filters in kernel for established connections
which directs packets to/filters packets from endpoints.
Fred Kuhns - 4/8/2016
Washington
WASHINGTON UNIVERSITY IN ST LOUIS
17
User-Space: Passive Open
0. listen/accept
(passive open)
vnetX: application
protocol
library
vnetX
control daemon:
(namespace, lifecycle, connections)
4. new connection
data copy
socket layer
3. insert incoming and
outgoing filters for
vnetX connection
1. connection
request (in)
5. data, established
connections
compare against connection
specific outgoing filter
2. ack (out)
vnet demux
connection filters
ethernet
Fred Kuhns - 4/8/2016
Washington
WASHINGTON UNIVERSITY IN ST LOUIS
use VLI to access incoming filters
and use to demux to filter set and/or
socket.
18
User-Space: Active Open
0. connect
vnetX: application
protocol
library
vnetX
control daemon:
(namespace, lifecycle, connections)
4. new connection
data copy
socket layer
1. connection
request (out)
3. insert incoming and
outgoing filters for
vnetX connection
5. data, established
connections
compare against connection
specific outgoing filter
2. ack (in)
vnet demux
connection filters
ethernet
Fred Kuhns - 4/8/2016
Washington
WASHINGTON UNIVERSITY IN ST LOUIS
use VLI to access incoming filters
and use to demux to filter set and/or
socket.
19
User-Space: Datagram (Connectionless)
daemon fills in local address and binds to
socket. No restrictions on destination
0. open(any)
vnetX: application
protocol
library
vnetX
control daemon:
(namespace, lifecycle, connections)
2. new connection
socket layer
(local address)
1. insert incoming and
outgoing filters for
vnetX connection
data copy
3. data established
connections
compare against “connection”
specific outgoing filter
vnet demux
connection filters
ethernet
Fred Kuhns - 4/8/2016
Washington
WASHINGTON UNIVERSITY IN ST LOUIS
use VLI to access incoming filters
and use to demux to socket. In this
case only the local part is used.
20
User-Space: Datagram (Connectionless)
daemon fills in both local and destination
addresses. Destination restricted
0. open(local and remote addr)
vnetX: application
protocol
library
vnetX
control daemon:
(namespace, lifecycle, connections)
2. new connection(local and remote) data copy
socket layer
1. insert incoming and
outgoing filters for
vnetX connection
3. data established
connections
compare against “connection”
specific outgoing filter
vnet demux
connection filters
ethernet
Fred Kuhns - 4/8/2016
Washington
WASHINGTON UNIVERSITY IN ST LOUIS
use VLI to access incoming filters and
use to demux to socket.
21
User-Space: App exits
TCP enters TIME_WAIT after close
vnetX: application
protocol
library
vnetX
control daemon:
(namespace, lifecycle, connections)
socket layer
3. remove filters
1. connection
close (out)
2. ack (in/out)
vnet demux
connection filters
ethernet
Fred Kuhns - 4/8/2016
Washington
WASHINGTON UNIVERSITY IN ST LOUIS
drop
22
Extensible protocol frameworks in the kernel
• Herbert Bos, Bart Samwel, Safe Kernel
Programming in the OKE, Proceedings of the
fifth IEEE Conference on Open Architectures and
Network Programming, June 2002
Fred Kuhns - 4/8/2016
Washington
WASHINGTON UNIVERSITY IN ST LOUIS
23
OKE
•
•
•
Context: For performance reasons it is useful to permit third parties to load optimized
modules into the kernel
Problem: Third party code is untrusted so loading into kernel will compromise system
security and reliability. Could use safe execution environment like java but incurs
expensive runtime checks.
Solution: create set of mechanisms and policies to permit non-root users to safely load
untrusted application modules into kernel space with minimal impact on runtime
performance.
– Safety: use a trusted compile to enforce policies (constraints). The constraints are designed to
ensure the untrusted module will not adversely affect the kernel (core and loadable modules) or
unrelated processes.
– User privileges: Vary enforced constraints based on user privileges (customizable language)
– Termination: well defined termination boundaries to protect system state
– Enforcement: Static and dynamic checks; language extensions
– Ease of use: Familiar development environment using Cyclone (type safe, C extension) and
kernel module.
•
Contribution: definition of safe kernel programming environment that meets competing
needs:
–
–
–
–
performance
safety
ease of use
hosted in a commodity OS
Fred Kuhns - 4/8/2016
Washington
WASHINGTON UNIVERSITY IN ST LOUIS
24
Considerations
• Identified areas where modules may impact system
behavior
1. program correctness: language restrictions for safety and
enforce coding conventions
2. Memory access: static and dynamic enforcement of
memory access rules
3. Kernel module access: static and dynamic enforcement
of kernel module (interface) access restrictions
4. Resource usage: Bounded (deterministic or limited)
Fred Kuhns - 4/8/2016
Washington
WASHINGTON UNIVERSITY IN ST LOUIS
25
Pushing protocols into the Kernel
• Positives:
– All the issues associated with user-space protocol simply go
away. Global tables and lifetime of the kernel
– Performance, efficiency, existing code base
– Enhances intra-Protocol security
– Simplifies integration with existing network I/O subsystems and
interfaces
• Negatives:
– Isolation: More difficult to isolate system from protocol
instances. Inter-protocol isolation difficult.
– Security: Proving trust/security more difficult
– Implementation and debugging more difficult in kernel
Fred Kuhns - 4/8/2016
Washington
WASHINGTON UNIVERSITY IN ST LOUIS
26
Kernel-Space Protocols
Application(s)
Rework!
/dev/protoX
User Space (Applications) /dev/vnet
vnet:ep
File Interface
tcp:port
PF_VNET
Socket Interface
I/O Interface
buffer
cache
vnet
vnet:ep
…
udp:port rawIP
ops
FS management
open
files
…
TCP
TCP1
Socket I/O Interface
vnet ops
vnet Proto
state tables
…
vnet Proto
state tables
TCP/IP
IP
TCP2
PF_INET
…
TCPn
route to interface
UDP RAW IP
routes
SW Interrupt
HW Interrupt
ethetnet
vnet Demux
VLAN
Hardware
Fred Kuhns - 4/8/2016
eth device driver
eth0
HW interrupt/Exception
Washington
WASHINGTON UNIVERSITY IN ST LOUIS
27