Piglet: An Operating System for Network Appliances

Download Report

Transcript Piglet: An Operating System for Network Appliances

LiNK: An Operating System Architecture
for Network Processors
Steve Muir, Jonathan Smith
Princeton University, University of Pennsylvania
([email protected], [email protected])
The Network Processor environment
• Many types of network device include a processor
• High performance host NICs e.g., gigabit cards
• Remote management (ILO) devices
• Router line cards
• Virtual network devices (more on this later...)
Example applications
• High-speed packet processing
• Traffic shaping
• Firewall/intrusion detection
• Monitoring/logging
• Remote management
Why a network processor OS?
• Device-independent portability layer
– run the same application on diverse network processors
• Hardware abstraction
– hide details of network processor environment from app.
• Isolation between components
– protect services and drives from bugs
• Multitasking
– simplifies the programming task
• Library of common functionality
– memory management, profiling, logging, etc.
LiNK: The Lightweight Network Kernel
• Simple, event-driven main loop
– interrupt handlers scheduled as events
– synchronous processing reduces complexity of kernel
• Network service components
– take advantage of simple, uniform application structure
• Posted service requests
– efficient communication between LiNK and clients
Network Service Components
• Fundamental structure shared by all applications
• Three elements:
1. Transmit filter – add packet headers, packet scheduling
2. Receive handler – generate response packets
3. Timeout callback – flush caches, packet retransmit
• Service functions scheduled by kernel
– functions run to completion or explicit yield
• Many services fit into this model
– ARP, ICMP, traffic shaping
Posted Service Request object
Posted Service Request processing
•
•
•
•
•
Client posts to shared object - asynchrony
Kernel polls and processes - concurrency
Chaining required for efficiency
Speculation and reference arguments
Events for completion and failure notification
Speculation
Alloc_Mem
Cap_Ref_New
12
Alloc_Mem
Cap_Ref_New
12
Make_Frameset
Cap_Ref_New
<Reference>
<Reference>
Data
Dependency
Set_Peer
<Reference>
192.168.0.1
Virtual Network Processors
• Multi-core processors are cheaply available
– multi-threaded and/or multi-core CPUs
• Virtualisation technology has matured too
– low overhead, enhanced by hardware support
• Use a processing element as a network processor
• A perfect prototyping and development environment
– testing and debugging on real hardware is hard
• But maybe something more...
Current Implementation
• Hybrid Linux/LiNK system
– Linux as host operating system, LiNK as kernel module
– LiNK accessible to Linux as standard ethernet device
• LiNK provides user-space network subsystem
– Linux provides filesystem, processes, scheduling, VM, etc.
• Simple way to prototype new NP feature
– e.g., a direct user-space interface to the (virtual) NP
Evaluating the Virtual NIC
• Ported the Flash webserver to Linux+LiNK
• Provided TCP/IP protocol stack as user-space library
– webserver used unmodified
• WebStone 2.5 HTTP benchmark
– simulates realistic workload with multiple clients
– small number of files, representative size and distribution
• Compare performance of Linux and Linux+LiNK
Evaluation: Communication Overhead
Distribution of Round-Trip Times
60000
Frequency
50000
linux -64
40000
30000
LiNK-256
linux -256
20000
LiNK-64
LiNK-1024
linux -1024
10000
0
0
100
200
300
Round-Trip Time/us
400
500
600
Evaluation: Flash performance
Webserver throughput
120
Throughput/Mb/s
100
(projected)
80
(projected)
LiNK (curl)
LiNK (flash)
60
Linux (curl)
Linux (flash)
40
20
0
0
1
2
3
Server CPUs
4
5
Conclusions
• Network processors need operating systems
• The Lightweight Network Kernel is one alternative
– Simple, specialised structure
– Asynchronous, high performance communication
• The future: virtual network processors
Questions/comments
Evaluation: Network Throughput
Mean aggregate throughput (Mb/s)
Network throughput for multiple TCP connections
120
100
80
Piglet
Linux 2.0.36
60
Linux 2.2.14
40
20
0
0
1
2
3
Concurrent connections
4
5
Evaluation: Polling Performance
CPU Utilisation
Idle Polling
Period/s
Throughput/Mb/s
Ttcp x1
35%
9.0
64.3
Ttcp x2
45%
10.5
65.6
Ttcp x3
51%
11.0
68.8
Ttcp x2 (2 loads)
91%
15.0
105.0
Ping
45%
9.0
N/A
Application
Evaluation: Polling Scalability
Polling time against clients
35
Polling time/us
30
25
20
15
10
5
0
0
10
20
30
Number of clients
40
50
60
Related Work
• Network device polling
– Click modular router [Kohler] - 4x performance increase
– Scout/IXP1200 router [Peterson] - similar
• Parallel network protocol stacks [Bjornberg, Naburn]
– processor-per-packet scales well for simple protocols
– complex protocols => severe lock contention
• Network appliance optimisations
– I/O Lite - unified buffer management [Pai]
– Soft timers - polling in an interrupt-driver kernel [Aron]
Future Work
• Responsiveness of polling
– dynamic specialisation e.g., run-time code generation
• Scalability
– how many processors can Piglet support?
– how many applications/devices?
• Alternative applications for Piglet
– network processor e.g., IXP1200, operating system