Practical Techniques to Obviate Setuid-to-Root Binaries
Download
Report
Transcript Practical Techniques to Obviate Setuid-to-Root Binaries
Operating System
Security
Concurrency
Architecture
Research Lab
Containing the Hype
Kavita Agarwal, Bhushan Jain, Don Porter
OSCAR Lab
Computer Science Department
Stony Brook University
Guest OS
Guest OS
Host OS (Hypervisor)
App
Process
tree
NIC
Pro: Strong Isolation, Compatibility
Con: High memory, startup time
Container
App
Container
App
VM
VM
VMs and Containers
App
Host OS
Mount
Points
Pro: Lightweight
Con: Security, Compatibility
Overlap if same OS in host and guest – Cloud environment
Hyping the Containers
• “VMs have 50x slower start-up time than containers”
– Felter et al., IBM Tech Report
• “Containers can operate with densities
approaching 100x what is possible with traditional
virtualization”
– Dr. James Bottomley, CTO, Server Virtualization, Parallels
• “Container-based LXD achieves 14.5x greater density
than KVM”
– Canonical Ltd. study
Reproducible but with unfair configurations
Contributions
• Quantify the gap in start-up latency and density
–
–
–
–
Make fair comparison using known optimizations
Effect of density on CPU and IO scalability
[See Paper]
Comparison with single application container
[See Paper]
Effect of different kernel versions on density
[See Paper]
• Analyze VM memory usage
Experimental Setup
• Host: Ubuntu 14.10 server, kernel 3.16
–
–
–
–
4 core, 3.4 GHz Intel i7 CPU
4 GB RAM, 250 GB, 7200 RPM ATA disk
Qemu-kvm version 2.1.0
LXC Version 1.1.0
• Idle KVM Guest: Ubuntu 14.10 server, kernel 3.16
– 1 virtual CPU, 1GB RAM
– 20 GB virtual disk image
• Idle LXC Guest: 1 CPU, 256 MB cgroups memory limit
Idle guests to measure infrastructure overhead
Outline
• Compare Guest Start-up Latency
• Measure Memory Footprint
• Analyze the VM Memory Usage
Start-up Latency of Guests
• Low start-up time
Provision for avg instead of peak load
– Reduce resource wastage, without missing SLO to increase profit
Virtualization Technology
KVM
LXC
Time to Boot (sec)
10.342
51x VM start-up
0.200
Reproducible result from IBM Tech Report
The Real Gap in Start-up Time
• Checkpoint a booted VM, restore checkpoint on demand
– Works if running the same kernel on the same hardware
Virtualization Technology Time to Boot (sec) Time to Restore (sec)
KVM
LXC
51x VM start-up
10.342
0.200
6x VM
start-up
The start-up gap is smaller (6x) than expected
1.192
0.192
Outline
• Compare Guest Start-up Latency
• Measure Memory Footprint
• Analyze the VM Memory Usage
Memory Footprint Analysis
Average Memory Footprint
240
Memory
Deduplication (KSM):
235 to 194 MB
PSS (MB)
200
160
KVM
LXC
Reality:
11x
120
80
Hype: 1 guest
14.5x
40
Asymptotic Incremental
Cost: 91 MB
0
4
8
12
16
20
Number of Guests
24
28
Memory footprint of VM is ½ in asymptote
32
Outline
• Compare Guest Start-up Latency
• Measure Memory Footprint
• Analyze the VM Memory Usage
KVM Memory Usage (After KSM Dedup)
EPT Table,
2 MB
Opportunity
Opportunity
Guest
RAM,
66 MB
Qemu
Devices,
23 MB
Total Incremental
Memory (91 MB)
Guest File
Backed,
55 MB
Guest
Anonymous,
11 MB
Reducing usage in these sections can reduce VM footprint
Is KSM Missing any Duplicates?
• Snapshot the complete host memory
– Find duplicates among VM and host page frames
• KSM only targets anonymous memory
Maximum Deduplication Potential
KSM only dedups
this column
Anon/Anon File/File Anon/File Total
Within the VM
28 MB
0.5 MB
0.5 MB
29 MB
Between VM and Host
18 MB
2.0 MB 12.0 MB
32 MB
Between 2 VMs
48 MB
8.0 MB
59 MB
Total
94 MB
KSM dedups
all 94MB!
3.0 MB
10.5 MB 15.5 MB
120 MB
26 MB missed opportunity
KSM for file pages can reduce incremental 91 MB to 65 MB
Qemu Emulated Devices
• Remove qemu-emulated unused devices
– VM in a cloud interact only over the network
– Safely remove audio, video, and extra USB devices
– Can further reduce 5 MB incremental cost
Removing unused devices further reduce incremental cost
How small can we make VMs?
Incremental VM memory footprint
Extend KSM to file pages
Remove unused devices
=
91 MB
= - 26 MB
= - 5 MB
Effective VM memory footprint
=
60 MB
We can reduce the incremental VM cost by 1/3
Conclusion
• Containers are definitely lighter-weight than VMs
– VMs may be necessary for security isolation and compatibility
• Gap between VMs and Containers is not as wide as hyped
• Most fair comparison:
VMs : 6x start-up time
Containers: 11x memory density
Simple approaches can reduce the incremental VM footprint by 1/3
[email protected]
BACKUP
Hardware
Page Tables
Shared
Vulnerability,
Limited
Functionality
App
Process
tree
NIC
Container
Exploit Syscall
Interface
Container
Containers
App
Host OS
Mount
Points
cgroups:
Resource
Isolation
Namespace:
Private copy
of kernel
objects
Shared OS
Light weight and efficient. Low security and functionality.
Virtual Machines
Isolated
Vulnerability
VM
Guest OS
Extended
Page Tables
VM
Hardware
Page Tables
Complete
Legacy OS
App
App
Host OS (Hypervisor)
Guest OS
Exploit Syscall
Interface
Narrow
Hypercall
Interface
Hardware assisted security. High memory, startup overhead
Effect of Density on CPU and I/O Workloads
SPEC CPU 2006 462.libquantum
2400
2100
KVM
1800
LXC
1500
4000
I/O Performance (ops/sec)
Execution Time (sec)
2700
1200
900
600
3000
1
4
8
16
Number of Guests
24
Filebench I/O Performance
3500
KVM
3000
LXC
2500
2000
1500
1000
500
0
1
4
8
16
24
Number of Guests
Both degrade dramatically and comparably
32
But, VMs can do Better for I/O
• Optimizations to reduce VM exits on I/O
– Pass-through I/O
– Push I/O drivers into the application
• ELI[1] directly maps device in guests using shadow IDT
– Achieves 97-98% of bare metal
• Arrakis[2], IX[3] pushes I/O drivers into the application
– Achieves 2.3x-3.9x throughput of Linux
[1] Gordon, Abel, et al. "ELI: bare-metal performance for I/O virtualization." 17th ASPLOS’12, 411-422.
[2] S. Peter, et al. “Arrakis: The Operating System is the Control Plane.” 11th OSDI’14, 1–16
[3] A. Belay, et al., “IX: A Protected Dataplane Operating System for High Throughput and Low Latency” .
11th OSDI’14, 49-65
Direct-to-application I/O can improve VM performance