Backdoor - Computer Science - Research Areas
Download
Report
Transcript Backdoor - Computer Science - Research Areas
Building Reliable Services Using
Backdoors
Stephen Smaldone
Department of Computer Science
Rutgers University
Frustration Scalability
Service.com
Planetary-Scale Services
Internet
Attacks
Failure
9:00pm EST
2:00am GMT
11:00am JST
Human operators, phone calls and emails hard to scale
Cost of ownership dramatically exceeds cost of systems
The Dream: A Defensive Architecture
Internet
Attacks
Failure
BD
BD
Gateway
BD
BD
BD
BD
BD
BD
BD
Gateway
Gateway
Private Network
9:00pm EST
2:00am GMT
11:00am JST
Possible Healing Actions
Refresh the state (reboot)
Destructive and Disruptive
Repair the state (continue)
Recover the state (transfer)
How to access the memory of the failed
system when the OS is “hung”?
The Motivating Philosophy
Something is better than nothing
Faster is better than slower
Repairing state faster than repairing software
It is hard to corrupt or stop an outsider
Save application state if possible
Remote healing better than self-healing
Attackers and faults are becoming “smarter”
Try “holistic” approach if nothing else
The Backdoor (BD)
Backdoor:
a hidden software or hardware mechanism, usually created for
testing and troubleshooting
--American National Standard for Telecommunications
Backdoor Design Principles
1. Availability
BD must be highly available (even when OS is not)
2. Non-intrusiveness
BD operations must not involve local OS (zerooverhead monitoring)
3. Integrity
OS cannot alter BD execution or modify the result
of a BD operation
4. Responsiveness
A BD operation cannot be delayed indefinitely
Possible Backdoor Implementations
A programmable network interface (I-NIC)
A virtual machine over a VMM
Our current prototype is on Myrinet
Work in progress over Xen
IBM’s Remote Supervisor Adapter?
HP’s Remote Management Adapter?
Backdoor as building block
Remote Healing Systems
A computer system monitors/repairs/recovers the
state of a remote system through the backdoor
Backdoor is controlled by the remote OS
Defensive Architectures
Backdoors are programmed to execute defensive
tasks, stand-alone or cooperatively over a private
network
Standalone backdoor
Outline
Introduction
Backdoor Idea
Remote Healing
Defensive Architectures
Conclusions
Remote Healing
Backdoor prototyped on I-NIC (Myrinet)
Remote Repair of OS State
Remote Recovery for Cluster-Based Internet
Servers
Backdoor on I-NIC
“Front door”
CPU
Mem
NIC
I-NIC
Backdoor
Private Network
Backdoor provides an alternative access to system
memory without involving local CPU/OS
Private network over a specialized interconnect, VPN, or
even over a phone link!
A Remote Healing Architecture
Monitor System
Target System
CPU
CPU
BD
BD
Mem
Mem
I/O
I/O
Backdoors use Remote Memory
Communication
Target
Memory
MONITOR
(RemoteRead)
Monitor
Memory
CPU
CPU
Recovery/Repair
(RemoteRead/Write)
BD
BD
NIC CPU
Remote OS Locking
Implemented by a BD-OS protocol
Two functions
Provides exclusive access to target OS data for state
repairing
Enforces fail-stop model in the recovery case to
avoid the consequences of false positives in failure
detection
Can be avoided?
Yes for monitoring
OS Support for Remote Healing
Monitoring and Failure Detection
Sensor Box: system health indicators (sensors) provided
by the target OS in its local memory
Sensors: <UniqueID, Type, Threshold , Value>
Repairing
Externalized State: OS state data that the BD can read
Remote Access Hooks: OS control data that the BD
can write to perform repairing actions
Recovery
Continuation Box: fine-grain OS and application
checkpoint state that the BD can transfer between
systems to migrate running applications
Sensor Box (SB)
Collection of health indicators (sensors) in the
target OS memory
<ID, Type, Threshold, Value>
Sensor Type
Threshold
Progress
Update deadline
Level
Max/Min value
Pressure
Max number of events
Failure Detection using Sensor Box
Target OS updates progress sensors in SB
continuously
Monitoring thread reads SB periodically and
checks counters
Failure = counter stalled beyond its deadline
False positive rate vs. detection latency tradeoff
Sensor Box
Target
OS
<Timer interrupts>
<Context switches>
Monitor
<NIC interrupts>
…
Backdoor
Monitoring and Detection Using
BD
Mem
CPU
Mem
CPU
Remote
view
Sensor Box
Detection
BD
BD
Diagnosis and Repairig
Diagnosis
Inspect live OS data structures in target’s memory
(through the externalized state)
Identify damaged OS state (e.g. resource exhaustion
due to memory hogging processes)
Repairing
Modify target OS memory (through remote access
hooks) to correct damaged state (e.g. remove
memory hogging processes by “injecting” a kill
signal in its process control block)
Diagnosis Using BD
Mem
CPU
Mem
CPU
Fine grained
Diagnosis
view
BD
Externalized
state
BD
Repair Using BD
Mem
CPU
Mem
CPU
Correct
state
BD
Repair
Repair
Hook
BD
Case Study: Repairing OS State
Damaged OS state : resource exhaustion,
corrupted data structures, compromised OS,
etc.
Resource exhaustion
Attack, overload, system misconfiguration,
programming error
Repairing cannot rely on local resources
Two examples
Fork bomb
Memory hog
Case Study : Memory Hog
Program allocates memory in an infinite loop
Both memory and swap space are occupied by
the memory hog
System is inaccessible from console or the
network
Cannot spawn new processes
Cannot handle interrupts
Local daemons cannot repair system
Remote Repairing in case of
Memory Hogging
Monitoring
Pressure sensor signals when severe low memory
condition is detected
Diagnosis
Target externalizes process table and process memory
usage statistics
Monitoring thread identifies the culprit
Repairing
Monitoring thread kills culprit by remotely posting a
SIGKILL
Prototype
BD implemented on Myrinet LanaiX NIC
Modified firmware and low level GM library
Modified FreeBSD 4.8 kernel
Experimental setup
Dell Poweredge 2600 servers with 2.4 GHz dual
Intel Xeon, 1GB RAM, 2GB swap, Myrinet Lanai X
NIC
Benchmark: simple counting program with fixed
number of iterations
Effectiveness of Remote Repairing
20
Execution time (s)
Impaired system
With remote repair
15
10
5
0
0
2
4
6
8
10
12
Number of memory hog processes
14
16
Repairing Timeline
Memory pressure
Remote
Repair
Local cleanup of damaged state
Detection
Diagnosis & Repair
End of repair
0
0.5
1
1.5
Time (s)
2
2.5
3
Remote Healing
Backdoor prototype using Myrinet
Remote Repair of OS State
Remote Recovery for Cluster-based Internet
Servers
Clusters with BD Network
M
M
I/O P
BD
T
M
BD
M
I/O P
T
M
BD
M
T
Interconnect
T
I/O P
M
BD
M
I/O P
Cluster-based Internet Services with
BD network
Client
Client
Client
Server
Server
Server
Monitor
Monitor
Monitor
Cluster-based Internet Services with
BD network
Client
Client
Client
Server
Server
Server
Monitor
Monitor
Monitor
Continuation Box (CB)
Idea
Define per client-session state (OS and application)
Transfer client sessions from the failed system to other
systems in the cluster running the same server
application
CB encapsulates the state of a client session
associated with a server application (possibly
multi-process)
OS state (data in transit through IPC channels)
application-specific state (periodically
exported/checkpointed by the application)
Continuation Box Extraction
Continuation
Box
Recovered
State
CPU
OS
Memory
BD
Victim machine
(crashed)
Memory
BD
Recovery machine
(healthy)
Client-Session Continuation Box
for Multi-Process Servers
App. state
Comm. state
Process 1
Client 1
TCP/IP
CB1
Client 2
CB2
Process 2
IPC
Continuation Box API
create_cb for a client session
export application state to CB
associate I/O channel with the CB
open_cb given an I/O channel
import application state from CB
Changes to make Server Recoverable
while (cid = accept()) {
cbid = create_cb(cid)
if (import(cbid, &{file_name, offset}) == NULL) {
receive(cid, file_name)
offset = 0
}
fd=open(file_name)
seek(fd, offset)
while (read(fd, block, size) != EOF) {
send(cid, block, size)
offset += size
export(cbid, {file_name, offset})
}
}
State Synchronization Problem
Application state (SB_APP) updated only upon export
OS state (SB_IO) updated continuously by the OS kernel
How to synchronize the two components of the CB?
Application
export
OS
A1
SB_IO
3 2
A2
OS
SB_APP
SB
Application
A1
OS
SB_IO
3 2
Application
SB_APP
SB
A1
A1
import
SB_IO
3
SB_APP
SB
A1
CB-based Recovery
Log-based rollback recovery
OS keeps communication logs (send/receive)
restores server state with respect to a client
0-copy using the communication buffers
After migration, OS replays send/receive
operations from logs
transparent to server and client applications
Backdoors Prototype
Myrinet LanaiX NIC as backdoor
Modified FreeBSD kernel
in-kernel remote read/write operations
Sensor Box, Continuation Box
Modified server applications
Apache, Flash, Icecast, JBoss
Case Study: A Multi-tier Auction Service
Front-End (FE)
Apache web server
Middle Tier (MT)
JBoss app. server
Back-End
MySQL DB server
Recoverable RUBiS
Experimental Evaluation
Experimental setup
Dell PowerEdge 2600 servers, 2.4 GHz dual Intel Xeon,
1GB RAM, 1Gb Ethernet
Workload modeled after TPC-W
Fault injection in FE and MT nodes
synthetic freeze, emulated freeze by remote OS locking,
bugs inserted in network drivers
Evaluation
Low overhead under load
Recovery is fast
Low Overhead under Load
8,000
Base
Recoverable FE
7,000
Recoverable FE+MT
Requests/min
6,000
5,000
4,000
3,000
2,000
1,000
0
20
100
300
500
Clients
700
900
1,100
Recovery is Fast
Failure
Recovery latency
Detection Latency
Detection
Import CB
Recovery
ends
0
5
10
15
Time (ms)
20
25
30
Outline
Introduction
Backdoor Idea
Remote Healing Experience
Defensive Architectures
Conclusions
Autonomous Backdoor
BD is programmed to execute defensive tasks, then “sealed”
Defensive Architecture Hierarchy
Defensive Computer Architecture (DCA)
Defensive Network Architecture (DNA)
Individual computers equipped with BD
BD performs local defensive tasks (e.g. OS state inspection)
Cluster nodes equipped with BDs connected over high-speed
private network
BDs perform defensive tasks cooperatively (e.g. OS integrity
checking, continuous remote logging)
Defensive Inter-Network Architectures (DINA)
Loosely coupled DNAs connected over the Internet or other
networks
DNA cooperate (e.g. early warnings of virus attacks)
Defensive Inter-Network Architecture
over PlanetLab (new project)
Internet
Attacks
Failure
BD
BD
Gateway
BD
BD
BD
BD
BD
BD
BD
Gateway
Gateway
Private Network
9:00pm EST
2:00am GMT
11:00am JST
Local Memory Inspection
(Work in Progress)
Orion - Holistic Approach to System Failure Prediction
Identify kernel memory update patterns and correlate
them to predict unstable system states
Related Work
DEC WRL Titan system [’86]
Recoverable OS subsystems
Rio reliable file cache [Chen ‘96]
Recovery Box [Baker ‘92]
Defensive Programming [Qie ‘03]
Nooks [Swift ’04]
Recovery Oriented Computing [Patterson’02]
Microreboot [Candea’04]
TCP Connection Failover[Snoeren’01, Sultan’01,
Alvisi’01, Koch’03, Mishra’03, Zagorodnov’03]
Automatic repair of data structures [Demski ‘03]
K42 [Soules ’03]
Hypervisor-based fault tolerance [Bressoud ‘95]
Conclusions
The Backdoor is a promising building block for
remote healing and defensive architectures
Feasibility studies for Remote Repairing and
Remote Recovery using I-NIC-based Backdoor
prototype
Current work includes Defensive Architectures
and Orion
People and Money Behind Backdoors
Liviu Iftode
Florin Sultan
Aniruddha Bohra
Pascal Gallard (INRIA/IRISA, France)
Iulian Neamtiu (University of Maryland)
Yufei Pan
Arati Baliga
Tzvika Chumash
NSF CAREER CCR-0133366
Thank You!
http://discolab.rutgers.edu/bda
Yes, BD Security! (work in progress)
BD under OS control
Access to remote memory controlled through
memory registration (established at the initialization
time)
Voting scheme for remote writes (delayed writes)
BDs monitor each other and their OSes integrity
Autonomous BD
OS cannot access BD memory after initialization
(possible with PCI Express)
Local Memory Inspection
(Work in Progress)
Kernel Integrity Monitoring & Healing
Search for kernel rootkits
individual kernel functions
kernel tables e.g. syscall
dynamic structures e.g. the process table, etc
Repair the kernel when compromised
Replace tampered tables with clean versions.
Replace corrupt versions of kernel functions with clean
ones.
Holistic Approach to System Failure Prediction
Identify kernel memory update patterns and correlate
them to predict unstable system states