Transcript Talk - MIT

Glavlit: Preventing
Exfiltration at Wire Speed
ГУОГТП
Nabil Schear†*, Carmelo Kintana†, Qing Zhang†, Amin Vahdat†
†Department of Computer Science and Engineering,
University of California at San Diego
*Los Alamos National Laboratory
Hotnets V - Irvine, CA - November 30, 2006
Information Leaks and Exfiltration
ГУОГТП



Exfiltration – type of information leak;
malicious theft of valuable information
Leaks affect customer confidence, regulatory
compliance, profits, etc…
Leaks are inevitable

Targeted attacks, insiders, accidents, etc…
Goal: Minimize leaks NO MATTER
how or why they happen.
2
How Does Data Get Out?
Accidentally
External
Network
Protected Network
Web servers
Email
File
File
User Workstations
Email server
3
Boundary
ГУОГТП

Didn’t Know file
was sensitive
______or
An honest mistake
What about Malicious Exfiltration?
ГУОГТП
External
Network
Protected Network
File
Web servers
File
User Workstations
Email server
4
Boundary
Attacker, malware,
or insider uses
existing Web server
More Malicious Leaks
Protected Network
HTTP
File2
Web servers
User Workstations
Email server
5
Boundary
ГУОГТП
Attacker uses
hidden channel in
protocol to encode
sensitive data
External
Network
HTTP
File2
File
Previous Solutions
Policy Stand-alone LAN
Private
External
Network
Protected Network
Web servers
User Workstations
Email server
6
Boundary
ГУОГТП

-Expensive
-Difficult to
_enforce
-Granularity
_too coarse
-Hard to use
Previous Solutions
Packet Filter
Passive
Monitoring
(Firewall)
External
Network
Protected Network
Web servers
User Workstations
Email server
Firewall
Boundary
ГУОГТП

-High speed
-Can’t actively
_limits analysis
_stop leaks in
_complexity
_progress
-Works on
_packets not
_files
Analysis / Audit
7
Previous Solutions
Proxies
External
Network
Protected Network
Web servers
Proxy
User Workstations
Email server
Proxy
8
Boundary
ГУОГТП

-High overhead
-Difficult and
_complicated
_to configure
Our Solution: Glavlit
External
Network
Protected Network
File
Web servers
User Workstations
Boundary
Guard
ГУОГТП
Decouple vetting from verification
-Transparent
-High speed
-Actively stop
_leaks
Warden
9
-Arbitrary and powerful analysis
-off critical network path
Our Solution: Glavlit
Protected Network
HTTP
File2
Web servers
User Workstations
Warden
10
External
Network
Boundary
Guard
ГУОГТП
Mitigate covert channels in the
application layer protocol
-Prevents a subset
_of covert channels
-Limits bandwidth
_of others
What is Glavlit?
ГУОГТП
11

Prevent unauthorized release from
HTTP servers while allowing
authorized data to pass unhindered
Key Contributions:

Ensure that
only authorized
Enforces
complex
exit policyobjects
1)
cross the network boundary in payload
 Operates at granularity of whole files
2) Mitigate a class of covert channels in
 Covers wide range of threats
application layer protocols
 Does not depend on host security
 Only trust the Warden and Guard
Glavlit is NOT…
ГУОГТП
Just a firewall
 For outgoing HTTP browser requests
 Designed to prevent leaks from covert
channels below layer 7
 Capable of stopping ALL potential
covert channels


12
In general this is intractable
Two Complementary Techniques
for Mitigating Leaks
ГУОГТП
1)
Content Control

2)
HTTP Protocol Channel Mitigation



13
Hash network content against known
list of good releasable data
Restrict HTTP RFC and parse protocol
for syntactic correctness
Check field values for semantic validity
Enforce ordering and normalize timing
Vetting at the Warden
ГУОГТП

Vetting – authoritative review to decide if
an object (a file) is ok to release


Arbitrarily complex and time-consuming
Warden performs arbitrary vetting process
Content
Provider
File
Warden
14
Guard
File
Vetting Complete
File Approved
Vetting at the Warden
ГУОГТП

Generates signatures




Split the file into 1KB chunks
Calculate secure hash of each chunk
Collect file metadata
Share table of signatures for vetted
objects with Guard
Guard
Content
Provider
Signatures
File
Warden
15
Verification at the Guard
ГУОГТП

Verification - Ensure object crossing
network boundary is pre-vetted
1)
Locate object within network stream
Lookup object in signature table based
upon hash of first 256 bytes of the file
Verify file content
2)
3)



16
Hash and check each chunk
Packets can egress as soon as all their chunks
are verified
Can actively stop invalid data by dropping
packets and injecting TCP RESET packets
Need an In-order TCP Stream
ГУОГТП


Packet Cache
Send
17
How to verify data in lost, retransmitted, or
out of order packets?
Keep a sliding window of packet content
and cache for old packets
TCP/IP
Header
TCP/IP
Header
Pending Data
TCP/IP
Header
TCP/IP
Header
Unused Buffer Space
TCP/IP
Header
Packet Header
Queue
Protocol Channels
ГУОГТП

Unauthorized communication channel
 Present in L7 protocol or its operation


Channel Carrier


18
Protocol Channel
Cover data holding the channel
Types of carriers in protocol channels
 Structured
 Unstructured
Structured Protocol Channels
ГУОГТП
19

Attackers can encode data in structured
protocol fields in an HTTP response
HTTP/1.1 200 OK
Date: Thu, 23 Nov 2006 03:45:23 GMT
Server: Apache
Last-Modified: Fri, 10 Mar 2006 05:56:06 GMT
Accept-Ranges: bytes
Content-Length:
255
Content-Length:
255
254
Connection: close
Content-Type: text/html; charset=UTF-8
Credit-Card-Num: 1234-5678-9012-3456

Key Insight: most fields are verifiable
Verifying Structured Data
ГУОГТП

Does it look right? (Syntactic)
Check syntax against restricted RFC
specification
 Pre-specified headers and order


Does it make sense? (Semantic)
Check against corresponding request
 Restrict server responses to aid
verification
 Check metadata against Warden Info

20
 Content-Length,
Last-Modified, etc…
Unstructured Carriers
ГУОГТП
21

Attackers can also encode information
in network order or timing
Correlate request/response pairs to
enforce ordering
 Actively alter timing behavior by delaying
server responses
 Model server response behavior and
block deviations

Evaluation Setup
ГУОГТП

How fast is Glavlit verification relative to



Direct connection
Linux software bridge
Glavlit Guard with verification off


No hashing or protocol parsing
TCP reassembly and packet forwarding only
Apache 2.2.2
Web Server
22
Linux Host
Running Guard
Gigabit Ethernet
Network
Boundary
Gigabit Ethernet
Custom
HTTP Client
System Throughput
ГУОГТП
23
Evaluation Discussion
ГУОГТП


Guard and Web server both pay the price
for more connections on small files
Per-connection overhead reduces
performance for small files (~50%)
1)
2)
3)

24

Parsing
TCP Connection/Stream/State Allocation
pcap and libnet kernel switching overhead
For common Web files (~10KB+)
performance is comparable to direct
connect and Linux kernel bridge
Total request latency NOT affected
Conclusions
ГУОГТП




Prevents inadvertent disclosure
Protocol Channel Mitigation prevents many
channels and limits others


25
Content control prevents information that is
not explicitly allowed from exiting
Raises the Bar for attackers wanting to steal
valuable data
Performance overhead acceptable in untuned prototype
FIRST system to actively limit application
layer covert channels
Thank you
QUESTIONS?
Author Contact Info
{nschear, ckintana, qzhang, vahdat}
@cs.ucsd.edu
Guard CPU Usage
ГУОГТП
27
Guard No-Verify CPU Usage
ГУОГТП
28
Verifying Dynamically
Generated Content
ГУОГТП
29

Goal: Leverage static content
verification as much as possible
Rolling Checksum (ala rsync)
 Rabin Fingerprints for variable sized
chunks
 High speed analysis engine for
mismatch regions
 Self describing templates

Related Work
ГУОГТП

Content Control


Commercial Solutions (Entrust, Fidelis,
Vontu, PortAuthority)
Covert Channels
Web Tap, Eraser, Infranet
 Detection of Layer 3 and 4 Channels
(NUSHU, Loki, etc…)

 Murdoch

Vetting Review Tools
Wetstone StegoSuite
 Los Alamos National Lab - File Scrub

30
et al., Fisk et al., Tumoian et al.
Future Work
ГУОГТП

Dynamic Content
Fuzzy Fingerprinting matching
 Self Describing Web Language (JWig)


Support More Protocols

SMTP, IM, etc…
SSL Traffic Support
 More tuning for better performance


31
Possible hardware acceleration?