The Devil and Packet Trace Anonymization

Download Report

Transcript The Devil and Packet Trace Anonymization

The Devil and Packet Trace
Anonymization
Authors:
Ruoming Pangy, Mark Allmanz, Vern Paxsonz, Jason
Lee
Princeton University, International Computer Science
Institute,
Lawrence Berkeley National Laboratory (LBNL)
Publication: Computer Communication Review, January 2006.
Presenter:
Radha V. Maldhure
AGENDA












ANONYMIZATION
PROBLEM WITH CURRENT TECHNIQUES
USE OF ANONYMIZATION
PAPER’S CONTENTS
METHODOLOGY
ANONYMIZATION POLICY
INFORMATION LOSS
VALIDATION
CONCLUSION
CONTRIBUTIONS
WEAKNESSES
SUGGESTIONS
INTRODUCTION
RESEARCHER
TO IMPROVE / TO
DEVELOP
Released data
ATTACKER
TO ATTACK
DATA
e.g. packet traces
RESEARCHER
anonymization
Released data
ATTACKER
ANONYMIZATION
o
o
o
Releasing network measurement data to research
community
Publishing traces require balance between
security needs of organization and research
usefulness
Example: “tcpdpriv” removes TCP options from
traces, no physical fingerprinting, no research
value
Security
Needs
Research
Usefulness
AGENDA












ANONYMIZATION
PROBLEM WITH CURRENT TECHNIQUES
USE OF ANONYMIZATION
PAPER’S CONTENTS
METHODOLOGY
ANONYMIZATION POLICY
INFORMATION LOSS
VALIDATION
CONCLUSION
CONTRIBUTIONS
WEAKNESSES
SUGGESTIONS
PROBLEM WITH CURRENT
TECHNIQUES
Existing publicly released traces have problems as:
•
No careful guidance on anonymization policy
for public release
•
No tool that adapts to particular policy
•
Example : NLANR’s PMA packet traces
AGENDA












ANONYMIZATION
PROBLEM WITH CURRENT TECHNIQUES
USE OF ANONYMIZATION
PAPER’S CONTENTS
METHODOLOGY
ANONYMIZATION POLICY
INFORMATION LOSS
VALIDATION
CONCLUSION
CONTRIBUTIONS
WEAKNESSES
SUGGESTIONS
USE OF ANONYMIZATION
Some uses of anonymization:
•
Your web site's performance and availability
•
Understanding of the Internet’s structure and
behavior
AGENDA












ANONYMIZATION
PROBLEM WITH CURRENT TECHNIQUES
USE OF ANONYMIZATION
PAPER’S CONTENTS
METHODOLOGY
ANONYMIZATION POLICY
INFORMATION LOSS
VALIDATION
CONCLUSION
CONTRIBUTIONS
WEAKNESSES
SUGGESTIONS
PAPER’S CONTENTS
o
o
o
Arrives at acceptable anonymization policy
Presents a tool “tcpmkpub” that implements the
suggested transformations
Provides meta-data about each trace for analysis
AGENDA












ANONYMIZATION
PROBLEM WITH CURRENT TECHNIQUES
USE OF ANONYMIZATION
PAPER’S CONTENTS
METHODOLOGY
ANONYMIZATION POLICY
INFORMATION LOSS
VALIDATION
CONCLUSION
CONTRIBUTIONS
WEAKNESSES
SUGGESTIONS
METHODOLOGY
Precise method for anonymization
Concerns for
appearing
traffic
Purpose of
transform
Policy
decisions
Anonymization
tool
Example Specification
Specification of IP Header anonymization:
AGENDA












ANONYMIZATION
PROBLEM WITH CURRENT TECHNIQUES
USE OF ANONYMIZATION
PAPER’S CONTENTS
METHODOLOGY
ANONYMIZATION POLICY
INFORMATION LOSS
VALIDATION
CONCLUSION
CONTRIBUTIONS
WEAKNESSES
SUGGESTIONS
ANONYMIZATION POLICY
•
Focuses on traces that include only packet headers
•
A possible policy but not completely a correct policy
•
It is crucial to prevent users of the trace files from
determining:



identities of specific hosts
identities of internal hosts such that a map could
be constructed of which hosts support which services
security practices of the organization
Protocol Stack
Application
Layer
Transport
Layer
Internet
Layer
Network
Interface
Layer
FTP/ Telnet/ SNMP/ DNS
TCP/ UDP
IP/ ARP/ ICMP/ IGMP
Ethernet/ ATM/ FR
CHECKSUMS
Reason to anonymize:
Re-calculate checksums in traces for two reasons:
 Gives content of data even when application
data removed
 To determine if original checksum were valid
Way to anonymize: Original checksum Co, Calculated checksum Cc


Replace Co by Cc
Insert “1” into appropriate checksum field to
mark packet as failed checksum
NETWORK INTERFACE LAYER:
Ethernet Address
Reason to anonymize:


Ethernet Addresses are distinct to individual NICs
Can be used by an attacker to uncover actions of
given user
Way to anonymize:

Three Different methods of randomizing Ethernet
addresses



Scrambling the entire 6 byte address
Scrambling only the lower 3 bytes of address
Scrambling lower 3 and upper 3 bytes independently
INTERNET LAYER: IP Address
Reason to anonymize:

Attacker can attain accounting of user’s activities if he
knows IP Address

Can plan an attack using information about services
running on the host
Way to anonymize:
Remap addresses differently based on type of addresses
Multicast addresses preserved in anonymized trace
TRANSPORT LAYER: TCP/UDP
Reason to anonymize: Not given
Way to anonymize:

Preserves port number and sequence number
but not the timestamp

They transform timestamps into separate
monotonically increasing counters

Research use: uniqueness and transmission
order of segments
AGENDA












ANONYMIZATION
PROBLEM WITH CURRENT TECHNIQUES
USE OF ANONYMIZATION
PAPER’S CONTENTS
METHODOLOGY
ANONYMIZATION POLICY
INFORMATION LOSS
VALIDATION
CONCLUSION
CONTRIBUTIONS
WEAKNESSES
SUGGESTIONS
INFORMATION LOSS

The effectiveness in preserving information is checked by
analyzing original and anonymized traces

Two tools for analysis: “tcpsum” and “pOF”

tcpsum : Used to find number of packets and bytes sent in
each direction
Crunches each Tcp connection in trace


Except for IP addresses, crunching original and transformed
traces matched
No value lost in transformation
pOF : Did not get what they tried to explain!
AGENDA












ANONYMIZATION
PROBLEM WITH CURRENT TECHNIQUES
USE OF ANONYMIZATION
PAPER’S CONTENTS
METHODOLOGY
ANONYMIZATION POLICY
INFORMATION LOSS
VALIDATION
CONCLUSION
CONTRIBUTIONS
WEAKNESSES
SUGGESTIONS
VALIDATION

Need to validate information intended to mask
was indeed transformed or left out of anonymized
trace

Two ad hoc validations:

Inspected the log created by “tcpmkpub”


Flags all unexpected aspects of a packet trace
Used “ipsumdump” to dump Tcp options


Picked timestamps, sorted and verified
Timestamp re-numbering appears accurate
AGENDA











ANONYMIZATION
PROBLEM WITH CURRENT TECHNIQUES
USE OF ANONYMIZATION
PAPER’S CONTENTS
METHODOLOGY
ANONYMIZATION POLICY
INFORMATION LOSS
VALIDATION
CONTRIBUTIONS
WEAKNESSES
SUGGESTIONS
CONTRIBUTIONS

Enumerated and explored devil-ish
details in preparing packet traces

A framework for implementing
anonymization policy and developed
“tcpmkpub”

Sets framework for future work of packet
trace anonymization
AGENDA











ANONYMIZATION
PROBLEM WITH CURRENT TECHNIQUES
USE OF ANONYMIZATION
PAPER’S CONTENTS
METHODOLOGY
ANONYMIZATION POLICY
INFORMATION LOSS
VALIDATION
CONTRIBUTIONS
WEAKNESSES
SUGGESTIONS
WEAKNESSES

No timing information for analyzing TCP
dynamics

Preserving port number may lead to
identification of a particular machine

No performance analysis
AGENDA












ANONYMIZATION
PROBLEM WITH CURRENT TECHNIQUES
USE OF ANONYMIZATION
PAPER’S CONTENTS
METHODOLOGY
ANONYMIZATION POLICY
INFORMATION LOSS
VALIDATION
CONCLUSION
CONTRIBUTIONS
WEAKNESSES
SUGGESTIONS
SUGGESTIONS

Needs to deal with different protocols at
each layer of protocol stack

Should present performance analysis
that indicates


tool’s efficiency in terms of maintaining
security needs
preserving research values
QUESTIONS?????????????????