Harnessing the Most to find the Least

Download Report

Transcript Harnessing the Most to find the Least

Hilarie Orman
Purple Streak, Inc.
And consulting to
PnP Networks
Eitan Fenson, Rich Howard, Phil Straw
Collaboration for the Common
Good
• People like to donate CPU cycles
– Breaking a cipher (DES)
– Factoring large numbers
– Data sifting for extraterrestrial intelligence
• People like to protect their computers
– Viruses
– Trojan Horses
• People should like to donate CPU cycles for
searching for secure application software
configurations
• Disclaimer: we haven’t started this yet!
The Search for Extraterrestrial
Security Configurations
• SETI uses thousands of
volunteer computers for data
mining astrophysical signals
• Easy to sign up and get an
assignment
• Can we use this approach to
discover how to securely
configure our computers ?
Least Privilege
• No more capability than is necessary to get the job
done
• Classic failures surround Unix and root privileges
• Examples:
– File permissions: read but not write
– Temporary files: readable by owner only
– Subjobs only if content and application are trusted
• A multi-dimensional min/max problem
– Too little privilege  too little functionality
– Too much privilege  too little security
How to Rank Privileges?
• Strict ordering: Administrator trumps user*, root
trumps user*
• Subsets: (read, write) trumps (read)
• Set size: (execute, *) trumps (execute, /bin)
• Visibility: writing to the network trumps writing to
hard drive
• Information flow: “create executable with A
permissions” and “A permissions allow network
server connections” leads to “proprietary data
release”
Negative Information
• If the privilege levels are too high, what goes
wrong?
– Privilege escalation
– Unauthorized information use
– Resource misappropriation
• Detection methods:
–
–
–
–
Virus scanning
Intrusion detection software
Environment monitoring (storage side-effects)
Execution monitoring (writing files in system areas,
network access, etc.)
– Anything unusual
Learning from Event Records
• Collect application privilege information
– Configuration files, registry settings, observed usage
• Collect monitored data
– Watchers monitor task lists, new files, network
connections, etc.
• Anonymize and index it
• Learning
– Cluster
– Min/max
• Distribute recommended privileges for common
usage patterns
Large-Scale Architecture
• Distributed P2P database: Volunteer machines
contribute their own, anonymized event records
• Higher tier of P2P “Planners” develop data mining
tasks and assign them to volunteers
• Volunteers retrieve required database records and
crunch the data
• Higher tier analyzes results and finds optimal
configuration sets
• Publish results on webpage or in P2P system
Collaborative Black-box
Execution Monitoring
Application Name
Version
Resource
Parameters
Action/Event
Result
Summary
Anonymized ID
Upper Tier Analysis and
Computation Plan
Cluster Analysis of Summaries
Assignments for parcels of
machine learning from
database portions
Distributed Learning
Fetch database records
Work assignment
Database pieces
Algorithm
Report station
Learned quantum
Application
Profile
Work assignment
Database pieces
Algorithm
Report station
Parameter/value
Resource/value
Research Questions
• Can we get enough information from
configurations and monitoring to do this?
– Fine-grained (system call) monitoring necessary?
– Is there enough “ground truth” to learn?
• Will the learning algorithms find useful optimal
points?
• Can we distribute the learning algorithm over
thousands of machines? Will the resulting traffic
create hot spots?
• Are the learning algorithms vulnerable to
manipulation?
What Other Uses?
Grass roots health
Information, trends,
treatments, outcomes
Whole world
online realtime
mapping project;
Coordinated GPS,
webcams, photos
Geneaology through
DNA matching (be
careful about what
you wish for!)