Transcript lecture_18

Web Security
Privacy
CS 136
Computer Security
Peter Reiher
December 1, 2011
CS 136, Fall 2011
Lecture 18
Page 1
Web Security
• Lots of Internet traffic is related to the
web
• Much of it is financial in nature
• Also lots of private information flow
around web applications
• An obvious target for attackers
CS 136, Fall 2011
Lecture 18
Page 2
The Web Security Problem
• Many users interact with many servers
• Most parties have little other relationship
• Increasingly complex things are moved via the
web
• No central authority
• Many developers with little security experience
• Many critical elements originally designed with no
thought to security
• Sort of a microcosm of the overall security
problem
Lecture 18
CS 136, Fall 2011
Page 3
Aspects of the Web Problem
CS 136, Fall 2011
Lecture 18
Page 4
Who Are We Protecting?
The clients
From the server
From
the
client
The client
The server
From each other
CS 136, Fall 2011
Lecture 18
Page 5
What Are We Protecting?
• The client’s private data
• The server’s private data
• The integrity (maybe secrecy) of their
transactions
• The client and server’s machines
• Possibly server availability
– For particular clients?
CS 136, Fall 2011
Lecture 18
Page 6
Some Real Threats
• Buffer overflows and other
compromises
– Client attacks server
• SQL injection
– Client attacks server
• Malicious downloaded code
– Server attacks client
CS 136, Fall 2011
Lecture 18
Page 7
More Threats
• Cross-site scripting
– Clients attack each other
• Threats based on non-transactional
nature of communication
– Client attacks server
• Denial of service attacks
– Threats on server availability
(usually)
CS 136, Fall 2011
Lecture 18
Page 8
Compromise Threats
• Much the same as for any other
network application
• Web server might have buffer
overflow
– Or other remotely usable flaw
• Not different in character from any
other application’s problem
– And similar solutions
CS 136, Fall 2011
Lecture 18
Page 9
What Makes It Worse
• Web servers are complex
• They often also run supporting code
– Which is often user-visible
• Large, complex code base is likely to
contain such flaws
• Nature of application demands
allowing remote use
CS 136, Fall 2011
Lecture 18
Page 10
Solution Approaches
•
•
•
•
Patching
Use good code base
Minimize code that the server executes
Maybe restrict server access
– When that makes sense
• Lots of testing and evaluation
– Many tools for web server evaluation
CS 136, Fall 2011
Lecture 18
Page 11
SQL Injection Attacks
• Many web servers have backing
databases
– Much of their information stored in
database
• Web pages are built (in part) based on
queries to database
– Possibly using some client input . . .
CS 136, Fall 2011
Lecture 18
Page 12
SQL Injection Mechanics
• Server plans to build a SQL query
• Needs some data from client to build it
– E.g., client’s user name
• Server asks client for data
• Client, instead, provides a SQL fragment
• Server inserts it into planned query
– Leading to a “somewhat different” query
CS 136, Fall 2011
Lecture 18
Page 13
An Example
“select * from mysql.user
where username = ‘ “ . $uid . “ ‘ and
password=password(‘ “. $pwd “ ‘);”
• Intent is that user fills in his ID and
password
• What if he fills in something else?
‘or 1=1; -- ‘
CS 136, Fall 2011
Lecture 18
Page 14
What Happens Then?
• $uid has the string substituted, yielding
“select * from mysql.user
where username = ‘ ‘ or 1=1; -- ‘ ‘ and
password=password(‘ “. $pwd “ ‘);”
• This evaluates to true
– Since 1 does indeed equal 1
– And -- comments out rest of line
• If script uses truth of statement to determine
valid login, attacker has logged in
CS 136, Fall 2011
Lecture 18
Page 15
Basis of SQL Injection Problem
•
•
•
•
Unvalidated input
Server expected plain data
Got back SQL commands
Didn’t recognize the difference and went
ahead
• Resulting in arbitrary SQL query being sent
to its database
– With its privileges
CS 136, Fall 2011
Lecture 18
Page 16
Solution Approaches
• Carefully examine all input
– To filter out injected SQL
• Use database access controls
– Of limited value
• Randomization of SQL keywords
– Making injected SQL meaningless
CS 136, Fall 2011
Lecture 18
Page 17
Malicious Downloaded Code
• The web relies heavily on downloaded code
– Full language and scripting language
– Mostly scripts
• Instructions downloaded from server to
client
– Run by client on his machine
– Using his privileges
• Without defense, script could do anything
CS 136, Fall 2011
Lecture 18
Page 18
Types of Downloaded Code
• Java
– Full programming language
• Scripting languages
– Java Script
– VB Script
– ECMAScript
– XSLT
CS 136, Fall 2011
Lecture 18
Page 19
Solution Approaches
• Disable scripts
– Not very popular
• Use secure scripting languages
– Also not popular
– Particularly with code writers
• Isolation mechanisms
– VM or application-based
• Vista mandatory access control
CS 136, Fall 2011
Lecture 18
Page 20
Cross-Site Scripting
• XSS
• Many sites allow users to upload information
– Blogs, photo sharing, Facebook, etc.
– Which gets permanently stored
– And displayed
• Attack based on uploading a script
• Other users inadvertently download it
– And run it . . .
CS 136, Fall 2011
Lecture 18
Page 21
The Effect of XSS
• Arbitrary malicious script executes on
user’s machine
• In context of his web browser
– At best, runs with privileges of the
site storing the script
– Often likely to run at full user
privileges
CS 136, Fall 2011
Lecture 18
Page 22
Why Is XSS Common?
• Use of scripting languages widespread
– For legitimate purposes
• Most users leave them enabled in
browser
• Only a question of getting user to run
your script
– Often only requires fetching URL
CS 136, Fall 2011
Lecture 18
Page 23
Typical Effects of XSS Attack
• Most commonly used to steal personal
information
– That is available to legit web site
– User IDs, passwords, credit card
numbers, etc.
• Such information often stored in
cookies at client side
CS 136, Fall 2011
Lecture 18
Page 24
Solution Approaches
• Don’t allow uploading of scripts
– Usually by carefully analyzing
uploaded data
• Provide some form of protection in
browser
CS 136, Fall 2011
Lecture 18
Page 25
Exploiting Statelessness
• HTTP is designed to be stateless
• But many useful web interactions are
stateful
• Various tricks used to achieve statefulness
– Usually requiring programmers to
provide the state
– Often trying to minimize work for the
server
CS 136, Fall 2011
Lecture 18
Page 26
A Simple Example
• Web sites are set up as graphs of links
• You start at some predefined point
– A top level page, e.g.
• And you traverse links to get to other pages
• But HTTP doesn’t “keep track” of where
you’ve been
– Each request is simply the name of a link
CS 136, Fall 2011
Lecture 18
Page 27
Why Is That a Problem?
• What if there are unlinked pages on the
server?
• Should a user be able to reach those
merely by naming them?
• Is that what the site designers
intended?
CS 136, Fall 2011
Lecture 18
Page 28
A Concrete Example
• The ApplyYourself system
• Used by colleges to handle student
applications
• For example, by Harvard Business
School in 2005
• Once all admissions decisions made,
results available to students
CS 136, Fall 2011
Lecture 18
Page 29
What Went Wrong?
• Pages representing results were created as
decisions were made
• Stored on the web server
– But not linked to anything, since results
not yet released
• Some appliers figured out how to craft
URLs to access their pages
– Finding out early if they were admitted
CS 136, Fall 2011
Lecture 18
Page 30
The Core Problem
• No protocol memory of what came before
• So no protocol way to determine that
response matches request
• Could be built into the application that
handles requests
• But frequently isn’t
– Or is wrong
CS 136, Fall 2011
Lecture 18
Page 31
Solution Approaches
• Get better programmers
– Or better programming tools
• Back end system that maintains and
compares state
• Front end program that observes
requests and responses
– Producing state as a result
CS 136, Fall 2011
Lecture 18
Page 32
Conclusion
• Web security problems not inherently
different than general software security
• But generality, power, ubiquity of the
web make them especially important
• Like many other security problems,
constrained by legacy issues
CS 136, Fall 2011
Lecture 18
Page 33
Privacy
• Data privacy issues
• Network privacy issues
• Some privacy solutions
CS 136, Fall 2011
Lecture 18
Page 34
What Is Privacy?
• The ability to keep certain information
secret
• Usually one’s own information
• But also information that is “in your
custody”
• Includes ongoing information about
what you’re doing
CS 136, Fall 2011
Lecture 18
Page 35
Privacy and Computers
• Much sensitive information currently
kept on computers
– Which are increasingly networked
• Often stored in large databases
– Huge repositories of privacy time
bombs
• We don’t know where our information
is
CS 136, Fall 2011
Lecture 18
Page 36
Privacy and Our Network
Operations
• Lots of stuff goes on over the Internet
– Banking and other commerce
– Health care
– Romance and sex
– Family issues
– Personal identity information
• We used to regard this stuff as private
– Is it private any more?
CS 136, Fall 2011
Lecture 18
Page 37
Threat to Computer Privacy
• Cleartext transmission of data
• Poor security allows remote users to access
our data
• Sites we visit can save information on us
– Multiple sites can combine information
• Governmental snooping
• Location privacy
• Insider threats in various places
CS 136, Fall 2011
Lecture 18
Page 38
Some Specific Privacy Problems
• Poorly secured databases that are remotely
accessible
– Or are stored on hackable computers
• Data mining by companies we interact with
• Eavesdropping on network communications
by governments
• Insiders improperly accessing information
• Cell phone/mobile computer-based location
tracking
CS 136, Fall 2011
Lecture 18
Page 39
Data Privacy Issues
• My data is stored somewhere
– Can I control who can use it/see it?
• Can I even know who’s got it?
• How do I protect a set of private data?
– While still allowing some use?
• Will data mining divulge data “through
the back door”?
CS 136, Fall 2011
Lecture 18
Page 40
Personal Data
• Who owns data about you?
• What if it’s really personal data?
– Social security number, DoB, your DNA
record?
• What if it’s data someone gathered about
you?
– Your Google history or shopping records
– Does it matter how they got it?
CS 136, Fall 2011
Lecture 18
Page 41
Protecting Data Sets
• If my company has (legitimately) a
bunch of personal data,
• What can I/should I do to protect it?
– Given that I probably also need to
use it?
• If I fail, how do I know that?
– And what remedies do I have?
CS 136, Fall 2011
Lecture 18
Page 42
Options for Protecting Data
• Careful system design
• Limited access to the database
– Networked or otherwise
• Full logging and careful auditing
• Using only encrypted data
– Must it be decrypted?
– If so, how to protect the data and the
keys?
CS 136, Fall 2011
Lecture 18
Page 43
Data Mining and Privacy
• Data mining allows users to extract
models from databases
– Based on aggregated information
• Often data mining allowed when direct
extraction isn’t
• Unless handled carefully, attackers can
use mining to deduce record values
CS 136, Fall 2011
Lecture 18
Page 44
Insider Threats and Privacy
• Often insiders need access to private
data
– Under some circumstances
• But they might abuse that access
• How can we determine when they
misbehave?
• What can we do?
CS 136, Fall 2011
Lecture 18
Page 45
Network Privacy
• Mostly issues of preserving privacy of
data flowing through network
• Start with encryption
– With good encryption, data values
not readable
• So what’s the problem?
CS 136, Fall 2011
Lecture 18
Page 46
Traffic Analysis Problems
• Sometimes desirable to hide that
you’re talking to someone else
• That can be deduced even if the data
itself cannot
• How can you hide that?
– In the Internet of today?
CS 136, Fall 2011
Lecture 18
Page 47
Location Privacy
• Mobile devices often communicate
while on the move
• Often providing information about
their location
– Perhaps detailed information
– Maybe just hints
• This can be used to track our
movements
CS 136, Fall 2011
Lecture 18
Page 48
Implications of Location Privacy
Problems
• Anyone with access to location data
can know where we go
• Allowing government surveillance
• Or a private detective following your
moves
• Or a maniac stalker figuring out where
to ambush you . . .
CS 136, Fall 2011
Lecture 18
Page 49
Some Privacy Solutions
• The Scott McNealy solution
– “Get over it.”
• Anonymizers
• Onion routing
• Privacy-preserving data mining
• Preserving location privacy
• Handling insider threats via optimistic
security
CS 136, Fall 2011
Lecture 18
Page 50
Anonymizers
• Network sites that accept requests of
various kinds from outsiders
• Then submit those requests
– Under their own or fake identity
• Responses returned to the original
requestor
• A NAT box is a poor man’s
anonymizer
CS 136, Fall 2011
Lecture 18
Page 51
The Problem With Anonymizers
• The entity running it knows who’s who
• Either can use that information himself
• Or can be fooled/compelled/hacked to
divulge it to others
• Generally not a reliable source of real
anonymity
CS 136, Fall 2011
Lecture 18
Page 52
Onion Routing
• Meant to handle issue of people
knowing who you’re talking to
• Basic idea is to conceal sources and
destinations
• By sending lots of crypo-protected
packets between lots of places
• Each packet goes through multiple
hops
CS 136, Fall 2011
Lecture 18
Page 53
A Little More Detail
• A group of nodes agree to be onion
routers
• Users obtain crypto keys for those
nodes
• Plan is that many users send many
packets through the onion routers
– Concealing who’s really talking
CS 136, Fall 2011
Lecture 18
Page 54
Sending an Onion-Routed Packet
• Encrypt the packet using the
destination’s key
• Wrap that with another packet to
another router
– Encrypted with that router’s key
• Iterate a bunch of times
CS 136, Fall 2011
Lecture 18
Page 55
In Diagram Form
Source
Destination
Onion routers
CS 136, Fall 2011
Lecture 18
Page 56
What’s Really in the Packet
CS 136, Fall 2011
Lecture 18
Page 57
Delivering the Message
CS 136, Fall 2011
Lecture 18
Page 58
What’s Been Achieved?
• Nobody improper read the message
• Nobody knows who sent the message
– Except the receiver
• Nobody knows who received the
message
– Except the sender
• Assuming you got it all right
CS 136, Fall 2011
Lecture 18
Page 59
Issues for Onion Routing
• Proper use of keys
• Traffic analysis
• Overheads
– Multiple hops
– Multiple encryptions
CS 136, Fall 2011
Lecture 18
Page 60
Privacy-Preserving Data Mining
• Allow users access to aggregate
statistics
• But don’t allow them to deduce
individual statistics
• How to stop that?
CS 136, Fall 2011
Lecture 18
Page 61
Approaches to Privacy for Data
Mining
• Perturbation
– Add noise to sensitive value
• Blocking
– Don’t let aggregate query see sensitive
value
• Sampling
– Randomly sample only part of data
CS 136, Fall 2011
Lecture 18
Page 62
Preserving Location Privacy
• Can we prevent people from knowing
where we are?
• Given that we carry mobile
communications devices
• And that we might want locationspecific services ourselves
CS 136, Fall 2011
Lecture 18
Page 63
Location-Tracking Services
• Services that get reports on our mobile
device’s position
– Probably sent from that device
• Often useful
– But sometimes we don’t want them
turned on
• So, turn them off then
CS 136, Fall 2011
Lecture 18
Page 64
But . . .
• What if we turn it off just before
entering a “sensitive area”?
• And turn it back on right after we
leave?
• Might someone deduce that we spent
the time in that area?
• Very probably
CS 136, Fall 2011
Lecture 18
Page 65
Handling Location Inferencing
• Need to obscure that a user probably
entered a particular area
• Can reduce update rate
– Reducing certainty of travel
• Or bundle together areas
– Increasing uncertainty of which was
entered
CS 136, Fall 2011
Lecture 18
Page 66
Conclusion
• Privacy is a difficult problem in
computer systems
• Good tools are lacking
– Or are expensive/cumbersome
• Hard to get cooperation of others
• Probably an area where legal
assistance is required
CS 136, Fall 2011
Lecture 18
Page 67