Protecting Against Web Application Injection with

Transcript Protecting Against Web Application Injection with

Preventing Web Application
Injections with Complementary
Character Coding
Raymond Mui
Phyllis Frankl
Polytechnic Institute of NYU
Presented at ESORICS 2011
Supported by NSF and CATT; Patent Pending
Web Application Injection Attacks
• Malicious user inputs cause unintended
executions of commands
• Caused by improper input sanitization
• SQL injection and cross-site scripting rank
among top application security threats
(OWASP Top 10)
<?PHP
Example: Vulnerable PHP program
$message = $_POST[’message’];
$username = $_POST[’username’];
Unsanitized user inputs
…
//welcome the user
if(isset($username)) { echo "Welcome $username <br />"; }
// insert new message
if(isset($message)) {
$query = “insert into messages values(’$username’, ’$message’)";
mysql_query($query);
}
…
// display all messages except the ones from admin
$query = "select * from messages where not (user = ‘admin’)";
$result = mysql_query($query);
echo ’<br /><b>Your messages:</b><br/>’;
while($row=mysql_fetch_assoc($result)){
if($row[’username’] != $username) echo "you ";
else echo " { $row[’username’] } ";
echo "wrote: { $row[’message’] }";
}
…
?>
Normal Use
select * from messages …
insert into messages
values(‘Alice’,’hello’);
Alice
Hello
DBMS
Web Server/
PHP Interpreter
Bonnie
<html>
…
Alice wrote
Hello
…
user
message
Alice
Hello
SQL Injection
Alice
hello’); drop table messages; --
insert into messages
values(‘Alice’,’hello’); drop table messages; --’);
DBMS
Web Server/
PHP Interpreter
user
message
Alice
Hello
Persistent Cross-Site Scripting
Alice
<script>…</script>
select
from
messages …
insert *into
messages
values(‘Alice’,’<script> …’);
DBMS
Web Server/
PHP Interpreter
Bonnie
<html>
…
Alice wrote <script>…</script>
…
user
message
Alice
<script> …
Browser/Javascript
Execute script with privileges
Of the origin site
Injection Attack Defenses
• Input sanitization
• Blacklist / whitelist
• In research
– Dynamic tainting
– Static analysis
– Model checking
– Instruction randomization
– Machine learning
–…
[DT intro.]
Weaknesses of Current Approaches to
Dynamic Tainting
• Overhead
– Code instrumentation
– Storage and propagation of taint data
– Sink checking
• Requires detailed knowledge of context at
taint sinks:
– SQL syntax (for particular SQL dialect)
• Taint propagation cannot cross component
boundaries
– Either the entire database is tainted or it is not
– Persistent XSS
Our Approach:
Complementary Character Coding
•
Main idea
–
•
•
•
Free taint storage
Free taint propagation through execution
Taint propagation across components
–
–
•
Turn dynamic tainting into a character coding
Between application and database
Between client and server over HTTP
Complement Aware Components
–
–
Safe execution of unsanitized code against injection attacks
Backwards compatibility through HTTP content negotiation
Complementary Character Coding

Two versions of every character




Each character gets two code points instead of
one
Standard characters
Complement characters
Two flavors


Complementary ASCII
Complementary Unicode
Complementary Unicode


Unicode
 Current version 6.0
 Less than 25% code space used or reserved
Allows possibility of having more than two
versions of each character
 Future work
Complementary ASCII

Standard characters
Values 0 – 127
Same as standard ASCII characters



Complement characters
Values 128 – 255

0
Taint
bit
1
0
0
0
0
1
1
--------------------------------------Data bits----------------------------
Complementary Character Coding:
Comparison Functions

Value Comparison



A standard character is equal to its
complement
Convert to standard character, and then
compare all the bits
Full Comparison


Standard and complement versions of same
character are not equal
Compare all the bits
Dynamic Tainting with
Complementary Character Coding
• Encode untrusted user inputs with complement
characters
– Explicitly converted by the server on entry
• Encode trusted developer code with standard
characters
• Value comparison during execution
– Functionality remains the same
– Automatic taint propagation by execution
– Taint propagation over database and HTTP
• Each complement aware component has complete
picture of taint status during parsing
Complement Aware Components
and Security Policy ( chart )
• Allowed token set
– Specified by each component individually for parsing
– Defines tokens allowed to contain untrusted characters
• Default policy
– Allowed token set = {numbers, string literals}
– Prevents all possible injections
• Maybe too restrictive for web browsers
• More permissive policies
– Browsers could allow tainted formatting tags
– Allowed token set = {numbers, string literals, <b>, <i>, etc.}
• Enforcement
– Match tokens in allowed token set with value comparison
– Everything else (forbidden tokens) are matched with full
comparison
<?PHP
Example: Vulnerable PHP program
…
$message = $_POST[’message’];
Untrusted inputs converted
$username = $_POST[’username’];
Into complement characters
…
by server
//welcome the user
if(isset($username)) {
echo "Welcome $username <br />";
}
// insert new message
if(isset($message)) {
$query = “INSERT INTO messages VALUES(’$username’, ’$message’)";
MySQL_query($query);
} …
// display all messages except the ones from admin
$query = "select * from messages where not (user = ‘admin’)”;
$result = MySQL_query($query);
echo ’<br /><b>Your messages:</b>’;
Value comparison
while($row=MySQL_fetch_assoc($result)){
?>
}…
if($row[’username’] != $username) echo "you";
else echo " {$row[’username’]} ";
echo "wrote: {$row[’message’]}";
Used by DBMS
And PHP interpreter
here
SQL Injection with Complement Aware DBMS
insert into messages values(‘Alice’,’hello’);
Alice
hello’); drop table messages; -- … drop table messages;--’);
DBMS
Web Server/
PHP Interpreter
Red denotes complement characters
user
message
Alice
hello’); drop …
‘ does not match ‘
; does not match ;
) does not match )
drop does not match drop, etc.
So DBMS stores literal
rather than dropping table.
Persistent Cross-site scripting attack
Alice
<script>…</script>
insert into messages
values(‘Alice’,’<script> …’);
select * from messages …
DBMS
Web Server/
PHP Interpreter
Bonnie
user
message
Alice
<script> …
<html>
…
Alice wrote <script>…</script>
…
<script> does not match <script>, etc., so browser displays
the characters rather than executing the script.
More permissive browser security policy: Allowed
token set includes boldface tags
Alice
<b>Hello</b>
select * from messages …
insert into messages
values(‘Alice’,’<b>Hello</b>’);
DBMS
Web Server/
PHP Interpreter
Bonnie
Browser,
Javascript,
…
user
message
Alice
<b>Hello<b>
Policy with allowed token set:
{<b>, </b>, …}
<html>
…
Alice wrote
<b>Hello</b>
…
Boldface tags matched with value
comparison, so browser renders
Hello in bold.
Backwards Compatibility
• Take advantage of HTTP content negotiation
mechanism
• Web browsers identify themselves through AcceptCharset header
• Complement aware browser
– Send output in complementary character coding
• Non-complement aware browser
– Route output through a filter that acts as a complement
aware browser
• Apply security policy (e.g. default policy)
• Convert output into format specified by Accept-Charset
header
• Extra overhead
• Gradually decrease as more people upgrade to
complement aware browser
Prototype Implementation


Done in complementary ASCII
LAMP (Linux Apache MySQL PHP)



Firefox


Default policy
Backwards compatible with standard browsers
Customized security policies through defined
allowed token sets
Enough to run proof-of-concept
experiments
Experimental Evaluation


Evaluation objectives

Effectiveness

Possible Defects

Overhead
Benchmarks

SQL Injection Application Testbed (Halfond et al)
 ATTACK set
 LEGIT set

ARDILLA (Kieyzun et al)
 Generated using automated technique
 SQL injection, reflected XSS, and persistent XSS
Benchmarks
LOC : Line Of Code
Results: Effectiveness



Ran ATTACK set from SQL Injection
Application Testbed using a script
 Checked database logs for SQL injection
Manually executed ARDILLA test cases
Found no signs of injections
Results: Possible Defects




Set up original and complement aware web
server with identical initial environments
Ran LEGIT set from SQL Injection
Application Testbed on both
Compared output produced by both
versions
Resulting web pages identical by value
comparison
Overhead Evaluation
50
45
Time (seconds)
40
35
Original
30
Without filter
25
With filter
20
15
10
5
0
Bookstore
Classifieds
Empldir
Events
Portal
Applications


Ran LEGIT set in SQL Injection Application Testbed and
compared average over 100 runs
Worse case overhead less than 2%
Conclusion and Future Work

Complementary character coding



Low overhead character level taint tracking
Taint propagation across component boundaries
Complement aware components



Safe execution of unsanitized code against injection
attacks
Backwards compatibility with current browsers
Future Work



Implement complementary Unicode
Explore other applications of complementary
character coding
Web standard
Questions?
Dynamic Tainting Propagation

Taint Source



Propagation


External data that should be initialized as tainted
e.g : GET, POST, Reading from DB or file…
The tainted state of variables propagates via assignment
or mathematical operation
Taint Sink


Points where sensitive operations are performed, and the
taint status of variables is checked here.
eg: SQL query, html output…
Dynamic Tainting Propagation ( cont. )

Example (ref)
Assume a is tainted
x is marked as tainted, since x is
affected directly by a
Namely, a propagate its taint state to x
Dynamic Tainting Analysis

Procedure
Instrumentation
Add specific code at source and
sink
Testing
Reporting
Generate a vulnerability report via
the log file or exception thrown
during execution
Default Policy
Allowed token set
String literal,
number
Untrusted data can be included.
Use value comparison to match token
Forbidden token set
Forbidden tokens
Trusted token only
Use full comparison to match token
Default Policy
• Example
Allowed token set
String literal
number
Forbidden token set
< (00111100)
<
0011 1100
Convert to
complementary form
1011 1100

Protecting Against Web Application Injection with

Transcript Protecting Against Web Application Injection with

Directory