Extending Microsoft’s Phoenix Framework

Download Report

Transcript Extending Microsoft’s Phoenix Framework

A software analysis framework
built on Phoenix

Matt Miller

Leviathan Security Group

Metasploit Framework

Uninformed Journal

Not a static analysis expert 

Cthulhu software analysis framework

Very high-level architectural overview

Interesting features

Case study

Software optimization and analysis

Basis for future Microsoft compilers and tools

Robust and extensible architecture
◦ Plugins
◦ Phases

Check out Richard Johnson’s talk to learn
more 

RDK/SDK not yet completely solidified
◦ Encapsulation can help here

API is feature rich but verbose
◦ No simplified wrapper

No solution for large-scale analysis
◦ LTCG is not enough

Software analysis framework

Hobby project started in June, 2006

Written in C#

Currently around 28KLOC

Simplified Programming Interface
◦ Simple and extensible API
◦ Fundamental independence

Large-scale analysis
◦ Modeling behavior of large systems
◦ Pie in the sky: Windows Vista 

Research Sandbox
◦ A playground for experimentation
◦ Phoenix can also be used directly for this purpose
Data Flow
IDA
DB
Control Flow
Peons
Phoenix
Analysis Engine
Tools
Analysis
Rendering
Fundamentals
Data Flow
IDA
DB
Control Flow
Peons
Phoenix
Analysis Engine
Tools
Analysis
Rendering
Fundamentals

Uses a fundamental to load assemblies

Runs phases
◦ Import
◦ Analyze
◦ Render

Peons register to be notified on certain events
Phoenix
Fundamental
1. Load Assembly
DB
2. Assembly Loaded
Analysis Engine
3. Import Event
4. Normalize
Information
Importing
Peons
5. Import Event
Basic Types
Control Flow
Data Flow
Database
Fundamental
2. Denormalize
Assembly
Information
DB
1. Load Assembly
3. Assembly Loaded
Analysis Engine
4. Analysis Event
5. Normalize and
Denormalize
Information
Analyzing
Peons
6. Analysis Event
Path Discovery
Leak Check
DB
2. Denormalize
Rendering
Peons
Analysis
Engine
1. Render
Console
GUI
3. Display
Output
Store


Extensible and flexible way to represent
binary information
May be used to support large-scale analysis
◦ Hundreds of modules
◦ More work needs to be done

Performance overhead is non-trivial
◦ Processing time can be high
◦ Volatile memory usage can be kept low
Simplified API
Version-independent modeling
Conceptual modeling
Abstract classes
provide fundamental
independence
Assembly
Module
Data Type
Method
…
Assembly
Assembly
Module
Module
Data Type
Data Type
Method
Method
Phoenix
DB
Concrete
Implementations
Modeling version independent
relationships between software
elements in the database
Appropriate versions can
be selected at analysis time
void CallExitProcess()
{
ExitProcess(0);
}
ExitProcess 1
ExitProcess 2
CallExitProcess 1
ExitProcess
ExitProcess 3
ExitProcess 4
Call to version independent
kernel32!ExitProcess
Distinct versions of
kernel32!ExitProcess
Universe
VPN Client
VPN Server
Device Driver
Daemon
vpn.sys
daemon.exe
User Interface
vpngui.exe
dialogs.dll
Finding inter-component data
flow paths

Web Services is a simple remoting interface
◦ Clients invoke methods hosted on a web server
◦ Server handles requests and provides responses

Problematic for static analysis
◦ Clients pass data to the server indirectly (network)
◦ Limits the scope at which analysis can be performed

Let’s walk through an example
[WebService]
public class WebService
{
[WebMethod]
public void ExecuteCommand(string command)
{
Process.Start(command);
}
}
Simple web service that invokes a process using the supplied
command string
[WebServiceBinding]
public class WebClient : SoapHttpClientProtocol
{
[SoapDocumentMethod]
public void ExecuteCommand(string command)
{
Invoke("ExecuteCommand",
new object[] { command );
}
}
Simple web client that wraps the invocation of the web service
method

To illustrate a relationship, the client
invocation and server method must be
bridged

Bridging can take a few different forms
◦ Automatic detection of relationships
◦ Manual description of relationships

Bridging is an abstract concept though
◦ How do we make it concrete?

A concrete relationship can be shown by
linking formal parameters
fin(ExecuteCommand, 0)
WebClient
WebService
fin(ExecuteCommand, 0)
Web Application
Web Client
Web Service
WebClient.dll
WebService.dll
WebClient
WebService
ExecuteCommand
ExecuteCommand
Enter Block
Enter Block
fin(0)
fin(0)



Describing indirect relationships improves the
quality of analysis information
Widens the scope for control flow and data
flow analysis
The Path Discovery peon can help illustrate
this

Designed to find reachable flow paths
◦ From a set of sources
◦ To a set of sinks
◦ Within a set of target assemblies

Current restrictions
◦ Requires the database fundamental
◦ Only operates on data flow information

Command Injection represents one type of

This can happen when user-controlled data is
used in conjunction with launching a process

For example, data passing…
security flaw found in managed applications
◦ From HttpRequest.get_QueryString
◦ To Process.Start

This should be easy to detect, right?



Finding data flow paths from get_QueryString
to Start can be problematic
Lowest level data flow information is
conveyed with respect to instructions
What if hundreds of assemblies are being
analyzed?
◦ Not enough physical memory!

Path Discovery makes use of generalized data
flow relationships
◦ Block-tier, method-tier, type-tier, etc…

Reachable paths are identified using a simple
algorithm
◦ Progressive Qualified Elaboration (PQE)

PQE is designed to reduce the amount of
analysis information that must be considered
Reachable paths are progressively found between source and
sink flow descriptors within a set of target assemblies
Tier
Information
Component
fout(Undefined)
Assembly
fout(System.Web)
Data Type
fout(System.Web.HttpRequest)
Method
fout(get_QueryString, 0)
Basic Block
fout(get_QueryString, 0)
Instruction
fout(get_QueryString, 0)
Sink flow
descriptor
Source flow
descriptor
Tier
Information
Component
fin(Undefined)
Assembly
fin(System)
Data Type
fin(System.Dia…Process)
Method
fin(Start, 0)
Basic Block
fin(Start, 0)
Instruction
fin(Start, 0)

Suppose there is some code in the web client
that does the following
◦ client.ExecuteCommand(request.QueryString[x]);


Bridging makes it possible to show a
complete data flow path from
get_QueryString to Start
Let’s see how we get there using PQE
◦ PQE starts from a macro-tier, such as the
component tier
Data flow Def-Use
relationships
between components
Interpretation:
In at least one
situation,
v uses data
defined by u
Data flow Def-Use
relationships
between assemblies
Data flow Def-Use
relationships
between data types
Data flow Def-Use
relationships
between methods
Data flow Def-Use
relationships
between blocks
Data flow Def-Use
relationships
between instructions

A complete data flow path is identified

Data flows across an indirect boundary

Without bridging, it would not be possible to
seamlessly perform this analysis
◦ This means the security issue would be missed

Note that the security issue exists in the web
service independent of the web client
◦ Example was meant to show simple indirect data flow

Import and analyze large data sets
◦ All PE modules from Windows Vista?

Improve database performance
◦ Optimization work has not started yet
◦ It is currently very slow

Implement additional peons
◦ Leak Check

And the list goes on…

Phoenix is an exciting project

Software analysis is fun & challenging

Hopefully the database stuff pans out 

Questions?