Reverse Engineering

Download Report

Transcript Reverse Engineering

Software Maintenance
and Evolution
JMSE-SM&E
Unit 7: Reverse
Engineering
Prof. Mohammad A. Mikki
Gaza, Palestine, 5.8.2014
Tempus
“Software Maintenance and Evolution “
Course
Unit 7
Reverse Engineering
Joint Master in Software Engineering
Software Maintenance and Evolution
2
Objectives/ILOs
 Define and get an overview of reverse engineering
 Understand the relationship between reverse engineering and
re-engineering
 Understand the issues and complete a reverse engineering
project
 Be able to analyse reverse engineering options
 Evaluate different reverse engineering strategies
 Differentiate between different levels of analysis in reverse
engineering
 Use static and dynamic analysis techniques of reverse
engineering
 Differenciate between analysis of source code vs. binaries.
Joint Master in Software Engineering
Software Maintenance and Evolution
3
List of Topics









Overview
Reverse Engineering Process
Reverse Engineering Methodology
System Level Design
Data reverse engineering
Source code vs. binaries
Static and Dynamic Analysis
Reverse engineering tools
Example on reverse engineering
Joint Master in Software Engineering
Software Maintenance and Evolution
4
Reverse Engineering Definition
Reverse engineering is the process of analysing a system to identify
its components and their interrelationships with the objective to
create representations of the system in another form or at a higher
level of abstraction.
Joint Master in Software Engineering
Software Maintenance and Evolution
5
Reverse Engineering Concepts
 Reverse engineering often precedes re-engineering.
 It may be part of a re-engineering process but it may also be
used to re-specify a system for re-implementation.
 Reverse engineering is sometimes done as an independent
activity.
Joint Master in Software Engineering
Software Maintenance and Evolution
6
Reverse Engineering Concepts (Cont.)
 Reverse engineering can be done at many levels.
 Reverse engineering belongs to Software Maintenance.
 Reverse engineering is a systematic methodology for
analyzing the design of an existing system, either as an
approach to study the design or as a prerequisite for redesign.
Joint Master in Software Engineering
Software Maintenance and Evolution
7
Reverse Engineering Objectives
 Determine how a product works.
 Learn the ideas and technology that were used in
developing the product.
 Identify the system's components and their
interrelationships.
 Create representations of the system in another form or at
a higher level of abstraction.
Joint Master in Software Engineering
Software Maintenance and Evolution
8
Why Do We Need Reverse Engineering?
 You have the following problem:
- By accident, you delete the a source code file
- How to recover it?
 Answer:
- Use reverse Engineering
Joint Master in Software Engineering
Software Maintenance and Evolution
9
Why Do We Need Reverse Engineering?
(Cont.)

Discovery of abstraction
Abstract System
Reverse
Engineering
Built System
Joint Master in Software Engineering
Software Maintenance and Evolution
10
Benefits of Reverse Engineering
 Savings in maintenance cost
 Improvement of system’s quality
 Competitive advantages
 Facilitation of software reuse
Joint Master in Software Engineering
Software Maintenance and Evolution
11
Relation With Forward Engineering
 Reverse engineering is the opposite of forward engineering.
 Forward engineering is when a system is designed and built to do its
specified objectives using basic already designed components or
subsystems.
 In forward engineering, the system is already built and the engineers
have to find out how it is designed, e.g., which components are used to
build it.
Joint Master in Software Engineering
Software Maintenance and Evolution
12
Relation With Forward Engineering (Cont.)
Abstract System
Reverse
Engineering
or
Abstraction
Forward Engineering
or
Implementation
Built
System
Joint Master in Software Engineering
System to be
built
Software Maintenance and Evolution
13
Reverse Engineering Phases
 Phase 1: Collecting information by studying the existing system
- Through tools such as parsers, debuggers, profilers, and
event recorders
 Phase 2: Abstracting information
- Finding out the the high level components which compose
the system
Joint Master in Software Engineering
Software Maintenance and Evolution
14
Reverse Engineering Process [1]
Program stucture
diagrams
Automated
analysis
System
information
store
System to be
re-engineered
Document
generation
Manual
annotation
Joint Master in Software Engineering
Data stucture
diagrams
Traceability
matrices
Software Maintenance and Evolution
15
Reverse Engineering Methodology [4]
 Re-documentation and/or document generation
 Recovery of design approach and design details at any
level of abstraction
 Identifying reusable components and components that
need restructuring
 Recovering business rules
 Understanding high-level system description.
Joint Master in Software Engineering
Software Maintenance and Evolution
16
Reverse Engineering Methodology
(Cont.) [2]
1. Investigation, Prediction and Hypothesis
 Steps in
the figure
will be
explained
in
following
slides
Reverse
Engineering
2. Concrete Experience: Function & Form
3. Design Models
Modeling & Analysis
4. Design Analysis
5. Parametric
Redesign
6. Adaptive Redesign
7. Original Redesign
Redesign
Figure: Reverse Engineering Methodology [3]
Joint Master in Software Engineering
Software Maintenance and Evolution
17
1. Investigation, Prediction and
Hypothesis [2]
 Develop black box model
 Use / Experience product
 List assumed working principles
 Perform economic feasibility of redesign
 State process description or activity diagram
Joint Master in Software Engineering
Software Maintenance and Evolution
18
2. Concrete Experience: Function and
Form [2]
 Plan and execute product disassembly
 Group defined systems and components together
 Experiment with product components
 Develop free body diagrams
 Identify function sharing and compatibility
 Transform to engineering specification and metrics
Joint Master in Software Engineering
Software Maintenance and Evolution
19
3. Design Models [2]
 Identify actual physical principles
 Constantly consider the customer
 Create engineering models and metric ranges
 Alternatively or concurrently build prototype to test
parameters
Joint Master in Software Engineering
Software Maintenance and Evolution
20
4. Design Analysis [2]
 Calibrate model
 Create engineering analysis, simulation or optimization
 Create experiment and testing procedures
Joint Master in Software Engineering
Software Maintenance and Evolution
21
5. Parametric Redesign [2]
 Optimize design parameters
 Perform sensitivity analysis and tolerance design
 Build and test prototype
Joint Master in Software Engineering
Software Maintenance and Evolution
22
6. Adaptive Redesign [2]
 Recommend new subsystems
 Search for inventive solutions
 Analyze force flows and component combinations
 Build and test prototype
Joint Master in Software Engineering
Software Maintenance and Evolution
23
7. Original Redesign [2]
 Develop new functional structure
 Choose alternatives
 Verify design concepts
 Build and test prototype
Joint Master in Software Engineering
Software Maintenance and Evolution
24
Abstraction Levels of System Design
To handle the complexity of system design, the system is
described at different levels of abstraction (top-down design
approach):
1. System Level Design: Defines system partition into sub-systems
and their interface.
2. Sub-system level design (recursive task): For each sub-system,
define its sub-systems and their interface
Joint Master in Software Engineering
Software Maintenance and Evolution
25
System Level Design
 It is defining components, modules, parts, interfaces, structure,
and architecture of the system to satisfy its requirements.
 System level design is required by reverse engineering
Joint Master in Software Engineering
Software Maintenance and Evolution
26
System Structure (Architecture)
System is composed of:
 Components
- Parts or modules of the system.
- They are the subsystem components.
- They define the system at component-level
 Connections
- between components
- Specify how components interface and communicate
Joint Master in Software Engineering
Software Maintenance and Evolution
27
System Interaction with Environment

System is modeled as a black box.

System gets input from the environment.

System produces output to the environment.
Input
Joint Master in Software Engineering
System
Output
Software Maintenance and Evolution
28
System Level Design Example: TV
Heat & Noise
Power
TV Signal
User Choices (Volume,
Frequency,)
Joint Master in Software Engineering
Convert Signal To
Sound At Desired
Level +
Convert Frequency to
Video Channel
Software Maintenance and Evolution
Sound + Video
Status Indications
(Volume, Frequency)
29
Data Reverse Engineering

Data reverse engineering focuses on data and datarelationships for two types of data:
- data structures within programs
- databases
Joint Master in Software Engineering
Software Maintenance and Evolution
30
Data Reverse Engineering [3]
- OO model
(objects,
associations,
inheritance, ...)
- keys
Optimizations
- ...
conceptual
schema
extension
migration
Abstraction
wrapping
- reengineer
integration
distribution
logical
schema
...
Analysis
- domain expert
- - developer
- - reengineer
physical schema:
- Data
- Schema catalog
- Code
- documentation
Software Maintenance and Evolution
31
Data Reverse Engineering Exampe

Convert text data files into relational databases (DBs)
e.g., converting the following flat text data file into a
relational DB table.
Column name
First name
Last name
Id
Street
City
Country
Birth Date
Joint Master in Software Engineering
Width
25
25
9
25
20
10
8
Software Maintenance and Evolution
32
Source Code vs. Binaries for SE
 What is source code:
- It is the code of a program written in high level
programming language.
- Contains variable declarations, instructions, functions, loops,
and other statements that tell the program how to function.
- Comments could be added to explain code sections.
 What is binary code:
- It is the simplest form of computer code or programming
data.
- It is represented entirely by a binary system of digits
consisting of a string of consecutive zeros and ones.
- Binary code is often associated with machine code
Software Maintenance and Evolution
33
Source Code vs. Binaries for SE (Cont.)
[3]
Using source code
Using binaries
 better form of
representation
 not always possible
 result depends on the
parser (notable
differencies)
 faster information
collection (e.g. Java byte
code)
 legality issues
Software Maintenance and Evolution
34
Usage of Binaries (reverse engineering,
decompilation, disassembly) [3]
 Recovery of lost source code
 Migration of applications to a new hardware platform
 Translation of code written in obsolete languages not
supported by compiler tools nowadays
 Determination of the existence of viruses or malicious code in
the program
 Recovery of someone else's source code (e.g., to determine
an algorithm)
Joint Master in Software Engineering
Software Maintenance and Evolution
35
A Decompilation Example [3]
public class MyTest {
// This is a silly program.
public static void main(String[] args) {
int myInt1=1;
int myInt2=2;
for (int i=1;i<10;i++) {
for (int j=2;j<8;j++)
myInt1++;
myInt2=myInt2+myInt1;
}
System.out.println("myInt1 is " + myInt1 + " and myInt2 is"
+ myInt2);
}
}
-> Compiled with Sun’s javac compiler and decompiled with DJ
Java Decompiler, let’s see what we got (see next slide):
Joint Master in Software Engineering
Software Maintenance and Evolution
36
A Decompilation Example (Cont.) [3]
import java.io.PrintStream;
public class MyTest {
public MyTest() { }
}
public static void main(String args[]) {
int i = 1;
int j = 2;
for(int k = 1; k < 10; k++) {
for(int l = 2; l < 8; l++)
i++;
j += i;
}
System.out.println("myInt1 is " + i + " and myInt2 is"
+ j);
}
Joint Master in Software Engineering
Software Maintenance and Evolution
37
Static and Dynamic Models [3]
 Static Model: Finding out the static structure, architecture
- code (using a parser)
- Documents
- Interviews

Dynamic Model: Finding out the run-time behavior of
software
- Debugger
- Profiler
- Source code instrumentation
Joint Master in Software Engineering
Software Maintenance and Evolution
38
Static Model Visualization [3]
 Class diagrams
 Hierarchical graphs
Joint Master in Software Engineering
Software Maintenance and Evolution
39
Dynamic Models Visualization [3]
 Scenarios
- (sequence diagrams)
 State diagrams
- (hierarchical) graphs
Joint Master in Software Engineering
Software Maintenance and Evolution
40
Abstracting the Static Model [3]
 Abstracting the high-level components (like
subsystems)
 The process types
- Automatic abstraction
Using the structure of the language
Using measurements
- Manual abstraction
Joint Master in Software Engineering
Software Maintenance and Evolution
41
Example: Using CodeCrowler Reverse
Engineering Tool [3]
 CodeCrawler: a reverse engineering tool that combines metrics
and graphs to visualize OO systems
 http://www.iam.unibe.ch/~lanza/codecrawler/codecrawler.html
Software Maintenance and Evolution
42
Abstracting the Dynamic Model [3]

Finding behavior patterns, repeating sequences of events
- E.g. initializing a dialogue

Using static abstractions
- E.g. representing interactions between high-level software
elements in sequence diagrams

Dynamic information is combined with the high-level static
model
Joint Master in Software Engineering
Software Maintenance and Evolution
43
Merging Static and Dynamic Models [3]
Merging static and dynamic information to
a single view
Dynamic and static views
- Directly illustrates connections between
static and dynamic info
- connections and correspondencies
between the views need to be defined
- Ensuring the quality of the view
-polymorfism (OO) may cause confusion
+ both static and dynamic abstractions can
be built
- building abstractions becomes
combersome and/or requires trade offs:
bahavioral patterns <-> subsystems
+ static and dynamic views are separated
also in forward engineering: support for reengineering and roun-trip engineering
- sequential information is difficult to
merge to a static view
+ more informatin can be viewed
- the more informatin a view contains, the
less readable it gets !
Software Maintenance and Evolution
44
Analyzing the Static Model [3]

Syntax, type checking, interfaces

Control and data flow analysis

Structure analysis

Slicing and dicing (different ways to partition the software)

Measuring the complexity

Navigation
Joint Master in Software Engineering
Software Maintenance and Evolution
45
Analysing the Dynamic Model [3]
 Object creation and related dependencies
 Dynamic binding, polymorphism
 Method calls
 Looking for dead code/reachability analysis
 Memory management
 Performance and related problems
 Concurrency
Joint Master in Software Engineering
Software Maintenance and Evolution
46
Reverse Engineering Tools [3]
 Tools supporting creation of high-level models
 Tools supporting metrics
 Forward & reverse engineering
- Re-engineering & round-trip-engineering &testing
 Other tools
- Parser generators
- Design pattern recognition
Joint Master in Software Engineering
Software Maintenance and Evolution
47
RE Tool Examples [3]

Rigi (University of Victoria, Canada)
- http://www.rigi.csc.uvic.ca/
- A research prototype that represents an open and public
domain reverse engineering tool
- User programmable
- Analysis for: C, C++, COBOL, PL/AS, LaTeX

SNIFF+ (TakeFive Software)
- A software development environment that also provides
reverse engineering capabilities
Joint Master in Software Engineering
Software Maintenance and Evolution
48
RE Tool Examples [3]

McCabe’s Visual Reengineering Toolset and Visual Quality
Toolset
- Various views
- Software metrics (ccomplexity and structuredness)

shown as specific colors on the views
Logiscope (CS Verilog)
- Reverse engineering, code testing, static and dynamic
testing, metrics
- Analysis for: C, C++, Java, ADA

ESW (Viasoft Inc.)
- Forward and reverse engineering (maintenance), metrics,
testing
Joint Master in Software Engineering
Software Maintenance and Evolution
49
RE Tool Examples [3]
 Refine (Reasoning Systems Inc.)
- An open and programmable tool that works in the Refinery
environment tools for generating source code parsing and
conversion tools
- Features for analyzing and re-engineering code analysis for:
Ada, C, Cobol
 Imagix4D (Imagix Corp.)
- http://www.powersoftware.com/english/im/index.html
- A closed tool that provides a large set of built-in
functionalities
- Several views (also 3D)
- Analysis for: C/C++
Joint Master in Software Engineering
Software Maintenance and Evolution
50
Tools for OO Languages [3]
Examples of tools that produce a class diagram from code

Rational Rose (Rational Software Corp.)

Paradigm Plus (Computer Associates International)

OEW (Innovative Software GmbH)

Graphical Designer (Advanced Software Technologies Inc.)

Domain Objects (Domain Objects Inc.)

COOL:Jex (Sterling Software Inc.)

Fujaba (Paderborn University)
...
Joint Master in Software Engineering
Software Maintenance and Evolution
51
Example: Java Decompiler [4]
 How to recover bytecode from .class file under Unix/Win
with JDK?
% javap -c <filename>
% javap -help
(to see the options)
 Java Decompilers
- ”ClassCracker” http://www.pcug.org.au/~mayon/
- “DeCafe Pro" from DeCafe, France at
http://decafe.hypermart.net/index.htm
- “SourceAgain" from Ahpah corp at
http://www.ahpah.com
Joint Master in Software Engineering
Software Maintenance and Evolution
52
Example: Java Decompiler [4]
ClassCracker 2 Interface
Joint Master in Software Engineering
Software Maintenance and Evolution
53
Example: Java Decompiler [4]
 Components of ClassCracker 2
-
Java decompiler: retrieves Java source code from
Java class files
-
Java disassembler: produces Java Assembly Code
-
A Java class file viewer: displays Java class file
structures.
Joint Master in Software Engineering
Software Maintenance and Evolution
54
Example: Java Decompiler [4]
 Features of ClassCracker 2
- User visual interface.
- Can decompile class files within zip or jar files.
- Conversion mode (JAVA, JASM or JDUMP) is selectable
- A Batch Mode allows multiple class files to be
decompiled simultaneously
- more…...
Joint Master in Software Engineering
Software Maintenance and Evolution
55
Example: Java Decompiler [4]
 ClassCracker 2 System Requirements
- All platform (Window/Linus/Unix)
- JDK /JRE
Do not believe it?
From myClass_origin.class ==>myClass.java
% javac myClass.java
(==>myClass.class)
% diff myClass.class myClass_origin.class
Joint Master in Software Engineering
Software Maintenance and Evolution
56
Example: Java Decompiler [4]
 ClassCracker 2.0-- want to try it?
- Free download at
http://www.pcug.org.au/~mayon/classcracker/ccgetde
mo.html
- Only first three methods are decoded.
 Bridge 1.0---Free
http://www.geocities.com/SiliconValley/Bridge/8617/jad.h
tml
Joint Master in Software Engineering
Software Maintenance and Evolution
57
References
[1] I. Sommerville, Software Engineering, Chapter 28, sixth
edition, Pearson Education, 2000.
[2] Reverse Engineering, Department of Mechanical Engineering,
The Ohio State University, 2000.
[3] J. Nummenmaa, Reverse Engineering, School of Information
Sciences, the University of Tampere,
www.sis.uta.fi/~jyrki/old_courses/se03/#material/reverseengineering.ppt
[4] S. Xu, Reverse Engineering, Computer Science, University of
Windsor, Canada, people.auc.ca/xu/present/reverse.ppt
Joint Master in Software Engineering
Software Maintenance and Evolution
58
Thank you for
your attention.
Tempus