assembly code to machine code

Download Report

Transcript assembly code to machine code

What Do I Represent?
Translators – Module Knowledge Areas
• Revisiting object code
When we disassemble code we can view the opcodes used
This is just a stage – the assembled code is typically translated into a series of hex
codes
What can you say about the object code below (these are from 3 different scripts)?
Translators – Module Knowledge Areas
Locate a Python script and copy it to a clean directory
In the same directory create a new Python file
The only line you need is import filename eg import tempConvert (do not put in the
extension)
Give the file a sensible name and run it
In your directory you will now see a folder called “_pycache_”
Open this. Inside you will see a file with the extension “.pyc”
Open one of the files with Sublime Text
•
Take a screengrab of this and use it to discuss the link between
source code and object code. Ensure you explain what is being shown.
Translators
Translators – Module Knowledge Areas
• Types of translators and their use
• Lexical analysis
• Syntax analysis
• Code generation and optimisation
• Library routines
Translators – Module Knowledge Areas
• Types of translators and their use
• Three types of translator
• Compiler – source code to object code. Translates all the code before running
• Interpreter – source code to object code BUT translates one line at a time and
then executes the line
• Assembler – assembly code to machine code
Translators – Module Learning Objectives
• describe the need for, and use of, translators to convert source code to object code
• understand the relationship between assembly language and
machine code
• describe the use of an assembler in producing machine code
• describe the difference between interpretation and compilation
• describe the purpose of intermediate code in a virtual machine
• describe what happens during lexical analysis
• describe what happens during syntax analysis, explaining how errors are handled
• explain the code generation phase and understand the need for optimisation
• describe the use of library routines
Translators - Assembly
Understand the relationship between assembly language and machine code
A particular architecture (CPU) has no way to directly read source code
Each architecture has its own machine language
This prevents a straight source code to machine code translation – we need to assemble the code
EG running my Python code on a 64 bit Windows machine is not the same as running the code on a 32 bit
Linux machine
Translators - Assembly
We have previously seen that one line of source code can generate many lines of object code:
Line 4 of the source code
creates 2 lines of object code
Line 6 of the source code
creates 8 lines of object code
This is referred to as one to
many relationship (one
source code line can give rise
to many object code lines)
The whole program
generates 20 lines of
assembly code
TASK – Explain what is meant by the term ‘Assembly Language’. Ensure
you make a link between assembly and source
Translators - Assembly
Assembly code has a one-to-one relationship with machine code.
In the diagram to the left there is a
representation of how the language
becomes less ‘friendly’ as the translation
process is undertaken
TASK – Describe the three stages in the diagram above. In your
description include the architecture based reasons and human readability
reasons for working in source and assembly code
Translators – Machine Code
We already know that machine code is architecture specific eg 16, 32, 64 bit
If an architecture is 32 bit this means that it has a word length of 32 bits
This means that the address bus is 32 bits in size and that the size of each instruction is
32 bits
Complex instructions – those exceeding the word length – are split into 2 or more
instructions
Each instruction has two parts – an opcode and data
Opcode – the instruction eg ADD to add a value to the ACC (accumulator)
Data – the information being manipulated eg 4 to have the instruction ADD applied to it
Translators – Assembly to Machine
We already know that assembly code has a particular structure that uses mnemonics (words that
look like English) to provide simple instructions
EG ADD 4 to ACC is an instruction to add a value to the accumulator (notice now that we are in the territory
of registers)
Opcode
Data
MOV Value 0 to ACC
000 001
0000 0000 00
The value to move into the ACC
0000 0000 0000 0000
ADD 4 to ACC
000 010
0000 0001 00
ADD 5 to ACC
000 010
0000 0001 01
MUL 2 to ACC
001 000
0000 0000 10
Task – what is the above doing? In pairs agree and be prepared to explain to the group
Translators – Machine Code
KISS: Keep It Simple Son – How 2 bytes/16 bits can be characterised
Bits 1-6
Bits 7-16
Opcode
Data
Opcode
Assembly Mnemonic
Description
000 001
MOV
Will move a value to a register
000 010
ADD
Will add a value and store it in the ACC
000 100
SUB
Will subtract a value and store it in the ACC
001 000
MUL
Will multiply a value and store it in the ACC
Translators – Machine Code
In this example of disassembled code the offset (second column) has a particular pattern
– what is it?
The second
column refers to
the instruction for
the mnemonic
opcode in column
3.
Each instruction is
exactly 3 bytes in
size and therefore
3 bytes apart
Task – Disassembling assembly (go to Moodle)