11_AssemblyIntro
Download
Report
Transcript 11_AssemblyIntro
Introduction to Assembly
Programming
Computer Architecture
Von Newmann Architecture
• Von Newmann was the scientist to invent the
“Stored Program” concept where data and
instructions are stored in the same memory
– There is no clear delineation between data and
instructions in memory
• Same memory location can be data or instruction!
– The distinction is made depending on how the
microprocessor is instructed to interpret memory
contents.
• Memory location is considered an instruction if microprocessor
fetches it for processing as an instruction.
• Interpretation of memory content as data is performed using
suitable instructions.
Storing programs
• Executable programs are typically stored
directly in binary form
• With mix of data and instructions based on the von
newmann stored program concept.
• Usually viewed as hexadecimal representation
– Easy to load into memory and execute
• Negative is that without additional knowledge
it is impossible to distinguish between data
and instructions!
– So how does a computer correctly execute a
program?
Symbol Table
• Solution: Symbol Table
– A table that specifies the list of symbols used in a program
along with their addresses
– An example symbol table is shown below:
• Symbol tables are embedded as a part of the binary file using
special file organization (standards such as .EXE etc.)
Symbol
Address (Hex)
_start
r
0F
24
elsePart
done
20
22
Need for Mnemonics
• A modern microprocessor has a large
number of instructions
• Typically more than 100 different instructions
– Developing bit patterns (series of 1s and 0s) and
hex encodings for each instruction is too
cumbersome
• Imagine writing a complex program in hex!
– BTW people used to do it!
– And they still do it in certain cases
• It is hard to develop, document, troubleshoot, and
maintain
Mnemonics
• The issues with developing programs led to
the use of mnemonics to describe instructions
• Mnemonic
– Is a easier to remember word that corresponds to an
operation
– Rather than remembering bit patterns you remember
words
– Examples:
OP Code
Mnemonic
101101
Add
101111
Sub
101011
And
1111111
Or
Tradeoffs with Mnemonics
• Advantages of mnemonics
– The objective is ease representation and understanding
of machine language programs.
• Use of mnemonics was a huge shift in programming paradigm
• Not comparable to high level programming languages
– They are still prevalently used to describe instructions that
can be processed by microprocessors
• Drawbacks
– The drawback is that they require additional processing or
manual translation to binary code.
– Not standardized, even today
• Consequently different tools and processors use different
conventions and notations!
– Have to know specifics of microprocessor to use.
Commonly used mnemonics
Mnemonic Description
Mov/Move Copy value from one location to another
Add
Add two values together and store result
in register.
Sub
Subtract one value from another and
store result in a register.
Compare two values and set flags to
reflect result of operation.
Branch to a given address.
Jump if Not Zero, conditionally branch if
zero flag is 0.
Cmp
Jmp
JNZ
Muddleheaded with Mnemonics
• For instance consider the following typical
usage of the mov mnemonic:
mov reg1, reg2
– How to interpret the above representation?
• Does it mean reg1 = reg2 or
• Does it mean reg2 = reg1
– There is no correct answer!
• Both conventions are used and different tools use
different conventions
• You just have to know the specifics of the tool or
manual to figure out the correct interpretation!
Addresses in Mnemonics
• Memory addresses are used in two different
ways in programs
– Used to load or store data
• Typically variables or symbols are associated with
memory addresses in these cases
– Used as target of branch or jump instructions
• Identifies location of next instruction to be executed
Extending to handle Variables
• Mnemonics were extended to ease
representation and use of addresses for
storing and retrieving data
• By associating variables or symbols with address
locations
– Example
• Let symbol sum be associated with address C316
• To store constant value 31 into address C316
mnemonic could be written as:
mov $31, sum
Extending to handle Branching
• Mnemonics were extended to ease
representation and use of addresses for
conditional and unconditional branching by
using labels.
• A label is a symbols that points to a specified address
where some instruction is stored.
– Example
• Let label “end” be associated with address 1E16
• To jump to instruction stored at 1E16, mnemonic could
be written as:
jmp end
A Program using Mnemonics
; Semicolon indicates start of comment
; Program does: if (r == 10) sum = 20
; else sum = 30
_start:
mov r, Reg1
; Reg1 = r
mov $10, Reg2 ; Reg2 = 10
sub Reg1, Reg2 ; Reg2 = Reg2 – Reg1
jnz elsePart
; Jump if Reg1 != Reg2
mov $20, Reg0 ; Reg0 = 20
jmp done
; Jmp to done
elsePart:
mov $30, Reg0 ; Reg0 = 30
done:
mov Reg0, sum ; sum = Reg0
Symbol Table
• An example symbol table is shown below:
– Addresses calculated assuming that
• Program is stored consecutively starting with address 0F16
• Each instruction occupies 2 bytes in memory.
Symbol
Address (Hex)
_start
0F
r
24
elsePart
20
done
22
Assembly Language
• High level mnemonic representation using
variables and labels is called assembly
language
• Typically includes a few additional program constructs
for defining variables and constants.
– Commonly used for device level programming
•
•
•
•
•
Developing operating systems
Developing device drivers
Ultra-optimizing programs for specific hardware
Tapping into special features of the processor
Developing embedded programs that don’t rely on OS
or other environments (like JVM)
• For timing critical operations and for real time systems
Advantages of Assembly
• Maximum power & control on hardware
• Can’t get closer to hardware than this using software!
– Enables development of ultra-optimized code
• Tailored for a specific microprocessor
• Code can be compacted
– Can tap into special features of the
microprocessor
– Directly interface with devices and peripherals
– Provides accurate control on timing
• Needed for interfacing with high speed devices like:
network, hard disk, USB, CD-ROM, DVD-ROM, Video
Card, FireWire
Disadvantages of Assembly
• Least portable
– Typically tailored for specific microprocessor &
operating system
– However, x86 is most dominant these days. So
portability issues are typically myths.
• Hard to develop, troubleshoot, and maintain
– Depends on how you develop the code
– You can spend long time troubleshooting Java
• Requires “Out of the Box” thinking
– Most programmers have problems with this part.
Need for Assembler
• Ultimately the microprocessor requires
instructions to be stored in binary format that
can be easily processed.
– Similar to the bits we used in our data path
• However, assembly language programs are
not in binary!
– Need a tool to convert assembly to pure binary
Assembler
• Assembler is a program that converts
assembly code to binary
• It is essentially a compiler just like the Java compiler
– Handles address allocation for variables and
labels
– Generates symbol table for the program
– Includes other constructs to make program
development easier
• Like including or importing other assembly source
codes and libraries
Using an assembler
Text Editor
(Notepad)
Assembly Source
(Text file saved on disk)
Assembler (Software)
(Compile source to binary)
Other
binaries &
libraries on
disk
Has a special file format
(similar to GIF, JPEG etc.) to
Binary
contain instructions, data, and
(Object code. File on disk)
the symbol table.
Linker (Software)
(Combine object codes &
libraries into a single binary )
Executable Binary
(File on disk)
How does binary get into memory?
• The generated binary code needs to be placed in
memory for the data path to process!
• This job is typically done by the operating system
– A special piece of the operating system called the loader
does the job.
• Loader requires the executable to be in a specific file format
– Loader reads the executable, places it in memory, and
changes the next instruction to refer to the first instruction
in the loaded program!
– The program then starts running
• The program typically uses the operating system services (via
library calls) to ease performing specific operations (like input and
output)
How does the OS get into memory?
• The OS is just another program that is run by
the microprocessor. How does the OS get
into memory?
– It is loaded by another program called the Basic
Input Output System or BIOS
• BIOS actually loads only a small program called the
boot loader.
• The boot loader then loads the whole OS into memory.
• How does the BIOS get into memory?
– It is hardwired to the data path using non-volatile
Read-Only Memory (ROM)
What is ROM
• ROM or Read Only Memory
• Can be programmed to hold fixed data
• Writing to it is reserved or restricted
– Has similar configuration to Read-Write Memory
• Uses address bus to randomly select location to read
– So it is a Random Access Memory (RAM)
• Places the byte requested on the data bus.
– Manufactured using a different processes
• A variety of ROM modules are available but most
commonly used ones are called Flash memory
– Flash is an EEPROM (Electrically Erasable and
Programmable Read Only Memory)
– Flash enables updating of BIOS without removing the ROM
chip out of the computer using software
Example ROM & RAM Layout
8
Data Bus
ROM
(BIOS)
EN
Address Bus
8
Most significant bit of address
Bus. If this bit is 1, the ROM is
enabled. For reading.
Otherwise the Read-Write
Memory is enabled!
Read
Write
Memory
EN
BIOS Microprocessor Interface
• Microprocessors are designed to generate a
specific initial starting address
– The first instruction to be executed when the
processor is turned on by applying power to it
– The logic circuit is designed to map this address
into the ROM BIOS’s first instruction
• The BIOS runs and loads the boot loader from disk
into specific memory locations in read-write memory
• The BIOS then jumps to the first instruction in the boot
loader program which causes boot loader to run.
• Boot loader is a software that further loads the actual
OS into read-write memory