Basic Concepts
Download
Report
Transcript Basic Concepts
Introduction
Chapter 1
What is Assembly Language?
Data Representation
1
Table 1. Software Hierarchy Levels
Level
Description
Application Program
Software designed for a particular class of
applications
High-Level Language
(HLL)
Programs are compiled into either
assembly language or machine language.
E.g. C++, Pascal, Java, Visual Basic, etc.
Operating Systems
Contains procedures than can be called
from programs written in either high-level
language or assembly language. This
system may also contain an application
programming interface (API).
Assembly Language (ASM) Uses instruction mnemonics that have a
one-to-one correspondence with machine
language.
Machine Language (ML)
2
Numeric instructions and operands that can
be stored in memory and directly executed
by the computer processor.
What is Assembly Language?
3
A low-level processorspecific programming
language design to
match the processor’s
machine instruction set
each assembly
language instruction
matches exactly one
machine language
instruction
we study here Intel’s
80x86 (and Pentiums)
Why learn Assembly Language?
4
To learn how high-level language code gets
translated into machine language
i.e.: to learn the details hidden in HLL code
To learn the computer’s hardware
by direct access to memory, video controller,
sound card, keyboard…
To speed up applications
direct access to hardware (ex: writing directly to
I/O ports instead of doing a system call)
good ASM code is faster and smaller: rewrite in
ASM the critical areas of code
Assembly Language Applications
5
Application programs are rarely written completely
in assembly language
only time-critical parts are written in ASM
Ex: an interface subroutine (called from HLL
programs) is written in ASM for direct hardware
access
Ex2: device drivers (called from the OS)
ASM often used for embedded systems (programs
stored in PROM chips)
computer cartridge games, microcontrollers
(automobiles, industrial plants...),
telecommunication equipment…
Very fast and compact but processor-specific
Table 2. Comparison of Assembly
Language and High-Level Languages
Type of Applications
High-Level Language
Assembly Language
Business application
software for single
platform.
Formal structures
No formal structure.
make it easy to
organize and maintain.
Hardware device
driver.
Awkward coding
techniques required.
Hardware access is
straightforward and
simple.
Business application
for multiple platforms.
Portable.
Difficult to maintain.
Embedded systems
and computer games
requiring direct
hardware access.
Produces too much
executable code, and
may not run efficiently.
Ideal, because the
executable code is
small and runs quickly.
6
Machine Language
An assembler is a program that converts
ASM code into machine language code:
mov
al,5 (Assembly Language)
1011000000000101
(Machine Language)
significant byte is the opcode for “move
into register AL”
the least significant byte is for the operand “5”
most
7
Directly programming in machine language
offers no advantage (over Assembly)...
Binary Numbers/Storage Size
are used to store both code and data
On Intel’s x86:
byte
= 8 bits (smallest addressable unit)
word = 2 bytes
doubleword = 2 words
quadword = 2 doublewords
8
Data Representation
Even if we know that a block of memory
contains data, to obtain its value we need
to choose an interpretation
Ex: memory content “0100 0001” can
either represent:
the
number 2^{6} + 1 = 65
or the ASCII code of character “A”
9
Data Representation
Number Systems
Binary/Octal/Decimal/Hexadecimal
Converting
between various number
systems
Signed/Unsigned Interpretation
Two’s
10
Complement
Addition/Subtraction
Character Storage
Number
Systems
11
A written number is meaningful only with respect to a
base
To tell the assembler which base we use:
Hexadecimal 25 is written as 25h
Octal 25 is written as 25o or 25q
Binary 1010 is written as 1010b
Decimal 1010 is written as 1010 or 1010d
You are supposed to know how to convert from one
base to another (see appendix A)
Binary Numbers
Digits are 1 and 0
1
= true
0 = false
MSB – most significant bit
LSB – least significant bit
MSB
Bit numbering:
1011001010011100
15
12
LSB
0
Converting between various number
systems
13
Converting Binary to Decimal
Converting Decimal to Binary
Converting Binary to Hexadecimal
Converting Hexadecimal to Decimal
Signed and Unsigned Interpretation
When a memory block contains a number,
to obtain its value we must choose either:
the
signed interpretation: in that case the most
significant bit (msb) represents the sign
Positive
number (or zero) if msb = 0
Negative number if msb = 1
the
unsigned interpretation: in that case all the
bits are used to represent a magnitude (ie:
positive number, or zero)
14
Signed Integers
The highest bit indicates the sign. 1 =
negative,
0 = positive
sign bit
1
1
1
1
0
1
1
0
0
0
0
0
1
0
1
0
Negative
Positive
If the highest digit of a hexadecimal integer is > 7, the value is
negative. Examples: 8A, C5, A2, 9D
15
Two’s Complement Notation
16
Used to represent negative numbers
The twos complement of a positive number
X, denoted by NEG(X), is obtained by
complementing all its bits and adding +1
NEG(X) = NOT(X) + 1
Ex: NEG(10) = NOT(10) + 1
= NOT(0000 1010b) + 1
= (1111 0101b) + 1 = 1111 0110b = NEG(10)
= -10
It follows that X + NEG(X) = 0
Forming the Two's Complement
Negative numbers are stored in two's
complement notation
Represents the additive Inverse
Note that 00000001 + 11111111 = 00000000
17
Binary Subtraction
To perform the difference X - Y:
the machine executes the addition X +
NEG(Y)
00001100
– 00000011
00001100
+11111101
00001001
Practice: Subtract 0101 from 1001.
18
Maximum and Minimum Values
The msb of a signed number is used for its
sign
fewer
bits are left for its magnitude
Ex: for a signed byte
smallest
positive = 0000 0000b
largest positive = 0111 1111b = 127
largest negative = -1 = 1111 1111b
smallest negative = 1000 0000b = -128
19
Ranges of Unsigned Integers
byte
Standard sizes:
word
doubleword
quadword
8
16
32
64
What is the largest unsigned integer that may be stored in 20 bits?
20
Ranges of Signed Integers
The highest bit is reserved for the sign. This limits the range:
Practice: What is the largest positive value that may be stored in 20 bits?
21
Signed/Unsigned Interpretation (again)
To obtain the value of a number we need to
chose an interpretation
Ex: memory content 1111 1111 can either
represent:
-1
if a signed interpretation is used
255 if an unsigned interpretation is used
22
Only the programmer can provide an
interpretation of the content of memory
Character Storage Systems
Character sets
(0 – 127)
Extended ASCII (0 – 255)
ANSI (0 – 255)
Unicode (0 – 65,535)
Standard ASCII
Null-terminated String
Array
23
of characters followed by a null byte
ASCII vs Extended ASCII
The ASCII code (from 00h to 7Fh)
Only
codes from 20h to 7Eh represent printable
characters. The rest are control codes (used
for printing, transmission…).
Extended ASCII character set (codes 80h to
FFh)
Varies
from one system to another
MS-DOS
usage: for accentuated characters,
Greek symbols and some graphic characters
24
The ASCII character set
CR = “carriage return” (MSDOS: move to beginning of line)
LF = “line feed” (MSDOS: move directly one line below)
SPC = “blank space”
25
Text Files
These are files containing only ASCII
characters
But different conventions are used for
indicating an “end-of line”
MS-DOS:
<CR>+<LF>
UNIX: <LF>
MAC: <CR>
26
This is at the origin of many problems
encountered during transfers of text files
from one system to another
Strings and numbers
27
A strings is stored as an array of characters
A 1-byte ASCII code is stored for each char
Hence, we can either store the number 123 in
numerical form or as the string “123”
The string form is best for display
The numerical form is best for computations