ARM Assembly Language - CSE @ IITD

Download Report

Transcript ARM Assembly Language - CSE @ IITD

PowerPoint Slides
Computer Organisation and Architecture
Smruti Ranjan Sarangi,
IIT Delhi
Chapter 4 ARM Assembly Language
PROPRIETARY MATERIAL. © 2014 The McGraw-Hill Companies, Inc. All rights reserved. No part of this PowerPoint slide may be displayed, reproduced or distributed in any form or by any
means, without the prior written permission of the publisher, or used beyond the limited distribution to teachers and educators permitted by McGraw-Hill for their individual course preparation.
PowerPoint Slides are being provided only to authorized professors and instructors for use in preparing for classes using the affiliated textbook. No other use or distribution of this PowerPoint
1 slide
is permitted. The PowerPoint slide may not be sold and may not be distributed or be used by any student or any other third party. No part of the slide may be reproduced, displayed or distributed in
any form or by any means, electronic or otherwise, without the prior written permission of McGraw Hill Education (India) Private Limited.
1
These slides are meant to be used along with the book: Computer
Organisation and Architecture, Smruti Ranjan Sarangi, McGrawHill 2015
2
Visit: http://www.cse.iitd.ernet.in/~srsarangi/archbooksoft.html
ARM Assembly Language
 One of the most popular RISC instruction
sets in use today
 Used by licensees of ARM Limited, UK
 ARM processors
 Some processors by Samsung, Qualcomm, and Apple
 Highly versatile instruction set
 Floating-point and vector (multiple operations per
instruction) extensions
3
Outline
 Basic Instructions
 Advanced Instructions
 Branch Instructions
 Memory Instructions
 Instruction Encoding
4
ARM Machine Model
 16 registers – r0 … r15
 The PC is explicitly visible
 Memory (Von Neumann Architecture)
Register
r11
r12
r13
r14
r15
Abbrv.
fp
ip
sp
lr
pc
Name
frame pointer
intra-procedure-call scratch register
stack pointer
link register
program counter
5
Data Transfer Instructions
Semantics
mov reg, (reg/imm)
mvn reg, (reg/imm)
Example
mov r1, r2
mov r1, #3
mvn r1, r2
mvn r1, #3
Explanation
r1 ← r2
r1 ← 3
r1 ← ∼ r2
r1 ← ∼ 3
 mov and mvn (move not)
6
Arithmetic Instructions
Semantics
add reg, reg, (reg/imm)
sub reg, reg, (reg/imm)
rsb reg, reg, (reg/imm)
Example
add r1, r2, r3
sub r1, r2, r3
rsb r1, r2, r3
Explanation
r1 ← r2 + r3
r1 ← r2 - r3
r1 ← r3 - r2
 add, sub, rsb (reverse subtract)
7
Example
Write an ARM assembly program to compute: 4+5 - 19. Save the result in
r1.
Answer: Simple yet suboptimal solution.
mov
mov
add
mov
sub
r1,
r2,
r3,
r4,
r1,
#4
#5
r1, r2
#19
r3, r4
Optimal solution.
mov r1, #4
add r1, r1, #5
sub r1, r1, #19
8
Logical Instructions
Semantics
and reg, reg, (reg/imm)
eor reg, reg, (reg/imm)
orr reg, reg, (reg/imm)
bic reg, reg, (reg/imm)
Example
and r1, r2, r3
eor r1, r2, r3
orr r1, r2, r3
bic r1, r2, r3
Explanation
r1 ← r2 AND r3
r1 ← r2 XOR r3
r1 ← r2 OR r3
r1 ← r2 AND (∼ r3)
 and, eor (exclusive or), orr (or), bic(bit
clear)
9
Example
Write an ARM assembly program to compute:𝐴 𝑉 𝐵, where A and
B are 1 bit Boolean values. Assume that A = 0 and B = 1. Save the
result in r0.
Answer:
mov r0, #0x0
orr r0, r0, #0x1
mvn r0, r0
10
Multiplication Instruction
Semantics
mul reg, reg, (reg/imm)
mla reg, reg, reg, reg
smull reg, reg, reg, reg
Example
mul r1, r2, r3
mla r1, r2, r3, r4
smull r0, r1, r2, r3
Explanation
r1 ← r2 × r3
r1 ← r2 × r3 + r4
r1 r0← r2 ×signed r3
64
umull reg, reg, reg, reg
umull r0, r1, r2, r3
r1 r0← r2 ×unsigned r3
64
 smull and umull instructions can hold a
64 bit operand
11
Example
Compute 123 + 1, and save the result in r3.
Answer:
/* load test values */
mov r0, #12
mov r1, #1
/* perform the logical computation */
mul r4, r0, r0 @ 12*12
mla r3, r4, r0, r1 @ 12*12*12 + 1
12
Outline
 Basic Instructions
 Advanced Instructions
 Branch Instructions
 Memory Instructions
 Instruction Encoding
13
Shifter Operands
Generic format
reg1
,
lsl
lsr
asr
ror
#shift_amt
reg2
Examples
1
1
1
1
0
0
0
0
1
1
1
1
10
10
10
10
lsl #1
lsr #1
asr #1
ror #1
0
0
1
0
1
1
1
1
1
0
0
0
00
11
11
11
14
Examples of Shifter Operands
Write ARM assembly code to compute: r1 = r2 / 4.
Answer:
mov r1, r2, asr #2
Write ARM assembly code to compute: r1 = r2 + r3 × 4.
Answer:
add r1, r2, r3, lsl #2
15
Compare Instructions
Semantics
cmp reg, (reg/imm)
cmn reg, (reg/imm)
tst reg, (reg/imm)
teq reg, (reg/imm)
Example
cmp r1, r2
cmn r1, r2
tst r1, r2
teq r1, r2
Explanation
Set flags after computing (r1 - r2)
Set flags after computing (r1 + r2)
Set flags after computing (r1 AND r2)
Set flags after computing (r1 XOR r2)
 Sets the flags of the CPSR register
 CPSR (Current Program Status Register)
 N (negative) , Z (zero), C (carry), F (overflow)
 If we need to borrow a bit in a subtraction, we set C
to 0, otherwise we set it to 1.
16
Instructions with the 's' suffix
 Compare instructions are not the only
instructions that set the flags.
 We can add an s suffix to regular ALU
instructions to set the flags.
 An instruction with the 's' suffix sets the flags in the
CPSR register.
 adds (add and set the flags)
 subs (subtract and set the flags)
17
Instructions that use the Flags
Semantics
adc reg, reg, reg
sbc reg, reg, reg
rsc reg, reg, reg
Example
adc r1, r2, r3
sbc r1, r2, r3
rsc r1, r2, r3
Explanation
r1 = r2 + r3 + Carry Flag
r1 = r2 - r3 - NOT(Carry Flag)
r1 = r3 - r2 - NOT(Carry Flag))
 add and subtract instructions that use the
value of the carry flag
18
64 bit addition using 32 bit registers
Add two long values stored in r2,r1 and r4,r3.
Answer:
adds r5, r1, r3
adc r6, r2, r4
The (adds) instruction adds the values in r1 and r3. adc(add with carry)
adds r2, r4, and the value of the carry flag. This is exactly the same as
normal addition.
19
Outline
 Basic Instructions
 Advanced Instructions
 Branch Instructions
 Memory Instructions
 Instruction Encoding
20
Simple Branch Instructions
Semantics
b label
beq label
Example
b .foo
beq .foo
bne label
bne .foo
Explanation
Jump unconditionally to label .foo
Branch to .foo if the last flag setting
instruction has resulted in an equality
and (Z flag is 1)
Branch to .foo if the last flag setting
instruction has resulted in an inequality
and (Z flag is 0)
 b (unconditional branch)
 b<code> (conditional branch)
21
Branch Conditions
Number
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Suffix
eq
ne
cs/hs
cc/lo
mi
pl
vs
vc
hi
ls
ge
lt
gt
le
al
–
Meaning
equal
notequal
carry set/ unsigned higher or equal
carry clear/ unsigned lower
negative/ minus
positive or zero/ plus
overflow
no overflow
unsigned higher
unsigned lower or equal
signed greater than or equal
signed less than
signed greater than
signed less than or equal
always
reserved
Flag State
Z=1
Z=0
C=1
C=0
N=1
N=0
V=1
V=0
(C = 1) ∧ (Z = 0)
(C = 0) ∨ (Z = 1)
N=0
N=1
(Z = 0) ∧ ( N = 0)
(Z = 1) ∨ (N = 1)
22
Example
Write an ARM assembly program to compute the factorial of a positive
number (> 1) stored in r0. Save the result in r1.
Answer:
ARM assembly
mov
mov
.loop:
mul
cmp
add
bne
r1, #1 /* prod = 1 */
r3, #1 /* idx = 1 */
r1, r3, r1 /* prod = prod * idx */
r3, r0 /* compare idx, with the input (num) */
r3, r3, #1 /* idx ++ */
.loop /* loop condition */
23
Branch and Link Instruction
Semantics
bl label
Example
bl .foo
Explanation
(1) Jump unconditionally to the function at .foo
(2) Save the next PC (PC + 4) in the lr register
 We use the bl instruction for a function call
24
Example
Example of an assembly program with a function call.
ARM assembly
C
int foo() {
return 2;
}
void main() {
int x = 3;
int y = x + foo();
}
foo:
mov r0, #2
mov pc, lr
main:
mov r1, #3 /* x = 3 */
bl foo /* invoke foo */
/* y = x + foo() */
add r2, r0, r1
25
The bx Instruction
Semantics
bx reg
Example
bx r2
Explanation
(1) Jump unconditionally to the address
contained in register, r2
 This is the preferred method to return from
a function.
 Instead of : mov pc, lr
Use : bx lr
26
Example
foo:
mov r0, #2
bx lr
main:
mov r1, #3 /* x = 3 */
bl foo /* invoke foo */
/* y = x + foo() */
add r2, r0, r1
27
Conditional Variants of Normal
Instructions
 Normal Instruction + <condition>
 Examples : addeq, subne, addmi, subpl
 Also known as predicated instructions
 If the condition is true
 Execute instruction normally
 Otherwise
 Do not execute at all
28
Write a program in ARM assembly to count the number of 1s in a 32 bit number stored in r1. Save
the result in r4.
Answer:
mov r2, #1 /* idx = 1 */
mov r4, #0 /* count = 0 */
/* start the iterations */
.loop:
/* extract the LSB and compare */
and r3, r1, #1
cmp r3, #1
/* increment the counter */
addeq r4, r4, #1
/* prepare for the next iteration */
mov r1, r1, lsr #1
add r2, r2, #1
/* loop condition */
cmp r2, #32
ble .loop
29
Outline
 Basic Instructions
 Advanced Instructions
 Branch Instructions
 Memory Instructions
 Instruction Encoding
30
Basic Load Instruction
●
ldr r1, [r0]
ldr r1, [r0]
memory
register
file
r0
r1
31
Basic Store Instruction
●
str r1, [r0]
str r1, [r0]
memory
register
file
r0
r1
32
Memory Instructions with an Offset
 ldr r1, [r0, #4]
 r1 ← mem[r0 + 4]
 ldr r1, [r0, r2]
 r1 ← mem[r0 + r2]
33
Table of Load/Store Instructions
Semantics
ldr reg, [reg]
ldr reg, [reg, imm]
ldr reg, [reg, reg]
ldr reg, [reg, reg, shift imm]
Example
ldr r1, [r0]
ldr r1, [r0, #4]
ldr r1, [r0, r2]
ldr r1, [r0, r2, lsl #2]
Explanation
r1 ← [r0]
r1 ← [r0 + 4]
r1 ← [r0 + r2]
r1 ← [r0 + r2 << 2]
Addressing Mode
register-indirect
base-offset
base-index
base-scaled-index
str reg, [reg]
str reg, [reg, imm]
str reg, [reg, reg]
str reg, [reg, reg, shift imm]
str r1, [r0]
str r1, [r0, #4]
str r1, [r0, r2]
str r1, [r0, r2, lsl #2]
[r0] ← r1
[r0 + 4] ← r1
[r0 + r2] ← r1
[r0 + r2 << 2] ← r1
register-indirect
base-offset
base-index
base-scaled-index
 Note the base-scaled-index addressing mode
34
Example with Arrays
C
void addNumbers(int a[100]) {
int idx;
int sum = 0;
for (idx = 0; idx < 100; idx++){
sum = sum + a[idx];
}
}
Answer:
ARM assembly
/* base address of array a in r0 */
mov r1, #0 /* sum = 0 */
mov r2, #0 /* idx = 0 */
.loop:
ldr
add
add
cmp
bne
r3, [r0, r2,
r2, r2, #1
r1, r1, r3
r2, #100
.loop
lsl #2]
/* idx ++ */
/* sum += a[idx] */
/* loop condition */
35
Advanced Memory Instructions
 Consider an array access again
 ldr r3, [r0, r2, lsl #2] /* access array */
 add r2, r2, #1
/* increment index */
 Can we fuse both into one instruction
 ldr r3, [r0], r2, lsl #2
 Equivalent to :
 r3 = [r0]
 r0 = r0 + r2 << 2
Post-indexed addressing
mode
36
Pre-Indexed Addressing Mode
 Consider
 ldr r0, [r1, #4]!
 This is equivalent to:
 r0  mem [r1 + 4]
 r1  r1 + 4
Similar to i++ and ++i in Java/C/C++
37
Example with Arrays
C
void addNumbers(int a[100]) {
int idx;
int sum = 0;
for (idx = 0; idx < 100; idx++){
sum = sum + a[idx];
}
}
Answer:
ARM assembly
/* base address of array a in r0 */
mov r1, #0
/* sum = 0 */
add r4, r0, #400
/* set r4 to address of a[100] */
.loop:
ldr
add
cmp
bne
r3, [r0], #4
r1, r1, r3
/* sum += a[idx] */
r0, r4
/* loop condition */
.loop
38
Memory Instructions in Functions
Instruction
ldmfd sp!, {list of registers }
Semantics
Pop the stack and assign values to registers
in ascending order. Update sp.
stmfd sp!, {list of registers }
Push the registers on the stack in descending
order. Update sp.
 stmfd → spill a set of registers
 ldmfd → restore a set of registers
39
Example
Write a function in C and implement it in ARM assembly to compute xn,
where x and n are natural numbers. Assume that x is passed through r0, n
through r1, and the return value is passed back to the original program via
r0. Answer:
ARM assembly
power:
cmp r1, #0
moveq r0, #1
bxeq lr
/* compare n with 0 */
/* return 1 */
/* return */
stmfd sp!, {r4, lr}
mov r4, r0
sub r1, r1, #1
bl power
mul r0, r4, r0
ldmfd sp!, {r4, pc}
/* save r4 and lr */
/* save x in r4 */
/* n = n - 1 */
/* recursively call power */
/* power(x,n) = x * power(x,n-1) */
/* restore r4 and return */
40
Outline
 Basic Instructions
 Advanced Instructions
 Branch Instructions
 Memory Instructions
 Instruction Encoding
41
Generic Format
 Generic Format
32
4
2
cond
type
29 28
27
 cond → instruction condition (eq, ne, … )
 type → instruction type
42
Data Processing Instructions
32
4
2
4
cond
00
I opcode S
29 28 27 26 25
22 21 20
4
4
rs
rd
17 16
12
shifter operand/
immediate
13
12
1
 Data processing instruction type : 00
 I → Immediate bit
 opcode → Instruction code
 S → 'S' suffix bit (for setting the CPSR flags)
 rs, rd → source register, destination register
43
Encoding Immediate Values
 ARM has 12 bits for immediates
 12 bits
 What do we do with 12 bits ?
 It is not 1 byte, nor is it 2 bytes
 Let us divide 12 bits into two parts
 8 bit payload + 4 bit rot
44
Encoding Immediates - II
 The real value of the immediate is equal
to : payload ror (2 * rot)
4
rot
8
payload
 The programmer/ compiler writes an assembly
instruction with an immediate: e.g. 4
 The assembler converts it in to a 12 bit format
(if it is possible to do so)
 The processor expands 12 bits → 32 bits
45
Encoding Immediates - III
 Explanation of encoding the immediate in
lay man's terms
 The payload is an 8 bit quantity
 A number is a 32 bit quantity.
 We can set 8 contiguous bits in the 32 bit number while
specifying an immediate
 The starting point of this sequence of bits needs to be
an even number such as 0, 2, 4, ...
46
Examples
Encode the decimal number 42.
Answer:
42 in the hex format is 0x2A, or alternatively 0x 00
00 00 2A. There is no right rotation involved. Hence,
the immediate field is 0x02A.
Encode the number 0x2A 00 00 00.
Answer:
The number is obtained by right rotating 0x2A by 8
places. Note that we need to right rotate by 4 places
for moving a hex digit one position to the right. We
need to now divide 8 by 2 to get 4. Thus, the
encoding of the immediate: 0x42A
47
Encoding the Shifter Operand
5
shift imm
12
2
4
shift type 0
8 7
6
5
Shift type
rt
4
1
(a)
4
shift reg
12
2
4
shift type 1
9 8 7
6
5
(b)
rt
4
1
lsl
lsr
asr
ror
00
01
10
11
(c)
48
Load/Store Instructions
4
2
cond
32
0 1
29 28 27
6
I P UBWL
20
4
4
12
rs
rd
shifter operand/
immediate
17 16
13
12
1
 Memory instruction type : 01
 rs, rd, shifter operand
 Connotation remains the same
 Immediates are not in (rot + payload format) :
They are standard 12 bit unsigned numbers
49
I, P, U, B, W, and L bits
Bit
I
P
U
B
W
L
Value
0
1
0
1
0
1
0
1
0
1
0
1
Semantics
last 12 bits represent an immediate value
last 12 bits represent a shifter operand
post-indexed addressing
pre-indexed addressing
subtract offset from base
add offset to base
transfer word
transfer byte
do not use pre or post indexed addressing
use pre or post indexed addressing
store to memory
load from memory
50
Branch Instructions
4
3
cond 101 L
32
24
offset
29 28 26 25 24
1
 L bit → Link bit
 offset → branch offset (in number of
words, similar to SimpleRisc)
51
Branch Instructions - II
 What does the processor do
 Expands the offset to 32 bits (with proper sign
extensions)
 Shifts it to the left by 2 bits (because offset is in
terms of memory words)
 Adds it to PC + 8 to generate the branch target
 Why, PC + 8 ?
 Read chapter 9
52
THE END
53