Powerpoint Slides - University of San Francisco

Download Report

Transcript Powerpoint Slides - University of San Francisco

A Dynamic Visualization of
Core-2 Duo Interrupts
Allan B. Cruse
University of San Francisco
17 September 2009
Organization of this Talk
• This talk is about “using the computer to study the computer”
Its General Thesis
Its Specific Application
Dynamic Visualization
Core-2 Duo Interrupts
• The focus here will be on using a recent version of the Linux
operating system on your personal computer, where you can
use “free” utilities -- and can exercise a few “root” privileges
‘Dynamic Visualization’
• This refers to a type of computer program
which can show us some volatile aspects
of a machine’s inner workings in real time
• Without the ability to “see” these activities
we are left simply to try and imagine what
they are like -- never quite feeling certain
• Such visualizations often reveal aspects
which our textbooks forgot to mention!
UNIX’s ‘top’ utility
This program lets a user see some volatile information from inside the kernel;
however, it’s not really being displayed in real time since it only gets updated
about once every 3 seconds – still it does help us to understand ‘timesharing’.
The software ‘architecture’
kernel space
user space
(unrestricted privileges)
(restricted privileges)
shared function
libraries
(e.g., ‘printf()’)
Linux
operating system
syscall
ret
sysret
in
out
call
application program
(e.g., ‘top’)
hardware devices
The software ‘architecture’
kernel space
user space
(unrestricted privileges)
(restricted privileges)
shared function
libraries
(e.g., ‘printf()’)
Linux
operating system
syscall
ret
sysret
LKM
in
out
call
application program
(e.g., ‘top’)
hardware devices
LKM = Linux Kernel Module
Formatting screen-output
• The Linux operating system is written in C
-- plus some “inline” assembly language,
and a large portion of the shared libraries
and standard system utilities are in C/C++
• Using functions like ‘printf()’ you can very
quickly write C/C++ code that will output
data values in a humanly-readable form
• There’s similar kernel-function: ‘printk()’
The ‘teletype’ model
• But the usual way C/C++ programs output
text to consoles (or to desktop windows) is
unsuitable for doing dynamic visualizations
• It’s based on a software emulation of early
teletype terminals, where a new character
gets added at the end of earlier text, until
the screen fills and is then scrolled upward
• A cursor blinks where the next text will go
Non-canonical I/O
• Our ‘dynamic’ visualizations will require us
to employ ‘full-screen’ terminal-output and
‘un-buffered’ terminal-input, with no ‘echo’
of keystrokes, and with no flashing cursor
or upward-scrolling screen
Hello, world
_
ANSI ‘escape’ codes
• There are standard control-codes we can
use in ‘printf()’ statements to achieve our
‘draw anywhere’ and ‘hide cursor’ goals:
printf( “\e[H\e[J” );
printf( “\e[?25l” );
// erase the entire display
// make the cursor disappear
// draw a textstring at the center of the 80-by-25 screen
int
row = 12, column = 40;
printf( “\e[%d;%dH%s”, row, column, “Hello” );
fflush( stdout );
// flush the output buffer
‘struct termios’
• We can use the ‘tcgetattr()’ and ‘tcsetattr()’
library functions to install changes to the
data that controls our terminal’s behavior
user space
kernel space
tcgetattr()
struct termios
tcsetattr()
struct termios
struct termios
Our ‘visualization’ application
• It consists of two ‘pieces’ of code:
– a user-program, written in C++
– a kernel-module, written in C
// activity.cpp
#include <stdio.h>
#include <termios.h>
int main( void )
{
// activity.c
#include <linux/module.h>
#include <linux/proc_fs.h>
#include <asm/uaccess.h>
int init_module( void )
{
}
}
void cleanup_module( void )
{
}
compiled using g++
compiled using mmake
What are ‘interrupts’?
• Normally a CPU inside our computer is
fetching and executing our sequence of
program-instructions from a contiguous
region of the system’s main memory
• But occasionally some peripheral device
undergoes a change in its state, and our
system needs to take note of it promptly
• An ‘interrupt’ is the signaling of that event
to the CPU so it’s dealt with appropriately
System-component overview
Core-2 Duo
CPU 0
Graphics
CPU 1
MCH
USB Hub
Ethernet
Real-Time Clock
Audio
Camera
Printer
MCH = Memory Controller Hub
DRAM
Disk Drives
ICH
CD/DVD
Keyboard
Mouse
ICH = I/O Controller Hub
Some examples
• The system needs to take note promptly when:
–
–
–
–
–
–
–
your keyboard has a key that’s been pressed
your mouse has been moved, or ‘clicked’
your network controller has received a packet
your internal clock’s time has been advanced
your disk controller has finished saving a file
your printer’s running low on paper or toner
your application’s ‘time-slice’ has expired
• Each of these needs a different CPU response
Interrupt ‘handlers’
• Your operating system includes all of the
code-fragments for responding to any of
the events that could ‘interrupt’ the CPU
• They’re called ‘Interrupt Service Routines’,
and the addresses of their entry-points are
stored within an array, known as the IDT,
whose location and size are held in a CPU
register that’s dedicated to that purpose
ISRs, IDT and the IDTR
Main Memory
Central
Processing
Unit
isr_kbd: ... iret
isr_prn: ... iret
isr_rtc: ... iret
EFLAGS
Interrupt Service Routines
(aka ‘interrupt handlers’)
isr_dvd: ... iret
...
EIP
ESP
EBP
EAX
EBX
Interrupt Descriptor Table
...
IDTR
‘Gate’ descriptors
• The Interrupt Descriptor Table (IDT) has
room for up to 256 entries (called ‘gates’):
32-bits
offset[31..16]
code-segment selector
type and
access
attributes
reserved
(=0)
bytes 7,6,5,4
offset[15..0]
bytes 3,2,1,0
The peculiar arrangement of this information, in which the 32-bit offset’s value
is split into a pair of non-adjacent 16-bit fields, is due to the history of Intel’s
earlier processor-architecture and its commitment to ‘backward compatibility’
Some questions…
• A typical PC doesn’t have 256 peripherals
attached to it! So, is the IDT-array larger
than is really necessary?
• The interruption-requests coming from the
various devices will be occurring at times
that application-programs cannot predict!
So how often will this be happening, and
how likely is it to ‘degrade’ performance?
Let’s take a look
• Our ‘dynamic visualization’ will let us view
all the various interrupt occurrences -- as
they are happening (i.e., in ‘real time’)
This is a static screenshot of our interrupt activity ‘visualization’
Our source-code (6 pages)
activity.c
activity.cpp
Our SMP version
The ‘enhanced’ version of our visualization shows separate interrupt-counters
for each of the two ‘logical’ processors inside our Core-2 Duo Linux platform
APIC components
Core-2 Duo
Graphics
CPU 0
CPU 1
Local-APIC
Local-APIC
MCH
USB Hub
Real-Time Clock
Printer
DRAM
Ethernet
ICH
Audio
Camera
Each CPU contains its
own Local-APIC with a
‘processor-ID’ register
Disk Drives
CD/DVD
I/O-APIC
Keyboard
Mouse
The I/O Controller Hub contains the so-called I/O-APIC whose registers
control routing of Interrupt Request signals (IRQs) to specific interrupts
(INTs) on one or more of the logical CPUs that are present in a system
Linux’s interrupt assignments
The kernel developers frequently make changes to their interrupt routing
for uniprocessor versus multiprocessor platforms, or 32-bit versus 64-bit
0
0x00
0x10
0x20
0x30
0x40
0x50
0x60
0x70
0x80
0x90
0xA0
0xB0
0xC0
0xD0
0xE0
0xF0
1
2
3
4
5
6
7
8
9
A
B
C
D
E
F
Reserved by Intel
Not in use with SMP
Legacy PIC interrupts
Assigned for use by
IO-APIC interrupts
or ‘Message-Signaled’
interrupts (or unused)
Local-APIC interrupts
…but usage is usually documented in Linux’s <asm/irq_vectors.h> kernel header
Writing LKM code
• The first two chapters of this O’Reilly book
teaches how to write Linux kernel modules
• You can use kernel ‘helper’ functions -–
–
–
–
to allocate kernel memory: kmalloc()
to create ‘pseudo’ files: create_proc_entry()
to execute code on another CPU: smp_call_function()
to insert “inline” assembly language statements: asm()
Our module’s organization
*oldidt
Our module’s ‘global’ data
*newidt
...
n_interrupts[ 256 ]
original_isr[ 256 ]
several module helper-functions…
my_open()
Our module’s ‘payload’
of file-operation methods
my_read()
my_write()
The pseudo-file’s struct
of method-pointers
Our module’s two required
administrative functions
my_fops{ }
init_module()
cleanup_module()
includes 256
ISR “hooks”
written in
in assembly
language
We ‘hook’ every ISR
• We build a new Interrupt Descriptor Table
whose entries point to our own set of very
short interrupt-handler routines: each will
simply increment a counter, then transfer
control to the usual Linux interrupt-handler
• This is NOT a new idea! It’s been used in
PCs for at least thirty years, although only
one or two interrupts got ‘hooked’ typically
How ‘hooking’ works
Our substitute ISR will increment the count of its previous interrupts…
original IDT
n_interrupts[ … ]
original ISR
isr:
...
iret
substitute IDT
substitute ISR
isr:
...
ret
IDTR
…then the substitute ISR puts the address of the original ISR on top of its stack,
so that it can transfer control there merely by executing a ‘ret’ instruction
…so, use a ‘repeat-macro’ ☻
We really didn’t want
to type in the code
for over two-hundred
separate assembly
language routines…
.text
.type
.align
isr_entry:
n = 0
.rept
pushl
jmp
.align
n = n+1
.endr
ahead:
push
mov
push
push
isr_entry, @function
16
256
$n
ahead
16
%ebp
%esp, %ebp
%eax
%ebx
mov
incl
mov
mov
4(%ebp), %ebx
n_interrupts(, %ebx, 4)
original_isr(, %ebx, 4), %eax
%eax, 4(%ebp)
pop
pop
pop
ret
%ebx
%eax
%ebp
Compiling and Installing
• Compiling a kernel module for Linux 2.6.x
is inherently complicated, so we wrote a
utility (‘mmake.cpp’) that does it easily:
$ ./mmake activity.c
• To install the compiled ‘kernel object’ in a
running kernel is a step that normally will
require ‘root’ privileges:
# /sbin/insmod activity.ko
Some exercises to try
• Can you modify our LKMs (‘activity.c’ and
‘smpwatch.c’) to use with a 64-bit kernel?
• Can you see what changes will be needed
if you want a ‘dynamic visualization’ of all
the interrupts on a multiprocessor platform
with more than two CPUs? (Core-2 Quad)
• Can you imagine other kinds of ‘dynamic
visualizations’ that would be enlightening?
Website resources
• You can download the source-code for all
the demos discussed during this talk from
this website:
<http://cs.usfca.edu/~cruse>
• You can obtain the newest versions of the
Linux kernel source-code from this site:
<http://www.kernel.org>