I/O devices

download report

Transcript I/O devices

Lecture 13 Architectural Support for OS
Peng Liu
[email protected]
Last time in Lecture 12
• Virtual Memory
– Large Address Space for Each Executing Program
– Memory Management for Multiple Programs
Virtual Memory in a Nutshell Review
• Use hard disk (or Flash) as a large storage for data of all programs
– Main memory (DRAM) is a cache for the disk
– Managed jointly by hardware and the operating system (OS)
• Each running program has its own virtual address space
– Address space as shown in previous figure
– Protected from other programs
• Frequently-used portions of virtual address space copied to DRAM
– DRAM = physical address space
– Hardware + OS translate virtual addresses (VA) used by
program to physical addresses (PA) used by the hardware
– Translation enables relocation (DRAM disk) & protection
Terminology for Virtual Memory Review
• Virtual memory used DRAM as a cache for disk
• New terms
– VM block is called a “page”
• The unit of data moving between disk and DRAM
• It is typically larger than a cache block (e.g., 4KB or 16KB)
• Virtual and physical address spaces can be divided up to virtual
pages and physical pages (e.g., contiguous chunks of 4KB)
– VM miss is called a “page fault”
• If the valid bit for a virtual page is off, a page fault occurs.
• Once the operating system gets control, it must find the page in the
next level of the hierarchy and decide where to place the requested
page in main memory.
Page Faults (Similar to “Cache Misses”) Review
• What if an object is on disk rather than in memory?
– Page table entry indicates virtual address not in memory
– OS exception handler invoked to move data from disk into memory
• OS has full control over placement
• Full-associativity to minimize future misses
Before fault
Page Table
After fault
Page Table
Using Separate Addresses Spaces Per
• Each program has its own virtual address space and own page table
– Addresses 0x400000 from different programs can map to
different locations or same location as desired
– OS control how virtual pages as assigned to physical
Translation Process Explained Review
• Valid page
– Check access rights (R,W,X) against access type
• Generate physical address if allowed
• Generate a protection fault (exception) if illegal access
• Invalid page
– Page is not currently mapped and a page fault is generated
• Faults are handled by the operating system
– Sometimes due to a program error => program terminated
• E.g. accessing out of the bounds of array
– Sometimes due to “caching” => refill & restart
• Desired data or code available on disk
• Space allocated in DRAM, page copied from disk, page table updated
• Replacement may be needed
VM: Replacement and Writes Review
• To reduce page fault rate, OS uses least-recently used
(LRU) replacement
– Reference bit (aka use bit) in PTE set to 1 on access to page
– Periodically cleared to 0 by OS
– A page with reference bit = 0 has not been used recently
• Disk writes take millions of cycles
Block at once, not individual locations
Write through is impractical
Use write-back
Dirty bit in PTE set when page is written
Fast Translation Using a TLB Review
Recap: Virtual Memory
• Use hard disk ( or Flash) as large storage for data of all programs
– Main memory (DRAM) is a cache for the disk
– Managed jointly by hardware and the operating system (OS)
• Each running program has its own virtual address space
– Address space as shown in previous figure
– Protected from other programs
• Frequently-used portions of virtual address space copied to DRAM
– DRAM = physical address space
– Hardware + OS translate virtual addresses (VA) used by
program to physical addresses (PA) used by the hardware
– Translation enables relocation & protection
Page Table Issues & Solutions
• Page table access latency
– Use TLB to cache page table entries close to processor
– Limit size of L1 cache to overlap TLB & cache accesses
• TLB coverage
– Larger pages
– Multi-level TLBs
• Page table size
– Multi-level page tables
Architectural Support Operating Systems (OS)
• Operating system (OS)
– Manages hardware resources
• Processor, main memory (DRAM), I/O devices
– Provides virtualization
• Each program thinks it has exclusive access to resources
– Provides protection, isolation, and sharing
• Between user programs and between programs and OS
• Examples of operating systems
– Windows (XP, Vista, Win7), MacOS, Linux, BSD Unix, Solaris..
– Symbian, Windows Mobile, Android
• Definition: a process is an instance of a running program
• Process provides each program with two key abstractions
– Logical control flow
• Illusion of exclusive use of the processor
• Private set of register values (including PC)
– Private address space
• Illusion of exclusive use of main memory of “infinte” size
• How are these illusion maintained?
– Process execution is interleaved (multitasking)
• On available processors
– Address space is managed by virtual memory system
Execution of Processes
• Each process has its own logical flow control
– A & B are concurrent processes, B & C are sequential
– Control flows for concurrent processes are physically
disjoint in time
– However, can think of concurrent processes are running in
Process Address Space
• Each process has its own
private address space
Kernel (OS) memory
(code, data, heap, stack)
invisible to
user code
User stack
(created at runtime)
$sp (stack pointer)
• Use processes cannot access
top region of memory
– Used by OS kernel
– The kernel is shared core of
the OS
run-time heap
(managed by malloc)
Read/write segment
Read-only segment
Loaded from the
executable file
Supporting Concurrent Processes
through Context Switching
• Processes are managed by the OS kernel
– Important: the kernel is not a separate process, but rather
runs as part of some user process
• Control flow passes from one process to another via a
context switch
• Time between context switches 10-20ms
HW Support for the OS
• Mechanisms to protect OS from processes
– Modes + virtual memory
• Mechanisms to switch control flow between OS –
– System calls + exceptions
• Mechanisms to protect processes from each other
– Virtual memory
• Mechanisms to interact with I/O devices
– Primarily memory-mapped I/O
Hardware Modes
(Protecting OS from Processes)
• 2 modes are needed, but some architectures have more
• User mode: Used to run users processes
– Accessible instructions: user portion of the ISA
• The instructions we have seen so far
– Accessible state: user registers and memory
• But virtual memory translation is always on
• Cannot access EPC, Cause, … registers
– Exceptions and interrupts are always on
• Kernel mode: used to run (most of) the kernel
– Accessible instructions: the whole ISA
• User + privileged instructions
– Accessible state: everything
– Virtual memory, exceptions, and interrupts may be turned off
Altering the Control Flow: System Calls
• So far, we have branches and jumps
– They implement control flows within a program
• Expected switch to the OS: system call instruction (syscall)
– A jump (or function call) to OS code
• E.g., in order to perform a disk access
• Also similar to a program-initiated exception
– Switches processor to kernel mode & disables interrupts
– Jump to a pre-defined place in the kernel
• Returning from a syscall: use the eret instruction (privileged)
– Switch to user mode & enable interrupts
– Jump to program counter indicated by EPC
Altering Flow Control: Exceptions & Interrupts
• Exceptions & interrupts implement unexpected switch to
– Exceptions: due to a problem in the program
• E.g., divide by zero, illegal instructions, page fault
– Interrupts: due to an external event
• I/O event, hitting clt+c, periodic timer interrupt
• Exceptions & interrupts operate similar to system calls
Set EPC & Cause registers
Switch to kernel mode, turn off interrupts
Jump to predefined part of the OS
Return to user program using eret instruction
Dealing with Switches to the OS
• A syscall, exception, or interrupt transfers control to the
OS kernel
• Kernel action (stating in the exception handling code)
– Examine issue & correct if possible
– Return to the same process or to another process
A Simple Exception Handler
• Can’t survive nested exceptions
– Since does not re-enable interrupts
• Does no use any user registers
– No need to save them
Exception Example #1
• Memory Reference
– User writes to memory location
– That virtual page is currently on disk
– OS must load page in memory
• Updated page table & TLB
– Return to faulting instruction
– Successful on second try
Exception Examples #2
• Memory Reference
– User writes to memory location
– Address is not valid or wrong access rights
– OS decides to terminate the process
• Sends SIGSEG signal to user process
• User process exits with “segmentation fault”
– OS switches to another active process
Using Virtual Memory & Exceptions: Page
• Example: Unix fork() system call
– Creates a 2nd process which is a clone of the current one
– How can we avoid copying all the address space
• Expensive in time and space
• Idea: share pages in physical memory
– Each process gets a separate page table (separate virtual
– But both PTs point to the same physical pages
– Saves time and space on fork()
• Same approach to support any type of sharing
– Two processes running same program
– Two processes that share data
Page Sharing
Diversity of Devices
• I/O devices can be characterized by
– Behavior: input, output, storage
– Partner: human or machine
– Data rate: bytes/sec, transfers/sec
I/O System Design Goals
• Performance measures
– Latency (response time)
– Throughput (bandwidth)
• Dependability is important
– Resilience in the face of failures
– Particularly for storage devices
• Expandability
• Computer classes
– Desktop: response time and diversity of devices
– Server: throughput, expandability, failure resilience
– Embedded: cost and response time
Throughput vs. Response time
• Throughput
– Aggregate measure of amount of data moved per unit time,
averaged over a window
• Measure in bytes/sec or transfers/sec
– Sometimes referred to as bandwidth
• Examples: Memory bandwidth, disk bandwidth
• Response time
– Response time to do a single I/O operation
• Measured in seconds or cycles
– Sometimes referred to as latency
• Example: Write a block of bytes to disk
• Example: Send a data packet over the network
Throughput – Response Time Tradeoffs
• Throughput is the number of tasks completed by server
per unit time
• Response time is the elapsed time between tasks
entering queue and tasks completed by the server
• Tradeoff between throughput and response time
– Highest possible throughput is when the server is always
busy and the queue is never empty
– Fastest response time is when the queue is empty and the
server is idle when the task arrives
Magnetic Hard Disks
• Characteristics
– Long term, nonvolatile storage
– Large, inexpensive, but slow (mechanical device)
• Usage
– Virtual memory (swap area)
– File system
Disk Storage
• Nonvolatile, rotating magnetic storage
Hard Disk
• Basic operation
– Rotating platter coated with magnetic surface
– Moving read/write head used to access the disk
• Features of hard disks
Platters are rigid (ceramic or metal)
High density since head can be controlled more precisely
High data rate with higher rotational speed
Can include multiple platters
• Incredible improvements
Examples of I/O device performance being technology driven
Capacity: 2x every year
Transfer rate: 1.4x every year
Price approaching 1$/GB = 109 bytes
Hard Disk Organization
• >=1 platter per disk with disk heads on both sides
• Platters are divided in concentric tracks
• Each track is divided in sectors
– Unit of transfer to/from disk (i.e., disk block)
– Some disks have a constant number of sectors per track
– Others keep constant bit density which places more sectors
on outer track
• A common track across multiple platters is referred to as
Measuring Disk Access Time
• Each read or write has three major components
– Seek time is the time to position the arm over the proper
– Rotational latency is the wait for the desired sector to rotate
under the read/write head
– Transfer time is the time required to transfer a block of bits
(sector) under the read/write head
• Note that these represent only the “raw performance” of
the disk
– Other components: I/O bus, controllers, other caches,
interleaving, etc.
Seek Time
• Seek time is the time to position the arm over the proper
• Average seek time is the time it takes to move the
read/write head from its current position to a track on the
• Industry definition is that seek time is the time for all
possible seeks divided by the number of possible seeks
– In practice, locality reduces this to 25-33% of this number
– Note that some manufacturers report minimum seek times
rather average seek times
Rotational Latency
• Rotational latency is the time spent waiting for the desired sector to
rotate under the read/write head
• Based upon the rotational speed, usually measured in revolutions
per minute (RPM)
• Average rotational latency
– Average rotational latency = 0.5 rotation / RPM
• Example: 7200 RPM
AverageRotationalLatency 
 4.2ms
7200 RPM 7200 RPM /(60sec/ min)
Transfer Time
• Transfer time is the time required to transfer a block of
• A factor of the transfer size, rotational speed, and
recording density
– Transfer size is usually a sector
• Most drives today use caches to help buffer the effects of
seek time and rotational latency
Typical Hard Drive
• Typical hard disk drive
Rotation speed:3600,5200,7200, or 15000 RPM
Tracks per surface:500-2000 tracks
Sectors per track:32-128 sectors
Sectors size: 512B-1024KB
Minimum seek time is often approximately 0.1ms
Average seek time is often approximately 5-10ms
Access time is often approximately 9-10ms
Transfer rate is often 50-200MB/s
Average Access Example
• Consider the Seagate 36.4GB Ultra SCSI
Rotation speed:10000RPM
Sector size: 512B
Average seek time : 5.7 ms
Transfer rate: 24.5 MB/s
Controller overhead of 1 ms
• What is the average read time?
I/O System Design
• Satisfying latency requirement
– For time-critical operations
– If system is unloaded
• Add up latency of components
• Maximizing throughput at steady state (loaded system)
– Find “weakest link” (lowest-bandwidth component)
– Configure to operate at its maximum bandwidth
– Balance remaining components in the system
• A bus is a shared communication link that connects
multiple devices
• Single set of wires connects multiple “subsystems” as
opposed to a point to point link which only two
components together
• Wires connect in parallel, so 32 bit bus has 32 wires of
• Advantages
– Broadcast capability of shared communication link
– Versatility
• New device can be added easily
• Peripherals can be moved between computer systems that use the same bus
– Low Cost
• A single set of wires is shared multiple ways
• Disadvantages
– Communication bottleneck
• Bandwidth of bus can limit the maximum I/O throughput
– Limited maximum bus speed
• Length of the bus
• Number of devices on the bus
• Need to support a range of devices with varying latencies and transfer rates
Bus Organization
• Bus Components
– Control Lines
• Signal begin and end of transactions
• Indicate the type of information on the data line
– Data Lines
• Carry information between source and destination
• Can include data, addresses, or complex commands
• Processor-memory bus or front-side bus or system bus
– Short, high-speed bus
– Connects memory and processor directly
– Designed to match the memory system and achieve the maximum memoryto-processor bandwidth (cache transfers)
– Designed specifically for a given processor/memory system (proprietary)
• I/O Bus (or peripheral bus)
Usually long and slow
Connect devices to the processor-memory bus
Must match a wide range of I/O device performance characteristics
Industry standard
Synchronous versus Asynchronous
• Synchronous Bus
– Includes a clock in control lines
– Fixed protocol for communication relative to the clock
– Advantages
• Involves very little logic and can therefore run very fast
– Disadvantages
• Every decision on the bus must run at the same clock rate
• To avoid clock skew, bus cannot be long if it is fast
– Processor-memory bus
• Asynchronous Bus
No clock control line
Can easily accommodate a wide range of devices
No clock skew problems, so bus can be quite long
Requires handshaking protocol
Increasing Bus Bandwidth
• Several factors account for bus bandwidth
– Wider bus width
• Increasing data bus width => more data per bus cycle
• Cost: More bus lines
– Separate address and data lines
• Address and data can be transmitted in one bus cycle if separate
address and data lines are available
• Costs: More bus lines
Increasing Bus Bandwidth
• Several factors account for bus bandwidth
– Block transfers
Transfer multiple words in back-to-back bus cycles
Only one address needs to be sent at the start
Bus is not released until the last word is transferred
Costs: Increased complexity and increased response time for
pending requests
Increasing Bus Bandwidth
– Spilt transaction “pipelining the bus”
• Free the bus during time between request and data transfer
• Costs: Increased complexity and higher potential latency
Spilt transaction bus with separate address and data wires
Accessing the Bus
• How is the bus reserved by a device that wishes to use it?
• Master-slave arrangement
– Only the bus master can control access to the bus
– The bus master initiates and controls all bus requests
– A slave responds to read and write requests
• A simple system
– Processor is the only bus master
– All bus requests must be controlled by the processor
– Major drawback is the processor must be involved in every
Multiple Masters
• With multiple masters, arbitration must be used so that
only one device is granted access to the bus at a given
• Arbitration
– The bus master wanting to use the bus asserts a bus request
– The bus master cannot use the bus until the request is
– The bus master must signal the arbiter when finished using
the bus
• Bus arbitration goals
– Bus priority – highest priority device should be serviced first
– Fairness – Lowest priority devices should not starve
Centralized Parallel Arbitration
• Advantages
– Centralized control where all devices submit request
– Any fair priority scheme can be implemented (FCFS, roundrobin)
• Disadvantages
– Potential bottleneck at central arbiter
Operating System Tasks
• The OS is a resource manager and acts ad the interface
between I/O hardware and programs that request I/O
• Several important characteristics of the I/O system:
– I/O system is shared by multiple programs
– I/O systems often use interrupts to notify CPU
• Interrupts = externally generated “exceptions”
• Typically “Input available” or “Output complete” messages
• OS handles interrupts by transferring control to kernel mode
– Low-level control of an I/O device is complex
• Managing a set of concurrent events
• Requirements for correct device control are very detailed
• I/O device drivers are the most common area for bugs in an OS!
OS Communication
• The operating system should prevent user program from
communicating with I/O device directly
– Must protect I/O resources to keep sharing fair
– Protection of shared I/O resources cannot be provided if user
programs could perform I/O directly
• Three types of communication are required
– OS must be able to give commands to I/O devices
– I/O device must be able to notify OS when I/O device has
completed and operation or has encountered an error
– Data must be transferred between memory and an I/O device
I/O Commands:
A method for addressing a device
• Memory-mapped I/O:
– Portions of the address space are assigned to each I/O
– I/O addresses correspond to device registers
– User programs prevented from issuing I/O operations directly
since I/O address space is protected by the address
translation mechanism
I/O Commands
• I/O devices are managed by I/O controller hardware
– Transfers data to/from device
– Synchronizes operation with software
• Command registers
– Cause device to do something
• Status registers
– Indicate what the device is doing and occurrence of errors
• Data registers
– Write: transfer data to a device
– Read: transfer data from a device
Communicating with the CPU
• Method #1: Polling
– I/O device places information in a status register
– The OS periodically checks the status register
– Whether polling is used dependent upon whether the device
can initiate I/O independently
• For instance, a mouse works well since it has a fixed I/O rate and
initiates its own data (whenever it is moved)
• For others, such as disk access, I/O only occurs under the control of
the OS, so we poll only when the OS knows it is active
– Advantages
• Simple to implement
• Processor is in control and does the work
– Disadvantage
• Polling overhead and data transfer consume CPU time
Polling and Programmed I/O
I/O Notification
• Method #2: I/O Interrupt
– When an I/O device needs attention, it interrupts the
– Interrupt must tell OS about the event and which device
• Using “cause” register(s): Kernel “asks” what interrupted
• Using vectored interrupts: A different exception handler for each
– I/O interrupts are asynchronous events, and happen anytime
• Processor waits until current instruction is completed
– Interrupts may have different priorities
– Advantages: Execution is only halted during actual transfer
– Disadvantage: Software overhead of interrupt processing
I/O Interrupts
Data Transfer
• The third component to I/O communication is the transfer
of data from the I/O device to memory (or vice versa)
• Simple approach: “Programmed” I/O
– Software on the processor moves all data between memory
addresses and I/O addresses
– Simple and flexible, but wastes CPU time
– Also, lots of excess data movement in modern systems
• Mem->Network->CPU->Network->graphics
• So need a solution to allow data transfer to happen
without the processor’s involvement
Delegating I/O : DMA
• Direct Memory Access (DMA)
– Transfer blocks of data to or from memory without CPU
– Communication coordinated by the DMA controller
• DMA controllers are integrated in memory or I/O controller chips
– DMA controller acts as a bus master, in bus-based systems
• DMA steps
– Processor sets up DMA by supplying
• Identify of the device and the operation (read/write)
• The memory address for source/destination
• The number of bytes to transfer
– DMA controller starts the operation by arbitrating for the bus
and then starting the transfer when the data is ready
– Notify the processor when the DMA transfer is complete or
on error
• Usually using an interrupt
DMA Problems: Virtual vs. Physical Addresses
• If DMA uses physical addresses
– Memory access across physical page boundaries may not
correspond to contiguous virtual pages (or even the same
• Solution1: <1 page per DMA transfer
• Solution: chain a series of 1-page requests provided by the OS
– Single interrupt at the end of the last DMA request in the
• Solution2: DMA engine uses virtual addresses
– Multi-page DMA requests are now easy
– A TLB is necessary for the DMA engine
• For DMA with physical addresses: pages must be pinned in DRAM
– OS should not page to disks pages involved with pending I/O
I/O Summary
• I/O performance has to take into account many variables
– Response time and throughput
– CPU, memory, bus, I/O device
– Workload e.g. transaction processing
• I/O devices also span a wide spectrum
– Disks, graphics, and networks
• Buses
– Bandwidth
– Synchronization
– Transactions
• OS and I/O
– Communication: Polling and Interrupts
– Handling I/O outside CPU: DMA