CS 5204 Operating Systems Fall 2005
Download
Report
Transcript CS 5204 Operating Systems Fall 2005
CS 5204
Operating Systems
Review Basic Concepts
Godmar Back
Announcements
• Stay tuned for reading list finalization
• (Don’t start picking topic yet, wait for
Piazza announcement)
CS 5204 Fall 2013
Why are OS interesting?
• OS are “magic”
– Most people don’t understand them – including
sysadmins and computer scientists!
• OS are incredibly complex systems
– “Hello, World” – program really 1 million lines of code
• Studying OS is learning how to deal with
complexity
– Abstractions (+interfaces)
– Modularity (+structure)
– Iteration (+learning from experience)
CS 5204 Fall 2013
What does an OS do?
• Software layer that sits
between applications
and hardware
• Performs services
– Abstracts hardware
– Provides protection
– Manages resources
CS 5204 Fall 2013
gcc
csh
X11
Operating System
Hardware
CPU Memory Network Disk
OS vs Kernel
• Can take a wider view or a narrower definition what an
OS is
• Wide view: Windows, Linux, Mac OSX are operating
systems
– Includes system programs, system libraries, servers, shells, GUI
etc.
• Narrow definition:
– OS often equated with the kernel.
– The Linux kernel; the Windows executive – the special piece of
software that runs with special privileges and actually controls
the machine.
• In this class, usually mean the narrow definition.
• In real life, always take the wider view. (Why?)
CS 5204 Fall 2013
Evolution of OS
• OSs as a library
– Abstracts away hardware, provide neat
interfaces
• Makes software portable; allows software evolution
– Single user, single program computers
• No need for protection: no malicious users, no
interactions between programs
– Disadvantages of uniprogramming model
• Expensive
• Poor utilization
CS 5204 Fall 2013
Evolution of OS (II)
• Invent multiprogramming
– First multi-programmed batch systems, then timesharing systems
• Idea:
– Load multiple programs in memory
– Do something else while one program is waiting, don’t
sit idle (see next slide)
• Complexity increases:
– What if programs interfere with each other (wild
writes)
– What if programs don’t relinquish control (infinite loop)
CS 5204 Fall 2013
Single Program vs Multiprogramming
CS 5204 Fall 2013
Protection
• Multiprogramming requires isolation
• OS must protect/isolate applications from each
other, and OS from applications
• This requirement is absolute
• Three techniques
– Preemption
– Interposition
– Privilege
CS 5204 Fall 2013
Protection #1: Preemption
• Resource can be given to program and access
can be revoked
– Example: CPU, Memory, Printer, “abstract” resources:
files, sockets
• CPU Preemption using interrupts
– Hardware timer interrupt invokes OS, OS checks if
current program should be preempted, done every
4ms in Linux
– Solves infinite loop problem!
• Q.: Does it work with all resources equally?
CS 5204 Fall 2013
Protection #2: Interposition
• OS hides the hardware
• Application have to go through OS to
access resources
• OS can interpose checks:
– Validity (Address Translation)
– Permission (Security Policy)
– Resource Constraints (Quotas)
CS 5204 Fall 2013
Protection #3: Privilege
• Two fundamental modes:
– “kernel mode” – privileged
• aka system, supervisor or monitor mode
• Intel calls its PL0, Privilege Level 0 on x86
– “user mode” – non-privileged
• PL3 on x86
• Bit in CPU – controls operation of CPU
– Protection operations can only
be performed in kernel mode.
Example: hlt
– Carefully control transitions
between user & kernel mode
CS 5204 Fall 2013
int main()
{
asm(“hlt”);
}
OS as a Resource Manager
• OS provides illusions, examples:
– every program is run on its own CPU
– every program has all the memory of the
machine (and more)
– every program has its own I/O terminal
• “Stretches” resources
– Possible because resource usage is bursty,
typically
• Increases utilization
CS 5204 Fall 2013
Resource Management (2)
• Multiplexing increases complexity
• Car Analogy (by Rosenblum):
– Dedicated road per car would be incredibly inefficient, so cars
share freeway. Must manage this.
– (abstraction) different lanes per direction
– (synchronization) traffic lights
– (increase capacity) build more roads
• More utilization creates contention
–
–
–
–
(decrease demand) slow down
(backoff/retry) use highway during off-peak hours
(refuse service, quotas) force people into public transportation
(system collapse) traffic jams
CS 5204 Fall 2013
Resource Management (3)
• OS must decide who gets to use what
resource
• Approach 1: have admin (boss) tell it
• Approach 2: have user tell it
– What if user lies? What if user doesn’t know?
• Approach 3: figure it out through feedback
– Problem: how to tell power users from
resource hogs?
CS 5204 Fall 2013
Goals for Resource Management
• Fairness
– Assign resources equitably
• Differential Responsiveness
– Cater to individual applications’ needs
• Efficiency
– Maximize throughput, minimize response
time, support as many apps as you can
• These goals are often conflicting.
– All about trade-offs
CS 5204 Fall 2013
Summary: Core OS Functions
• Hardware abstraction through interfaces
• Protection:
– Preemption
– Interposition
– Privilege (user/kernel mode)
• Resource Management
– Virtualizing of resources
– Scheduling of resources
CS 5204 Fall 2013
Concurrency
• Review of basic concepts
• Process Management as OS responsibility
– process vs thread abstraction
• Synchronization Issues:
– mutual exclusion & race conditions
– deadlock & starvation
• Implementing processes & threads
• Programming models for communication
– threads vs events
CS 5204 Fall 2013
Definition: Process/Thread
• Process
– “program in execution”
– resources: CPU context, memory maps, open files,
privileges, ….; isolated
• Threads
– CPU context (state + stack); not isolated
• “thread” is a historically recent term
– Threads used to be called “processes”
• Q: what primitives does an OS need to provide?
CS 5204 Fall 2013
Processes vs Threads
• Processes execute concurrently
and share resources:
– Files on disk; Open files (via inherited
file descriptors); Terminal, etc.
– Kernel-level concurrency
P1
P2
P3
Kernel
• but do not (usually) share any of their memory
– cannot share data easily by exchanging pointers to it
• Threads are separate logical flows of control (with
separate stacks!) that share memory and can refer
to same data
– Different models and variations exist
– Application-level concurrency
CS 5204 Fall 2013
Context Switching
• Historical motivation for processes was
introduction of multi-programming:
– Load multiple processes into memory, and switch to
another process if current process is (momentarily)
blocked
– This required protection and isolation between these
processes, implemented by a privileged kernel: dualmode operation.
• Time-sharing: switch to another process
periodically to make sure all processes make
equal progress
• Switch between processes is called a context
switch
CS 5204 Fall 2013
Dual-Mode Operation
• Two fundamental modes:
– “kernel mode” – privileged
• aka system, supervisor or monitor mode
• Intel calls its PL0, Privilege Level 0 on x86
– “user mode” – non-privileged
• PL3 on x86
• Bit in CPU – controls operation of CPU
– Privileged operations can only
be performed in kernel mode.
Example: hlt
– Must carefully control transition
between user & kernel mode
CS 5204 Fall 2013
int main()
{
asm(“hlt”);
}
Mode Switching
• User Kernel mode
– For reasons external or internal to CPU
• External (aka hardware) interrupt:
– timer/clock chip, I/O device, network card, keyboard, mouse
– asynchronous (with respect to the executing program)
• Internal interrupt (aka software interrupt, trap, or exception)
– are synchronous
– can be intended (“trap”): for system call (process wants to enter
kernel to obtain services)
– or unintended (usually): (“fault/exception”) (division by zero,
attempt to execute privileged instruction in user mode, memory
access violation, invalid instruction, alignment error, etc.)
• Kernel User mode switch on iret instruction
CS 5204 Fall 2013
A Context Switch Scenario
Timer interrupt: P1 is preempted,
context switch to P2
I/O device interrupt:
P2’s I/O complete
switch back to P2
Process 1
Process 2
user mode
kernel mode
Kernel
System call: (trap):
P2 starts I/O operation, blocks
context switch to process 1
CS 5204 Fall 2013
Timer interrupt: P2 still has
time left, no context switch
System Calls
User processes access kernel services by
trapping into the kernel, executing kernel
code to perform the service, then returning –
very much like a library call.
Unless the system call cannot complete
immediately, this does not involve a context
switch.
Process 1
user mode
kernel mode
Kernel
Kernel’s System Call Implementation
CS 5204 Fall 2013
Kernel
Threads
Most OS support kernel threads that never run in
user mode – these threads typically perform book
keeping or other supporting tasks. They do not
service system calls or faults.
Process 1
Process 2
user mode
kernel mode
Kernel
Kernel Thread
Careful: “kernel thread” not the same as
kernel-level thread (KLT) – more on KLT later
CS 5204 Fall 2013
Reasoning about Processes:
Process States
RUNNING
Process
must wait
for event
Scheduler
picks process
Process
preempted
BLOCKED
READY
Event arrived
• Only 1 process (per CPU) can be in RUNNING state
CS 5204 Fall 2013
Process Creation
• Two common paradigms:
– Cloning vs. spawning
• Cloning: (Unix)
– “fork()” clones current process
– child process then loads new program
• Spawning: (Windows)
– “exec()” spawns a new process with new program
• Difference is whether creation of new process
also involves a change in program
CS 5204 Fall 2013
fork()
#include <unistd.h>
#include <stdio.h>
int
main()
{
int x = 1;
if (fork() == 0) {
// only child executes this
printf("Child, x = %d\n", ++x);
} else {
// only parent executes this
printf("Parent, x = %d\n", --x);
}
Child, x = 2
Exiting with x = 2
Parent, x = 0
Exiting with x = 0
// parent and child execute this
printf("Exiting with x = %d\n", x);
return 0;
}
CS 5204 Fall 2013
The fork()/join() paradigm
• After fork(), parent & child
execute in parallel
Parent:
fork()
– Unlike a fork in the road, here we
take both roads
• Used in many contexts
• In Unix, ‘join()’ is called wait()
• Purpose:
– Launch activity that can be done in
parallel & wait for its completion
– Or simply: launch another program
and wait for its completion (shell
does that)
CS 5204 Fall 2013
Parent
process
executes
Child
process
executes
Child
process
exits
Parent:
join()
OS notifies
fork()
#include <sys/types.h>
#include <unistd.h>
#include <stdio.h>
int main(int ac, char *av[])
{
pid_t child = fork();
if (child < 0)
perror(“fork”), exit(-1);
if (child != 0) {
printf ("I'm the parent %d, my child is %d\n",
getpid(), child);
wait(NULL); /* wait for child (“join”) */
} else {
printf ("I'm the child %d, my parent is %d\n",
getpid(), getppid());
execl("/bin/echo", "echo", "Hello, World", NULL);
}
}
CS 5204 Fall 2013
User View of Threads
• Unix/C:
– fork()/wait() vs pthread_create()/pthread_join()
• Java:
– new Thread()
– Thread.start()/join()
• See also
[Boehm PLDI 2005]
Runnable r = new Runnable() {
public void run() {
/* body */
}
};
Thread t = new Thread(r);
t.start(); // concurrent execution starts
// main
t.join(); // concurrent execution ends
CS 5204 Fall 2013
Threading APIs
• How are threads embedded in a
language/environment?
• POSIX Threads Standard (in C)
–
–
–
–
pthread_create(), pthread_join()
Uses function pointer to denote start of new control flow
Largely retrofitted in Unix world
Needed to define interaction of signals and threads
• Java/C#
– Thread.start(), Thread.join()
– Java: Using “Runnable” instance
– C#: Uses “ThreadStart” delegate
CS 5204 Fall 2013
Example pthread_create/join
static void * test_single(void *arg)
{
// this function is executed by each thread, in parallel
}
Use Default Attributes –
could set stack addr/size
here
pthread_t threads[NTHREADS];
int i;
for (i = 0; i < NTHREADS; i++)
if (pthread_create(threads + i, (const pthread_attr_t*)NULL,
test_single, (void*)i) == -1)
{ printf("error creating pthread\n"); exit(-1); }
/* Wait for threads to finish. */
for (i = 0; i < NTHREADS; i++)
pthread_join(threads[i], NULL);
2nd arg could receive exit
status of thread
CS 5204 Fall 2013
Java Threads Example
public class JavaThreads {
public static void main(String []av) throws Exception {
Thread [] t = new Thread[5];
for (int i = 0; i < t.length; i++) {
final int tnum = i;
Runnable runnable = new Runnable() {
public void run() {
System.out.println("Thread #"+tnum);
Threads implements Runnable –
could also have subclassed
}
Thread & overridden run()
};
t[i] = new Thread(runnable);
t[i].start();
}
Thread.join() can throw
for (int i = 0; i < t.length; i++)
InterruptedException – can be
used to interrupt thread waiting to
t[i].join();
join via Thread.interrupt
System.out.println("all done");
}
}
CS 5204 Fall 2013
import java.util.concurrent.*;
public class FixedThreadPool
{
public static void main(String []av) throws Exception {
ExecutorService ex = Executors.newFixedThreadPool(4);
final int N = 4;
Future<?> f[] = new Future<?>[N];
Tasks must implement “Callable”
for (int i = 0; i < N; i++) {
– like Runnable except returns
result.
final int j = i;
f[i] = ex.submit(new Callable<String>() {
public String call() {
return "Future #" + j + " brought to you by “ + Thread.currentThread();
}
});
}
System.out.println("Main thread: " + Thread.currentThread());
Java
Threadpools
for (int i = 0; i < N; i++)
System.out.println(f[i].get());
get() waits for the execution of
call() to be completed (if it hasn’t
already) and returns result
ex.shutdown();
}
}
CS 5204 Fall 2013
Explicit Threads vs. Pools
• Overhead:
– Startup overhead per thread relatively high (between 1e4
& 1e5 cycles); pools amortize
• There is no point in having more threads than there
are physical cores
– Compete for available CPUs
– Unless some subset is blocked on I/O or other conditions
• Still, sizing of pools that maximizes throughput can be
challenging
– “cachedThreadPool” creates thread whenever needed, but
reuses existing ones that are idle
– “fixedThreadPool” - # of threads fixed
– Can implement custom policies
CS 5204 Fall 2013
Aside: Hybrid Models
• The “threads share everything” +
“processes share nothing” mantra does
not always hold
• Hybrids:
– WEAVES allows groups of threads to define
their own namespace, so they only share data
they want
– Java multitasking systems (KaffeOS, MVM):
multiple “processes” may share same address
space
CS 5204 Fall 2013
IMPLEMENTATION
CS 5204 Fall 2013
Implementing Threads
• Issues:
–
–
–
–
–
Who maintains thread state/stack space?
How are threads mapped onto CPUs?
How is coordination/synchronization implemented?
How do threads interact with I/O?
How do threads interact with existing APIs such as
signals?
– How do threads interact with language runtimes (e.g.,
GCs)?
– How do terminate threads safely?
CS 5204 Fall 2013
Cooperative Multi-threading
• Special case of user-level threading
– Is easy to implement using ‘swapcontext’ – see next slide
• Support multiple logical execution flows
– Each needs own stack so has its own (procedure-) local
(automatic) variables
– But share address space so shares heap, global vars (all
kinds: global, global static, local static)
• In cooperative multi-threading, a context switch can occur
only if a thread voluntarily offers the CPU to (any) other thread
(“yield”); later resumes and returns
– Can build resource abstractions on top where threads yield if
they cannot obtain the abstracted resource
• This is called a “non-preemptive” model
• If yield is directed (“yield to x”) this model is called “coroutines”
CS 5204 Fall 2013
Cooperative Multithreading via ‘swapcontext’
static char stack[2][65536];
// a stack for each coroutine
static ucontext_t coroutine_state[2]; // container to remember context
// switch current coroutine (0 -> 1 -> 0 -> 1 ...)
static inline void
yield_to_next(void)
{
static int current = 0;
static void
coroutine(int coroutine_number)
int prev = current;
{
int next = 1 - current;
int i;
for (i = 0; i < 5; i++) {
current = next;
printf("Coroutine %d counts i=%d (&i=%p)\n",
swapcontext(&coroutine_state[prev],
coroutine_number, i, &i);
&coroutine_state[next]);
yield_to_next();
}
}
}
CS 5204 Fall 2013
Cooperative Multi-threading (cont’d)
• Advantages:
– Requires no OS support
– Context switch very fast (usually involves only saving calleesaved regs + stack pointer)
– Reduced potential for certain types of race conditions
• E.g., i++ will never be interrupted
– Used in very high-performance server designs & discrete
event simulation
• Disadvantages
– OS sees only one thread in process, system calls block entire
process (if they block)
– Cannot make use of multiple CPUs/cores
– Cannot preempt infinitely-looping or uncooperative threads
• (though can fake preemption in just-in-time compiled languages by
letting compiler insert periodic checks)
CS 5204 Fall 2013
Use Cases for
App-level Concurrency
• Overlap I/O with computation
– E.g. file sharing program downloads, checksums, and
saves/repairs files simultaneously
• Parallel computation
– Use multiple CPUs
• Retain interactivity while background activity is
performed
– E.g., still serve UI events while printing
• Handling multiple clients in network server apps
• By and large, these are best handled with a
preemptive, and (typically) kernel-level multithreading model
CS 5204 Fall 2013
Example Use: Threads in Servers
CS 5204 Fall 2013
Preemptive Multi-Threading
• Don’t require the explicit, programmer-inserted
“yield” call to switch between threads
• “Switch” mechanism can be implemented at userlevel or kernel-level
– User-level threads: can be built using signal handlers
(e.g. SIGVTALRM)
• Requires advanced file descriptor manipulation techniques to
avoid blocking entire process in system call
– Kernel-level threads: natural extension for what OS
already does when switching between processes
• Integrated with system calls – only current thread blocks
– Hybrids
• Kernel-level threads is the dominant model today
CS 5204 Fall 2013
Kernel-level Threads 1:1 Model
Threading Models
Hybrid, so-called M:N model:
User-level Threads 1:N Model
Source: Solaris documentation (left), Stallings (right)
CS 5204 Fall 2013
Threading Implementations Overview
Does OS know
about threads
within a
process?
Do blocking
syscalls (e.g.,
read()) block
entire
process?
Can threads
be preempted
even if they
don’t call
yield()?
Can multiple
cores/CPUs be
utilized?
Yes
No
Yes
Yes
Preemptive
Kernel-level
Threads *
No
No
Yes
No
Preemptive
user-level
threads
No
Yes
No
No
Cooperative
user-level
threads
CS 5204 Fall 2013
* Most
common
model
today
Threading Models
• Linux, Windows, Solaris 10 or later, OSX: use
1:1 model with kernel-level threads. OS
manages threads in each process
– threads in Java/C#, etc. are typically mapped to
kernel-level threads
• Solaris (pre-10), Windows “fibers”: provide
M:N model
– Attempted to obtain “best of both worlds” – turned
out to be difficult in practice
• User-level Threads
– used mainly in special/niche applications today
CS 5204 Fall 2013
Managing Stack Space
• Stacks require continuous virtual
address space
– virtual address space fragmentation
(example 32-bit Linux)
stack1
guard
• What size should stack have?
– How to detect stack overflow?
– Ignore vs. software vs. hardware
• Related: how to implement
– Get local thread id “pthread_self()”
– Thread-local Storage (TLS)
CS 5204 Fall 2013
stack2
guard
On Termination
• If you terminate a thread, how will you
clean up if you have to terminate it?
• Strategies:
– Avoid shared state
where possible
– Disable termination
– Use cleanup handlers
try/finally,
pthread_cleanup
Queue q1, q2; // shared
thread_body() {
while (!done)
(true) { {
Packet p = q1.dequeue();
q2.enqueue(p);
}
}
CS 5204 Fall 2013
SYNCHRONIZATION
CS 5204 Fall 2013
Synchronization
• Access to resources must be protected
• Race Condition problem
– Definition
– Approaches for detecting them
– Static vs dynamic
CS 5204 Fall 2013
Critical Section Problem
• Many algorithms known
– purely software-based (Dekker’s, Peterson’s
algorithm) vs. hardware-assisted (disable irqs,
test-and-set instructions)
• Criteria for good algorithm:
– mutual exclusion
– progress
– bounded waiting
while (application hasn’t exited) {
enter critical section
inside critical section
exit critical section
in remainder section
}
CS 5204 Fall 2013
Synchronization Abstractions
• Atomic primitives
– e.g. Linux kernel “atomic_inc()”
• Dijkstra’s semaphores
– P(s) := atomic { while (s<=0) /* no op */; s--; }
– V(s) := atomic { s++; }
– Q: what’s wrong with this implementation?
• Binary semaphores, locks, mutexes
– Difference between mutex & semaphore
CS 5204 Fall 2013
Expressing Critical Sections
pthread_mutex_t m;
…
pthread_mutex_lock(&m);
/* in critical section */
synchronized (object) {
/* in critical section */
if (*) {
pthread_mutex_unlock(&m);
return;
}
pthread_mutex_unlock(&m);
if (*) {
return;
}
}
Pthreads/C vs Java
CS 5204 Fall 2013
Expressing Critical Sections
pthread_mutex_t m;
…
pthread_mutex_lock(&m);
/* in critical section */
synchronized (object) {
/* in critical section */
if (*) {
pthread_mutex_unlock(&m);
return;
}
pthread_mutex_unlock(&m);
if (*) {
return;
}
}
Pthreads/C vs Java
Note benefits of language support
CS 5204 Fall 2013
Monitors (Hoare)
Enter
• Data Type:
• “Monitor Invariant”
Wait
Signal
Wait
Signal
Region of mutual exclusion
– internal, private data
– public methods
wrapped by Enter/Exit
– wait/signal methods
Exit
CS 5204 Fall 2013
Expressing Monitors
pthread_mutex_t m;
pthread_cond_t c;
…
pthread_mutex_lock(&m);
/* in critical section */
synchronized (object) {
/* in critical section */
while (somecond != true)
pthread_cond_wait(&c, &m);
while (somecond != true) {
object.wait();
}
pthread_mutex_unlock(&m);
}
pthread_mutex_lock(&m);
/* in critical section */
pthread_cond_signal(&c, &m);
pthread_mutex_unlock(&m);
synchronized (object) {
/* in critical section */
object.notify();
}
See also Java’s insecure parallelism [Per Brinch Hansen 1999]
CS 5204 Fall 2013
Deadlock
pthread_mutex_t A;
pthread_mutex_t B;
…
pthread_mutex_lock(&A);
pthread_mutex_lock(&B);
…
pthread_mutex_unlock(&B);
pthread_mutex_unlock(&A);
pthread_mutex_lock(&B);
pthread_mutex_lock(&A);
…
pthread_mutex_unlock(&A);
pthread_mutex_unlock(&B);
A
B
Thread 1
Thread 2
CS 5204 Fall 2013
Reusable vs. Consumable Resources
• Distinguish two types of resources when discussing
deadlock
• A resource:
– “anything a process needs to make progress”
• (Serially) Reusable resources (static, concrete, finite)
– CPU, memory, locks
– Can be a single unit (CPU on uniprocessor, lock), or multiple
units (e.g. memory, semaphore initialized with N)
• Consumable resources (dynamic, abstract, infinite)
– Can be created & consumed: messages, signals
• Deadlock may involve reusable resources or
consumable resources
CS 5204 Fall 2013
Deadlocks, more formally
• 4 necessary conditions
– Mutual Exclusion
– Hold and Wait
– No Preemption
– Circular Wait
• Q.: what are strategies
to detect/break/avoid
deadlocks?
CS 5204 Fall 2013
R1
P1
R3
P2
R2
P3
P4
R4
Resource Allocation Graph
R → P Assignment
P → R Request
Strategies for dealing with
Deadlock
• Deadlock Prevention
– Remove a necessary condition
• Deadlock Avoidance
– Can’t remove necessary condition, so avoid
occurrence of deadlock – maybe by clever
resource scheduling strategy
• Deadlock Recovery
• Deadlock vs. Starvation
CS 5204 Fall 2013
Example: x86
• Nonpreemptive = C calling conventions:
– Caller-saved: eax, ecx, edx + floating point
– Callee-saved: ebx, esi, edi, esp
• ebp, eip for a jmpbuf size of 6*4 = 24 bytes
• Preemptive = save entire state
– All registers + 108 bytes for floating point context
• Note: context switch cost = save/restore state
cost + scheduling overhead + lost locality cost
CS 5204 Fall 2013