process - University of Limerick

Download Report

Transcript process - University of Limerick

CS4023 – Operating Systems
(week 4)
Processes and Concurrency
Dr. Atif Azad
[email protected]
1
Acknowledgement
• Significant material in this set of lectures has been
borrowed from:
– http://www.cs.berkeley.edu/~kubitron/courses/cs162/.
Copyright © 2010 UCB
– http://www.cs.berkeley.edu/~kubitron/cs19424/index_lectures.html. Copyright © 2013 UCB
– http://www-inst.eecs.berkeley.edu/~cs162/fa14 .
Copyright © 2014 David Culler. Copyright ©2014 UCB.
– http://www.os-book.com . © 2014 Silberschatz et al.
– Dr Patrick Healy at CSIS Department, University of
Limerick.
2
Review – Week 3
• Operating System Services
• System Calls
– Provide links to Kernel Services (Resource
Allocator)
– Implementation
– Viewing System Calls (strace)
• Policy vs Mechanisms
– Policy: what needs to be done.
– Mechanism: how it should be done.
• Structure of Operating Systems
3
What happens during execution?
Addr 232-1
R0
…
R31
F0
…
F30
PC
Fetch
Exec
Thread of Execution
• Execution sequence:
–
–
–
–
–
–
Fetch Instruction at PC
Decode
Execute (possibly using registers)
Write results to registers/memory
PC = Next Instruction(PC)
Repeat
…
Data1
Data0
Inst237
Inst236
…
Inst5
Inst4
Inst3
Inst2
Inst1
Inst0
PC
PC
PC
PC
Addr 0
• Register: small, very fast memory for quick access to frequently used information
• Program Counter (PC): A special register; holds address of current/next instruction
OS Bottom Line: Run Programs
(passive) Program
Executable
foo.c
instructions
instructions
Load &
Execute
compiler
editor
Program Source
int main()
{ … ;
}
Process (active)
Memory
0x000…
data
data
heap
a.out
stack
• Program (passive) becomes alive to be a Process/job.
– Load instruction and data segments of executable file
into memory
– Create data structures in memory (stack and heap)
– “Transfer control to it”
PC:
– Provide services to it
– While protecting OS and it
• One program can be several processes?
– Consider multiple users executing the same
program
OS
0xFFF…
registers
Processor
Program Address Space
Process
0x000…
Code Segment
instruction
Static Data
heap
int array[6];
//Global data
float globalF=3.0f;
int main(){
int i=20;
float w=2;
int a; char c;
stack
0xFFF…
//dynamic allocation
char* x=malloc( i );
//like new Object();
int i; flot w; chr *x
func();
PC:
SP:
Processor
• What’s in the code segment? Data?
registers
• What’s in the heap segment?
– How is it allocated? How big?
• What’s in the stack segment?
//local data
}
void func(int a){
char c=‘a’;
printf(“func”);
}
– How is it allocated? How big is it?
• What if an instruction accesses data outside heap/stack?
Program Address Space
Stack and Heap
0x000…
Code Segment
instruction
Static Data
heap
stack:
int array[6];
//Global data
float globalF=3.0f;
int main(){
int i=20;
float w=2;
int a; char c;
//dynamic allocation
char* x=malloc( i );
//like new Object();
int i; flot w; chr *x
0xFFF…
func();
PC:
SP:
free(x)
Processor
• When func exits, stack decreases.
registers
• When main exits, stack finishes.
–
–
–
–
Heap still has memory allocated
But nothing points to it.
Needs to be free()d
Otherwise, heap may run out.
//local data
}
void func(int a){
char c=‘a’;
printf(“func”);
}
Recall Key Concept: Address Space
• Program operates in an address space that is
distinct from the physical memory space of
the machine
0x000…
Processor
Memory
translator
0xFFF…
A simple address translation: Base & Bound
code
0000…
code
0000…
Static Data
Static Data
heap
heap
stack
Base Address
stack
1000…
code
Program
address
Static Data
heap
Bound
<
1100…
• Can the pgm touch OS?
• Can it touch other pgms?
9/3/14
1000…
stack
1100…
FFFF…
Concurrency
• “Thread” of execution
– Independent Fetch/Decode/Execute loop
– Operating in some Address space
– Single-Threaded processes for this lecture. (also called
“heavy weight” process)
• Uniprogramming: one thread at a time
–
–
–
–
MS/DOS, early Macintosh, Batch processing
Easier for operating system builder
Get rid of concurrency by defining it away
Does this make sense for personal computers?
• Multiprogramming: more than one thread at a time
– Multics, UNIX/Linux, OS/2, Windows NT/2000/XP, Mac OS X
The Basic Problem of Concurrency
• The basic problem of concurrency involves resources:
– Hardware: single CPU, single RAM, single I/O devices
– Multiprogramming API: users think they have exclusive access
to shared resources
• OS Has to coordinate all activity
– Multiple users, I/O interrupts, …
– How can it keep all these things straight?
• Basic Idea:
– Represent a running program as a process
– Once we know what a process looks like, define ways to
multiplex (interleave) them
• Dijkstra did this for the “THE system”
– Few thousand lines vs 1 million lines in OS 360 (1K bugs)
Multiprogramming – Many Processes
Proc
1
Proc
2
…
code
Proc
n
0000…
Static Data
heap
OS
stack
sysmode
1
code
Base
1000 …
0000…
Static Data
Bound
1100…
FFFF…
heap
oldPC
xxxx…
PC
stack
0000 1234
code
regs
00FF…
1000…
1100…
3000…
Static Data
…
heap
stack
3080…
Multiprogramming – Interrupt
(kernel code)
Proc
1
Proc
2
…
code
Proc
n
0000…
Static Data
heap
OS
stack
sysmode
0
code
Base
1000 …
0000…
Static Data
Bound
1100 …
FFFF…
heap
oldPC
0000 1234
PC
IntrpVector[i]
code
regs
•
00FF…
How to save
registers and set up
system stack?
stack
1000…
1100…
3000…
Static Data
…
heap
stack
3080…
Multiprogramming – Process 2
Proc
1
Proc
2
…
code
Proc
n
0000…
Static Data
heap
OS
stack
sysmode
1
code
1000 …
1100 …
0000 1234
Base
3000 …
0000…
Static Data
Bound
0080 …
FFFF…
heap
oldPC
xxxx xxxx
PC
regs
00FF…
•
stack
000 0248
code
regs
00D0…
How to save
registers and set up
system stack?
1000…
1100…
3000…
Static Data
…
heap
stack
3080…
14
Concurrency: illusion of many processors
CPU1 CPU2 CPU3
Shared Memory
CPU1 CPU2 CPU3 CPU1 CPU2
Time
• Concurrency provides the illusion of multiple processors?
– Multiplex in time!
• Each process is like a virtual “CPU”; needs a structure to hold:
– Program Counter (PC), Stack Pointer (SP)
– Registers (Integer, Floating point, others…?)
• Q: What structure holds this information: PCB. (coming next)
• How switch from one CPU to the next?
– Save PC, SP, and registers in current state block
– Load PC, SP, and registers from new state block
• What triggers switch?
– Timer, voluntary yield, I/O, other things
• Q: what states can a process go to after a switch?
Diagram of Process State
• As a process executes, it changes state
– new: The process is being created
– ready: The process is waiting to run
– running: Instructions are being executed
– waiting: Process waiting for some event to occur
– terminated: The process has finished execution
Multiplexing with Process Control Blocks (PCB)
• The current state of process held in a
process control block (PCB):
– This is a “snapshot” of the execution and
protection environment
– Only one PCB active at a time
• PCB also notes multiplexing process
info :
– CPU time to different processes
(Scheduling):
• Only one process “running” at a time
• Give more time to important processes
– Resources held by Process):
• Memory Mapping: based and bound registers
• I/O info: opened files, network connections
etc.
Process
Control
Block
Process Representation in Linux
Represented by the C structure
task_struct
pid t_pid; /* process identifier */
long state; /* state of the process */
unsigned int time_slice /* scheduling information */
struct task_struct *parent; /* this process’s parent */
struct list_head children; /* this process’s children */
struct files_struct *files; /* list of open files */
struct mm_struct *mm; /* address space of this process */
CPU Switch From Process to Process
• This is also called a “context switch”
• Code executed in kernel above is overhead
– Overhead sets minimum practical switching time
Context Switch in Detail
• A context switch is the switching of the CPU from one
process or thread to another.
• In detail, kernel does the following:
– Suspend one process and store its state as a PCB
– retrieve the PCB for the next process from memory and restore
it in the CPU’s registers
– Start executing the new process as pointed by its program
counter (PC)
– The more complex the OS and the PCB  the longer the
context switch
• Time dependent on hardware support
– Some hardware provides multiple sets of registers per CPU 
multiple contexts loaded at once
20
Operations on Processes
• Operating systems must provide
mechanisms for:
– process creation,
– process termination,
Process Creation
• Parent process create children processes,
which, in turn create other processes,
forming a tree of processes
• Generally, process identified and
managed via a process identifier (pid)
• Resource sharing options
– Parent and children share all resources
– Children share subset of parent’s resources
– Parent and child share no resources
• Execution options
– Parent and children execute concurrently
– Parent waits until children terminate
A Tree of Processes in Linux
init
pid = 1
login
pid = 8415
khelper
pid = 6
bash
pid = 8416
ps
pid = 9298
emacs
pid = 9204
sshd
pid = 3028
kthreadd
pid = 2
pdflush
pid = 200
sshd
pid = 3610
tcsch
pid = 4005
• Run pstree command. Can’t see init? Instead see systemd?
• http://www.pcworld.com/article/2841873/meet-systemd-the-controversialproject-taking-over-a-linux-distro-near-you.html
Process Creation (Cont.)
• Address space
– Child duplicate of parent
– Child has a program loaded into it
• UNIX examples
– fork() system call creates new process
– exec() system call used after a fork() to
replace the process’ memory space with a
new program
C Program Forking Separate Process
??
Parent Process
Child Process
Process Termination
• Process executes last statement and then asks the
operating system to delete it using the exit() system call.
– Parent gets the child’s exit status via wait()
– Process’ resources are deallocated by operating system
Parent
Child
pid_t pid;
exit(0);
int status;
pid = wait(&status);
printf(“%d”, status);
• Parent may terminate the execution of children processes
using the abort() system call. Some reasons for doing so:
– Child has exceeded allocated resources
– Task assigned to child is no longer required
– The parent is exiting and the operating systems does not
allow a child to continue if its parent terminates
Process Termination
• Some operating systems do not allow child to exist if its
parent has terminated. If a process terminates, then all its
children must also be terminated.
– cascading termination. All children, grandchildren, etc. are
terminated.
– The termination is initiated by the operating system.
• The parent process may wait for termination of a child
process by using the wait()system call. The call returns
status information and the pid of the terminated process
•
pid = wait(&status);
• If no parent waiting (did not invoke wait()) process is a
zombie
• If parent terminated without invoking wait , process is an
orphan
Multiprocess Architecture –
Chrome Browser
• Many web browsers ran as single process
– If one web site causes trouble, entire browser can
hang or crash
• Google Chrome Browser is multi-process with 3
different types of processes:
– Browser process manages user interface, disk and
network I/O
– Renderer process renders web pages, deals with
HTML, Javascript. A new renderer created for each
website opened
• Runs in sandbox restricting disk and network I/O,
minimizing effect of security exploits
– Plug-in process for each type of plug-in
Resource Duplication in Child Process
• Child gets a copy of parent’s resources
• Can involve a lot of overhead
– Copying several GB of data
– Only to throw that memory space away and load a
new address space with exec()?
• Solution:
– Copy on write
29
Copy On Write
Code Segments
Data Copies
//Parent code segment (a)
string i = “Good”;
//Child1 Code Segment (b)
{
printf(“in child 1”);
}
//Child2 Code Segment (c)
{
printf(“in child 2”);
i = i+ “Bye”;
}
a
Adapted from: http://www.slideshare.net/k.alroumi/copy-on-write-179305
RWCString is class implemented using a Copy-on-Write technique
“Good”
Copy On Write
Code Segments
Data Copies
//Parent code segment (a)
string i = “Good”;
//Child1 Code Segment (b)
{
printf(“in child 1”);
}
//Child2 Code Segment (c)
{
printf(“in child 2”);
i = i+ “Bye”;
}
a
b
Adapted from: http://www.slideshare.net/k.alroumi/copy-on-write-179305
RWCString is class implemented using a Copy-on-Write technique
“Good”
Copy On Write
Code Segments
Data Copies
//Parent code segment (a)
string i = “Good”;
//Child1 Code Segment (b)
{
printf(“in child 1”);
}
//Child2 Code Segment (c)
{
printf(“in child 2”);
i = i+ “Bye”;
}
a
b
c
Adapted from: http://www.slideshare.net/k.alroumi/copy-on-write-179305
RWCString is class implemented using a Copy-on-Write technique
“Good”
Copy On Write
Code Segments
Data Copies
//Parent code segment (a)
string i = “Good”;
//Child1 Code Segment (b)
{
printf(“in child 1”);
}
//Child2 Code Segment (c)
{
printf(“in child 2”);
i = i+ “Bye”;
}
a
b
“Good”
c
“Good Bye”
Read section 9.3 of the book SGG:
The page containing
the data to be written to is copied and then amended.
Adapted from: http://www.slideshare.net/k.alroumi/copy-on-write-179305
RWCString is class implemented using a Copy-on-Write technique
Short Term
Scheduler
Process Scheduling
Long Term
Scheduler?
Degree of
multiprogramming
• PCBs move from queue to queue as they change state
– Decisions about which order to remove from queues are
Scheduling decisions
– Many algorithms possible (few weeks from now)
Interprocess Communication
• Processes within a system may be independent or
cooperating
• Reasons for cooperating processes:
– Information sharing (Copy and Paste)
– Computation speedup (parallel work)
– Modularity
• Cooperating processes need interprocess
communication (IPC)
– Link: Interprocess communication on Windows
• Two models of IPC
– Shared memory
– Message passing
Communications Models
(a) Message passing.
(b) shared memory.
Which is potentially faster?
Shared Memory: Producer-Consumer
Metaphor
• An area of memory shared among the
processes that wish to communicate
• Paradigm for cooperating processes:
producer process produces information
that is consumed by a consumer process
– Issue: synchronisation
• The consumer should not read before the producer
has produced;
• producer should not over-produce.
Synchronisation
In == out  empty
in+1 == out  full
Consumer
out
Shared Buffer
in
Producer
• Producer/Consumer may wait:
– Either for the space/content to be available
– Or CPU time (assume a single core: only one process at a time)
38
•
Bounded-Buffer – Shared-Memory Solution
Shared data
#define BUFFER_SIZE 10
typedef struct {
. . .
} item;
item buffer[BUFFER_SIZE];
int in = 0;
int out = 0;
item next_produced;
while (true) {
/* produce an item in next produced */
while (((in + 1) % BUFFER_SIZE) == out)
; /* do nothing */
buffer[in] = next_produced;
in = (in + 1) % BUFFER_SIZE;
}
•
Solution is correct, but can only use BUFFER_SIZE-1 elements
Bounded Buffer – Consumer
item next_consumed;
while (true) {
while (in == out)
; /* do nothing */
next_consumed = buffer[out];
out = (out + 1) % BUFFER_SIZE;
/* consume the item in next consumed */
}
Synchronisation covered in detail in Chapter 5 of SGG
IPC POSIX Producer
IPC POSIX Consumer
Interprocess Communication –
Message Passing
• Mechanism for processes to communicate and to synchronize
their actions
• Message system – processes communicate with each other
without resorting to shared variables
• IPC facility provides two operations:
– send(message)
– receive(message)
• The message size is either fixed or variable
• If processes P and Q wish to communicate, they need to:
– Establish a communication link between them
– Exchange messages via send/receive
Message Passing (Cont.)
• Implementation of communication link
– Physical:
• Shared memory
• Hardware bus
• Network
– Logical:
• Direct or indirect
• Synchronous or asynchronous
Indirect Communication
• Messages are directed and received from mailboxes
(also referred to as ports)
– Each mailbox has a unique id
– Processes can communicate only if they share a mailbox
• Properties of communication link
– Link established only if processes share a common
mailbox
– A link may be associated with many processes
– Each pair of processes may share several communication
links
– Link may be unidirectional or bi-directional
Synchronization
Message passing may be either blocking or non-blocking
Blocking is considered synchronous
Blocking send -- the sender is blocked until the message is
received
Blocking receive -- the receiver is blocked until a message is
available
Non-blocking is considered asynchronous
Non-blocking send -- the sender sends the message and
continue
Non-blocking receive -- the receiver receives:
A valid message, or
Null message
• Different combinations possible
– If both send and receive are blocking, we have a rendezvous
Communications in Client-Server Systems
•
•
•
•
Pipes
Sockets
Remote Procedure Calls
Remote Method Invocation (Java)
Pipes
• Acts as a conduit allowing two processes to
communicate
• Issues:
– Is communication unidirectional or bidirectional?
– In the case of two-way communication, is it half or fullduplex?
– Must there exist a relationship (i.e., parent-child)
between the communicating processes?
– Can the pipes be used over a network?
• Ordinary pipes – cannot be accessed from outside the
process that created it. Typically, a parent process
creates a pipe and uses it to communicate with a child
process that it created.
• Named pipes – can be accessed without a parent-child
relationship.
Ordinary Pipes
• Ordinary Pipes allow communication in standard producerconsumer style
• Producer writes to one end (the write-end of the pipe)
• Consumer reads from the other end (the read-end of the pipe)
• Ordinary pipes are therefore unidirectional
• Require parent-child relationship between communicating processes
• Windows calls these anonymous pipes
• See Unix and Windows code samples in textbook
Sockets
•
A socket is defined as an endpoint for communication
•
•
•
Concatenation of IP address and port
IP: four numbers between 0 – 255: e.g. 161.25.19.8
port a number included at start of message packet to differentiate
network services on a host.
•
The socket 161.25.19.8:1625 refers to port 1625 on host 161.25.19.8
•
Communication consists between a pair of sockets
•
All ports below 1024 are well known, used for standard services.
– Assigned by Internet Assigned Numbers Authority
•
Special IP address 127.0.0.1 (loopback) to refer to system on which
process is running
Socket Communication
Sockets in Java
• Three types of sockets
– Connection-oriented
(TCP)
– Connectionless (UDP)
– MulticastSocket
class– data can be
sent to multiple
recipients
• Consider this “Date”
server:
• See Fig. 3.22 in the
book for the client
code.
Lab Sample code
• FILE* outfile = fopen(“copy.txt”, “w”);
• while ( (c=getc(infile)) !=EOF){
– putc(c,outfile);
• }
• No check on the system call’s return value
• Can crash your program
– Can bring your entire system down
– Can cost millions to your business
• Only going to deduct 1 mark for such mistakes.
53
Midterm Exam
• Exam on Wednesday, 12 October at 1100 hrs
at FB028.
• Special Needs: email Farshad Toosi for
separate seating.
• Everything covered including non-optional
extra reading material
54