CS4023 * Operating Systems

Download Report

Transcript CS4023 * Operating Systems

CS4023 – Operating Systems
(week 5)
Communication models in processes
Threads
Dr. Atif Azad
[email protected]
1
Acknowledgement
• Significant material in this set of lectures has
been borrowed from:
– http://www.cs.berkeley.edu/~kubitron/courses/cs162
/. Copyright © 2010 UCB
– http://cs162.eecs.berkeley.edu. Copyright © 2014
David Culler. Copyright ©2014 UCB.
– http://www.os-book.com . © 2014 Silberschatz et al.
– Dr Patrick Healy at CSIS Department, University of
Limerick.
2
A simple address translation: Base & Bound
code
0000…
code
0000…
Static Data
Static Data
heap
heap
stack
Base Address
stack
1000…
code
Program
address
Static Data
heap
Bound
<
1100…
• Can the pgm touch OS?
• Can it touch other pgms?
9/3/14
1000…
stack
1100…
FFFF…
Concurrency
• “Thread” of execution
– Independent Fetch/Decode/Execute loop
– Operating in some Address space
– Single-Threaded processes for this lecture. (also called “heavy
weight” process)
• Uniprogramming: one thread at a time
–
–
–
–
MS/DOS, early Macintosh, Batch processing
Easier for operating system builder
Get rid of concurrency by defining it away
Does this make sense for personal computers?
• Multiprogramming: more than one thread at a time
– Multics, UNIX/Linux, OS/2, Windows NT/2000/XP, Mac OS X
• ManyCore  Multiprogramming, right?
The Basic Problem of Concurrency
• The basic problem of concurrency involves resources:
– Hardware: single CPU, single DRAM, single I/O devices
– Multiprogramming API: users think they have exclusive access
to shared resources
• OS Has to coordinate all activity
– Multiple users, I/O interrupts, …
– How can it keep all these things straight?
• Basic Idea:
– abstract the notion of an executing program (process
representation)
– Then, worry about multiplexing these abstract beings
• Dijkstra did this for the “THE system”
– Few thousand lines vs 1 million lines in OS 360 (1K bugs)
Multiprogramming – Many Processes
Proc
1
Proc
2
…
code
Proc
n
0000…
Static Data
heap
OS
stack
sysmode
1
code
Base
1000 …
0000…
Static Data
Bound
1100…
FFFF…
heap
uPC
xxxx…
PC
stack
0000 1234
code
regs
00FF…
1000…
1100…
3000…
Static Data
…
heap
stack
3080…
Multiprogramming – Interrupt
(kernel code)
Proc
1
Proc
2
…
code
Proc
n
0000…
Static Data
heap
OS
stack
sysmode
0
code
Base
1000 …
0000…
Static Data
Bound
1100 …
FFFF…
heap
uPC
PC
0000 1234
IntrpVector[i]
code
regs
•
00FF…
How to save
registers and set up
system stack?
stack
1000…
1100…
3000…
Static Data
…
heap
stack
3080…
Multiprogramming – Process 2
Proc
1
Proc
2
…
code
Proc
n
0000…
Static Data
heap
OS
stack
sysmode
1
code
1000 …
1100 …
0000 1234
Base
3000 …
0000…
Static Data
Bound
0080 …
FFFF…
heap
uPC
PC
regs
00FF…
•
xxxx xxxx
stack
000 0248
code
regs
00D0…
How to save
registers and set up
system stack?
1000…
1100…
3000…
Static Data
…
heap
stack
3080…
8
Concurrency: illusion of many processors
CPU1 CPU2 CPU3
Shared Memory
CPU1 CPU2 CPU3 CPU1 CPU2
Time
• Concurrency provides the illusion of multiple processors?
– Multiplex in time!
• Each process is like a virtual “CPU”; needs a structure to hold:
– Program Counter (PC), Stack Pointer (SP)
– Registers (Integer, Floating point, others…?)
• Q: What structure holds this information: PCB. (coming next)
• How switch from one CPU to the next?
– Save PC, SP, and registers in current state block
– Load PC, SP, and registers from new state block
• What triggers switch?
– Timer, voluntary yield, I/O, other things
• Q: what states can a process go to after a switch?
Diagram of Process State
• As a process executes, it changes state
– new: The process is being created
– ready: The process is waiting to run
– running: Instructions are being executed
– waiting: Process waiting for some event to occur
– terminated: The process has finished execution
Multiplexing with Process Control Blocks (PCB)
• The current state of process held in a
process control block (PCB):
– This is a “snapshot” of the execution and
protection environment
– Only one PCB active at a time
• PCB also notes multiplexing process
info :
– CPU time to different processes
(Scheduling):
• Only one process “running” at a time
• Give more time to important processes
– Resources held by Process):
• Memory Mapping: based and bound registers
• I/O info: opened files, network connections
etc.
Process
Control
Block
Process Representation in Linux
Represented by the C structure
task_struct
pid t_pid; /* process identifier */
long state; /* state of the process */
unsigned int time_slice /* scheduling information */
struct task_struct *parent; /* this process’s parent */
struct list_head children; /* this process’s children */
struct files_struct *files; /* list of open files */
struct mm_struct *mm; /* address space of this process */
CPU Switch From Process to Process
• This is also called a “context switch”
• Code executed in kernel above is overhead
– Overhead sets minimum practical switching time
Context Switch in Detail
• A context switch is the switching of the CPU from one
process or thread to another.
• In detail, kernel does the following:
– Suspend one process and store its state as a PCB
– retrieve the PCB for the next process from memory and restore
it in the CPU’s registers
– Start executing the new process as pointed by its program
counter (PC)
– The more complex the OS and the PCB  the longer the
context switch
• Time dependent on hardware support
– Some hardware provides multiple sets of registers per CPU 
multiple contexts loaded at once
14
Operations on Processes
• Operating systems must provide
mechanisms for:
– process creation,
– process termination,
Process Creation
• Parent process create children processes,
which, in turn create other processes,
forming a tree of processes
• Generally, process identified and
managed via a process identifier (pid)
• Resource sharing options
– Parent and children share all resources
– Children share subset of parent’s resources
– Parent and child share no resources
• Execution options
– Parent and children execute concurrently
– Parent waits until children terminate
A Tree of Processes in Linux
init
pid = 1
login
pid = 8415
khelper
pid = 6
bash
pid = 8416
ps
pid = 9298
emacs
pid = 9204
sshd
pid = 3028
kthreadd
pid = 2
pdflush
pid = 200
sshd
pid = 3610
tcsch
pid = 4005
• Run pstree command. Can’t see init? Instead see systemd?
• http://www.pcworld.com/article/2841873/meet-systemd-the-controversialproject-taking-over-a-linux-distro-near-you.html
Process Creation (Cont.)
• Address space
– Child duplicate of parent
– Child has a program loaded into it
• UNIX examples
– fork() system call creates new process
– exec() system call used after a fork() to
replace the process’ memory space with a
new program
C Program Forking Separate Process
??
Parent Process
Child Process
Process Termination
• Process executes last statement and then asks the
operating system to delete it using the exit()
system call.
– Returns status data from child to parent (via wait())
– Process’ resources are deallocated by operating
system
• Parent may terminate the execution of children
processes using the abort() system call. Some
reasons for doing so:
– Child has exceeded allocated resources
– Task assigned to child is no longer required
– The parent is exiting and the operating systems does
not allow a child to continue if its parent terminates
Process Termination
• Some operating systems do not allow child to exist if its
parent has terminated. If a process terminates, then all its
children must also be terminated.
– cascading termination. All children, grandchildren, etc. are
terminated.
– The termination is initiated by the operating system.
• The parent process may wait for termination of a child
process by using the wait()system call. The call returns
status information and the pid of the terminated process
•
pid = wait(&status);
• If no parent waiting (did not invoke wait()) process is a
zombie
• If parent terminated without invoking wait , process is an
orphan
Multiprocess Architecture –
Chrome Browser
• Many web browsers ran as single process (some still
do)
– If one web site causes trouble, entire browser can hang or
crash
• Google Chrome Browser is multiprocess with 3
different types of processes:
– Browser process manages user interface, disk and
network I/O
– Renderer process renders web pages, deals with HTML,
Javascript. A new renderer created for each website
opened
• Runs in sandbox restricting disk and network I/O, minimizing
effect of security exploits
– Plug-in process for each type of plug-in
Resource Duplication in Child Process
• Child gets a copy of parent’s resources
• Can involve a lot of overhead
– Copying several GB of data
– Only to throw that memory space away and load a
new address space with exec()?
• Solution:
– Copy on write
23
Copy On Write
Code Segments
Data Copies
//Parent code segment (a)
string i = “Good”;
//Child1 Code Segment (b)
{
printf(“in child 1”);
}
//Child2 Code Segment (c)
{
printf(“in child 2”);
i = i+ “Bye”;
}
a
Adapted from: http://www.slideshare.net/k.alroumi/copy-on-write-179305
RWCString is class implemented using a Copy-on-Write technique
“Good”
Copy On Write
Code Segments
Data Copies
//Parent code segment (a)
string i = “Good”;
//Child1 Code Segment (b)
{
printf(“in child 1”);
}
//Child2 Code Segment (c)
{
printf(“in child 2”);
i = i+ “Bye”;
}
a
b
Adapted from: http://www.slideshare.net/k.alroumi/copy-on-write-179305
RWCString is class implemented using a Copy-on-Write technique
“Good”
Copy On Write
Code Segments
Data Copies
//Parent code segment (a)
string i = “Good”;
//Child1 Code Segment (b)
{
printf(“in child 1”);
}
//Child2 Code Segment (c)
{
printf(“in child 2”);
i = i+ “Bye”;
}
a
b
c
Adapted from: http://www.slideshare.net/k.alroumi/copy-on-write-179305
RWCString is class implemented using a Copy-on-Write technique
“Good”
Copy On Write
Code Segments
Data Copies
//Parent code segment (a)
string i = “Good”;
//Child1 Code Segment (b)
{
printf(“in child 1”);
}
//Child2 Code Segment (c)
{
printf(“in child 2”);
i = i+ “Bye”;
}
a
b
“Good”
c
“Good Bye”
Read section 9.3 of the book SGG:
The page containing
the data to be written to is copied and then amended.
Adapted from: http://www.slideshare.net/k.alroumi/copy-on-write-179305
RWCString is class implemented using a Copy-on-Write technique
Short Term
Scheduler
Process Scheduling
Long Term
Scheduler?
Degree of
multiprogramming
• PCBs move from queue to queue as they change state
– Decisions about which order to remove from queues are
Scheduling decisions
– Many algorithms possible (few weeks from now)
Interprocess Communication
• Processes within a system may be independent or cooperating
• Cooperating process can affect or be affected by other processes,
including sharing data
• Reasons for cooperating processes:
–
–
–
–
Information sharing (Copy and Paste)
Computation speedup (parallel work)
Modularity
Convenience
• Cooperating processes need interprocess communication (IPC)
– Link: Interprocess communication on Windows
• Two models of IPC
– Shared memory
– Message passing
Communications Models
(a) Message passing.
(b) shared memory.
Which is potentially faster?
Shared Memory: Producer-Consumer
Metaphor
• An area of memory shared among the
processes that wish to communicate
• Paradigm for cooperating processes:
producer process produces information
that is consumed by a consumer process
– Issue: synchronisation
• The consumer should not read before the producer
has produced;
• producer should not over-produce.
Synchronisation
In == out  empty
in+1 == out  full
Consumer
out
Shared Buffer
in
Producer
• Producer/Consumer may wait:
– Either for the space/content to be available
– Or CPU time (assume a single core: only one process at a time)
32
•
Bounded-Buffer – Shared-Memory Solution
Shared data
#define BUFFER_SIZE 10
typedef struct {
. . .
} item;
item buffer[BUFFER_SIZE];
int in = 0;
int out = 0;
item next_produced;
while (true) {
/* produce an item in next produced */
while (((in + 1) % BUFFER_SIZE) == out)
; /* do nothing */
buffer[in] = next_produced;
in = (in + 1) % BUFFER_SIZE;
}
•
Solution is correct, but can only use BUFFER_SIZE-1 elements
Bounded Buffer – Consumer
item next_consumed;
while (true) {
while (in == out)
; /* do nothing */
next_consumed = buffer[out];
out = (out + 1) % BUFFER_SIZE;
/* consume the item in next consumed */
}
Synchronisation covered in detail in Chapter 5 of SGG
IPC POSIX Producer
IPC POSIX Consumer
Interprocess Communication –
Message Passing
• Mechanism for processes to communicate and to synchronize
their actions
• Message system – processes communicate with each other
without resorting to shared variables
• IPC facility provides two operations:
– send(message)
– receive(message)
• The message size is either fixed or variable
• If processes P and Q wish to communicate, they need to:
– Establish a communication link between them
– Exchange messages via send/receive
Message Passing (Cont.)
• Implementation of communication link
– Physical:
• Shared memory
• Hardware bus
• Network
– Logical:
• Direct or indirect
• Synchronous or asynchronous
Indirect Communication
• Messages are directed and received from mailboxes
(also referred to as ports)
– Each mailbox has a unique id
– Processes can communicate only if they share a mailbox
• Properties of communication link
– Link established only if processes share a common
mailbox
– A link may be associated with many processes
– Each pair of processes may share several communication
links
– Link may be unidirectional or bi-directional
Synchronization
Message passing may be either blocking or non-blocking
Blocking is considered synchronous
Blocking send -- the sender is blocked until the message is
received
Blocking receive -- the receiver is blocked until a message is
available
Non-blocking is considered asynchronous
Non-blocking send -- the sender sends the message and
continue
Non-blocking receive -- the receiver receives:
A valid message, or
Null message
• Different combinations possible
– If both send and receive are blocking, we have a rendezvous
Communications in Client-Server Systems
•
•
•
•
Pipes
Sockets
Remote Procedure Calls
Remote Method Invocation (Java)
Pipes
• Acts as a conduit allowing two processes to
communicate
• Issues:
– Is communication unidirectional or bidirectional?
– In the case of two-way communication, is it half or fullduplex?
– Must there exist a relationship (i.e., parent-child)
between the communicating processes?
– Can the pipes be used over a network?
• Ordinary pipes – cannot be accessed from outside the
process that created it. Typically, a parent process
creates a pipe and uses it to communicate with a child
process that it created.
• Named pipes – can be accessed without a parent-child
relationship.
Ordinary Pipes
• Ordinary Pipes allow communication in standard producerconsumer style
• Producer writes to one end (the write-end of the pipe)
• Consumer reads from the other end (the read-end of the pipe)
• Ordinary pipes are therefore unidirectional
• Require parent-child relationship between communicating processes
• Windows calls these anonymous pipes
• See Unix and Windows code samples in textbook
Sockets
•
A socket is defined as an endpoint for communication
•
•
•
Concatenation of IP address and port
IP: four numbers between 0 – 255: e.g. 161.25.19.8
port a number included at start of message packet to differentiate
network services on a host.
•
The socket 161.25.19.8:1625 refers to port 1625 on host 161.25.19.8
•
Communication consists between a pair of sockets
•
All ports below 1024 are well known, used for standard services.
– Assigned by Internet Assigned Numbers Authority
•
Special IP address 127.0.0.1 (loopback) to refer to system on which
process is running
Socket Communication
Sockets in Java
• Three types of sockets
– Connection-oriented
(TCP)
– Connectionless (UDP)
– MulticastSocket
class– data can be
sent to multiple
recipients
• Consider this “Date”
server:
• See Fig. 3.22 in the
book for the client
code.
Lab Sample code
• FILE* outfile = fopen(“copy.txt”, “w”);
• while ( (c=getc(infile)) !=EOF){
– putc(c,outfile);
• }
• No check on the system call’s return value
• Can crash your program
– Can bring your entire system down
– Can cost millions to your business
– An awkward meeting with the boss
• Only going to deduct 1 mark for such mistakes.
47
Midterm Exam
• Exam on Monday, 19 October at 1600 hrs at
CSG001.
• Special Needs: email Farshad Toosi for
separate seating.
• Everything covered including extra reading
material
• More instructions later on the class website.
48