fork() and exec()
Download
Report
Transcript fork() and exec()
CS 201
Computer Systems Programming
Chapter 8
“Unix fork(), execve()”
Herbert G. Mayer, PSU CS
Status 11/17/2013
1
Syllabus
Process, Thread, Hyperthread
Unix Process
Interrupt
Command ps
Background Process
Command fork()
fork() Sample
Command execve()
execve() Sample
References
2
Process – SW View
[Bryant, 16] “A process is the operating system’s
abstraction of a running program.”
[Silberschatz, 10] “A process can be thought of as a
program in execution …”
[Silberschatz, 90] “Informally, a process is a
program in execution. … A process is more than
the program code. It also includes the current
activity, as represented by the value of the
program counter and the contents of the
processor’s registers.”
[Tanenbaum, 72] “A process is just an executing
program, including the current values of the program
counter, registers, and variables.”
3
Thread – SW View
[Bryant, 947] “A thread is a logical flow that runs in the
context of a process. Each thread has its own thread context,
including a unique integer thread ID, stack, stack pointer,
program counter, general-purpose registers, and condition
codes. All threads running in a process share the entire
virtual address space of that process.”
[Silberschatz, 103] “A thread, sometimes called a
lightweight process, is a basic unit of CPU utilization,
and consists of a program counter, a register set, and a
stack space. It shares with peer threads its code section,
data section, and operating-system resources such as
open files and signals.”
[Tanenbaum, 81] “A thread has a program counter that
keeps track of which instruction to execute next. It has
registers, which hold its current working variables. It has a
stack, which contains the execution history, with one frame
for each procedure called, but not yet returned from. ... A
thread must execute in some process”
4
Thread Creation
Who makes threads?
Programmer understands which portions of C program are
dependent-, and which ones are independent of other parts
of the same program
MS and Intel compiler provide tools to support this analysis
Some C and C++ compilers provide directives
Directives can be hidden as comments, to be transparent to
other compilers
See lit ref [9] for Intel’s Parallel Studio, helping programmer
eliminated threading errors in C and C++ programs
Or see Intel’s [10] [12] to help programmer “to thread” SW
Or Microsoft Multi-Threading support for existing C and C++
source code [11]
5
Hyperthread – HW View
A processor may be single-core (UP), or have multiple
cores, the latter meaning: all processor resources are
replicated; e.g. Intel Core 2 Duo
Or a processor may be single-core, hyperthreaded, e.g.
Intel Pentium 4e. Hyperthreaded means that only CPU
registers and APIC are replicated, but not ALU units, such as
integer unit, floating-point unit, branch unit, caches, etc.
Or a processor may be multi-core, hyperthreaded, in which
case each of several real cores has a hyperthread twin (or
more), sharing the real core’s ALU with all hyperthreads, but
each hyperthread has own register + APIC; e.g. Intel Core i7
Hyperthreading is an old idea, proposed decades ago by
Digital Equipment Corp. (DEC), implemented first in silicon
by Intel in 2002 on Xeon® server + Pentium® 4 desktop CPUs
Hyperthread is an overloaded term, referring to the reduced
Silicon core, as well as the active SW thread executing on it
Should be named: Hypothread since it is a thread-subset,
but Hyperthread IS THE accepted technical term
6
Hyperthread – Per Intel Website
Ability of processor to run concurrent threads quickly
Some microprocessor hardware replication that creates the
illusion to SW of Dual Processor (DP)
Yet such HW with some resource replication is NOT a true
dual-core silicon implementation
The execution unit is still shared between multiple threads
Effect of Hyperthreading on Xeon® Processor:
Average CPU utilization increases to ~50%, down from ~35%
for a typical uni-processor
Up to ~30% performance gain for some applications with the
same processor frequency
Hyperthreading Technology Results:
1. More performance with enabled applications
2. Better responsiveness with existing applications
7
Hyperthread – Per Intel Website
Almost two Logical Processors
On-Die
Caches
Architecture state (registers) and
APIC* replicated
Shares execution units, caches,
branch prediction, control logic
and buses
Architecture State
Architecture State
Adv. Programmable
Interrupt Control
Adv. Programmable
Interrupt Control
Processor
Execution
Resource
*APIC: Advanced Programmable
Interrupt Controller. Handles
interrupts sent to a specified logical
processor
System Bus
8
Hyperthread
Hyperthreaded core replicates in silicon all CPU resources
essential to switching fast from one thread to another
Those replicated resources are registers and the APIC. ALU
units are not replicated on a hyperthread core
SW hyperthreads are also threads, but execute concurrently
on HW hyperthread cores; switch is efficient due to available
registers; ALU-sharing is still necessary: Only a single ALU!
This costs ~5% more HW (silicon) than a single core, but can
gain up to 30% performance improvement; great ROI!
9
Delta Between Process & Thread
Process is an OS-centric view of a running program.
Due to the meaning of “running”, all machine
resources are necessary to execute a process
A thread is a subset of a process, not necessarily a
proper subset
One purpose for threading a process is to allow
continued execution of that one process, even when
some part of it has to wait; e.g. one thread waits for
an IO operation to finish, but another thread can
continue, being independent of the IO result
Purpose for having threads execute a process is to
speed up overall execution. Possible, if multiple
threads of the same process are sufficiently dataindependent to progress concurrently; never
simultaneously on a single core!
A threaded process can –sometimes-- execute faster
on a single core unit due to reduced waits
10
Delta Between Thread & Hyperthread
A hyperthreaded core almost creates illusion of a
multi-core CPU, though there is only 1 complete CPU
Enables concurrent execution on a uni-processor,
when one process thread stalls and another thread is
ready; switch to the other thread is cheap, since the
other register set already has proper state –except
the first time around
But hyperthreading (or threading) per se never allows
parallel execution; only a multi-core architecture
does
A System Programmer must know: On MP OS not yet
tuned for hyperthreading: best disable hyperthread
scheduling; else under the right (i.e. wrong)
circumstances performance degradation results
This is the case, when the “next core to be scheduled”
happens to be regularly the hyperthreaded subset-core,
leaving some other real core idle, while such a real core
could work instead of the11hyperthread
Unix Process
Under Unix, a process is an instance of running a
program. If you execute the ancient editor ed and
your colleague does too, then there are 2 --very
similar-- processes running
All user a.out programs and all Unix commands,
when running, become processes; commands ll
and g++ my_prog.cpp create 2 different processes
Processes are visible via ps command, and even
that command is a process in its own right
Issue the command ps, for process status, and you
see your current processes, plus the ps processes
Issue ps –a, and you see a very detailed list,
including sleeping processes
Issue the command man ps for your education
12
Unix Process
When a command is issued, Unix starts new
process, suspends the current OS process,
generally the C-shell, until the new child process
completes
Exception: background processes with &
Unix identifies every process by a Process
Identification Number (PID) assigned at initiation
See getpid() function calls below
Unix is a timesharing system, which grants each
process time-slices
Allocation of time-slices has to be fair, so that one
long process cannot make other, shorter ones, wait
for extended periods (no starvation!)
Also, time-slicing has to be efficient, lest too much
processing time migrates into overhead, like a
typical state government
13
Interrupt
Def: Program interrupt is a transparent, nonscheduled change in execution flow with specific
cause outside the program, treated by an interrupt
handler, ending up at the original program again
Is unpredictable: Programmer does not know that,
when, why, where interrupt happens
Is unexpected: Programmer does not know when
interrupt happens, or whether it happens
Cannot be pinpointed (i.e. coded): Programmer
does not know where such an interrupt happens
Cause is known, passed to interrupt handler by
HW, yet the SW program (i.e. process) does not
know a-priori that it will happen; can be some
external event like power-outage, or time-slice
consumption (timer interrupt), numeric error
14
Interrupt
After handling, execution continues at place after the
interrupt
Challenge: if interrupt happens during execution of
some long instruction, say move of a large portion of
memory by a byte-move instruction
Challenge caused by general need of handling
interrupt swiftly, more swiftly than the time needed
for some long machine instructions!
Interrupt handled in a way that the program never
knows it was interrupted: transparent
Except that execution ends up slower than expected;
slower than if the interrupt had not happened
Note: x86 INT instruction is not an interrupt! Though
it is named a “software interrupt” instruction; is it
totally predictable, locatable!
15
Command ps
Command ps without any option shows all
processes with the same controlling terminal and
same user id as the invoker of the ps command
Complex command with numerous options!
PID identifies the process, TT the controlling
terminal, S the state, and TIME the CPU time
consumed for that process
[process] ps
PID TT
S
TIME COMMAND
8687 pts/33
O
0:00 ps
-- O running
19212 pts/33
S
0:00 –csh
-- S sleeping
16
Command ps
The command ps –a prints information about all
active processes; can result in a long list, e.g.:
ps –a | more
PID TT
S TIME COMMAND
12229
Z 0:00
16386
Z 0:00
523 console S 0:00
/usr/lib/saf/ttymon -g -d /dev/console -l
console -m ld
19668 pts/1
S 0:00 -tcsh
19691 pts/1
S 0:00 tcsh
19705 pts/1
S 0:07 pine
22394 pts/2
S 0:00 -bash
22412 pts/2
S 0:06 screen
. . .
17
Command ps
To abort, AKA kill, any process in Unix whose PID
is known, issue the command kill with argument
-9, e.g., the -9 meaning “death” , see below:
kill –9 19186
Which in this particular case was the instructor’s
remote shell log-in to PSU’s computer, and as a
result he had to log in again from home
Process status Z means “zombie”, typically a child
process that has terminated, but the parent
process still needs to know its termination status
Killing a Z process has no effect; good so, else
parent would never know status of terminated
child; see [8] for detail
18
Background Process &
Some processes may run forever
Others in your environment may run for a limited
time but for quite long
What if you do not wish to wait for long process
completion before continuing with other work?
Possible in Unix with background processes
initiated by the & command modifier, such as:
run_big &
Which executes the long running program
run_big in the background; making progress
whenever CPU cycles are available
Your real interactive work may proceeded in
parallel –on a UP implied here: but concurrently!
19
Background Process &
What happens if this program generates output,
such as messages to stderr?
These would be interspersed with other output
generated concurrently, causing confusion!
To avoid mixing of messages, stderr can be
redirected to a separate file using >&
g++ big_program.cpp >& my_errors &
. . . does accomplish that
It redirects via >& all output from stderr to a new
file, named: my_errors
. . . and then runs in the background, caused by &
So neither the messages nor the (long running)
background process hold you up
20
Command fork()
Unix fork() creates a new process by cloning the
current process issuing the fork() command
Returns 0 to child, and returns child’s PID to parent
Peculiar about a child processes: it shares all
attributes with the forking parent process, except
the process id AKA PID, the parent PID AKA PPID,
locks and a few attributes preventing infinite
spawning!
To compile the fork() command in your C program,
#include <unistd.h>
Code and data space of child and parent process are
shared for reading only; when the child needs to
write (stack, heap) it receives its own copy with the
new modifications; AKA copy-on-write (<- Final!)
Spawning a new process via fork() creates a
second exit() action; i.e. both parent and child
need to exit eventually
21
Command fork()
Confusing about fork(): it creates one new process
by cloning once, but exiting twice!
Child can inquire about its own PID via getpid( )
Child processes created via fork() do not repeat
their own fork() thus preventing infinite spawning!!
But child processes do execute other fork()
instructions, if also present in the parent program
Important: child and parent run concurrently
If parent and child had 2 processors, they could both run in
parallel, but the relative speeds shall be arbitrary! Make no
assumptions about their speeds!!
Hence on a UP these 2 executions are arbitrarily interleaved
E.g. outputs can be arbitrarily mixed, though strictly
sequential for parent and strictly sequential for the child
A sample shown next:
22
fork() Sample1 in C
#include <unistd.h>
#define DEAD
-1
// no shared global data, not shared data on heap!
void fork_sample1()
{ // fork_sample1
int local
= 1;
pid_t pid
= fork();
// just some data to track: which process?
// from now on: 2 processes
switch ( pid ) {
case 0:
// this is thread through child process
printf( “++local value in child = %d\n", ++local );
break;
case DEAD:
// this is en error process, dead, zombie?
printf( “<><> local in dead process = %d\n", local );
break;
default:
// clearly thread through parent process
printf( ”parent pid = %d, --local = %d\n", pid, --local );
} //end switch
// strange? switch statement has multiple clauses executed! By design!
printf( "Ending process %d\n", pid );
} //end fork_sample1
23
fork() Sample1 Output
[process] a.out
parent pid = 22853, --local = 0
Ending process 22853
++local value in child = 2
Ending process 0
[process]
•
Note that local is 2, and not 1 in child process, and it
happens to be executed after parent; arbitrarily!
•
So you infer: child has its own copy. It does not share
local with parent; else it would be 1, since parent
decreased local to 0
•
Students: Are other outputs possible? How many?
24
fork() Sample2 in C++
#define DEAD -1
#include <unistd>
// define DEAD as before
// make fork() available
// Bryant & O'Halleron's "Computer Systems", chapter 8
void fork_sample2()
{ // fork_sample2
switch ( fork() ) {
case 0:
// child process
cout << 'C';
break;
case DEAD:
cout << " <><> DEAD process?";
break;
default:
// parent process
cout << 'P';
} //end switch
// in all cases, indicate end by emitting 'E'
cout << 'E';
} //end fork_sample2
25
fork() Sample2 Output
[process] a.out
PECE[process]
•
•
•
•
•
note no carriage return: no endl!
Would CEPE be possible?
Would EECP be possible?
Would CPEE be possible?
Would PECE be possible?
Would ECPE be possible?
26
fork() Sample3 in C++
#include <iostream.h>
#include <unistd.h>
int main( void )
{ // main
int number = 0;
// Note “number” on stack
if ( 0 == fork() ) {
cout << "PID: " << getpid()
<< ” child process number = ”
<< ++number << endl;
} //end if
cout << "PID: " << getpid()
<< " exiting with number = “
<< --number << endl;
return 0;
} //end main
// how many output lines, students?
// Which will be the different values for “number”?
27
fork() Sample3 Output
[process]
PID: 5587
PID: 5588
PID: 5588
a.out
exiting with number = -1
child process number = 1
exiting with number = 0
28
fork() Sample4 –Getting Interesting
#include <iostream.h>
#include <unistd.h>
int main()
{ // main
int number = 100; // Note “number” on stack
if ( 0 == fork() ) { // first child creation
cout << "PID: " << getpid() << " 1st child, number = "
<< ++number << endl;
} //end if
if ( 0 == fork() ) { // second child creation
cout << "PID: " << getpid() << " 2nd child. PPID: "
<< getppid() << ". ++number = " << ++number << endl;
} //end if
cout << "PID: " << getpid()
<< " exiting with number = " << --number << endl;
return 0;
} //end main
29
fork() Sample4 Output
[process]
PID: 4432
PID: 4433
PID: 4434
PID: 4434
PID: 4433
[process]
PID: 4435
[process]
a.out
exiting with number = 99
1st child, number = 101
2nd child. PPID: 4432. ++number = 101
exiting with number = 100
exiting with number = 100
PID: 4435 2nd child. PPID: 4433. ++number = 102
exiting with number = 101
had to enter CR-LF
30
fork() Sample5 –Interesting
#include <iostream.h>
#include <unistd.h>
void fork_sample()
{ // fork_sample
pid_t pid1 = fork();
pid_t pid2 = fork();
pid_t pid3 = fork();
cout
<< " pid1 = " << pid1
<< " pid2 = " << pid2
<< " pid3 = " << pid3
<< endl;
} //end fork_sample
int main()
{ // main
fork_sample();
exit( 0 );
} //end main
31
fork() Sample5 Output
[process] a.out
pid1 = 24583 pid2 = 24584 pid3 = 24585
pid1 = 24583 pid2 = 24584 pid3 = 0
pid1 = 24583 pid2 = 0 pid3 = 24587
pid1 = 0 pid2 = 24586 pid3 = 24588
[process] pid1 = 24583 pid2 = 0 pid3 = 0
pid1 = 0 pid2 = 24586 pid3 = 0
pid1 = 0 pid2 = 0 pid3 = 0
pid1 = 0 pid2 = 0 pid3 = 24589
had to enter CR-LF
[process]
32
fork() Sample6
#include <iostream.h>
#include <unistd.h>
void fork_sample()
{ // fork_sample
pid_t pid1 = fork();
// 1
pid_t pid2 = fork();
// 2
pid_t pid3 = fork();
// 3
pid_t pid4 = fork();
// 4
printf( "I am %5d, parent= %5d,
pid1= %5d, pid2= %5d, pid3= %5d, pid4= %5d\n”,
getpid(), getppid(), pid1, pid2, pid3, pid4 );
} //end fork_sample
int main()
{ // main
fork_sample();
exit ( 0 );
} //end main
33
fork() Sample6 Output
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
am
am
am
am
am
am
am
am
am
am
am
am
am
am
am
am
22255,
22260,
22256,
22258,
22264,
22257,
22261,
22265,
22259,
22266,
22263,
22268,
22269,
22270,
22267,
22262,
parent=
parent=
parent=
parent=
parent=
parent=
parent=
parent=
parent=
parent=
parent=
parent=
parent=
parent=
parent=
parent=
21528,
22255,
22255,
22255,
22256,
22255,
22256,
22258,
22256,
22261,
22257,
22257,
22262,
22263,
22259,
22259,
pid1=
pid1=
pid1=
pid1=
pid1=
pid1=
pid1=
pid1=
pid1=
pid1=
pid1=
pid1=
pid1=
pid1=
pid1=
pid1=
22256,
22256,
0,
22256,
0,
22256,
0,
22256,
0,
0,
22256,
22256,
0,
22256,
0,
0,
34
pid2=
pid2=
pid2=
pid2=
pid2=
pid2=
pid2=
pid2=
pid2=
pid2=
pid2=
pid2=
pid2=
pid2=
pid2=
pid2=
22257,
22257,
22259,
22257,
22259,
0,
22259,
22257,
0,
22259,
0,
0,
0,
0,
0,
0,
pid3=
pid3=
pid3=
pid3=
pid3=
pid3=
pid3=
pid3=
pid3=
pid3=
pid3=
pid3=
pid3=
pid3=
pid3=
pid3=
22258,
22258,
22261,
0,
22261,
22263,
0,
0,
22262,
0,
0,
22263,
0,
0,
22262,
0,
pid4=
pid4=
pid4=
pid4=
pid4=
pid4=
pid4=
pid4=
pid4=
pid4=
pid4=
pid4=
pid4=
pid4=
pid4=
pid4=
22260
0
22264
22265
0
22268
22266
0
22267
0
22270
0
0
0
0
22269
fork() Sample6 Graph
28
55
4 children of Parent
60
61
66
56
67
Parent Process
58
64
59
Shell Process
65
57
68
63
70
62
69
35
Command execve()
So why does the fork() command exist, if all it
does is: make a copy of the current process?
fork() is just the first step of creating new
processes, including the execution of any Unix
commands, but also user programs, such as a.out
Therefore, code and data space have to be replaced
to spawn off a new, separate process
Unix command execve() accomplishes this
A clever resource saving: pages in memory are not
duplicated for new process; only modified pages
are: AKA copy-on-write
Side-effect of any fork() command is eventual
execution of two returns or exits, namely of parent
and child process; correspondingly, execve()
command creates no new return or exit
36
execve() Sample
void fork_exec_sample( char * argv[], char * envp[] )
{ // fork_exec_sample
pid_t pid = fork();
// parent + child exist now
{
if ( 0 == pid ) {
// strange order of: 0 == . . .
// child process
cout << “run 'herb.o' program." << endl;
// must yield complete path; giving relative path also OK
if ( execve( "/u/herb/progs/process/herb.o", argv, envp ) < 0 )
cout << "Aborting " << endl;
exit( 0 );
} //end if
} //end if
cout << "Created process " << pid << endl;
} //end fork_exec_sample
int main( int argc, char * argv[], char * envp[] )
{ // main
fork_exec_sample( argv, envp );
return 0;
} //end main
37
execve() Sample, uses herb.o
// new program = process to be initiated by execve()
// source named: herb.cpp
#include <stdio.h>
int main()
{ // main
printf( "<> SUCCESS <> You executed Herb\n" );
return 0;
// exits main()
} //end main
1. Compile this, and rename: a.out --> herb.o
2. Move herb.o into directory /u/herb/progs/process
[process] a.out
Created process 21010
students: why printed only once?
run 'herb.o' program.
[process] <> SUCCESS <> You executed Herb
38
References
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
IBM literature about Unix 8: http://www.ibm.com/developerworks/aix/library/auspeakingunix8/
Computer Systems, a Programmer’s Perspective, Bryant and O’Halleron, Prentice
Hall © 2011, 2nd ed.
Modern Operating Systems, Andrew S. Tanenbaum, Prentice Hall © 2011, second
ed.
Operating System Concepts, Silberschatz and Galvin, Addison Wesley © 1998,
fifth ed.
Posix Thread: https://computing.llnl.gov/tutorials/pthreads/#Thread
Thread, Hyperthread, Process:
http://decipherinfosys.com/HyperthreadedDualCore.pdf
Intel Hyperthreading:
http://www.intel.com/technology/itj/2002/volume06issue01/vol6iss1_hyper_threadi
ng_technology.pdf
Zombie process: http://en.wikipedia.org/wiki/Zombie_process
Eliminate threading errors: http://download-software.intel.com/enus/sites/default/files/eliminate-threading-errors_studioxe-evalguide.pdf
Intel Timebase Utility for multi-threading and concurrency:
http://software.intel.com/en-us/forums/topic/304224
MS multi-threading: http://msdn.microsoft.com/enus/library/vstudio/172d2hhw.aspx
Threading help: http://software.intel.com/en-us/articles/automatic-parallelizationwith-intel-compilers
Threading instructions from Intel: http://software.intel.com/en-us/articles/intel39
guide-for-developing-multithreaded-applications