fork() and exec()

Transcript fork() and exec()

CS 201
Computer Systems Programming
Chapter 8
“Unix fork(), execve()”
Herbert G. Mayer, PSU CS
Status 11/17/2013
1
Syllabus
 Process, Thread, Hyperthread
 Unix Process
 Interrupt
 Command ps
 Background Process
 Command fork()
 fork() Sample
 Command execve()
 execve() Sample
 References
2
Process – SW View
 [Bryant, 16] “A process is the operating system’s
abstraction of a running program.”
 [Silberschatz, 10] “A process can be thought of as a
program in execution …”
 [Silberschatz, 90] “Informally, a process is a
program in execution. … A process is more than
the program code. It also includes the current
activity, as represented by the value of the
program counter and the contents of the
processor’s registers.”
 [Tanenbaum, 72] “A process is just an executing
program, including the current values of the program
counter, registers, and variables.”
3
Thread – SW View
 [Bryant, 947] “A thread is a logical flow that runs in the
context of a process. Each thread has its own thread context,
including a unique integer thread ID, stack, stack pointer,
program counter, general-purpose registers, and condition
codes. All threads running in a process share the entire
virtual address space of that process.”
 [Silberschatz, 103] “A thread, sometimes called a
lightweight process, is a basic unit of CPU utilization,
and consists of a program counter, a register set, and a
stack space. It shares with peer threads its code section,
data section, and operating-system resources such as
open files and signals.”
 [Tanenbaum, 81] “A thread has a program counter that
keeps track of which instruction to execute next. It has
registers, which hold its current working variables. It has a
stack, which contains the execution history, with one frame
for each procedure called, but not yet returned from. ... A
thread must execute in some process”
4
Thread Creation
 Who makes threads?
 Programmer understands which portions of C program are
dependent-, and which ones are independent of other parts
of the same program
 MS and Intel compiler provide tools to support this analysis
 Some C and C++ compilers provide directives
 Directives can be hidden as comments, to be transparent to
other compilers
 See lit ref [9] for Intel’s Parallel Studio, helping programmer
eliminated threading errors in C and C++ programs
 Or see Intel’s [10] [12] to help programmer “to thread” SW
 Or Microsoft Multi-Threading support for existing C and C++
source code [11]
5
Hyperthread – HW View
 A processor may be single-core (UP), or have multiple
cores, the latter meaning: all processor resources are
replicated; e.g. Intel Core 2 Duo
 Or a processor may be single-core, hyperthreaded, e.g.
Intel Pentium 4e. Hyperthreaded means that only CPU
registers and APIC are replicated, but not ALU units, such as
integer unit, floating-point unit, branch unit, caches, etc.
 Or a processor may be multi-core, hyperthreaded, in which
case each of several real cores has a hyperthread twin (or
more), sharing the real core’s ALU with all hyperthreads, but
each hyperthread has own register + APIC; e.g. Intel Core i7
 Hyperthreading is an old idea, proposed decades ago by
Digital Equipment Corp. (DEC), implemented first in silicon
by Intel in 2002 on Xeon® server + Pentium® 4 desktop CPUs
 Hyperthread is an overloaded term, referring to the reduced
Silicon core, as well as the active SW thread executing on it
 Should be named: Hypothread  since it is a thread-subset,
but Hyperthread IS THE accepted technical term
6
Hyperthread – Per Intel Website
Ability of processor to run concurrent threads quickly



Some microprocessor hardware replication that creates the
illusion to SW of Dual Processor (DP)
Yet such HW with some resource replication is NOT a true
dual-core silicon implementation
The execution unit is still shared between multiple threads
Effect of Hyperthreading on Xeon® Processor:


Average CPU utilization increases to ~50%, down from ~35%
for a typical uni-processor
Up to ~30% performance gain for some applications with the
same processor frequency
Hyperthreading Technology Results:
1. More performance with enabled applications
2. Better responsiveness with existing applications
7
Hyperthread – Per Intel Website
Almost two Logical Processors
On-Die
Caches
Architecture state (registers) and
APIC* replicated
Shares execution units, caches,
branch prediction, control logic
and buses
Architecture State
Architecture State
Adv. Programmable
Interrupt Control
Adv. Programmable
Interrupt Control
Processor
Execution
Resource
*APIC: Advanced Programmable
Interrupt Controller. Handles
interrupts sent to a specified logical
processor
System Bus
8
Hyperthread
 Hyperthreaded core replicates in silicon all CPU resources
essential to switching fast from one thread to another
 Those replicated resources are registers and the APIC. ALU
units are not replicated on a hyperthread core
 SW hyperthreads are also threads, but execute concurrently
on HW hyperthread cores; switch is efficient due to available
registers; ALU-sharing is still necessary: Only a single ALU!
 This costs ~5% more HW (silicon) than a single core, but can
gain up to 30% performance improvement; great ROI!
9
Delta Between Process & Thread
 Process is an OS-centric view of a running program.
Due to the meaning of “running”, all machine
resources are necessary to execute a process
 A thread is a subset of a process, not necessarily a
proper subset
 One purpose for threading a process is to allow
continued execution of that one process, even when
some part of it has to wait; e.g. one thread waits for
an IO operation to finish, but another thread can
continue, being independent of the IO result
 Purpose for having threads execute a process is to
speed up overall execution. Possible, if multiple
threads of the same process are sufficiently dataindependent to progress concurrently; never
simultaneously on a single core!
 A threaded process can –sometimes-- execute faster
on a single core unit due to reduced waits
10
Delta Between Thread & Hyperthread
 A hyperthreaded core almost creates illusion of a
multi-core CPU, though there is only 1 complete CPU
 Enables concurrent execution on a uni-processor,
when one process thread stalls and another thread is
ready; switch to the other thread is cheap, since the
other register set already has proper state –except
the first time around
 But hyperthreading (or threading) per se never allows
parallel execution; only a multi-core architecture
does
 A System Programmer must know: On MP OS not yet
tuned for hyperthreading: best disable hyperthread
scheduling; else under the right (i.e. wrong)
circumstances performance degradation results

This is the case, when the “next core to be scheduled”
happens to be regularly the hyperthreaded subset-core,
leaving some other real core idle, while such a real core
could work instead of the11hyperthread
Unix Process
 Under Unix, a process is an instance of running a
program. If you execute the ancient editor ed and
your colleague does too, then there are 2 --very
similar-- processes running
 All user a.out programs and all Unix commands,
when running, become processes; commands ll
and g++ my_prog.cpp create 2 different processes
 Processes are visible via ps command, and even
that command is a process in its own right
 Issue the command ps, for process status, and you
see your current processes, plus the ps processes
 Issue ps –a, and you see a very detailed list,
including sleeping processes
 Issue the command man ps for your education
12
Unix Process
 When a command is issued, Unix starts new
process, suspends the current OS process,
generally the C-shell, until the new child process
completes
 Exception: background processes with &
 Unix identifies every process by a Process
Identification Number (PID) assigned at initiation
 See getpid() function calls below
 Unix is a timesharing system, which grants each
process time-slices
 Allocation of time-slices has to be fair, so that one
long process cannot make other, shorter ones, wait
for extended periods (no starvation!)
 Also, time-slicing has to be efficient, lest too much
processing time migrates into overhead, like a
typical state government 
13
Interrupt
 Def: Program interrupt is a transparent, nonscheduled change in execution flow with specific
cause outside the program, treated by an interrupt
handler, ending up at the original program again
 Is unpredictable: Programmer does not know that,
when, why, where interrupt happens
 Is unexpected: Programmer does not know when
interrupt happens, or whether it happens
 Cannot be pinpointed (i.e. coded): Programmer
does not know where such an interrupt happens
 Cause is known, passed to interrupt handler by
HW, yet the SW program (i.e. process) does not
know a-priori that it will happen; can be some
external event like power-outage, or time-slice
consumption (timer interrupt), numeric error
14
Interrupt
 After handling, execution continues at place after the
interrupt
 Challenge: if interrupt happens during execution of
some long instruction, say move of a large portion of
memory by a byte-move instruction
 Challenge caused by general need of handling
interrupt swiftly, more swiftly than the time needed
for some long machine instructions!
 Interrupt handled in a way that the program never
knows it was interrupted: transparent
 Except that execution ends up slower than expected;
slower than if the interrupt had not happened
 Note: x86 INT instruction is not an interrupt! Though
it is named a “software interrupt” instruction; is it
totally predictable, locatable!
15
Command ps
 Command ps without any option shows all
processes with the same controlling terminal and
same user id as the invoker of the ps command
 Complex command with numerous options!
 PID identifies the process, TT the controlling
terminal, S the state, and TIME the CPU time
consumed for that process
[process] ps
PID TT
S
TIME COMMAND
8687 pts/33
O
0:00 ps
-- O running
19212 pts/33
S
0:00 –csh
-- S sleeping
16
Command ps
The command ps –a prints information about all
active processes; can result in a long list, e.g.:
ps –a | more
PID TT
S TIME COMMAND
12229
Z 0:00
16386
Z 0:00
523 console S 0:00
/usr/lib/saf/ttymon -g -d /dev/console -l
console -m ld
19668 pts/1
S 0:00 -tcsh
19691 pts/1
S 0:00 tcsh
19705 pts/1
S 0:07 pine
22394 pts/2
S 0:00 -bash
22412 pts/2
S 0:06 screen
. . .
17
Command ps
 To abort, AKA kill, any process in Unix whose PID
is known, issue the command kill with argument
-9, e.g., the -9 meaning “death” , see below:
kill –9 19186
 Which in this particular case was the instructor’s
remote shell log-in to PSU’s computer, and as a
result he had to log in again  from home
 Process status Z means “zombie”, typically a child
process that has terminated, but the parent
process still needs to know its termination status
 Killing a Z process has no effect; good so, else
parent would never know status of terminated
child; see [8] for detail
18
Background Process &
 Some processes may run forever
 Others in your environment may run for a limited
time but for quite long
 What if you do not wish to wait for long process
completion before continuing with other work?
 Possible in Unix with background processes
initiated by the & command modifier, such as:
run_big &
 Which executes the long running program
run_big in the background; making progress
whenever CPU cycles are available
 Your real interactive work may proceeded in
parallel –on a UP implied here: but concurrently!
19
Background Process &
 What happens if this program generates output,
such as messages to stderr?
 These would be interspersed with other output
generated concurrently, causing confusion!
 To avoid mixing of messages, stderr can be
redirected to a separate file using >&
g++ big_program.cpp >& my_errors &
 . . . does accomplish that
 It redirects via >& all output from stderr to a new
file, named: my_errors
 . . . and then runs in the background, caused by &
 So neither the messages nor the (long running)
background process hold you up
20
Command fork()
 Unix fork() creates a new process by cloning the
current process issuing the fork() command
 Returns 0 to child, and returns child’s PID to parent
 Peculiar about a child processes: it shares all
attributes with the forking parent process, except
the process id AKA PID, the parent PID AKA PPID,
locks and a few attributes preventing infinite
spawning!
 To compile the fork() command in your C program,
#include <unistd.h>
 Code and data space of child and parent process are
shared for reading only; when the child needs to
write (stack, heap) it receives its own copy with the
new modifications; AKA copy-on-write (<- Final!)
 Spawning a new process via fork() creates a
second exit() action; i.e. both parent and child
need to exit eventually
21
Command fork()
 Confusing about fork(): it creates one new process
by cloning once, but exiting twice!
 Child can inquire about its own PID via getpid( )
 Child processes created via fork() do not repeat
their own fork() thus preventing infinite spawning!!
 But child processes do execute other fork()
instructions, if also present in the parent program
 Important: child and parent run concurrently



If parent and child had 2 processors, they could both run in
parallel, but the relative speeds shall be arbitrary! Make no
assumptions about their speeds!!
Hence on a UP these 2 executions are arbitrarily interleaved
E.g. outputs can be arbitrarily mixed, though strictly
sequential for parent and strictly sequential for the child
 A sample shown next:
22
fork() Sample1 in C
#include <unistd.h>
#define DEAD
-1
// no shared global data, not shared data on heap!
void fork_sample1()
{ // fork_sample1
int local
= 1;
pid_t pid
= fork();
// just some data to track: which process?
// from now on: 2 processes
switch ( pid ) {
case 0:
// this is thread through child process
printf( “++local value in child = %d\n", ++local );
break;
case DEAD:
// this is en error process, dead, zombie?
printf( “<><> local in dead process = %d\n", local );
break;
default:
// clearly thread through parent process
printf( ”parent pid = %d, --local = %d\n", pid, --local );
} //end switch
// strange? switch statement has multiple clauses executed! By design!
printf( "Ending process %d\n", pid );
} //end fork_sample1
23
fork() Sample1 Output
[process] a.out
parent pid = 22853, --local = 0
Ending process 22853
++local value in child = 2
Ending process 0
[process]
•
Note that local is 2, and not 1 in child process, and it
happens to be executed after parent; arbitrarily!
•
So you infer: child has its own copy. It does not share
local with parent; else it would be 1, since parent
decreased local to 0
•
Students: Are other outputs possible? How many?
24
fork() Sample2 in C++
#define DEAD -1
#include <unistd>
// define DEAD as before
// make fork() available
// Bryant & O'Halleron's "Computer Systems", chapter 8
void fork_sample2()
{ // fork_sample2
switch ( fork() ) {
case 0:
// child process
cout << 'C';
break;
case DEAD:
cout << " <><> DEAD process?";
break;
default:
// parent process
cout << 'P';
} //end switch
// in all cases, indicate end by emitting 'E'
cout << 'E';
} //end fork_sample2
25
fork() Sample2 Output
[process] a.out
PECE[process] 
•
•
•
•
•
note no carriage return: no endl!
Would CEPE be possible?
Would EECP be possible?
Would CPEE be possible?
Would PECE be possible?
Would ECPE be possible?
26
fork() Sample3 in C++
#include <iostream.h>
#include <unistd.h>
int main( void )
{ // main
int number = 0;
// Note “number” on stack
if ( 0 == fork() ) {
cout << "PID: " << getpid()
<< ” child process number = ”
<< ++number << endl;
} //end if
cout << "PID: " << getpid()
<< " exiting with number = “
<< --number << endl;
return 0;
} //end main
// how many output lines, students?
// Which will be the different values for “number”?
27
fork() Sample3 Output
[process]
PID: 5587
PID: 5588
PID: 5588
a.out
exiting with number = -1
child process number = 1
exiting with number = 0
28
fork() Sample4 –Getting Interesting
#include <iostream.h>
#include <unistd.h>
int main()
{ // main
int number = 100; // Note “number” on stack
if ( 0 == fork() ) { // first child creation
cout << "PID: " << getpid() << " 1st child, number = "
<< ++number << endl;
} //end if
if ( 0 == fork() ) { // second child creation
cout << "PID: " << getpid() << " 2nd child. PPID: "
<< getppid() << ". ++number = " << ++number << endl;
} //end if
cout << "PID: " << getpid()
<< " exiting with number = " << --number << endl;
return 0;
} //end main
29
fork() Sample4 Output
[process]
PID: 4432
PID: 4433
PID: 4434
PID: 4434
PID: 4433
[process]
PID: 4435
[process]
a.out
exiting with number = 99
1st child, number = 101
2nd child. PPID: 4432. ++number = 101
exiting with number = 100
exiting with number = 100
PID: 4435 2nd child. PPID: 4433. ++number = 102
exiting with number = 101
 had to enter CR-LF
30
fork() Sample5 –Interesting
#include <iostream.h>
#include <unistd.h>
void fork_sample()
{ // fork_sample
pid_t pid1 = fork();
pid_t pid2 = fork();
pid_t pid3 = fork();
cout
<< " pid1 = " << pid1
<< " pid2 = " << pid2
<< " pid3 = " << pid3
<< endl;
} //end fork_sample
int main()
{ // main
fork_sample();
exit( 0 );
} //end main
31
fork() Sample5 Output
[process] a.out
pid1 = 24583 pid2 = 24584 pid3 = 24585
pid1 = 24583 pid2 = 24584 pid3 = 0
pid1 = 24583 pid2 = 0 pid3 = 24587
pid1 = 0 pid2 = 24586 pid3 = 24588
[process] pid1 = 24583 pid2 = 0 pid3 = 0
pid1 = 0 pid2 = 24586 pid3 = 0
pid1 = 0 pid2 = 0 pid3 = 0
pid1 = 0 pid2 = 0 pid3 = 24589
 had to enter CR-LF
[process]
32
fork() Sample6
#include <iostream.h>
#include <unistd.h>
void fork_sample()
{ // fork_sample
pid_t pid1 = fork();
// 1
pid_t pid2 = fork();
// 2
pid_t pid3 = fork();
// 3
pid_t pid4 = fork();
// 4
printf( "I am %5d, parent= %5d,
pid1= %5d, pid2= %5d, pid3= %5d, pid4= %5d\n”,
getpid(), getppid(), pid1, pid2, pid3, pid4 );
} //end fork_sample
int main()
{ // main
fork_sample();
exit ( 0 );
} //end main
33
fork() Sample6 Output
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
am
am
am
am
am
am
am
am
am
am
am
am
am
am
am
am
22255,
22260,
22256,
22258,
22264,
22257,
22261,
22265,
22259,
22266,
22263,
22268,
22269,
22270,
22267,
22262,
parent=
parent=
parent=
parent=
parent=
parent=
parent=
parent=
parent=
parent=
parent=
parent=
parent=
parent=
parent=
parent=
21528,
22255,
22255,
22255,
22256,
22255,
22256,
22258,
22256,
22261,
22257,
22257,
22262,
22263,
22259,
22259,
pid1=
pid1=
pid1=
pid1=
pid1=
pid1=
pid1=
pid1=
pid1=
pid1=
pid1=
pid1=
pid1=
pid1=
pid1=
pid1=
22256,
22256,
0,
22256,
0,
22256,
0,
22256,
0,
0,
22256,
22256,
0,
22256,
0,
0,
34
pid2=
pid2=
pid2=
pid2=
pid2=
pid2=
pid2=
pid2=
pid2=
pid2=
pid2=
pid2=
pid2=
pid2=
pid2=
pid2=
22257,
22257,
22259,
22257,
22259,
0,
22259,
22257,
0,
22259,
0,
0,
0,
0,
0,
0,
pid3=
pid3=
pid3=
pid3=
pid3=
pid3=
pid3=
pid3=
pid3=
pid3=
pid3=
pid3=
pid3=
pid3=
pid3=
pid3=
22258,
22258,
22261,
0,
22261,
22263,
0,
0,
22262,
0,
0,
22263,
0,
0,
22262,
0,
pid4=
pid4=
pid4=
pid4=
pid4=
pid4=
pid4=
pid4=
pid4=
pid4=
pid4=
pid4=
pid4=
pid4=
pid4=
pid4=
22260
0
22264
22265
0
22268
22266
0
22267
0
22270
0
0
0
0
22269
fork() Sample6 Graph
28
55
4 children of Parent 
60
61
66
56
67
 Parent Process
58
64
59
 Shell Process
65
57
68
63
70
62
69
35
Command execve()
 So why does the fork() command exist, if all it
does is: make a copy of the current process?
 fork() is just the first step of creating new
processes, including the execution of any Unix
commands, but also user programs, such as a.out
 Therefore, code and data space have to be replaced
to spawn off a new, separate process
 Unix command execve() accomplishes this
 A clever resource saving: pages in memory are not
duplicated for new process; only modified pages
are: AKA copy-on-write
 Side-effect of any fork() command is eventual
execution of two returns or exits, namely of parent
and child process; correspondingly, execve()
command creates no new return or exit
36
execve() Sample
void fork_exec_sample( char * argv[], char * envp[] )
{ // fork_exec_sample
pid_t pid = fork();
// parent + child exist now
{
if ( 0 == pid ) {
// strange order of: 0 == . . .
// child process
cout << “run 'herb.o' program." << endl;
// must yield complete path; giving relative path also OK
if ( execve( "/u/herb/progs/process/herb.o", argv, envp ) < 0 )
cout << "Aborting " << endl;
exit( 0 );
} //end if
} //end if
cout << "Created process " << pid << endl;
} //end fork_exec_sample
int main( int argc, char * argv[], char * envp[] )
{ // main
fork_exec_sample( argv, envp );
return 0;
} //end main
37
execve() Sample, uses herb.o
// new program = process to be initiated by execve()
// source named: herb.cpp
#include <stdio.h>
int main()
{ // main
printf( "<> SUCCESS <> You executed Herb\n" );
return 0;
// exits main()
} //end main
1. Compile this, and rename: a.out --> herb.o
2. Move herb.o into directory /u/herb/progs/process
[process] a.out
Created process 21010
 students: why printed only once?
run 'herb.o' program.
[process] <> SUCCESS <> You executed Herb
38
References
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
IBM literature about Unix 8: http://www.ibm.com/developerworks/aix/library/auspeakingunix8/
Computer Systems, a Programmer’s Perspective, Bryant and O’Halleron, Prentice
Hall © 2011, 2nd ed.
Modern Operating Systems, Andrew S. Tanenbaum, Prentice Hall © 2011, second
ed.
Operating System Concepts, Silberschatz and Galvin, Addison Wesley © 1998,
fifth ed.
Posix Thread: https://computing.llnl.gov/tutorials/pthreads/#Thread
Thread, Hyperthread, Process:
http://decipherinfosys.com/HyperthreadedDualCore.pdf
Intel Hyperthreading:
http://www.intel.com/technology/itj/2002/volume06issue01/vol6iss1_hyper_threadi
ng_technology.pdf
Zombie process: http://en.wikipedia.org/wiki/Zombie_process
Eliminate threading errors: http://download-software.intel.com/enus/sites/default/files/eliminate-threading-errors_studioxe-evalguide.pdf
Intel Timebase Utility for multi-threading and concurrency:
http://software.intel.com/en-us/forums/topic/304224
MS multi-threading: http://msdn.microsoft.com/enus/library/vstudio/172d2hhw.aspx
Threading help: http://software.intel.com/en-us/articles/automatic-parallelizationwith-intel-compilers
Threading instructions from Intel: http://software.intel.com/en-us/articles/intel39
guide-for-developing-multithreaded-applications

fork() and exec()

Transcript fork() and exec()

Directory