PID

advertisement
Linux Operating System
許 富 皓
1
NameSpace
2
Namespace [Michael Kerrisk]
Currently, Linux implements six different
types of namespaces.
 The CLONE_NEW* identifiers listed in
parentheses are the names of the
constants used to identify namespace
types when employing the namespacerelated APIs (clone(), unshare(),
and setns() )

3
Six Linux Namespaces






Mount namespaces (CLONE_NEWNS, Linux 2.4.19)
UTS namespaces (CLONE_NEWUTS, Linux 2.6.19)
IPC namespaces (CLONE_NEWIPC, Linux 2.6.19)
PID namespaces (CLONE_NEWPID, Linux 2.6.24)
Network namespaces (CLONE_NEWNET, Linux 2.6.29)
User namespaces (CLONE_NEWUSER, Linux 3.8)
4
Goals of Namespace (1) [Michael Kerrisk]

The purpose of each namespace is to
wrap a particular global system resource
in an abstraction that makes it appear to
the processes within the namespace that
they have their own isolated instance of
the global resource.
5
Goals of Namespace (2) [Michael Kerrisk]

One of the overall goals of namespaces is
to support the implementation of containers,
a tool for lightweight virtualization (as well
as other purposes) that provides a group of
processes with the illusion that they are the
only processes on the system
6
PID Namespace [Michael Kerrisk]
The global resource isolated by PID
namespaces is the process ID number space.
 This means that processes in different PID
namespaces can have the same process ID.
 PID namespaces are used to implement
containers that can be migrated between host
systems while keeping the same process IDs
for the processes inside the container.

7
Process PID [Michael Kerrisk]
As with processes on a traditional Linux (or
UNIX) system, the process IDs within a PID
namespace are unique, and are assigned
sequentially starting with PID 1.
 Likewise, as on a traditional Linux system,
PID 1—the init process—is special: it is
the first process created within the
namespace, and it performs certain
management tasks within the namespace.

8
Creation of a New PID Namespace [Michael Kerrisk]

A new PID namespace is created by calling
clone() with the CLONE_NEWPID flag.
 child_pid
= clone(childFunc, child_stack,
CLONE_NEWPID | SIGCHLD, argv[1]);
9
PID Namespace Hierarchy[Michael Kerrisk]

PID namespaces form a hierarchy:
A
process can "see" only those processes
contained in its own PID namespace and in the
child namespaces nested below that PID
namespace.
 If the parent of the child created by clone() is
in a different namespace, the child cannot "see"
the parent; therefore, getppid() reports the
parent PID as being zero.
10
PID Namespace Hierarchy [text book]
11
/proc/PID Directory[Michael Kerrisk]

Within a PID namespace, the /proc/PID directories
show information only about
 processes
within that PID namespace
or
 processes within one of its descendant namespaces.
12
Mount a proc filesystem[Michael Kerrisk]


However, in order to make the /proc/PID directories
that correspond to a PID namespace visible, the proc
filesystem ("procfs" for short) needs to be mounted
from within that PID namespace.
From a shell running inside the PID namespace (perhaps
invoked via the system() library function), we can do
this using a mount command of the following form:
# mount -t proc proc /mount_point
13
Nested PID Namespaces[Michael Kerrisk]
PID namespaces are hierarchically nested in
parent-child relationships.
 Within a PID namespace, it is possible to see

 all
other processes in the same namespace,
as well as
 all processes that are members of descendant
namespaces.
14
“See” a Process [Michael Kerrisk]

Here, "see" means being able to make
system calls that operate on specific PIDs.
 e.g.,

using kill() to send a signal to process.
Processes in a child PID namespace cannot
see processes that exist (only) in the parent
PID namespace (or further removed
ancestor namespaces).
15
PID returned by getpid() [Michael Kerrisk]
A process will have one PID in each of the
layers of the PID namespace hierarchy
starting from the PID namespace in which
it resides through to the root PID
namespace.
 Calls to getpid() always report the PID
associated with the namespace in which
the process resides

16
Traditional init Process and Signals

The traditional Linux init process is treated
specially with respect to signals.
only signals that can be delivered to init are
those for which the process has established a
signal handler.
 All other signals are ignored.
 This prevents the init process—whose presence
is essential for the stable operation of the system
—from being accidentally killed, even by the super
user.
17
 The
init Processes of Namespaces and Signals

PID namespaces implement some analogous
behavior for the namespace-specific init process.
 Other
processes in the namespace (even privileged
processes) can send only those signals for which the init
process has established a signal handler.
 Note that (as for the traditional init process) the kernel
can still generate signals for the PID
namespace init process in all of the usual circumstances

e.g.,

hardware exceptions,
terminal-generated signals such as SIGTTOU,

and expiration of a timer.

18
Signals from Ancestor Namespaces
Signals can be sent to the PID
namespace init process by processes in
ancestor PID namespaces.
 Again, only the signals for which
the init process has established a
handler can be sent, with two exceptions:

 SIGKILL
and
 SIGSTOP.
19
init Process and SIGKILL and SIGSTOP
When a process in an ancestor PID
namespace sends SIGKILL and
SIGSTOP to the init process, they are
forcibly delivered (and can't be caught).
 The SIGSTOP signal stops the init
process; SIGKILL terminates it.

20
Termination of init Processes

Since the init process is essential to the
functioning of the PID namespace, if the
init process is terminated by
SIGKILL (or it terminates for any other
reason), the kernel terminates all other
processes in the namespace by sending
them a SIGKILL signal.
21
Connection between Processes
and Namespaces
struct nsproxy *nsproxy;
22
Definition of struct nsproxy
struct nsproxy {
atomic_t count;
struct uts_namespace
struct ipc_namespace
struct mnt_namespace
struct pid_namespace
struct net
};


*uts_ns;
*ipc_ns;
*mnt_ns;
*pid_ns;
*net_ns;
A nsproxy is shared by processes which share all namespaces.
As soon as a single namespace is cloned or unshared, the nsproxy
is copied.
23
struct nsproxy
A structure to contain pointers to all perprocess namespaces - fs (mount), uts,
network, ipc, etc.
 'count' is the number of processes
holding a reference.

24
Initial Global Namespace
struct nsproxy init_nsproxy = {
.count = ATOMIC_INIT(1),
.uts_ns = &init_uts_ns,
#if defined(CONFIG_POSIX_MQUEUE)|| defined(CONFIG_SYSVIPC)
.ipc_ns = &init_ipc_ns,
#endif
.mnt_ns = NULL,
.pid_ns = &init_pid_ns,
#ifdef CONFIG_NET
.net_ns = &init_net,
#endif
};
25
Process Identification Number
Unix processes are always assigned a
number to uniquely identify them in their
namespace.
 This number is called the process
identification number or PID for short.
 Each process generated with fork or
clone is automatically assigned a new
unique PID value by the kernel.

26
Process ID

PIDs are numbered sequentially in each PID namespace:
the PID of a newly created process is normally the PID of
the previously created process increased by one.

Of course, there is an upper limit on the PID values; when
the kernel reaches such limit, it must start recycling the
lower, unused PIDs.

By default, the maximum PID number is PID_MAX_LIMIT-1
(32,767 or 262143).
27
Maximum PID Number
#define PAGE_SHIFT
12
#define PAGE_SIZE
1UL << PAGE_SHIFT)
#define PID_MAX_DEFAULT (CONFIG_BASE_SMALL ? 0x1000 : 0x8000)
#define PID_MAX_LIMIT (CONFIG_BASE_SMALL ? PAGE_SIZE * 8 : \
(sizeof(long) > 4 ? 4 * 1024 * 1024 : PID_MAX_DEFAULT))
P.S.: PID_MAX_LIMIT is equal to 215 (32768) or 224.
#define PIDMAP_ENTRIES ((PID_MAX_LIMIT + 8*PAGE_SIZE - 1)/PAGE_SIZE/8)
P.S.: PIDMAP_ENTRIES is equal to 1 or 215.
28
PIDs in PID Namespaces
Namespaces add some additional
complexity to how PIDs are managed.
 PID namespaces are organized in a
hierarchy.

29
A Process May Have Multiple PIDs
When a new namespace is created, all PIDs
that are used in this namespace are visible to
the parent namespace, but the child namespace
does not see PIDs of the parent namespace.
 However this implies that some processes are
equipped with more than one PID, namely,
one per namespace they are visible in. This
must be reflected in the data structures.

30
Global IDs
Global IDs are identification numbers that
are valid within the kernel itself and in the
initial global namespace.
 For each ID type, a given global identifier
is guaranteed to be unique in the whole
system.

31
Local IDs
Local IDs belong to a specific namespace
and are not globally valid.
 For each ID type, they are valid within the
namespace to which they belong, but
identifiers of identical type may appear
with the same ID number in a different
namespace.

32
Global PID and TGID

The global PID and TGID are directly
stored in the task_struct, namely, in
the elements pid and tgid:
typedef int
typedef __kernel_pid_t
struct task_struct {
...
pid_t pid;
pid_t tgid;
...
}
__kernel_pid_t;
pid_t;
33
PIDs and Processes

Linux associates a different PID with each
process or lightweight process in the
system.
 As
we shall see later in this chapter, there is a
tiny exception on multiprocessor systems.

This approach allows the maximum
flexibility, because every execution
context in the system can be uniquely
identified.
34
Threads in the Same Group Must
Have a Common PID

On the other hand, Unix programmers
expect threads in the same group to
have a common PID.
 For
instance, it should be possible to send a
signal specifying a PID that affects all threads
in the group.
 In fact, the POSIX 1003.1c standard states
that all threads of a multithreaded application
must have the same PID.
35
Thread Group
To comply with POSIX 1003.1c standard,
Linux makes use of thread groups.
 The identifier shared by the threads is the
PID of the thread group leader , that is,
the PID of the first lightweight process in
the group.
 The thread group ID of a thread group is
called TGID.

36
Process Groups

Modern Unix operating systems introduce the
notion of process groups to represent a job
abstraction.
 For example,
 in order to execute the command line:
$ ls | sort | more
a shell that supports process groups, such as bash, creates
a new group for the three processes corresponding to ls,
sort, and more.
 In this way, the shell acts on the three processes as if they
were a single entity (the job, to be precise).
37
Process Groups [waikato]
One important feature is that it is possible
to send a signal to every process in the
group.
 Process groups are used

 for
distribution of signals,
and
 by terminals to arbitrate requests for their
input and output.
Process Groups [waikato]

Foreground Process Groups
A
foreground process has read and write access to
the terminal.
 Every process in the foreground receives SIGINT (^C
) SIGQUIT (^\ ) and SIGTSTP signals.

Background Process Groups
A
background process does not have read access to
the terminal.
 If a background process attempts to read from its
controlling terminal its process group will be sent a
SIGTTIN.
Group Leaders and Process Group
IDs
Each process descriptor includes a field
containing the process group ID.
 Each group of processes may have a
group leader, which is the process whose
PID coincides with the process group ID.

40
Creation of a New Process Group
[Bhaskar]




A newly created process is initially inserted into
the process group of its parent.
The shell after doing a fork would explicitly call
setpgid to set the process group of the child.
The process group is explicitly set for purposes
of job control.
When a command is given at the shell prompt,
that process or processes (if there is piping) is
assigned a new process group.
41
Login Sessions
Modern Unix kernels also introduce login
sessions.
 Informally, a login session contains all
processes that are descendants of the
process that has started a working session
on a specific terminal -- usually, the first
command shell process created for the
user.

42
Login Sessions vs. Process Groups


All processes in a process group must be in the
same login session.
A login session may have several process
groups active simultaneously;
 one
of these process groups is always in the
foreground, which means that it has access to the
terminal.
 The other active process groups are in the
background.


When a background process tries to access the terminal, it
receives a SIGTTIN or SIGTTOUT signal.
In many command shells, the internal commands bg and fg
can be used to put a process group in either the background
or the foreground.
43
Representation of a PID Namespace
struct pid_namespace {
struct kref kref;
struct pidmap pidmap[PIDMAP_ENTRIES];
int last_pid;
unsigned int nr_hashed;
struct task_struct *child_reaper;
struct kmem_cache *pid_cachep;
unsigned int level;
struct pid_namespace *parent;
:
struct user_namespace *user_ns;
struct work_struct proc_work;
kgid_t pid_gid;
int hide_pid;
int reboot;
/* group exit code if this pidns was rebooted */
unsigned int proc_inum;
};
44
child_reaper Field
Every PID namespace is equipped with a
process that assumes the role taken by
init in the global picture.
 One of the purposes of init is to call
wait4 for orphaned processes, and this
must likewise be done by the init
process of the namespace.
 A pointer to the task structure of this
process is stored in child_reaper.

45
parent Field
parent is a pointer to the parent namespace,
and level denotes the depth in the
namespace hierarchy.
 The initial namespace has level 0, any children
of this namespace are in level 1, children of
children are in level 2, and so on.
 Counting the levels is important because IDs
in higher levels must be visible in lower levels.

46
pidmap Field
struct pidmap {
atomic_t nr_free;
void
*page;
};
#define PIDMAP_ENTRIES ((PID_MAX_LIMIT + 8*PAGE_SIZE - 1)/PAGE_SIZE/8
struct pid_namespace {
:
struct pidmap pidmap[PIDMAP_ENTRIES];
:
}
47
PID bitmap [1][2][3]
To keep track of which PIDs have been
allocated and which are still free, the
kernel uses a large bitmap in which each
PID is identified by a bit.
 The value of the PID is obtained from the
position of the bit in the bitmap.

48
Allocate a Free PID

Allocating a free PID is then restricted
essentially to looking for the first bit in the
bitmap whose value is 0; this bit is then set
to 1.
static int alloc_pidmap(struct pid_namespace *pid_ns)
49
Free a PID

Freeing a PID can be implemented by
‘‘toggling‘‘ the corresponding bit from 1 to
0.
static void free_pidmap(struct upid *upid)
50
struct upid

struct upid represents the information
that is visible in a specific namespace.
struct upid {
/* Try to keep pid_chain in the same cacheline
as nr for find_vpid */
int nr;
struct pid_namespace *ns;
struct hlist_node pid_chain;
};
51
Fields of struct upid
nr represents the numerical value of an
ID, and ns is a pointer to the namespace
to which the value belongs.
 All upid instances are kept on a hash table
to which we will come in a moment, and
pid_chain allows for implementing hash
overflow lists with standard methods of the
kernel.

52
The Kernel-internal Representation of A PID

struct pid is the kernel-internal
representation of a PID.
struct pid
{
atomic_t count;
unsigned int level;
/* lists of tasks that use this pid */
struct hlist_head tasks[PIDTYPE_MAX];
struct rcu_head rcu;
struct upid numbers[1];
};
53
Type enum pid_type
enum pid_type
{
PIDTYPE_PID,
PIDTYPE_PGID,
PIDTYPE_SID,
PIDTYPE_MAX
};
Notice that thread group IDs are not contained
in this collection.
 This is because the thread group ID is simply
given by the PID of the thread group leader, so
a separate entry is not necessary.

54
level Field of struct pid
A process can be visible in multiple
namespaces, and the local ID in each
namespace will be different.
 level denotes in how many namespaces
the process is visible (in other words, this
is the depth of the containing namespace
in the namespace hierarchy).

55
numbers Field of struct pid

numbers contains a upid instance for
each level.
 Note
that the array consists formally of one
element, and this is true if a process is
contained only in the global namespace.
Since the element is at the end of the
structure, additional entries can be added to
the array by simply allocating more space.
56
Graphic Explanation of Field
struct upid numbers[]
struct pid
atomic_t
count
sruct hlist_head tasks[PIDTYPE_PID]
sruct hlist_head tasks[PIDTYPE_PGID]
sruct hlist_head tasks[PIDTYPE_SID]
int level
struct upid numbers[0]
int nr
ns
pid_chain
struct upid numbers[1]
int nr
ns
pid_chain
int nr
ns
pid_chain
57
tasks Field of struct pid
The definition of struct pid is headed by a
reference counter.
 tasks is an array with a list head for every ID
type. This is necessary because an ID can be
used for several processes.
 All task_struct instances that share a given
ID are linked on this list.
 PIDTYPE_MAX denotes the number of ID
types.

58
pids Field of struct task_struct

Since all tast_struck structures that share
an identifier are kept on a list headed by
tasks, a list element is required in struct
task_struct:
struct task_struct {
...
/* PID/PID hash table linkage. */
struct pid_link pids[PIDTYPE_MAX];
...
};
59
struct pid_link
struct pid_link
{
struct hlist_node node;
struct pid *pid;
};
60
Graphic Explanation of Field
struct hlist_head tasks[]
struct pid
count
struct pid
count
tasks[0]
tasks[0]
tasks[1]
tasks[1]
tasks[2]
tasks[2]
level
int nr
ns
level
int nr
ns
pid_chain
int nr
ns
pid_chain
node
pid
node
pid
node
int nr
ns
pid
pid_chain
group_leader
struct task_struct
pids[0]
pids[1]
pids[2]
node
pid
node
pid
node
pid_chain
int nr
ns
pid_chain
pid
int nr
ns
group_leader
pid_chain
struct task_struct
61
Create a New pid Instance

When a new process is created, a new pid instance is also created.
long do_fork(unsigned long clone_flags, unsigned long stack_start, unsigned long stack_size,
int __user *parent_tidptr, int __user *child_tidptr)
{
:
p = copy_process(clone_flags, stack_start, stack_size, child_tidptr, NULL, trace);
:
}
static struct task_struct *copy_process(unsigned long clone_flags, unsigned long stack_start,
unsigned long stack_size, int __user *child_tidptr, struct pid *pid, int trace)
{
:
if (pid != &init_struct_pid) {
retval = -ENOMEM;
pid = alloc_pid(p->nsproxy->pid_ns);
if (!pid)
goto bad_fork_cleanup_io;
}
:
}
62
pids[] Fields of a New Process [source code]

When a new process is created,
pids[0] field of the task_struct of
the new process will be added to the linked
list headed at the tasks[0] field of the new
pid instance.
 The
 If
it is NOT a Thread Group Leader (TGL), the
pids[1] (i.e. pids[PIDTYPE_PGID]) and
pids[2] (i.e. pids[PIDTYPE_PGID]) fields
of its task_struct will not be set.
63
Graphic Explanation of pids[]
Fields of a New non-TGL Process
struct task_struct
struct task_struct
struct pid
count
tasks[0]
tasks[1]
tasks[2]
level
int nr
ns
node
pid
node
pid
node
pid
group_leader
thread group
leader
pids[0]
pids[1]
pids[2]
node
pid
node
pid
node
pid_chain
int nr
ns
pid_chain
pid
int nr
ns
group_leader
pid_chain
64
Graphic Explanation of pids[]
Fields of a Session Leader Process
struct task_struct
struct pid
count
tasks[0]
tasks[1]
tasks[2]
level
int nr
ns
pids[0]
pids[1]
pids[2]
node
pid
node
pid
node
pid_chain
int nr
ns
pid_chain
pid
int nr
ns
group_leader
pid_chain
65
Relationship between a PGL
and TGL [source code]

A Process Group Leader (PGL) must also
be a Thread Group Leader (TGL).
66
Graphic Explanation of pids[] Fields of
a New TGL Process [source code]

When a new process P1 is created by process P2 (P2
is not a TGL) and P2‘s Process Group Leader (PGL) is P3,
pids[0].node field of the task_struct of P1 will be
added to the linked list headed at the tasks[0] field of the
new pid instance.
 If P1 is a TGL, the pids[1].node (i.e.
pids[PIDTYPE_PGID].node) of its task_struct will be
added to the linked list headed at the tasks[1] field of a
pid instance. The pid instance is pointed by the
pids[PIDTYPE_PGID].pid of P3‘s task_struct.
 pids[2].node of the task_struct of P1 is handled
67
similarly.
 The
Graphic Explanation of Field
tasks[], if P2 is not a TGL
struct pid
count
tasks[0]
struct pid
count
tasks[0]
struct pid
count
tasks[0]
tasks[1]
tasks[1]
tasks[1]
tasks[2]
tasks[2]
tasks[2]
node
node
node
pid
pid
pid
node
node
node
pid
pid
pid
node
node
node
pid
pid
pid
group_leader
P3, struct task_struct
group_leader
P2, struct task_struct
pids[0]
pids[1]
pids[2]
group_leader
P1, struct task_struct
68
Graphic Explanation of Field
tasks[], if P2 is a TGL
struct pid
count
tasks[0]
struct pid
count
tasks[0]
struct pid
count
tasks[0]
tasks[1]
tasks[1]
tasks[1]
tasks[2]
tasks[2]
tasks[2]
node
node
node
pid
pid
pid
node
node
node
pid
pid
pid
node
node
node
pid
pid
pid
group_leader
P3, struct task_struct
group_leader
P2, struct task_struct
pids[0]
pids[1]
pids[2]
group_leader
P1, struct task_struct
69
Linked List That Links All
Processes in a Process Group
If a process is a
process group leader
(P.S.: it must also be a
thread group leader),
the tasks[1] field of
the
pid
instance
pointed
by
the
pids[1].pid of its
task_struct is the
head of the linked list
that links the field
pids[1].pid of the
task_struct of all
thread group leaders in
the same process group.
count
tasks[0]
count
tasks[0]
count
tasks[0]
tasks[1]
tasks[1]
tasks[1]
tasks[2]
tasks[2]
tasks[2]
node
node
node
pid
pid
pid
node
node
node
pid
pid
pid
node
node
node
pid
pid
pid
group_leader
group_leader
group_leader
70
process group leader
Thread Group

Processes in the same thread group are
chained together through the thread_group
field of their tast_struct structures [1][2].
struct task_struct {
:
struct list_head thread_group;
:
}
71
Function attach_pid()

Suppose that a new instance of struct pid has
been allocated and set up for a given ID type. It is
attached to a task_struct structure as follows:
int fastcall attach_pid(struct task_struct *task, enum pid_type type,
struct pid *pid)
{
struct pid_link *link;
link = &task->pids[type];
link->pid = pid;
hlist_add_head_rcu(&link->node, &pid->tasks[type]);
return 0;
}
72
attach_pid(p,PIDTYPE_PGID,pid)
pid
count
tasks[0]
count
tasks[0]
count
tasks[0]
tasks[1]
tasks[1]
tasks[1]
tasks[2]
tasks[2]
tasks[2]
node
node
node
pid
pid
pid
node
node
node
pid
pid
pid
node
node
node
pid
pid
pid
p
group_leader
group_leader
group_leader
73
struct pid related Helper Functions


Obtain the pid instance associated with the
task_struct structure.
The auxiliary functions task_pid, task_tgid,
task_pgrp, and task_session are provided for
the different types of IDs.
static inline struct pid *task_pid(struct task_struct *task)
{ return task->pids[PIDTYPE_PID].pid; }
static inline struct pid *task_pgrp(struct task_struct *task)
{ return task->group_leader->pids[PIDTYPE_PGID].pid; }
74
Numerical PID related Helper
Functions (1)

Once the pid instance is available, the numerical
ID can be read off from the upid information
available in the numbers array in struct pid.
pid_t pid_nr_ns(struct pid *pid, struct pid_namespace *ns)
{
struct upid *upid;
pid_t nr = 0;
if (pid && ns->level <= pid->level) {
upid = &pid->numbers[ns->level];
if (upid->ns == ns)
nr = upid->nr;
}
return nr;
}
75
Numerical PID related Helper
Functions (2)
pid_vnr returns the local PID seen from
the namespace to which the ID belongs.
 pid_nr obtains the global PID as seen from
the init process.
 Both rely on pid_nr_ns and automatically
select the proper level:

0
for the global PID,
and
 pid->level for the local one.
76
pid_t pid_nr_ns(struct pid *pid,
struct pid_namespace *ns)
pid
struct pid
count
nr
tasks[0]
tasks[1]
ns
tasks[2]
level
int nr
ns
unsigned int level;
match
pid_chain
struct pid_namespace
:
int nr
ns
numbers[ns->level]
unsigned int level;
pid_chain
:
int nr
ns
pid_chain
struct pid_namespace
77
Return Value of the System Call
getpid( )


The getpid( ) system call returns the value
of TGID relative to the current process
instead of the value of PID, so all the threads
of a multithreaded application share the same
identifier.
Most processes belong to a thread group
consisting of a single member; as thread
group leaders, they have the TGID equal to
the PID, thus the getpid( ) system call
works as usual for this kind of process.
78
Return Value of the System Call
getpid( )
pid_t pid_vnr(struct pid *pid)
{
return pid_nr_ns(pid, task_active_pid_ns(current));
}
static inline pid_t task_tgid_vnr(struct task_struct
*tsk)
{
return pid_vnr(task_tgid(tsk));
}
SYSCALL_DEFINE0(getpid)
{
return task_tgid_vnr(current);
}
79
Graphic Explanation of getpid( )
struct task_struct
struct pid
count
:
level
int nr
ns
pid_chain
int nr
ns
node
node
pid
pid
node
node
pid
pid
node
node
pid
pid
group_leader
pid_chain
:
pid_chain
int nr
ns
group_leader
pid_chain
current
:
pid_nr_ns
struct pid_namespace
int nr
ns
pid_chain
struct pid
count
:
level
int nr
ns
unsigned int level;
unsigned int level;
int nr
ns
numbers[level]
pid_chain
80
pid_hash Hash Table

Hash table pid_hash is used to find the
pid instance that belongs to a numeric
PID value in a given namespace.
static struct hlist_head *pid_hash;
81
Size of pid_hash

pid_hash is used as an array of
hlist_head.

The number of elements is determined by
the RAM configuration of the machine and
lies between 24 = 16 and 212 = 4,096 (It
seems that the size used in kernel 3.9 is
16[1][2]).
82
pidhash_init

pidhash_init computes the apt size
and allocates the required storage.
void __init pidhash_init(void)
{ unsigned int i, pidhash_size;
pid_hash = alloc_large_system_hash("PID", sizeof(*pid_hash),
0, 18,HASH_EARLY | HASH_SMALL, &pidhash_shift, NULL, 0, 4096);
pidhash_size = 1U << pidhash_shift;
for (i = 0; i < pidhash_size; i++)
INIT_HLIST_HEAD(&pid_hash[i]);
}
83
Hash Function pid_hashfn
static unsigned int pidhash_shift = 4;
#define pid_hashfn(nr, ns) \
hash_long((unsigned long)nr + (unsigned long)ns, pidhash_shift)

hash_long returns a value between 0 and 15, if the
size of the hash table is 16.
84
Add a upid Instance (or numbers[]
field) into the Hash Table pid_hash
struct pid *alloc_pid(struct pid_namespace *ns)
{
struct pid *pid;
enum pid_type type;
int i, nr;
struct pid_namespace *tmp;
struct upid *upid;
:
for ( ; upid >= pid->numbers; --upid) {
hlist_add_head_rcu(&upid->pid_chain,&pid_hash[pid_hashfn(upid->nr, upid->ns)]);
upid->ns->nr_hashed++;
}
:
}
85
Graphic Explanation of pid_hash
pid_hash
pid_hash[0]
pid_hash[1]
pid_hash[2]
pid_hash[3]
pid_hash[6]
count
count
count
:
level
int nr
:
level
int nr
:
level
int nr
ns
pid_chain
int nr
ns
pid_chain
ns
pid_chain
int nr
ns
pid_chain
:
int nr
:
int nr
ns
ns
pid_chain
pid_chain
ns
pid_chain
int nr
ns
pid_chain
:
int nr
ns
pid_chain
pid_hash[7]
pid_hash[8]
struct pid
count
count
count
:
level
int nr
:
level
int nr
:
level
int nr
ns
ns
ns
pid_chain
pid_hash[12]
int nr
ns
pid_chain
:
int nr
ns
pid_hash[15]
pid_chain
pid_chain
int nr
ns
pid_chain
:
int nr
ns
pid_chain
pid_chain
int nr
ns
pid_chain
:
int nr
ns
pid_chain
86
Multiple PIDs of a Process
When a new process is created, it may be
visible in multiple namespaces.
 For each of them a local PID must be
generated.
 This is handled in alloc_pid:

87
Excerpt of alloc_pid()
struct pid *alloc_pid(struct pid_namespace *ns)
{
struct pid *pid;
enum pid_type type;
int i, nr;
struct pid_namespace *tmp;
struct upid *upid;
...
tmp = ns;
for (i = ns->level; i >= 0; i--) {
nr = alloc_pidmap(tmp);
...
pid->numbers[i].nr = nr;
pid->numbers[i].ns = tmp;
tmp = tmp->parent;
}
pid->level = ns->level;
...
}
88
Set the values of field numbers[]
of struct pid
Starting at the level of the namespace in
which the process is created, the kernel
goes down to the initial, global namespace
and creates a local PID for each.
 All upid that are contained in struct
pid are filled with the newly generated
PIDs.

89
Obtain the pid Instance from a
numbers[] Field
struct pid
count
Void foo(struct pid_namespace *ns)
{ struct upid *pnr;
:
container_of(pnr, struct pid, numbers[ns->level]);
:
}
tasks[0]
tasks[1]
tasks[2]
level
int nr
ns
pid_chain
:
pnr
numbers[ns->level]
int nr
ns
pid_chain
:
int nr
ns
pid_chain
90
Function find_pid_ns()
struct pid *find_pid_ns(int nr, struct pid_namespace *ns)
{
struct upid *pnr;
hlist_for_each_entry_rcu(pnr, &pid_hash[pid_hashfn(nr, ns)], pid_chain)
if (pnr->nr == nr && pnr->ns == ns)
return container_of(pnr, struct pid, numbers[ns->level]);
return NULL;
}
91
Relationships among Processes





Processes created by a program have a
parent/child relationship.
When a process creates multiple children, these
children have sibling relationships.
Several fields must be introduced in a process
descriptor to represent these relationships with
respect to a given process P.
Processes 0 and 1 are created by the kernel.
Process 1 (init) is the ancestor of all other
processes.
92
Fields of a Process Descriptor Used to
Express Parenthood Relationships (1)

real_parent:
 points
to the process descriptor of the process
that created P
or
 points to the descriptor of process 1 (init) if
the parent process no longer exists.

Therefore, when a user starts a background
process and exits the shell, the background
process becomes the child of init.
93
Fields of a Process Descriptor Used to
Express Parenthood Relationships (2)

parent:
 Points
to the current parent of P
this is the process that must be signaled when the
child process terminates.
 its value usually coincides with that of
real_parent.

 It
may occasionally differ, such as when
another process issues a ptrace( ) system
call requesting that it be allowed to monitor P.

see the section "Execution Tracing" in Chapter 20.
94
Fields of a Process Descriptor Used to
Express Parenthood Relationships (3)

struct list_head children:



struct list_head sibling:


The head of the list containing all children created by P.
This list is formed through the sibling field of the child
processes.
The pointers to the next and previous elements in the list of the
sibling processes, those that have the same parent as P.
P.S.:
/*
* children/sibling forms the list of my natural children
*/
struct list_head children;
/* list of my children */
struct list_head sibling;
/* linkage in my parent's children list
95
Family Relationships between
Processes
96
Example

Process P0 successively created P1, P2,
and P3. Process P3, in turn, created
process P4.
children/sibling
fields forms the list of
children of P0 (those
links marked with )
97
Other Relationship between Processes

There exist other relationships among
processes:
a
process can be a leader of a process
group or of a login session,
 it can be a leader of a thread group, and
 it can also trace the execution of other
processes (see the section "Execution
Tracing" in Chapter 20).
98
Other Process Relationship Fields
of a Process Descriptor P (1)

struct task_struct * group_leader
 Process
descriptor pointer of the thread group
leader of P.
99
Other Process Relationship Fields
of a Process Descriptor P (2)
/*
* ptraced is the list of tasks this task is using ptrace on.
* This includes both natural children and PTRACE_ATTACH targets.
* p->ptrace_entry is p's link on the p->parent->ptraced list.
*/
struct list_head ptraced;
struct list_head ptrace_entry;
[example]: function __ptrace_link()
100
Download