Linked Lists

advertisement
Linux Operating System
許 富 皓
1
Chapter 3 Processes
2
Non-circular Doubly Linked Lists

A sequence of nodes chained together through two kinds of pointers:

a pointer to its previous node
and
 a pointer to its subsequent node.

Each node has two links:
one points to the previous node, or points to a null value or empty list if
it is the first node
and
 one points to the next, or points to a null value or empty list if it is the
final node.

3
Problems with Doubly Linked Lists

The Linux kernel contains hundred of
various data structures that are linked
together through their respective doubly
linked lists.

Drawbacks:

a waste of programmers' efforts to implement a
set of primitive operations, such as,
initializing the list
 inserting and deleting an element
 scanning the list.


a waste of memory to replicate the primitive
operations for each different list.
4
Data Structure struct list_head


Therefore, the Linux kernel defines the struct
list_head data structure, whose only fields
next and prev represent the forward and
back pointers of a generic doubly linked list
element, respectively.
It is important to note, however, that the pointers
in a list_head field store
addresses of other list_head fields
rather than
 the addresses of the whole data structures in which
the list_head structure is included.
 the
5
A Circular Doubly Linked List with
Three Elements
data
structure 1
data
structure 2
data
structure 3
list_head
list_head
list_head
list_head
next
next
next
next
prev
prev
prev
prev
6
Macro LIST_HEAD(list_name)

A new list is created by using the
LIST_HEAD(list_name) macro.
declares a new variable named list_name of type
list_head, which is a dummy first element that acts
as a placeholder for the head of the new list.
and
 it initializes the prev and next fields of the
list_head data structure so as to point to the
list_name variable itself.
 it
7
Code of Macro
LIST_HEAD(list_name)
struct list_head
{
struct list_head *next, *prev;
};
#define LIST_HEAD_INIT(name) { &(name), &(name) }
#define LIST_HEAD(name) \
struct list_head name = LIST_HEAD_INIT(name)
8
An Empty Doubly Linked List
 LIST_HEAD(my_list)
next
prev
struct list_head my_list
9
Relative Functions and Macros (1)
list_add(n,p)
 list_add_tail(n,p)
 list_del(p)
 list_empty(p)

n
p
1
2
...
n
10
Relative Functions and Macros (2)
list_for_each(p,h)
 list_for_each_entry(p,h,m)
 list_entry(p,t,m)

 Returns
the address of the data structure of
type t in which the list_head field that has
the name m and the address p is included.
11
Example of list_entry(p,t,m)
sturct class{
char name[20];
char teacher[20];
struct student_pointer
struct list_head link;
};
name
*student;
struct class grad_1A;
struct list_head *poi;
poi=&(grad_1A.link);
list_entry(poi,struct class,link)
 &grad_1A
(20 bytes)
teacher
(20 bytes)
student
(4 bytes)
link
next (4 bytes)
prev(4 bytes)
12
Code of list_entry
typedef unsigned int __kernel_size_t;
typedef __kernel_size_t size_t;
#define offsetof(TYPE, MEMBER) ((size_t) &((TYPE *)0)->MEMBER)
#define list_entry (ptr, type, member)
\
({
\
const typeof(((type *)0)->member)*__mptr= (ptr); \
(type *)((char *)__mptr - offsetof(type,member) );\
})
13
Explanation of list_entry(p,t,m)
offset
#define list_entry(ptr, type, member) \
((type *)((char *)(ptr)-(unsigned long)(&((type *)0)->member)))
list_entry(…)
name
(20 bytes)
poi - offset
teacher
offset
(20 bytes)
= list_entry()
student
(4 bytes)
poi
link
next (4 bytes)
prev(4 bytes)
list_entry(poi,struct class,link)
((struct class *)((char *)(poi)-(unsigned long)(&((struct
class *)0)->link)))
14
hlist_head
The Linux kernel 2.6 supports another kind
of doubly linked list, which mainly differs
from a list_head list because it is NOT
circular.
 It is mainly used for hash tables.
 The list head is stored in an hlist_head
data structure, which is simply

a
pointer to the first element in the list (NULL if
the list is empty).
15
hlist_node

Each element is represented by an
hlist_node data structure, which
includes
pointer next to the next element
and
 a pointer pprev to the next field of the
previous element.
a

Because the list is not circular, the pprev
field of the first element and the next field
of the last element are set to NULL.
16
A Non-circular Doubly Linked List
struct hlist_head
struct hlist_node
struct hlist_node
*first
pprev
next
pprev
next
pprev
next
17
Functions and Macro for
hlist_head and hlist_node

The list can be handled by means of
several helper functions and macros
similar to those listed in previous sixth
slide: hlist_add_head( ),
hlist_del( ), hlist_empty( ),
hlist_entry,
hlist_for_each_entry, and so on.
18
The Process List
 The
process list is a circular
doubly linked list that links the
process descriptors of all existing
thread group leaders:
task_struct structure includes a
tasks field of type list_head whose prev
and next fields point, respectively, to the
previous and to the next task_struct
element’s tasks field.
 Each
19
The Head of the Process List
The head of the process list is the
init_task task_struct descriptor; it
is the process descriptor of the so-called
process 0 or swapper (see the section
"Kernel Threads" later in this chapter).
 The tasks->prev field of init_task
points to the tasks field of the process
descriptor inserted last in the list.

20
Code for init_task
struct task_struct init_task = INIT_TASK(init_task);
#define INIT_TASK(tsk)
{
.state = 0,
.thread_info = &init_thread_info,
.usage = ATOMIC_INIT(2),
.flags = 0,
.lock_depth = -1,
.prio = MAX_PRIO-20,
.static_prio = MAX_PRIO-20,
.policy = SCHED_NORMAL,
.cpus_allowed = CPU_MASK_ALL,
.mm = NULL,
.active_mm = &init_mm,
.run_list = LIST_HEAD_INIT(tsk.run_list),
:
:
}
\
\
\
\
\
\
\
\
\
\
\
\
\
\
21
Insert and Delete a Process
Descriptor from the Process List
The SET_LINKS and REMOVE_LINKS
macros are used to insert and to remove a
process descriptor in the process list,
respectively.
 These macros also take care of the
parenthood relationship of the process
(see the section "How Processes Are
Organized" later in this chapter).

22
Scans the Whole Process List with
Macro for_each_process
#define for_each_process(p)
\
for (p=&init_task; (p=list_entry((p)->tasks.next, \
struct task_struct, tasks)) != &init_task; )


The macro starts by moving PAST init_task
to the next task and continues until it reaches
init_task again (thanks to the circularity of
the list).
At each iteration, the variable p passed as the
argument of the macro contains the address of
the currently scanned process descriptor, as
returned by the list_entry macro.
23
Example


Macro for_each_process scans the whole
thread group leader list.
The macro is the loop control statement after
which the kernel programmer supplies the loop.
e.g.
counter=1; /* for init_task */
for_each_process(t)
{ if(t->state==TASK_RUNNING)
++counter;
}
24
The Lists of TASK_RUNNING
Processes – in Early Linux Version




When looking for a new process to run on a
CPU, the kernel has to consider only the
runnable processes (that is, the processes in
the TASK_RUNNING state).
Earlier Linux versions put all runnable processes
in the same list called runqueue.
Because it would be too costly to maintain the
list ordered according to process priorities, the
earlier schedulers were compelled to scan the
whole list in order to select the "best" runnable
process.
Linux 2.6 implements the runqueue differently.
25
The Lists of TASK_RUNNING
Processes – in Linux Version 2.6



Linux 2.6 achieves the scheduler speedup by
splitting the runqueue in many lists of runnable
processes, one list per process priority.
Each task_struct descriptor includes a
run_list field of type list_head.
If the process priority is equal to k (a value
ranging between 0 and 139), the run_list field
links the process descriptor into the list of
runnable processes having priority k.
26
runqueue in a Multiprocessor
System

Furthermore, on a multiprocessor
system, each CPU has its own runqueue,
that is, its own set of lists of processes.
27
Trade-off of runqueue

runqueue is a classic example of making
a data structures more complex to improve
performance:
 to
make scheduler operations more efficient,
the runqueue list has been split into 140
different lists!
28
The Main Data Structures of a runqueue
The kernel must preserve a lot of data for
every runqueue in the system.
 The main data structures of a runqueue
are the lists of process descriptors
belonging to the runqueue.
 All these lists are implemented by a single
prio_array_t (= struct
prio_array ) data structure.

29
struct prio_array
struct prio_array
{ unsigned int
unsigned long
struct list_head
};
nr_active;
bitmap[BITMAP_SIZE];
queue[MAX_PRIO];
 nr_active:
the number of process
descriptors linked into the lists.
 bitmap: a priority bitmap: each flag is set
if and only if the corresponding priority list
is not empty
 queue: the 140 heads of the priority lists.
30
The prio and array Field of a
Process Descriptor
The prio field of the process descriptor
stores the dynamic priority of the
process.
 The array field is a pointer to the
prio_array_t data structure of its
current runqueue.

 P.S.:
Each CPU has its own runqueue.
31
Scheduler-related Fields of a Process
Descriptor
prio_array_t
unsigned int
nr_active
unsigned long
bitmap[5]
struct
[0]
list_head [1]
queue[140][x]
struct task_struct
struct task_struct
struct task_struct
:
:
:
int
prio
struct
list_head
run_list
prev
next
int
prio
struct
list_head
run_list
prev
next
prio_array_t
*array
prio_array_t
*array
:
:
int
.
.
.
prio
struct
list_head
run_list
prev
next
prio_array_t
*array
:
32
Function enqueue_task(p,array)

The enqueue_task(p,array)
function inserts a process descriptor
into a runqueue list; its code is
essentially equivalent to:
list_add_tail(&p->run_list, &array->queue[p->prio]);
__set_bit(p->prio, array->bitmap);
array->nr_active++;
p->array = array;
33
Function dequeue_task(p,array)

Similarly, the dequeue_task(p,array)
function removes a process descriptor
from a runqueue list.
34
Relationships among Processes





Processes created by a program have a
parent/child relationship.
When a process creates multiple children, these
children have sibling relationships.
Several fields must be introduced in a process
descriptor to represent these relationships with
respect to a given process P.
Processes 0 and 1 are created by the kernel.
Process 1 (init) is the ancestor of all other
processes.
35
Fields of a Process Descriptor Used to
Express Parenthood Relationships (1)

real_parent:
 points
to the process descriptor of the process
that created P
or
 points to the descriptor of process 1 (init) if
the parent process no longer exists.

Therefore, when a user starts a background
process and exits the shell, the background
process becomes the child of init.
36
Fields of a Process Descriptor Used to
Express Parenthood Relationships (2)

parent:
 Points
to the current parent of P
this is the process that must be signaled when the
child process terminates.
 its value usually coincides with that of
real_parent.

 It
may occasionally differ, such as when
another process issues a ptrace( ) system
call requesting that it be allowed to monitor P.

see the section "Execution Tracing" in Chapter 20.
37
Fields of a Process Descriptor Used to
Express Parenthood Relationships (3)

struct list_head children:



struct list_head sibling:


The head of the list containing all children created by P.
This list is formed through the sibling field of the child
processes.
The pointers to the next and previous elements in the list of the
sibling processes, those that have the same parent as P.
P.S.:
/* children/sibling forms the list of my children plus the tasks I'm
ptracing. */
struct list_head children; /* list of my children */
struct list_head sibling; /* linkage in my parent's children list */
38
The children Field of a Patent Process Points to
the sibling Field of a Child Process
#define add_parent(p, parent)
list_add_tail(&(p)->sibling,&(parent)->children)
=========================================================
#define SET_LINKS(p)
do { if (thread_group_leader(p))
list_add_tail(&(p)->tasks,&init_task.tasks);
add_parent(p, (p)->parent);
} while (0)
39
Iterate over a Process’s Children

Similarly, it is possible to iterate over a
process's children with
#define list_for_each(pos, head) \
for(pos = (head)->next; pos != (head); pos = pos->next)
struct task_struct *task;
struct list_head *list;
list_for_each(list, &current->children)
{
task = list_entry(list, struct task_struct, sibling);
/* task now points to one of current's children */
}
40
Example

Process P0 successively created P1, P2,
and P3. Process P3, in turn, created
process P4.
children/sibling
fields forms the list of
children of P0 (those
links marked with )
41
Process Groups

Modern Unix operating systems introduce the
notion of process groups to represent a job
abstraction.
 For example,
 in order to execute the command line:
$ ls | sort | more
a shell that supports process groups, such as bash, creates
a new group for the three processes corresponding to ls,
sort, and more.
 In this way, the shell acts on the three processes as if they
were a single entity (the job, to be precise).
42
Process Groups [waikato]
One important feature is that it is possible
to send a signal to every process in the
group.
 Process groups are used

 for
distribution of signals,
and
 by terminals to arbitrate requests for their
input and output.
Process Groups [waikato]

Foreground Process Groups
A
foreground process has read and write access to
the terminal.
 Every process in the foreground receives SIGINT (^C
) SIGQUIT (^\ ) and SIGTSTP signals.

Background Process Groups
A
background process does not have read access to
the terminal.
 If a background process attempts to read from its
controlling terminal its process group will be sent a
SIGTTIN.
Group Leaders and Process Group
IDs
Each process descriptor includes a field
containing the process group ID.
 Each group of processes may have a
group leader, which is the process whose
PID coincides with the process group ID.

45
Creation of a New Process Group
[Bhaskar]




A newly created process is initially inserted into
the process group of its parent.
The shell after doing a fork would explicitly call
setpgid to set the process group of the child.
The process group is explicitly set for purposes
of job control.
When a command is given at the shell prompt,
that process or processes (if there is piping) is
assigned a new process group.
46
Login Sessions
Modern Unix kernels also introduce login
sessions.
 Informally, a login session contains all
processes that are descendants of the
process that has started a working session
on a specific terminal -- usually, the first
command shell process created for the
user.

47
Login Sessions vs. Process Groups


All processes in a process group must be in the
same login session.
A login session may have several process
groups active simultaneously;
 one
of these process groups is always in the
foreground, which means that it has access to the
terminal.
 The other active process groups are in the
background.


When a background process tries to access the terminal, it
receives a SIGTTIN or SIGTTOUT signal.
In many command shells, the internal commands bg and fg
can be used to put a process group in either the background
or the foreground.
48
Other Relationship between Processes

There exist other relationships among
processes:
a
process can be a leader of a process
group or of a login session,
 it can be a leader of a thread group, and
 it can also trace the execution of other
processes (see the section "Execution
Tracing" in Chapter 20).
49
Other Process Relationship Fields
of a Process Descriptor P (1)

struct task_struct * group_leader
 Process

signal->pgrp
 PID

of the group leader of P.
pid_t tgid
 PID

descriptor pointer of the group leader of P.
of the thread group leader of P.
signal->session
 PID
of the login session leader of P.
50
Other Process Relationship Fields
of a Process Descriptor P (2)

struct list_head
ptrace_children
head of a list containing all children of P
being traced by a debugger.
 The

struct list_head ptrace_list
 The
pointers to the next and previous
elements in the real parent's list of traced
processes (used when P is being traced).
51
More about ptrace_list and
ptrace_children[1]
/*
ptrace_list/ptrace_children forms
the list of my children that were stolen by a
ptracer.
*/
52
More about ptrace_list and
ptrace_children[2]
void __ptrace_link(task_t *child, task_t *new_parent)
{
if (!list_empty(&child->ptrace_list))
BUG();
if (child->parent == new_parent)
return;
list_add(&child->ptrace_list, &child->parent->ptrace_children);
REMOVE_LINKS(child);
child->parent = new_parent;
SET_LINKS(child);
}
53
The pidhash Table and Chained Lists

In several circumstances, the kernel must be able
to derive the process descriptor pointer
corresponding to a PID.
occurs, for instance, in servicing the kill( )
system call.
 This


When process P1 wishes to send a signal to another process,
P2, it invokes the kill( ) system call specifying the PID of
P2 as the parameter.
The kernel derives the process descriptor pointer from the PID
and then extracts the pointer to the data structure that records
the pending signals from P2's process descriptor.
54
Multiple Hash Tables
The process descriptor includes fields that
represent different types of PID, and each
type of PID requires its own hash table.
 Due to the above reason, FOUR hash
tables have been introduced to derive the
process descriptor pointer
corresponding to a PID.

55
The Four Hash Tables and Corresponding
Fields in the Process Descriptor
Hash Table Type
PIDTYPE_PID
Field
Name
pid
Description
PIDTYPE_TGID
tgid
PID of thread group leader
process
PIDTYPE_PGID
pgrp
PID of the group leader process
PIDTYPE_SID
session
PID of the session leader
process
PID of the process
56
Array pid_hash

The four hash tables are dynamically
allocated during the kernel initialization
phase, and their addresses are stored in
the pid_hash array.
static struct hlist_head *pid_hash[PIDTYPE_MAX];
pid_hash
struct hlist_head *
57
Initialization of the Four Hash
Tables -- pidhash_init(void)
for (i = 0; i < PIDTYPE_MAX; i++)
{
pid_hash[i]=alloc_bootmem(pidhash_size*
sizeof(*(pid_hash[i])));
if (!pid_hash[i])
panic("Could not alloc pidhash!\n");
for (j = 0; j < pidhash_size; j++)
INIT_HLIST_HEAD(&pid_hash[i][j]);
}
58
The Four Initialized Hash Tables
pid_hash
struct hlist_head *
[0]
[1] [2] [3]
[2047]
struct
hlist_head
[0]
[1] [2] [3]
[2047]
struct
hlist_head
59
PID and Table Index

The PID is transformed into a table index using the pid_hashfn macro,
which expands to:
#define pid_hashfn(x) hash_long((unsigned long) x, pidhash_shift)


The pidhash_shift variable stores the length in bits of a table index (11,
in our example).
The hash_long( ) function is used by many hash functions; on a 32-bit
architecture it is essentially equivalent to:
unsigned long hash_long(unsigned long val, unsigned int
bits)
{ unsigned long hash = val * 0x9e370001UL;
return hash >> (32 - bits);
}
Because in our example pidhash_shift is equal to 11, pid_hashfn
yields values ranging between 0 and 211 - 1 = 2047.
60
Colliding



As every basic computer science course
explains, a hash function does not always
ensure a one-to-one correspondence between
PIDs and table indexes.
Two different PIDs that hash into the same table
index are said to be colliding.
Linux uses chaining to handle colliding PIDs;
each table entry is the head of a doubly linked
list of colliding process descriptors.
61
Colliding Example

The following figure illustrates a PID hash table with two lists.
The processes having PIDs 2,890 and 29,384 hash into the 200th
element of the table, while the process having PID 29,385 hashes
into the 1,466th element of the table.
62
Properties of Data Structures Used
in the PID Hash Tables


The data structures used in the PID hash tables must keep
track of the relationships between the processes.
As an example, suppose that the kernel must retrieve all
processes belonging to a given thread group, that is, all
processes whose tgid field is equal to a given number.



Looking in the hash table for the given thread group number
returns just one process descriptor, that is, the descriptor of the
thread group leader.
To quickly retrieve the other processes in the group, the kernel
must maintain a list of processes for each thread group.
The same situation arises when looking for the processes


belonging to a given login session
or
belonging to a given process group.
63
Properties of a Process
Descriptor’s pids Field
The PID hash tables' data structures solve
all these problems, because they allow the
definition of a list of processes for any PID
number included in a hash table.
 The core data structure is an array of four
pid structures embedded in the pids field
of the process descriptor.

64
struct pid
struct pid
{int nr;
struct hlist_node pid_chain;
struct list_head pid_list;
};

nr:
 The

PID number.
pid_chain:
 The
links to the next and previous elements in the
hash chain list.

pid_list:
 The
head of the per-PID list.
65
The PID Hash Tables
66
pidhash-related Functions







do_each_task_pid(nr, type, task)
while_each_task_pid(nr, type, task)
find_task_by_pid_type(type, nr)
find_task_by_pid(nr)
attach_pid(task, type, nr)
detach_pid(task, type)
next_thread(task)
67
Download