The kernel’s task list Introduction to process descriptors and their related data-structures

advertisement
The kernel’s task list
Introduction to process descriptors
and their related data-structures
for Linux kernel version 2.6.10
Multi-tasking
• Modern operating systems allow multiple
users to share a computer’s resources
• Users are allowed to run multiple tasks
• The OS kernel must protect each task
from interference by other tasks, while
allowing every task to take its turn using
some of the processor’s available time
Stacks and task-descriptors
• To manage multitasking, the OS needs to
use a data-structure which can keep track
of every task’s progress and usage of the
computer’s available resourcres (physical
memory, open files, pending signals, etc.)
• Such a data-structure is called a ‘process
descriptor’ – every active task needs one
• Every task needs its own ‘private’ stack
What’s on a program’s stack?
Upon entering ‘main()’:
• A program’s exit-address is on user stack
• Command-line arguments on user stack
• Environment variables are on user stack
During execution of ‘main()’:
• Function parameters and return-addresses
• Storage locations for ‘automatic’ variables
Entering the kernel…
A user process enters ‘kernel-mode’:
• when it decides to execute a system-call
• when it is ‘interrupted’ (e.g. by the timer)
• when ‘exceptions’ occur (e.g. divide by 0)
Switching to a different stack
• Entering kernel-mode involves not only a
‘privilege-level transition’ (from level 3 to
level 0), but also a stack-area ‘switch’
• This is necessary for robustness:
e.g., user-mode stack might be exhausted
• This is desirable for security:
e.g, privileged data might be accessible
What’s on the kernel stack?
Upon entering kernel-mode:
• task’s registers are saved on kernel stack
(e.g., address of task’s user-mode stack)
During execution of kernel functions:
• Function parameters and return-addresses
• Storage locations for ‘automatic’ variables
Supporting structures
• So every task, in addition to having its own
code and data, will also have a stack-area
that is located in user-space, plus another
stack-area that is located in kernel-space
• Each task also has a process-descriptor
which is accessible only in kernel-space
A task’s virtual-memory layout
Privilege-level 0
Kernel space
User-mode stack-area
User space
Privilege-level 3
Task’s code and data
Process descriptor
and
kernel-mode stack
Something new in 2.6
• Linux uses part of a task’s kernel-stack
page-frame to store ‘thread information’
• The thread-info includes a pointer to the
task’s process-descriptor data-structure
Task’s kernel-stack
struct task_struct
Task’s
process-descriptor
Task’s thread-info
kernel page-frame
Tasks have ’states’
• From kernel-header: <linux/sched.h>
•
•
•
•
•
#define TASK_RUNNING
#define TASK_INTERRUPTIBLE
#define TASK_UNINTERRUPTIBLE
#define TASK_ZOMBIE
#define TASK_STOPPED
0
1
2
4
8
Fields in a process-descriptor
struct task_struct
{
volatile long
state;
struct thread_into
*thread_info;
unsigned long
flags;
struct mm_struct
*mm;
pid_t
pid;
char
comm[16];
/* plus many other fields */
};
Finding a task’s ‘thread-info’
• During a task’s execution in kernel-mode, it’s
very quick to find that task’s thread-info object
• Just use two assembly-language instructions:
movl
andl
$0xFFFFF000, %eax
%esp, %eax
Ok, now %eax = the thread-info’s base-address
There’s a macro that implements this computation
Finding the task-descriptor
• Use a macro ‘current_thread_info()’ to get
a pointer to the ‘thread_info’ structure:
struct thread_info *info = current_thread_info();
• Then one more step gets you the address
of the task’s process-descriptor:
struct task_struct *task = info->task;
• You can also use ‘current’ to perform this
two-step assignment: task = current;
Parenthood
•
•
•
•
•
•
New tasks get created by calling ‘fork()’
Old tasks get terminated by calling ‘exit()’
When ‘fork()’ is called, two tasks return
One task is known as the ‘parent’ process
And the other is called the ‘child’ process
The kernel keeps track of this relationship
A parent can have many children
• If a user task calls ‘fork()’ twice, that will
create two distinct ‘child’ processes
• These children are called ‘siblings’
• Kernel track of all this with lists of pointers
Parenthood relationships
P1
P2
P3
See “Linux Kernel Programming”
(Chapter 3) for additional details
P4
P5
The kernel’s ‘task-list’
•
•
•
•
•
Kernel keeps a list of process descriptors
A ‘doubly-linked’ circular list is used
The ‘init_task’ serves as a fixed header
Other tasks inserted/deleted dynamically
Tasks have forward & backward pointers,
implemented as fields in the ‘tasks’ field
• To go forward:
task = next_task( task );
• To go backward: task = prev_task( task );
Doubly-linked circular list
next_task
init_task
(pid=0)
prev_task
…
newest
task
Demo
• We can write a module that lets us create
a pseudo-file (named ‘/proc/tasklist’) for
viewing the list of all currently active tasks
• Our ‘tasklist.c’ module shows the name
and process-ID of each task, along with
the task’s current state
• Use the command: $ cat /proc/tasklist
In-class exercise #1
• Different versions of the 2.6 Linux kernel
use slightly different definitions for these
task-related kernel data-structures (e.g,
our 2.6.10 kernel uses a smaller-sized
‘thread-info’ structure than 2.6.9 did)
• So can you write an installable kernel
module that will tell you:
– the size of a ‘task_struct’ object (in bytes)?
– the size of a ‘thread_info’ object (in bytes)?
‘Kernel threads’
• Some tasks don’t have a page-directory of
their own – because they don’t need one
• They can just ‘borrow’ the page-dirtectory
that belongs to another task
• These ‘kernel thread’ tasks will have an
NULL value (i.e., zero) stored in the ‘mm’
field of their ‘task_struct’ descriptor
In-class exercise #2
• Can you modify our ‘tasklist.c’ module so it
will display a list of only those tasks which
are kernel threads?
Download