lesson5

advertisement
The kernel’s task list
Introduction to process descriptors
and their related data-structures
for Linux kernel version 2.6.22
Multi-tasking
• Modern operating systems allow multiple
users to share a computer’s resources
• Users are allowed to run multiple tasks
• The OS kernel must protect each task
from interference by other tasks, while
allowing every task to take its turn using
some of the processor’s available time
Stacks and task-descriptors
• To manage multitasking, the OS needs to
use a data-structure which can keep track
of every task’s progress and usage of the
computer’s available resources (physical
memory, open files, pending signals, etc.)
• Such a data-structure is called a ‘process
descriptor’ – every active task needs one
• Every task needs its own ‘private’ stack
What’s on a program’s stack?
Upon entering ‘main()’:
• A program’s exit-address is on user stack
• Command-line arguments on user stack
• Environment variables are on user stack
During execution of ‘main()’:
• Function parameters and return-addresses
• Storage locations for ‘automatic’ variables
Entering the kernel…
A user process enters ‘kernel-mode’:
• when it decides to execute a system-call
• when it is ‘interrupted’ (e.g. by the timer)
• when ‘exceptions’ occur (e.g. divide by 0)
Switching to a different stack
• Entering kernel-mode involves not only a
‘privilege-level transition’ (from level 3 to
level 0), but also a stack-area ‘switch’
• This is necessary for robustness:
e.g., user-mode stack might be exhausted
• This is desirable for security:
e.g, privileged data might be accessible
What’s on the kernel stack?
Upon entering kernel-mode:
• task’s registers are saved on kernel stack
(e.g., address of task’s user-mode stack)
During execution of kernel functions:
• Function parameters and return-addresses
• Storage locations for ‘automatic’ variables
Supporting structures
• So every task, in addition to having its own
code and data, will also have a stack-area
that is located in user-space, plus another
stack-area that is located in kernel-space
• Each task also has a process-descriptor
which is accessible only in kernel-space
A task’s virtual-memory layout
Privilege-level 0
Kernel space
User-mode stack-area
User space
Privilege-level 3
Shared runtime-libraries
Task’s code and data
process descriptor
and
kernel-mode stack
The Linux process descriptor
pagedir[]
task_struct
Each process
descriptor
contains many
fields
and some are
pointers to
other kernel
structures
state
*stack
flags
*pgd
*mm
user_struct
exit_code
*user
pid
which may
themselves
include fields
that point to
structures
mm_struct
files_struct
*files
*parent
*signal
signal_struct
Something new in 2.6
• Linux uses part of a task’s kernel-stack
page-frame to store ‘thread information’
• The thread-info includes a pointer to the
task’s process-descriptor data-structure
Task’s kernel-stack
struct task_struct
Task’s
process-descriptor
8-KB
Task’s thread-info
page-frame aligned
Tasks have ’states’
From kernel-header: <linux/sched.h>
•
•
•
•
•
•
•
#define TASK_RUNNING
#define TASK_INTERRUPTIBLE
#define TASK_UNINTERRUPTIBLE
#define TASK_STOPPED
#define TASK_TRACED
#define TASK_NONINTERACTIVE
#define TASK_DEAD
0
1
2
4
8
64
128
Fields in a process-descriptor
struct task_struct {
volatile long
state;
void
*stack;
unsigned long
flags;
struct mm_struct
*mm;
struct thread_struct
*thread;
pid_t
pid;
char
comm[16];
/* plus many other fields */
};
Finding a task’s ‘thread-info’
• During a task’s execution in kernel-mode, it’s
very quick to find that task’s thread-info object
• Just use two assembly-language instructions:
movl
andl
$0xFFFFF000, %eax
%esp, %eax
Ok, now %eax = the thread-info’s base-address
There’s a macro that implements this computation
Finding task-related kernel-data
• Use a macro ‘task_thread_info( task )’ to
get a pointer to the ‘thread_info’ structure:
struct thread_info *info = task_thread_info( task );
• Then one more step gets you back to the
address of the task’s process-descriptor:
struct task_struct *task = info->task;
The kernel’s ‘task-list’
•
•
•
•
•
Kernel keeps a list of process descriptors
A ‘doubly-linked’ circular list is used
The ‘init_task’ serves as a fixed header
Other tasks inserted/deleted dynamically
Tasks have forward & backward pointers,
implemented as fields in the ‘tasks’ field
• To go forward:
task = next_task( task );
• To go backward: task = prev_task( task );
Doubly-linked circular list
next_task
init_task
(pid=0)
prev_task
…
newest
task
Demo
• We can write a module that lets us create
a pseudo-file (named ‘/proc/tasklist’) for
viewing the list of all currently active tasks
• Our ‘tasklist.c’ module shows the name
and process-ID of each task, along with
that task’s current ‘state’ (0, 1, 2, 4, 8,…)
• Use the command: $ cat /proc/tasklist to
display a complete list of the active tasks
Maybe a big /proc file…
• We can’t know ahead of time how many
tasks are active in our system – this will
depend on many varying factors, such as
who else is logged in, which commands
have been issued, whether we’re using
text-mode console or graphical desktop
• So it’s perfectly possible our pseudo-file
might ‘overflow’ its kernel-supplied buffer!
How to avoid buffer-overflow
• Our module’s ‘get_info()’ callback-function has
four parameter-values passed to it by the kernel:
•
•
•
•
char *buf
char **start
off_t offset
int
buflen
- address of a small kernel buffer
- address of a pointer variable
- current offset of file-pointer
- size of the kernel buffer
• The initial conditions are:
offset == 0
and
*start == NULL
• Kernel’s behavior will vary if we modify *start
Normal case
• We expect the ‘/proc’ file to deliver a small
amount of text-data (not more than would
fit in the kernel-supplied buffer (e.g., 3KB)
• So we make no change to ‘*start’
• Then kernel will deliver the data it finds in
the buffer it had supplied to ‘get_info()’
• The kernel will not call ‘get_info()’ again
(unless our file is closed and reopened)
Alternative case
• Our ‘get_info()’ function modifies the value
of the (initially NULL) ‘*start’ pointer – for
example, maybe assigning it the address
of some buffer we’ve allocated, or even
assigning the address of the kernel-buffer:
*start = buf;
• In this case, the kernel will again call our
module’s ‘get_info()’ function, provided we
returned a nonzero function-value before!
The benefit
• Knowing about this alternative option, we
can design our ‘get_info()’ function so that
it delivers a big amount of data in several
small-size chunks, never overflowing the
size-limitations on the kernel’s buffer
• We just need to think carefully about the
differing senarios under which ‘get_info()’
will be repeatedly called
First pass
• The value of ‘offset’ will be zero
• We set *start to a buffer-address where
we place a positive number of data-bytes
• Kernel delivers those bytes to the ‘reader’,
taking them from the *start address, then
advances the file-pointer by that amount
• Kernel calls our ‘get_info()’ again, but with
a non-zero ‘offset’ value this time!
Final time
• When our ‘get_info()’ function has finally
finished delivering all the desired data to
the file’s ‘reader’, and still we receive yet
another ‘get_info()’ call, then we simply
return a function-value equal to zero,
telling the kernel that the data has been
exhausted -- and so not to call again!
Our implementation
struct task_struct *task;
// ‘global’ variables’ values remembered
int my_get_info( char *buf, char **start, off_t offset, int buflen )
{
int
len = 0;
if ( offset == 0 )
// our first time through this function
{
task = &init_task;
// start of circular linked-list
}
else if ( task == &init_task ) return 0;
// our final pass
// put some data into the kernel-supplied buffer
len += sprintf( buf+len, “pid=%d \n”, task->pid );
*start = buf;
// tell kernel where to find data, and to call again
task = next_task( task );
return len;
}
// advance to next node of circular list
// and tell kernel how far to advance
In-class exercise #1
• Different versions of the 2.6 Linux kernel use
slightly different definitions for the task-related
kernel data-structures (e.g., the 2.6.10 kernel
used a smaller-sized ‘thread-info’ structure than
2.6.9 kernel did)
• So, by using the C ‘sizeof’ operator, can you
quickly create an LKM that will show us:
– the size of a ‘task_struct’ object (in bytes)?
– the size of a ‘thread_info’ object (in bytes)?
‘Kernel threads’
• Some tasks don’t have a page-directory of
their own – because they don’t need one
• They only execute code, and access data,
that resides in the kernel’s address space
• They can just ‘borrow’ the page-directory
that belongs to another task
• These ‘kernel thread’ tasks will store the
NULL-pointer value (i.e., zero) in the ‘mm’
field of their ‘task_struct’ descriptor
In-class exercise #2
• Can you modify our ‘tasklist.c’ module so it
will display a list of only those tasks which
are ‘kernel threads’? (i.e., task->mm == 0)
• How many ‘kernel threads’ on your list?
Download