Kernel Services System Processes in Traditional Unix Kernel

advertisement
Kernel Services
CIS 657
System Processes in
Traditional Unix

Three processes
created at startup time


swapper
–
–
–
init
– process 1
– user-mode
– administrative tasks
(keeps getty running;
shutdown)
– ancestor of all of your
processes
– Still in FreeBSD

process 0
kernel process
moves entire
processes between
main memory and
secondary storage
pagedaemon
– process 2
– kernel process
– moves parts of
processes in and out
Kernel Processes in
FreeBSD 5.x

idle


pagedaemon

swapper



Runs when there are no
other ready procs
Moves processes from
secondary storage ->
main memory
vmdaemon

Moves processes from
main memory ->
secondary storage
Writes portions of a
process’
process’s address space
to storage

pagezero

bufdaemon


Supplies zero-filled pages
Supplies clean buffers
(by writing out dirty
buffers)
1
Kernel Processes in
FreeBSD 5.x (II)

syncer


ktrace


Logs system call trace
records to a log file
vnlru


Maintains supply of free
vnodes (LRU)
random

Ensures dirty file data
written within 30 seconds
Seeds kernel random
numbers and
/dev/random

g_event

g_up



Handles dynamic devices
Data from device drivers
-> processes
g_down

Data from processes ->
device drivers
Note on Init and Logging In





init keeps a getty process running on each
terminal port (tty
(tty))
getty initializes the port and waits for a login
name (the “login:”
login:” prompt)
getty reads a string and execs login
login prompts for the password, performs oneway encryption, and compares values
if successful, login sets the user id and execs a
shell
Run-Time Organization

Top Half







per-process stack
library of shared code
maintains process structure (always resident)
maintains user structure (can be swapped out)
division between process/user dependent on memory
never preempted for another process (but can yield
the processor)
can block interrupts by setting processor priority level
(see discussion on bottom half)
2
Run-Time Organization II

Bottom Half





handles hardware interrupts
asynchronous activities (wrt
(wrt Top Half)
special kernel stack (might not be a process
running)
Top and bottom half coordinate around work
queues using mutexes
top half starts I/O requests, waits for bottom half to
finish
Entry Into the Kernel

Hardware interrupt
 I/O
device (disks, network cards, etc.)
(used for scheduling, time of day)
 clock

Hardware trap

Software-initiated trap
 divide
by 0, illegal memory reference
 system
call
Entry Into the Kernel II

First, kernel must save machine state

Example sequence





Why?
hardware switches to kernel mode
hardware pushes onto per-process kernel stack the
PC, PSW, trap info
additional asm routine saves all other state that the
hardware doesn’
doesn’t
kernel calls a C routine--the handler.
3
Entry Into the Kernel III

Handlers for each kind of entry:
 syscall()
syscall()
for a system call
for hardware traps
 interrupt handlers for devices
 trap()

Each kind of handler takes specific
parameters (e.g., syscall number,
exception frame; or the unit number for
an interrupt).
Return From Kernel
asm routine restores registers and user
stack pointer (it undoes what the
companion asm routine did)
 hardware restores the stored PC, PSW,
etc. (undoes what it did on the way in)
 execution returns at the next instruction
in the user process

Software Interrupts






Used as low-priority processing mechanism in
the kernel
Hardware interrupts have high priority
Can put work in work queues (cf. network)
When high-priority work is done, low-priority
software interrupt does the rest
might be real interrupt, might be flag checked
in kernel (architecture-dependent)
can be preempted by another hardware
interrupt
4
Priority Levels in FreeBSD 5.2

High-priority hardware interrupt creates
work for lower-priority software interrupt
 Work
queues
Software interrupt routines lower priority
than device drivers; higher than user
processes
 Hardware interrupt > software interrupt
> user

Example: Network Packets
Device driver (hardware interrupt) takes
packets from network, puts them in a
work queue; controller re-enabled
 Software interrupt handler moves
packets from work queues to
destination processes
 Thus, delivery to processes doesn’
doesn’t
block packets coming in from the
network

What FreeBSD 4.4 Did for x86





The cpl variable holds the current priority level
Various macros are defined to set the cpl to a
new level (e.g. spl0(), splx(),
splx(), spltty())
spltty())
Interrupts at lower levels are masked (but not
really--a pending bit is set and the majority of the
work handling the interrupt is deferred)
See /usr/src/sys/i386/isa/ipl_funcs.c
Homework: find out (and submit) what FreeBSD
5.2.1 does (you may work with your lab partner).
5
Clock Interrupts


The system clock interrupts, or ticks, at regular
intervals (usually 100 Hz).
Interrupt handler calls the hardclock()
hardclock() routine,
which must run quickly



running for more than one tick will miss the next
interrupt, causing the time-of-day clock to skew
lower-priority devices (network devices, disk
controllers) cannot be serviced while hardclock()
hardclock() is
running.
Non-critical clock functions handled by softclock()
softclock()
The Four Clocks




Hardclock:
Hardclock: hardware timer, 100 Hz.
Softclock:
Softclock: handles non-critical timing work
Profclock:
Profclock: profiling clock (collect process
performance information), 1024 Hz.
Statclock:
Statclock: collects system statistics, 128 Hz.
Clock Interrupts: What
hardclock()
hardclock() does





check for an interval timer on the currently running
process
increment the time of day
do the job of profclock()
profclock() if there is no spparate profiling
clock
do the job of statclock()
statclock() if there is no separate clock for
statistics gathering
call softclock()
softclock() directly if the cpl is low (saves overhead
of a software interrupt that would just do that when
hardclock()
hardclock() returns)
6
Statistics


Historically, hardclock()
hardclock() collected resource utilization
statistics, and forced context switches
Problems (see McCanne & Torek)
Torek)



potential for inaccurate measurement of CPU utilization
inaccurate profiling
Use semi-randomized sampling with a second clock
(the stat clock, see statclock())
statclock())
charge the current process with a tick; if it has four, recalculate
priority
 record what the system was doing at time of tick

Softclock()
Softclock()

Handles events in the callout queue, such as:






timeouts (real-time timer)
retransmits dropped network packets
monitors some peripherals that require polling
process scheduling
Scheduler would/should be called every
second in a “perfect world;”
world;” uses interval timer
to run 1 second after it last finished
Discussion: why not do scheduling in
hardclock()?
hardclock()?
Callout Queue




A circular list of n queues; sorted by time of event (in
ticks)
Each queue sorted in time order (soonest first)
Pointer to current queue (now
(now)) moves around circular
list
hardclock()
hardclock() advances pointer, and softclock runs
when the lead item in the new queue has time 0.
7
Example Callout Queue
(200 queues in example)
now +199
now
now + 1
now
now+200
now
f(x)
g(y)
f(z)
now + 2
now + 2
Question: does
h(x)
f(z) happen this
now + 3
.
..
cycle, or next?
Memory Management

Two kinds of executable files in BSD Unix



interpreted
compiled (directly executed)
First 16 bits in a file contain a “magic number”
number”
telling what kind of file it is.


“#!”
#!” indicates an interpreted file; interpreter must be
directly executable (#!/bin/sh
(#!/bin/sh is the most common)
Other magic numbers indicate whether the file can
be paged and whether the text is sharable.
FreeBSD Process Layout
0xfff00000
Special stuff...
User stack
shared libs
heap

0-filled bss,
bss, stack
most of process is
demand paged into
memory (Ch. 5)
bss
Symbol table
initialized data
Initialized data
text
0x00000000

text
elf header
elf magic number
8
What Was That Special Stuff?
Per-process kernel stack
Red zone

User area
Ps_strings struct

Signal code
Env strings
argv strings

Argv,
Argv, argc,
argc, envp contain
arguments and
environment
signal code used by kernel
to deliver signals
ps_strings used by ps to
located argv of process
Env pointers
argc
argv pointers
Timing Services


Real Time: gettimeofday()
gettimeofday() returns the time
since 1 Jan 1970 in UTC (the Epoch)
adjtime()
adjtime() allows one to tweak the clock




keeps multiple machines “close enough”
enough”
response to normal clock skew
give a delta argument; speed up or slow down the
counted microseconds per clock tick by 10% until
delta is reached
Time is reported in microseconds
Interval Timers

Each process gets three interval timers



real: decrements in real time; SIGALRM; run from
timeout queue maintained by softclock()
softclock()
profiling: decrements only when process runs, but
tracks both user and kernel-mode execution;
SIGPROF; checked by profclock()
profclock()
process virtual: decrements only when the process
is running; SIGVTALRM; checked by profclock()
profclock()
9
User, Group, Other Identifiers
User ID (uid
): 32-bit identifier for all
(uid):
processes of each user, set by
administrator
 Group ID (gid
): 32-bit identifier. Many
(gid):
users in one group; many groups for
each user
 Root: uid 0, gid 0.
 These bits are checked on file access

Permission Checks






Checked in order
If the UIDF == UIDP use owner permission bits
If UIDF != UIDP, but GIDF ∈ GIDP then use
group permission bits
If UIDF != UIDP, and GIDF ∉ GIDP then use the
other permission bits
Recall discussion last time of how uid,
uid, gid set
on login
Can I own a file that I can’
can’t read?
Rights Amplification


Users may need temporary write access on files
(e.g. passwd)
passwd)
setuid()
setuid() does this





changes effective user id
real user id stays the same
effective uid also saved
seteuid()
seteuid() changes only the effective user id
setgid()
setgid()


used to work like setuid()
setuid()
now just put “effective”
effective” gid into 0th element of array
10
Effects of Syscalls on UIDs
Action
Real
Effective
Saved
Exec-normal
R
R
R
Exec-setuid
R
S
S
Seteuid(R)
R
R
S
Seteuid(S)
R
S
S
Seteuid(R)
R
R
S
Exec-normal
R
R
R
R = Real UID, S = Special-privilege UID
11
Download