W4118 Operating Systems Interrupt and System Call in Linux Instructor: Junfeng Yang

advertisement
W4118 Operating Systems
Interrupt and System Call in Linux
Instructor: Junfeng Yang
Logistics

TAs

Supreeth Subramanya
• Office Hours: M 3-5pm
• Address: CEPSR 7LW1

Yunling Wang
• Office Hours: W 1-3pm
• Address: TA room (Mudd 122A)

Heming Cui
• Office Hours: F 4-6PM
• Address: TA room (Mudd 122A)
Logistics (cont.)

Textbooks


Bookstore is working on the order
We’ve included the problem statements in
homework 1 page
Homework 1 clarifications


Your shell should wait for command to finish
 While command running, don’t prompt or accept new
command
 NOTE: wait for the entire pipeline to finish
When do IO redirection and pipe conflict?
 Tie two things to one file descriptor
• Bad: “ls > 1.txt | grep FOO”
• bad: “ls | sort < file.txt”


Different shells handle conflicts differently

Your shell should emit an error.
• tcsh emits error. “Ambiguous output redirect.”
• bash is silent.
Any questions?
Last lecture

OS: event driven

Events from device: interrupt





Computer organization: CPU, device, memory, bus
CPU’s “fetch-execute” cycle
How to start this cycle: boot process
Devices need CPU’s immediate attention. How? interrupt
How it works
• PIC translates IRQs to interrupt #
• CPU looks up handler in Interrupt Descriptor Table

Traps (or Exceptions): raised inside CPU
Last lecture (cont.)

Events from application: system call

Often implemented via trap, e.g. int 0x80 in Linux
The need for protection

Dual-mode operation: user mode and kernel mode




Privileged instructions can only execute in kernel mode
Apps transit into kernel via system calls, so kernel can
validate the calls and perform privileged instructions for
them
OS structure


Simple
Layered
Today

OS structure (cont.)


Monolithic kernel v.s. Microkernel
Virtual machines

Intro to Linux

Interrupts in Linux

System calls in Linux
Monolithic kernel

All OS components run in kernel mode
User mode
Kernel mode

FS
Mem
Net
Why good?


APP
Can be efficient. Cross-component access cheap
Why bad?

No boundaries  Big, complex kernel  hard to change

Trusted computing base (TCB) large, one error  entire
kernel crash, or be compromised
• Hard to do new stuff in OS  OS researchers unhappy
• No flexibility for apps. Hard to customize for speed
(web server)
Microkernel


Moves as much from the kernel into “user” space
Restricted interface: no direct memory sharing
between modules; need to send messages via kernel
User mode FS
Kernel mode

Mem Net
kernel
APP
Why good? Claimed advantages:




Extensibility: new module = new user space program/library
Flexibility: app can have own FS, Mem, Net, can make them fast
Portability: easier to port kernel to new hardware
Reliability & security: each module has own protection domain. if
crash, just restart; can’t affect other modules.
Microkernel (cont.)

Big thing in 90s; best people worked on microkernel


Students became top school professors
Problem: slow, too many user-kernel crossings

Can be fixed with fast IPC

However, there remain problems. In the end, either
download extensions into kernel, or merge all modules
into a library  looks like monolithic kernels, maybe
even more complicated!
Today: Windows, Linux, *BSD, MacOS, all monolithic

Some criticism on microkernel


Restricted interface  complicated implementation
• No shared state, hard to manage consistency

Reliability & security: one key module fails, apps fail
Modules


Most microkernel advantages due to modularity
Most modern operating systems implement kernel
modules
 Uses object-oriented approach
• Function pointers in Linux: strawman OOP with C

Each talks to the others over known interfaces
• But share one protection domain, so just call function


Each is loadable as needed within the kernel
Overall, similar to microkernel, but more flexible
User mode
Kernel mode
FS
APP
Mem
Net
Virtual Machine

Virtual Machine Monitor (VMM): kernel that
provides hardware interface
User mode
Kernel mode

Why good?




APP
APP
APP
OS
OS
OS
VMM
Isolation. Strong protection between VMs
Consolidation. One physical machine, multiple VMs
Mobility. Can move VMs around
Standardization: same hw  better system mgmt
Virtual Machine (cont)

Normal operating system environment:



Virtualized guest operating systems:



running in supervisor mode
full access to machine state and I/O devices
running in user mode
no direct access to machine state
Tasks of the virtual machine monitor:



reconciling the virtual and physical architecture
preventing virtual machines from interfering with
each other or the monitor
Do it fast? Not a easy job …
Hosted virtual machines:
VMware Desktop Products Architecture
Today

OS structure (cont.)

Intro to Linux

Interrupts in Linux

System calls in Linux
What is Linux?

A modern, open-source OS based on UNIX standards



1991: written by Linus Torvalds from scratch, 0.1 MLOC
• major design goal of UNIX compatibility
Now: many developers worldwide, 10 MLOC
Unique management model
• Distributed development, central check in

Linux distributions


Ubuntu, Debian, Fedora, Redhat, CentOS, Slackware,
Mandrake Linux, DreamLinux, SELinux, Gentoo, …
All based on the Linux kernel, with different set of
applications, package management methods and
configurations
Linux Licensing


The Linux kernel is distributed under the GNU
General Public License (GPL), the terms of which
are set out by the Free Software Foundation
Anyone using Linux, or creating their own
derivative of Linux, may not make the derived
product proprietary; software released under the
GPL may not be redistributed as a binary-only
product
Linux kernel structure



Core + dynamically loadable modules
Modules include: device drivers, file systems, network
protocols, etc
Modules were originally developed to support the
conditional inclusion of device drivers

Early OS kernels would need to either:
• include code for all possible devices or
• be recompiled to add support for a new device


Now, Modules can be dynamically loaded and unloaded
Modules are used extensively
Linux kernel structure (cont.)
Applications
System Libraries (libc)
Modules
System Call Interface
I/O Related
File Systems
Networking
Process Related
Scheduler
Memory Management
Device Drivers
IPC
Architecture-Dependent Code
Hardware
Linux source tree



Download: kernel.org (all releases + revision
history)
Browse: lxr.linux.no (with cross reference)
Directory structure



Public header files: include/
Each component is a subdir (e.g. mm/, ipc/ driver/)
Usually interface + common functions + loadable
modules
Today

OS structure (cont.)

Intro to Linux

Interrupts in Linux


How interrupts implemented Linux, using x86 as ex
System calls in Linux
Types of Interrupts on 80386

Interrupts, asynchronous, from external devices,
not related to code running



Maskable interrupts
Nonmaskable interrupts (NMI): hardware error
Exceptions, synchronous, raised by CPU
 Processor-detected exceptions:
• Faults — correctable; offending instruction is retried
• Traps — often for debugging; instruction is not retried
• Aborts — major error (hardware failure), EIP wrong

Programmed exceptions:
• Requests for kernel intervention (software intr/syscalls)
Faults


Instruction would be illegal to execute
Examples:






Writing to a memory segment marked ‘readonly’
Reading from an unavailable memory segment
(on disk)  page fault
Executing a ‘privileged’ instruction
Detected before incrementing the IP
The causes of ‘faults’ can often be ‘fixed’
If a ‘problem’ can be remedied, then the CPU
can just resume its execution-cycle
Traps

A CPU might have been programmed to
automatically switch control to a ‘debugger’
program after it has executed an instruction

That type of situation is known as a ‘trap’

It is activated after incrementing the IP
Handling Exceptions






Most error exceptions — divide by zero, invalid
operation, illegal memory reference, etc. — translate
directly into signals
This isn’t a coincidence. . .
The kernel’s job is fairly simple: send the appropriate
signal to the current process
 force_sig(sig_number, current);
That will probably kill the process, but that’s not the
concern of the exception handler
One important exception: page fault
An exception can (infrequently) happen in the kernel
 die(); // kernel oops
Interrupt # assignment




Total possible 0-255 Interrupt ID numbers
First 32 reserved by Intel for NMI and exceptions
OS’s such as Linux are free to use the remaining 224
available interrupt ID numbers for their own purposes (e.g.,
for service-requests from external devices, or for other
purposes such as system-calls)
We’ve seen many examples in last lecture





0: divide-overflow fault
3: breakpoint
14: Page-Fault Exception
128: system call
Called “vector” in ULK
Interrupts in Linux
Memory Bus
intr #
IRQs
PIC
idtr
INTR
CPU
0
IDT
intr #
ISR
Assign IRQ to dev?
IRQ to Intr #?
Mask points
255
Assigning IRQs to Devices

IRQ assignment is hardware-dependent




Sometimes it’s hardwired, sometimes it’s set physically,
sometimes it’s programmable
PCI bus usually assigns IRQs at boot
Some IRQs are fixed by the architecture
 IRQ0: Interval timer
 IRQ2: Cascade pin for 8259A
Linux device drivers request IRQs when the device is
opened


Especially useful for dynamically-loaded drivers, such as for
USB or PCMCIA devices
Two devices that aren’t used at the same time can share an
IRQ, even if the hardware doesn’t support simultaneous
sharing
Assigning Interrupt # to IRQs


Intr #: index (0-255) into interrupt descriptor table
Intr #: usually IRQ + 32




Below 32 reserved for non-maskable intr & exceptions
Maskable interrupts can be assigned as needed
Vector 128 used for syscall
Vectors 251-255 used for Inter-Processor Interrupt (IPI)
Interrupts in Linux
Memory Bus
intr #
IRQs
PIC
idtr
INTR
CPU
0
IDT
intr #
ISR
Multicore?
Mask points
255
Multiple Logical Processors
Multi-CORE CPU
CPU
0
CPU
1
LOCAL
APIC
LOCAL
APIC
I/O
APIC
Advanced Programmable Interrupt Controller is needed to
perform ‘routing’ of I/O requests from peripherals to CPUs
APIC, IO-APIC, LAPIC

Advanced PIC (APIC) for SMP systems




Local APIC (LAPIC) versus “frontend” IO-APIC



Used in all modern systems
Interrupts “routed” to CPU over system bus
IPI: inter-processor interrupt
Devices connect to front-end IO-APIC
IO-APIC communicates (over bus) with Local APIC
Interrupt routing



Allows broadcast or selective routing of interrupts
Ability to distribute interrupt handling load
Routes to lowest priority process
• Special register: Task Priority Register (TPR)

Arbitrates (round-robin) if equal priority
Interrupts in Linux
Memory Bus
intr #
IRQs
PIC
idtr
INTR
CPU
0
IDT
intr #
ISR
How to set up IDT?
Mask points
255
Interrupt Descriptor Table


The ‘entry-point’ to the interrupt-handler is located via
the Interrupt Descriptor Table (IDT)
IDT: “gate descriptors”


Location of handler
Descriptor Privilege Level (DPL), prevent bad access
• Can invoke only when current privilege level (CPL) < DPL
• This is just the mode bit for protection

Gates (slightly different ways of entering kernel)
• Interrupt gate: disables further interrupts
• Trap gate: further interrupts still allowed
• Task gate: includes TSS to transfer to (used when EIP is
bad, or hardware failure)
IDT Initialization

Initialized once by BIOS in real mode


Must not expose kernel to user mode access


Linux re-initializes during kernel init
start by setting all descriptors to null handler ignore_int()
Then, set up entries we handle

E.g. arch/i386/kernel/traps.c, function trap_init()
Linux lingo

Interrupt gate = Intel Interrupt, maskable or non maskable




System gate = Intel trap with user access (DPL = 3) and
interrupt enabled




no user access (DPL = 0)
disable interrupt when invoking handler
E.g. set_intr_gate(2, &nmi)
into (#4), bounds (#5), system call (#128)
E.g. set_system_gate(4, &overflow)
Sometimes want to disable interrupt for int3,
set_system_interrupt_gate(3, &int3)
Trap gate == Intel trap and fault, no user access (DPL = 0)
and interrupt enabled

set_trap_gate(0, &divide_error)
Interrupts in Linux
Memory Bus
intr #
IRQs
PIC
idtr
INTR
CPU
0
IDT
intr #
ISR
How to load ISR?
Mask points
255
Loading an Interrupt handler





Hardware locates the proper gate descriptor
for this interrupt vector, and locates the new
context
Verifies Current Privilege Level (CPL) <=
Descriptor Privilege level (DPL)
Load a new stack pointer if needed
Hw saves old IP, etc on new stack
Set IP, etc to interrupt handler = invoke
handler


disable interrupt by unsetting IF bit in eflags
register
Handler saves old CPU state on new stack
Finding the Proper Handler




On modern hardware, multiple I/O devices can
share a single IRQ and hence interrupt vector
First differentiator is the interrupt vector
Multiple interrupt service routines (ISR) can
be associated with a vector
Each device’s ISR for that IRQ is called; the
determination of whether or not that device
has interrupted is device-dependent
Next lecture

Interrupts in Linux (cont.)

System calls in Linux

Process (read OSC ch 3)
Download