General Design Principles Notes

advertisement
Operating System Design
General Design Principles
Dr. C. C. Lee
Ref: Operating System Concepts by Silberschatz…
OS Design and Structure
 OS vs Kernel(privileged
part)
 Operating System Services
(Shared/Protected)
Internal Services: CPU execution
Scheduling, Memory/IO/file management
Interfaces: GUI, commands
system calls or library calls
System programs: invoking system calls
Others: Utilities, compilers,
editors, Shell, Misc. tools
 Operating System Design Goals
Varies from User to system itself
Users want: Easy to use, reliable, fast
Systems want: Easy to implement/maintain, flexible,
reliable, efficient
Varies from system to system
Embedded Systems, Server Systems, etc.
Main goals for General-purpose Systems
Define abstractions
Data structures of
Processes, files, threads, signals, I/O model
Implement primitive operations on abstractions
Read/write files
Implemented in system calls
Ensure Isolation/Protection
Users, processes, files, virtualization
Managing hardware
Support low-level chips, interrupt controllers
Device drivers
 Main Design Issues
Start with interfaces it provides
Internal system design
Mechanism/policy
System structure and architecture
 Mechanism and Policy Separation
Purpose: Minimize change, flexibility
Mechanism: define timer struct
Policy: define time quantum value
Mechanism: define priority array with
scheduler search the highest
priority to run
Policy: set the different priority
for different users or processes
 System Structure and Architecture
Layered Approach
Unix
System programs: user mode
Kernel mode
CPU/memory/process/device management,
protection, etc
Windows
System programs: user mode
Kernel mode
Kernel layer
Executive layer: Manage kernel layer objects
Hardware abstraction layer to different hardware
Monolithic Kernel (Unix, Windows…)
Single address space – direct call, fast
Problems - maintenance, flexibility, expandability,
reliability, secure
Microkernel (Mach)
Move as much from kernel to user space– reliable,
secure, extensibility
Problems –slow due to IPC (user/kernel),
with separate address space
Loadable kernel modules
Each core component is separate and loadable as
needed within the kernel.
Concerns: Kernel size due to module management
and complexity of kernel bootstrap
Choices of modules: How often modules to be
used, will kernel be built for many architectures?
Hybrid SYSTEMS
Linux kernel: in kernel address space - monolithic
plus modules for dynamic loading of functionality
Windows: most monolithic, plus microkernel
for different subsystems personalities
MacOS X: Layered, Aqua UI, Cocoa programming
environment, Mach microkernel, BSD Unix,
dynamically loadable modules (called kernel extensions)
Apple iOS: Structured Mac OS X, added functionality, layers
Android: Based on Linux Kernel (modified), layers, run-time
environments Dalvik VM, libraries, frameworks.
Processes
 Process Concept
Program in execution (text,
data, stack, heap)
Process Control Block (PCB)
context switch
Process states (ready, blocked, running)
 Process Scheduling
Long term scheduling: Job, Batch
Short term scheduling: CPU scheduling
Medium term scheduling: Swap in/out

Operations on Processes
Creation/terminate/abort
 IPC
Shared memory Approach
Processes: shared memory mapping
shmget, shmat, shmdt, shmctl
Message passing (mailbox, ports)
Various message system calls
msgget, msgctl, msgsend, msgrcv
Threads
 Processes vs Threads
Child processes, Multiple-threads
 Multithreading, multitasking
Concurrency, multiple CPUs
Sharing the process PCB, with its own
program counter, stack, register info.
thread ID.
Context switch (within same process)
overhead minimal.

User/Kernel Threads: User/Kernel Manage
Green threads, GNU Portable Threads
Windows, Linux, Mac OS X, Solaris
 Multithreading Models (User
Many-to-One: blocking
threads->Kernel)
(rare now)
Solaris Green thread, GNU portable thread
One-to-One: more concurrency, overhead
Windows, Linux…
Many-to-many: more currency
Windows
with the ThreadFiber package
Two-Level Model (M:M, Can be bounded)
 Thread Library (API)
Entirely in user space (local call)
OR
Entirely in kernel (system call)
 Three Main Thread Library (API)
POSIX Pthreads (API):
Either user or kernel library
Specification, not implementation
Common in Unix systems, Mac OS X
Windows Threads: Win32
Kernel level library
Java threads: managed by JVM
Implemented by underline OS
Windows: Win32 API
UNIX: POSIX Pthreads
 Scheduler Activations
Scheme for communication between userthread library and the kernel.
Kernel will make upcall to run a new
user thread (with a new LWP) when
blocking occurs on the current thread.
CPU Scheduling
 CPU Scheduling Basics
CPU utilization with multiprogramming
CPU and I/O cycle
 CPU Scheduler
CPU scheduling occurs when a process
state changes from:
Run to Wait, Terminate (Nonpreemptive)
Run to Ready, wait to ready (Preemptive)
 CPU Dispatcher
Switching context
Start newly selected processes
 CPU Scheduling Optimization Principle
CPU utilization
Throughput
Turnaround time
Waiting time
Response time
 CPU (Short-term) Scheduling Algorithms
FCFS
SJF: minimum average waiting time
Shortest-Remaining-Time First
Priority-based
RR (Quantum Size? Context Switch overhead)
Multi-Level Feedback Queue (Feedback
Aging or quantum expired, I/O return)

–
I/O-bound favored over CPU-bound
(feedback, aging, I/O)
Why?
How?
 Multiple-Processor Scheduling
Asymmetric Multiprocessing, SMP(common)
Processor Affinity: Soft/Hard
Load Balancing
 Real-Time Scheduling
Soft Real-time
Hard Real-time
 POSIX Real-Time Scheduling
SCHED_FIFO, SCHED_RR specified
pthread_attr_setschedpolicy(&attr, SCHED_FIFO)
 OS Scheduling Algorithm Examples
*Solaris Scheduling (Classes)
Interrupt threads: priority 160-169
Real-time threads: 100-159
System threads: 60-99
Timeshare and others: 0-59 (dispatch table)
*Windows Scheduling (Priority Classes)
A process can have the following classes:
REALTIME_PRIORITY_CLASS, HIGH_PRIORITY_CLASS …
A thread within a given priority class has a
relative priority:
TIME_CRITICAL, HIGHEST, ABOVE_NORMAL .…
*Linux Scheduling: O(1)/CFS

(later)
Priority-Inversion Problem and Priority
Inheritance Solution (L, M1,M2.., H priority)
Disk Scheduling
 Cylinders, Tracks, Sectors
 Seek Time, Rotational Delay, Transfer
Time
 Disk Arm Movement (Seek Time)
To be minimized
 Elevator Algorithm
SCAN
C-SCAN (more uniform waiting time - begin)
LOOK (more common, only to the last request)
C-LOOK
 Algorithm Selection
SSTF is common and natural
Elevator algorithm – heavy load systems
Can be influenced by file-allocation
method: contiguous allocation – nearby
SSTF or LOOK usually as default
Process (Thread) Synchronization
 Why and What ?
Parallelism/concurrency and IPC
Synchronization, Coordination
Mutual exclusion
Cooperation
 Race condition
Producer/Consumer problem
 Critical-Section (shared access)
 Solution to Critical Section Problem
Mutually Exclusive
Progress (can’t delayed indefinitely if
no other in CS)
Bounded Waiting (bounded wait of
entering times from others)
 Some Software Solutions (difficulties)
Peterson’s Solution
turn and flag
Two Processes? All Conditions Met?
 The key is: interleaving due to
Interrupts
Preemption
Only guarantee: at instruction level
 Hardware/Architecture Solution
Uniprocessors
Disable interrupts? User level?
Hardware scalable?
Atomic hardware instructions (TS, SWAP)
 Locks
Implemented using hardware instructions
Non-busy-waiting/busy-wait
Mutex, spinlock
 Semaphores
What is?
P/V,
Down/Up,
“Wait”/”Signal”
Synchronization Tool by Dijkstra
for Mutual Exclusion and Process
Cooperation
Implemented by lower level primitives
i.e. machine instructions.
 Binary/Counting Semaphores
Purpose?
Differences?
 Potential Deadlock/Starvation –
Semaphores, Example?
 Bounded Buffer Problem –
Traditional Producer/Consumer problem
(Binary/Counter semaphores used?)
 Readers and Writers Problem
Examples, Readers favored?
Binary/Counter semaphores used?
 Condition Variables, Monitors
 Pthread Examples
Mutex locks
pthread_mutex_init/lock/unlock
Condition Variables
pthread_cond_init/signal/wait
Read-Write Locks
pthread_rwlock_init/rdlock/wrlock
 Solaris, Windows, Linux Examples
All Synchronization tools are used
Solaris:
Adaptive mutex
Windows: Spinlocks (kernel),
Dispatcher objects (Executive)
Linux:
Sequencial Locks
Deadlocks
 Four Necessary Conditions
Mutual Exclusion
Hold and Wait
No Preemption
Circular waiting
 Detection and recovery
Detection
Cycle detection (single instance resource)
Allocation and check if all processes complete
Recovery
Kill/abort cycle
 Prevention
Violate any of the necessary conditions
Mutual Exclusion>
Hold and Wait>
No Preemption>
Circular waiting>
Shared
All or none allocation
Preempt, abort (transaction, DB)
Resource ordering
 Avoidance
Safe state; Safe algorithm
Banker’s algorithm – Use Safe algorithm
Example: PPTs (7.27-7.33)
Memory Management
 Memory Allocation and Relocation Problems
Contiguous Allocation -> Fragmentation (external), Compaction?
 Paging and Page Tables Mapping
Fixed block called pages mapped to page frame
No external fragmentation
(but internal fragmentation in last page frame – minor)
Table mapping->Memory Effective Access Time?
TLB to speedup with good hit ratio – program locality
Why multilevel page tables? (hierarchical) – avoid large
contiguous page tables
 Segmentation and paging
User (logical) view
Same as paging with internal Fragmentation
MULTICS, Intel Pentium
 Demand Paging (vs. Prepaging)
Pages loaded only as they are needed
 Page Faults and Handling
 Process Creation and Copy-on-Write
 Page Replacement (Policies)
FIFO (Belady’s Anomaly – more page faults with memory increased)
Optimal (Not realistic)
LRU (Processing Overhead: clock counter, linked list)
2nd Chance (clock) : Implement LRU
If the reference bit is 0, replace it
Else if the reference bit is 1, give this page 2nd chance and
move onto next page; reset reference bit to 0
If a page is used often to keep its reference bit set, it will
never be replaced.
Implementation:
The clock algorithm using a circular queue
A pointer (hand on a clock) indicates which page is to be
replaced next
When a frame is needed, the pointer advances, finds a page
with a ref bit 0
As it advances, it clears the reference bits (2nd chance)
Once a victim page is found, the page is replaced, and the new
page is inserted in the circular queue in that position
It degenerates to FIFO replacement if all bits are set
Enhanced 2nd Chance (in addition to ref. bit - modified bit)
<0,0>
<0,1>
<1,0>
<1,1>
neither recently referenced nor modified, best!
modified but not recently referenced, will need to be written.
recently referenced but clean—likely to be used again.
both—likely to be used again and will need to be written.
There are three 3 steps (4 loops) through the circular buffer:
(1) Cycle through and look for <0,0>. If one is found, use that page.
(2) Cycle through and look for <0,1>. Set the referenced bit to zero
for all frames bypassed. Afterwards, (1, 0) -> (0, 0), and (1,1,) -> (0,1)
(3) If step 2 failed, all referenced bits will now be zero and repetition
of steps 1 and 2 are guaranteed to find a frame for replacement.
 Allocation of Frames (by size, by priority, ...)
 Thrashing (excessive paging) – Min. memory needed
 Working Set, Working Set Size, Locality (in memory)
 Memory-Mapped Files: mmap() call
File I/O as memory-mapped (efficient)
Sharing
 Allocating Kernel Memory
Often allocated from a free-memory pool (contiguous)
Kernel data structures of varying size (less than a page size)
Minimize internal fragmentation and therefore should not
subject to paging system.
Often contiguous allocation required (for device to kernel)
Implementation: Buddy System, Slab Allocation (see Linux)
 Other Considerations
Page Size Selection
Table space
Fragmentation
I/O time
I/O interlock (lock page when in I/O)
 Operating System Examples
Windows
Demand paging with clustering
Replacement policy with working set Min/Max.
LRU with working set trimming (WS-Max or system
memory pressure)
VM manager periodically makes a pass through working set of
each process and increment the age for pages that have not been
marked in the PTE as referenced since last pass. LRU heuristics
policy is used to remove the old page in working set trimming.
Solaris
Demand paging with modified (2-hands) Clock algorithm
Parameters: lotsfree (start paging), desfree, minfree (swap)
File-System Essentials
 Virtual File Systems (Layered File System)
 In-Memory File System Structure
Open and Read (inode in memory)
 Design Criteria of Allocation Methods
Contiguous Allocation
Simple, Random Access, Fragmentation, Files can not grow
Linked Allocation (FAT)
Simple, No Random Access, No Fragmentation
Indexed Allocation
Random Access, No Fragmentation, Extra space form index
table
Combined Scheme (Unix)
Performance Considerations
File is usually accessed sequentially and file is small ?
Contiguous
File is usually accessed sequentially and file is large? Linked
File is usually accessed randomly and file is large? Indexed
 Implementation of Free-Space management
Bit Vector
Extra space for bit maps, easy to get contiguous file
Protect bit map on disk
Access in memory – Consistency Problem
Linked List
No waste of space, can not get contiguous space easily
Protect pointer to free list
 Efficiency and Performance of File-system Design
Efficiency Issues (disk space)
Disk allocation and directory algorithms
Types of data kept in file’s directory entry
Performance Issues
Even after the basic file-system algorithms have been selected,
We can still improve performance in several ways:
Disk Cache (memory for frequently used blocks)
Free-behind and Read-ahead to optimize sequential access
Memory as virtual disk or RAM disk
Download