The UNIX Time-Sharing System Landon Cox February 10, 2016

advertisement
The UNIX Time-Sharing
System
Landon Cox
February 10, 2016
Multics
• Multi-user operating system
• Primary goal was to allow efficient, safe sharing btw
users
• Central data abstraction in Multics
•
•
•
•
•
A segment
All data was contained within a segment
No distinction between files and memory
Accessed through loads/stores in memory
Think of a segment as an mmapped region of
memory
Unix
• Also a multi-user operating system
• In many ways a response to the complexity of Multics
• Primary goals were “simplicity, elegance, and ease of use”
• What is the central data abstraction in Unix?
• A file
• As in Multics, hierarchical namespace
• Mapped human-readable names to data objects
• Three kinds of files
• Ordinary files
• Directories
• “Special files”
Files in Unix
• How are files read and written?
• Via explicit read/write system calls
• Requires passing a buffer between process, kernel
• In what way is this better than Multics
segments?
• Much narrower interface
• Don’t have to worry about stray loads/stores
• Clean separation of ephemeral and persistent state
• What is the downside compared to segments?
• Requires extra copying
• Kernel makes copy of a buffer in its own address spaces
Data-sharing tradeoffs
One copy of shared data
Only copy reference
Changes to copies are global
Corruption visible to all
Efficiency
Share by reference
Spend time creating copies
Spend memory holding
copies
Changes to copies are local
Corruption can be contained
Share by value
Protection
Data-sharing tradeoffs
How to share by reference, value?
int P(int a){…}
Efficiency
Share by reference
void C(int x){
int y=P(x);
}
Share by value
Protection
Data-sharing tradeoffs
What was the default sharing mode for Multics?
Share by reference (via segments)
Efficiency
Share by reference
Share by value
Protection
Data-sharing tradeoffs
Unix’s approach is very different
By default, share by value;
Support share by reference when needed
Efficiency
Share by reference
Share by value
Protection
UNIX philosophy
• OS by programmers for programmers
• Support high-level languages (C and scripting)
• Make interactivity a first-order concern (via shell)
• Allow rapid prototyping
• How should you program for a UNIX system?
• Write programs with limited features
• Do one thing and do it well
• Support easy composition of programs
• Make data easy to understand
• Store data in plaintext (not binary formats)
Thompson and Ritchie
• Communicate via text streams
Turing Award ‘83
UNIX philosophy
Kernel
Proces
sC
What is the core
abstraction?
Communication via
files
?
Proces
sP
UNIX philosophy
Kernel
Proces
sC
What is the
interface?
File
Proces
sP
Open: get a file reference (descriptor)
Read/Write: get/put data
Close: stop communicating
UNIX philosophy
Kernel
Proces
sC
Why is this safer
than procedure
calls?
File
Proces
sP
Interface is narrower
Access file in a few well-defined ways
Kernel ensures things run smoothly
UNIX philosophy
Kernel
Proces
sC
How do we transfer
control to kernel?
File
Proces
sP
Special system call instruction
CPU pauses process, runs kernel
Kernel schedules other process
UNIX philosophy
Kernel
Proces
sC
Key insight:
File
Proces
sP
Interface can be used for lots of things
Persistent storage (i.e., “real” files)
Devices, temporary channels (i.e., pipes)
UNIX philosophy
Kernel
Proces
sC
Two questions
File
Proces
sP
(1) How do processes start running?
(2) How do we control access to
files?
UNIX philosophy
Kernel
Proces
sC
Two questions
File
Proces
sP
(1) How do processes start
running?
UNIX philosophy
Kernel
Proces
sC
File
Proces
sP
Maybe P is already
running?
Could just rely on kernel
to start processes
UNIX philosophy
Kernel
Proces
sC
File
Proces
sP
What might we call such a
process?
Basically what a server is
A process C wants to talk to
process someone else launched
UNIX philosophy
Kernel
Proces
sC
File
Proces
sP
All processes shouldn’t
be servers
Want to launch processes on demand
C needs primitives to create P
UNIX shell
Kernel
Shell
Program that runs other
programs
Interactive (accepts user commands)
Essentially just a line interpreter
Allows easy composition of programs
UNIX shell
• How does a UNIX process interact with a user?
• Via standard in (fd 0) and standard out (fd 1)
• These are the default input and output for a program
• Establishes well-known data entry and exit points for a program
• How do UNIX processes communicate with each other?
• Mostly communicate with each other via pipes
• Pipes allow programs to be chained together
• Shell and OS can connect one process’s stdout to another’s stdin
• Why do we need pipes when we have files?
•
•
•
•
Pipes create unnamed temporary buffers between processes
Communication between programs is often ephemeral
OS knows to garbage collect resources associated with pipe on exit
Consistent with UNIX philosophy of simplifying programmers’ lives
UNIX shell
• Pipes simplify naming
•
•
•
•
Program always receives input on fd 0
Program always emits output on fd 1
Program doesn’t care what is on the other end of fd
Shell/OS handle input/output connections
• How do pipes simplify synchronization?
• Pipe accessed via read system call
• Read can block in kernel until data is ready
• Or can poll, checking to see if read returns enough
data
How kernel starts a process
1.
2.
3.
4.
5.
Allocates process control block (bookkeeping data structure)
Reads program code from disk
Stores program code in memory (could be demand-loaded too)
Initializes machine registers for new process
Initializes translator data for new address space
• E.g., page table and PTBR
Need
• Virtual addresses of code segment point to correct physical locationshardware
6. Sets processor mode bit to “user”
7. Jumps to start of program
support
Creating processes
• Through what commands does UNIX create
processes?
• Fork: create copy child process
• Exec: initialize address space with new program
• What’s the problem of creating an exact copy
process?
• Child needs to do something different than parent
• i.e., child needs to know that it is the child
• How does child know it is child?
• Pass in return point
• Parent returns from fork call, child jumps into other region of
code
• Fork works slightly differently now
Fork
• Child can’t be an exact copy
• Is distinguished by one variable (the return value of
fork)
if (fork () == 0) {
/* child */
execute new program
} else {
/* parent */
carry on
}
Creating processes
• Why make a complete copy of parent?
•
•
•
•
•
Sometimes you want a copy of the parent
Separating fork/exec provides flexibility
Allows child to inherit some kernel state
E.g., open files, stdin, stdout
Very useful for shell
• How do we efficiently copy an address
space?
• Use “copy on write”
• Make copy of page table, set pages to read-only
• Only make physical copies of pages on write fault
Copy on write
Physical
memory
Parent
memory
Child
memory
What happens if parent writes to a
page?
Copy on write
Physical
memory
Parent
memory
Child
memory
Have to create a copy of pre-write
page for the child.
Alternative approach
• Windows CreateProcess
• Combines the work of fork and exec
• UNIX’s approach
• Supports arbitrary sharing between parent and child
• Window’s approach
• Supports sharing of most common data via params
Shells (bash, explorer, finder)
• Shells are normal programs
• Though they look like part of the OS
• How would you write one?
while (1) {
print prompt (“crocus% “)
ask for input (cin)
// e.g., “ls /tmp”
first word of input is command
// e.g., ls
fork a copy of the current process (shell)
if (child) {
redirect output to a file if requested (or a pipe)
exec new program (e.g., with argument “/tmp”)
} else {
wait for child to finish
or can run child in background and ask for another command
}
}
UNIX philosophy
Kernel
Proces
sC
Two questions
File
Proces
sP
(1) How do processes start running?
(2) How do we control access to
files?
UNIX philosophy
Kernel
Proces
sC
Two questions
File
Proces
sP
(1) How do processes start running?
(2) How do we control access to
files?
Access control
• Where is most trusted code located?
• In the operating system kernel
• What are the primary responsibilities of a UNIX
kernel?
• Managing the file system
• Launching/scheduling processes
• Managing memory
• How do processes invoke the kernel?
•
•
•
•
Via system calls
Hardware shepherds transition from user process to kernel
Processor knows when it is running kernel code
Represents this through protection rings or mode bit
Access control
• How does kernel know if system call is
allowed?
• Looks at user id (uid) of process making the call
• Looks at resources accessed by call (e.g., file or pipe)
• Checks access-control policy associated with
resource
• Decides if policy allows uid to access resources
• How is a uid normally assigned to a
process?
• On fork, child inherits parent’s uid
MOO accounting problem
• Multi-player game called Moo
Game
client
(uid x)
• Want to maintain high score in a file
• Should players be able to update score?
• Yes
• Do we trust users to write file directly?
• No, they could lie about their score
“x’s score = 10”
High
score
“y’s score = 11”
Game
client
(uid y)
MOO accounting problem
• Multi-player game called Moo
Game
client
(uid x)
• Want to maintain high score in a file
• Could have a trusted process update scores
“x’s score = 10”
Game
server
• Is this good enough?
“x:10
y:11”
High
score
“y’s score = 11”
Game
client
(uid y)
MOO accounting problem
• Multi-player game called Moo
Game
client
(uid x)
• Want to maintain high score in a file
• Could have a trusted process update scores
“x’s score = 100”
Game
server
• Is this good enough?
“x:100
y:11”
High
score
“y’s score = 11”
• Can’t be sure that reported score is genuine
• Need to ensure score was computed correctly
Game
client
(uid y)
Access control
• Sometimes simple inheritance of uids is insufficient
• Tasks involving management of “user id” state
• Logging in (login)
• Changing passwords (passwd)
• Where have we put management code before?
• Put it in the kernel (e.g., file system and page table code)
• Why not put login, passwd, etc inside the kernel?
• This functionality doesn’t really require interaction w/ hardware
• Would like to keep kernel as small as possible
• How are “trusted” user-space processes identified?
• Run as super user or root (uid 0)
• Like a software kernel mode
• If a process runs under uid 0, then it has more privileges
Access control
• Why does login need to run as root?
• Needs to check username/password correctness
• Needs to fork/exec process under another uid
• Why does passwd need to run as root?
• Needs to modify password database (file)
• Database is shared by all users
• What makes passwd particularly tricky?
• Easy to allow process to shed privileges (e.g., login)
• passwd requires an escalation of privileges
• How does UNIX handle this?
• Executable files can have their setuid bit set
• If setuid bit is set, process inherits uid of image file’s owner on exec
MOO accounting problem
• Multi-player game called Moo
• Want to maintain high score in a file
Shell
(uid x)
“fork/exec
game”
Game
client
(uid moo)
• How does setuid solve our problem?
•
•
•
•
Game executable is owned by trusted entity
Game cannot be modified by normal users
Users can run executable though
High-score is also owned by trusted entity
“x’s score = 10”
High score
(uid moo)
• This is a form of trustworthy computing
• Only trusted code can update score
• Root ownership ensures code integrity
• Untrusted users can invoke trusted code
“y’s score = 11”
Shell
(uid y)
“fork/exec
game”
Game
client
(uid moo)
Summary of UNIX
• Share-by-copy is easier for programmers
•
•
•
•
Everything looks like a file
Standardize interface (open, read/write, close)
Standardize entry/exit points (stdin, stdout)
Read in copy, work on copy, copy out results
• Try to make share-by-copy more efficient
• Use copy-on-write whenever possible
• Next time
• Sharing across machines (RPC, code offload)
Download