Document 17844162

advertisement
Innovations in teaching OS
concepts using native NT
Arkady Retik
Program Manager
Source Asset
Management
Dave Probert
Architect
Windows Core
Kernel
Microsoft Corporation
Windows Academic Shared Source Program
Integrate Windows internals into
Operating Systems courses
Give students more real-world illustrations of the principles being taught
Achieve a better concept-to-effort
ratio for OS projects
Include examples from
the Windows kernel source code
Agenda
Program Overview
Windows OS Internals Curriculum
Resource Kit
ProjectOZ
Windows Research Kernel
Q&A
Working below ground in Windows
WEB
GUI
Applications
LOB
Services
Middleware
WinFX
Win32
POSIX
System Runtime Libraries
System
Services
Net Interfaces
Protocol Stacks
Devices
File Systems
ProjectOZ
Lecture Materials
Textbooks
System Call Interface
I/O mgr
Processes
Object mgr
SOURCE
Data cache
Threads
Registry
InterProcess
Virtual Memory
Scheduler
Synchr
Security
Interrupts
Partnership with Higher Education
We believe Microsoft technologies are
important to Computer Science education
ubiquitous
empowering
scalable
innovative
customer-driven features
We know Computer Science Education
is important to Microsoft
a source of the human and intellectual resources
that drive our industry
quality of education determines technical capabilities of
our customers
our partners
our employees
Windows Academic Shared Source Program
Windows Operating
Systems Internals
Curriculum Resource
Kit (CRK) presentation slides,
experiments, labs,
quizzes and
assignments for
introducing case
studies from the
Windows kernel
into operating
system courses.
Windows Research
Kernel – the core
CRK
WRK
ProjectOZ
kernel sources and
binaries integrated with
an environment for
building and testing
experimental
versions of the
Windows kernel
for use in teaching
and research.
Available soon
Available now
ProjectOZ - an operating
systems project
environment that uses the native kernel interfaces of Windows to provide simple,
clean, user-mode abstractions of the CPU, MMU, trap mechanism, and physical
memory that can be used to perform experiments in operating systems
principles.
Pilots this year
CRK
CRK Authors
industry
Mark Russinovich is chief software architect and cofounder of Winternals
Software (www.winternals.com), a company that specializes in advanced
systems software for Microsoft Windows. Mark is co-author of Inside
Windows 2000, 3rd Edition (Microsoft Press) with David Solomon
andsuccessor, Windows Internals, 4th Edition (Microsoft Press).
Mark is a Microsoft Most Valuable Professional (MVP) and serves as
senior contributing editor for Windows IT Pro magazine where he
contributes to the Windows Power Tools column. He is also a frequent
speaker at major industry conferences such as Microsoft Tech Ed, IT
Forum, Windows IT Pro Magazine's Connections and Redmond
Magazine's TechMentor.
Mark has a B.S. from Carnegie Mellon University and a M.S. from
Rensselaer Polytechnic Institute, both in computer engineering. In 1994,
he earned a Ph.D. from Carnegie Mellon University, also in computer
engineering.
David Solomon (www.solsem.com) teaches classes on Windows kernel
internals to developers and IT professionals at companies worldwide,
including Microsoft. He is the co-author of Windows Internals, 4th edition,
the official Microsoft Press book on Windows kernel internals, as well as
the previous edition, Inside Windows 2000. David also wrote Inside
Windows NT, 2nd edition, and Windows NT for OpenVMS Professionals.
He also co-created the Windows Internals COMPLETE video series which
Microsoft licensed for worldwide internal training. David has served as
technical chair for three past Windows NT conferences and has spoken at
many TechEds and PDCs. He was a recipient of the 1993 & 2005 Microsoft
Support Most Valuable Professional (MVP) award.
academia
Andreas Polze is the Operating Systems and Middleware Professor at the HassoPlattner-Institute for Software Engineering at University Potsdam, Germany. He
received a doctoral degree from Freie University Berlin, Germany, in 1994 and a
habilitation degree from Humboldt University Berlin in 2001, both in computer
science. His habilitation thesis investigates Predictable Computing in MulticomputerSystems. Current research interests include Interconnecting Middleware and
Embedded Systems, Mobility and Adaptive System Configuration, and End-to-End
Service Availability for standard middleware platforms.
At University Potsdam, his current teaching activities focus on architecture of
operating systems, on component-based middleware, as well as on predictable
distributed computing. Our curriculum includes lectures that discuss operating system
issues based on standard platforms (Windows 2000/XP, Mac OS X (BSD Unix), and
Solaris) as well as on embedded systems (Windows CE, Embedded Linux).
Prof. Polze was a visiting scientist with the Dynamic Systems Unit at Software
Engineering Institute, at Carnegie Mellon University, Pittsburgh, USA, were he worked
on real-time computing on standard middleware (CORBA), and with the Real-Time
Systems Laboratory at University of Illinois, Urbana-Champaign.
What about CRK content?
 cover all OS BOK units and more (based on Windows
XP/Server 2003)
 scaleable to multiple levels
 modular (can be used in whole / in part)
 case studies / compare & contrast
 Basic module provides materials to incorporate into a
complete basic level OS course of one semester in length. The
module cover the Windows OS specific topics in the core and
elective units of the OS BOK of Computing Curricula 2001.
 Advanced module provides materials to incorporate into an
advanced level OS course of one semester in length. The
module covers the Windows OS specific topics in the core and
elective units of the “CC2001” OS BOK as well as three
supplementary units.
What OS topics CRK covers?
a. Core topics
 OS1. Overview of operating systems
 OS2. Operating system principles
 OS3. Concurrency
 OS4. Scheduling and dispatch
Available
 OS5. Memory management
now!
b. Elective topics
OS6. Device management
 OS7. Security and protection
OS8. File systems
OS9. Real-time and embedded systems
OS10. Fault tolerance
OS11. System perf evaluation & troubleshooting
OS12. Scripting
c. Supplementary topics
13. Windows networking
14. Comparing the Linux and Windows Kernels
15. Windows – Unix Interoperability
Note: Labs and Exercises to reinforce the topics
 Available now @ http://www.msdnaa.net/curriculum
Anything
we
missed?
ProjectOZ
12
ProjectOZ Background
Collaboration with MSR University Relations, Windows
Kernel & Architecture Team, and Source Asset Team
Goal is to provide better support for OS instruction and
research using Windows
Part of a larger program:
• Windows Research Kernel
• Curriculum Resource Kit
• Textbooks and other resources
Based on observations from SPACE research project at UC
Santa Barbara (Probert & Bruno)
Provide an alternative to Nachos
Alpha version of ProjectOZ implemented by Paul
Turner, a summer intern from University of Waterloo
13
OS model of processor
RETI
External
interrupts
CPU
MMU
TRAP
handler
OS can only control:
MMU (memory management unit)
trap vector
MEMORY
scheduling of external interrupts
when it does an RETI (Return from Interrupt)
OS only regains control through trap/interrupt
14
SPACE
Systems Programming using Address-spaces and
Capabilities for Extensibility
– a reaction to distributed-shared virtual memory research
Key observation: extending core OS functionality difficult
because existing kernel abstractions get in the way
(i.e. threads, processes, inter-process communication)
SPACE uses lower-level abstractions:
control flow, address spaces/domains, portals
– represent hardware abstractions
i.e. CPU, MMU, trap-vectors
– then threads, processes, IPC built on top
Monolithic kernel is not necessary => fundamental
extensibility
15
Kernel Abstractions
thread
thread
thread
Process
thread
Process
kernel
pagetable
pagetable
CPU
CPU
CPU
MMU
MMU
MMU
16
SPACE Abstractions
Space: a mapping of addresses from logical to physical
Domain: permission bit-vector on each address mapping in a Space
– Each bit-vector indexed by the current protection-mode
– (Space, mode) → Domain
Portal: entry-point in a Domain
– (currDomain, trap/interrupt) → (newDomain, newPC)
– Each portal traversal saves state and associates a token
– SPACE implementation maintains stack of tokens corresponding to nested
traversals of portals on a particular CPU
– Resume reverses portal-traversal to state at top of token stack
Two portal operations
– Suspend:
• Save state token at top of current token stack
• Create empty token stack, to be used at next portal traversal
• Pass handle on token for previous stack to routine at newPC in newDomain
– Unsuspend(token) operation:
• Takes handle to a previous token stack
• Discards current token stack (if any)
• Resumes token from top of previous stack
17
Kernels out of spaces & domains
Kernel-mode memory mappings (mostly) shared in all spaces
kernel-mode
domain 0
kernel-mode
domain 0
kernel-mode
domain 1
user-mode
domain 1
user-mode
domain 1
user-mode
domain 1
space 0
space 1
space 2
spaces used to build processes
18
Following the CPU
CPU 0
Domain a
Domain b
Domain c
suspend
a b c
unsuspend
T0
suspend
Domain d
T1
Domain f
Domain d
T0
Domain e
f e d
Domain c
resume
Domain b
suspend
a
Domain d
unsuspend
T1
T0
Domain f
19
Redrawing the picture
CPU 0
Domain d
T0
a b c
SCHEDULER
Domain d
wakeup1
sleep2
sleep1
suspend
start2
suspend
Domain c
Domain f
Domain b
Domain e
Domain a
Domain d
T1
f e d
T0
unsuspend
T0
Domain c
a
sleep1
suspend
wakeup2
unsuspend
T1
Domain f
resume
Domain b
20
Building SPACE on top of NT
Spaces – use NT Processes
Domains – use a Space for each domain, but – other than
the page permissions, the logical-to-‘physical’ mappings
are identical for domains in the same space
Physical memory – creates an NT section, and selectively
creates single ‘page’ views onto the section from each
Space/Domain (64K page size)
CPUs – each domain has an NT thread corresponding to
each logical CPU configured -- with only one thread per
CPU runnable at a time
Space implementation – space.exe, controls the
simulation, provides the space primitives such as portal
traversal, implementing CPUs and MMUs
21
Building SPACE on top of NT
Exceptions – space.exe establishes an exception port for
each domain, which it uses to detect exceptions (e.g.
pagefaults) and implement portal traversal.
Traps – programmatic traps in a domain are forwarded to
space.exe for portal traversal using either NT LPC or the
exception mechanism
Interrupts – space.exe interrupts CPUs by suspending the
running NT thread, and doing get/set thread context
MMU – simulated by space.exe by modifying the views
each domain has for the ‘physical memory’ section
(using NT memory management APIs)
22
SPACE Multi-computer
Network simulator
space.exe
NT Proc
space.exe
NT Proc
NT Proc
space.exe
NT Proc
NT Proc
NT Proc
NT Proc
NT Proc
NT Proc
23
Teaching Objectives
SPACE Mission:
• An exciting, innovative, productive environment for OS
instruction & research
Goals:
• Use SPACE to abstract hardware
• Let students focus on OS data structures and algorithms
• Provide a non-simulated environment for normal
execution
• Build models for I/O devices, timers, DMA
• Support both project-level and lab-level experiments
• Provide an experimental apparatus for exploring the OS
literature
24
Approach to OS experiments
Provide the BasicOZ environment
• SPACE core implementing SPACE abstractions
• Small vanilla OS implementation on top of SPACE
• System described by XML configuration file
• Development/measurement environment
• Tools for tracing/profiling/analyzing
• Workload/test library
• Access to native NTAPIs (?)
Student experiments improve on BasicOZ
Experiments selected to complement lectures
Some experiments progressive, others independent
25
Approach to OS experiments
Multiple types of experiments can be assigned
• Lab-level experiments to implement different algorithms,
make small extensions, explore performance
• Medium-level projects that do major work on a particular
subsystem
• Competitive projects where different groups implement
different algorithms and compare resulting performance
• Literature-based projects, where students implement
algorithms/solutions from published papers
• Investigations into novel algorithms and new solutions
(open-ended)
26
BasicOZ Environment
System calls
• implementation of basic system calls, using dynamic
allocation of stacks in 'kernel'
• token-chains provide trapframes for returning to usermode
User-mode Threads
• no preemption, no guard pages on stack
System devices
• timer, clock, console, disk simulator, network simulator
(with fault-injection)
27
BasicOZ Environment
Input/output
• I/O device simulation framework
– DMA, interrupts
– simple device register operations
– simulation of IRQLs
– simulation of real device properties
Filesystem
• trivial file system
– one directory
– assumes infinite storage, contiguous allocation
– no delete or other namespace operations, no file
extension
– populated as part of system specification
28
BasicOZ Environment
Processes
• single thread
• static executable images (no libraries or relocation)
• simple create/loadimage model (not fork/exec)
• simple virtual address management with linear freelist
Virtual Memory
• no shared virtual memory
• simple pagefile management
• pagefault handler always goes to disk
• management of physical memory with linear freelist
• artificial forcing of low-memory
• random page replacement, blocking on page writes
• fetch-from-previous-space for kernel implementation
29
BasicOZ Environment
Boot loader
• load kernel configuration and images
Image library
• load the segments of an executable image into an
address space
• access symbols, relocation information, headers,
import/export tables, profiling support, stacktrace
support, disassembly
Build environments
• environment for producing the 'kernel' (server)
• environment for building test programs (client)
30
BasicOZ Environment
Debug, test, instrumentation
• execution statistics and timing
• profiling information
• tracing (flight-data recorder)
Tests & Workloads
• library of individual applets, applications, and entire
workloads for test/evaluation/demonstration, e.g.
– multi-process, multi-thread, multi-computer loads
– demonstrate synchronization, priority inversion, scheduling
characteristics
– IPC, shared-memory
– asynchronous I/O
– client/server applications
– etc, etc, etc
31
Project Areas: multi-threading
Multi-threading and synchronization primitives
• use the timer to make user-mode threads preemptive
• implement a pluggable scheduler, with several different
scheduling algorithms (including priority-based)
• demonstrate race conditions, including priority inversion
• implement basic kernel-mode blocking synchronization
primitives, like semaphores and reader/writer locks
• user synchronization primitives to eliminate race
conditions
32
Project Areas: handles
Implement handles and file table
• provide a user-mode mechanism for referencing kernelmode objects
• implement a way of referring to open files in the trivial file
system
• implement open/read/write/close on the file system
• experiment with ways of detecting bad closes and test
with poorly synchronized multi-processor workload
33
Project Areas: virtual memory
Virtual memory
• improve algorithms for managing
– physical memory
– pagefile space
– virtual addresses
• implement shared memory between processes
• implement distributed-shared-virtual-memory across a
SPACE multi-computer
34
Project Areas: processes
Process management
• create/destroy processes
– using fork
– using other algorithms
• build a capability-based sandbox
• build a process pool for isolating hosted code
35
Project Areas: I/O drivers
I/O driver
• implement IRQL-based protection of data structures
• write a traditional top-half/bottom-half I/O driver for a
simple simulated device
• add DMA
• implement asynchronous completion of I/O
36
Project Areas: IPC
Inter-Process (i.e. cross-domain) Communication
• simple reader/writer synchronization
• basic message-based IPC between processes
– copy-based
– shared-memory
• named IPC ports
• named pipes
• mailboxes
37
Project Areas: objects
Build simple kernel-level object model
• cross-domain invocation of object methods, with simple
marshalling
• build a name server
• recover from cross-domain failures
• persist objects across reboots
38
Project Areas: file system
File system (and volumes)
• build a more complex file system (on the simulated disk - or a USB thumbdrive)
• implement block management, directory hierarchies
• build a log-based file system
• implement namespace operations (like rename,
link/unlink) and test for race conditions
• implement a cache (either blocks or files)
• implement memory mapped files
• implement get/put file protocols (incl memory mapping)
• build a RAID layer below the file system, evaluating
robustness and performance)
39
Project Areas: security
Investigate security features
• give processes identities
• add ACLs to files/objects
• demonstrate buffer-overflow
• implement ‘applications’
• implement client/server impersonation
• implement client/server capability mechanism
40
Project Areas: signals/exceptions
signals and exceptions
• deliver signals to threads
• test for race conditions
• use signals for delivery of asynchronous events (like I/O
completion)
• exception notification using signals
• exception notification using unwinding
41
Project Areas: networking
networking
• using the SPACE multicomputer, build a simple network
stack
• implement sockets
• packetize streams and send between computers
• Use network unreliability feature in simulation
– implement reliable streams
– explore techniques to minimize network latencies
42
Project Areas: basic debugging
implement a basic debugger
• run/stop/step
• examine/modify memory
• disassemble
• set breakpoints
43
2006/2006 academic plans
• Initial version nearing completion (thanks Paul!)
• Start building community
• Pilot projects in China
– building on the Chinese OS principles textbook by faculty at
Peking, Tsinghua, and Behai
– considering a follow-on project book
• Will use in short-courses in Japan this year
• Talking with some U.S. schools about special topics
courses this year
• Working with faculty on proposal for internals book using
ProjectOZ as basis for experiments
• Lot of interest in Europe
44
That’s as far as our travels have taken us so far
Windows Research Kernel
45
WRK Goals
• Make it easier for faculty and students to compare &
contrast Windows to other operating systems
• Students can study source, and modify and build
projects
• Better support for research & publication based on
Windows internals
• Encourage more OS textbook and university-oriented
internals books on Windows kernel
• Simplified licensing
46
NTOS Kernel Sources
Based on Windows XP/SP2 and Windows x64 NTOS
• Processes, threads, LPC, VM, scheduler, object manager, I/O manager,
synchronization, worker threads, kernel memory manager, …
– most everything in NTOS except plug-and-play, power-management, and
specialized code such as the driver verifier, splash screen, branding, timebomb,
etc.
– non-kernel kernel-mode code (drivers, file systems, networking) code is from the
DDK and IFSKIT
• Simplified in a few places, cleaned up comments, improved spelling
• Non-source is encapsulated in a binary library
Build and set up utilities and tools
Tools for tracing, performance monitoring, logging, debugging, etc
Packaged with
–
–
–
–
–
DDK subset and documentation for working with drivers
File system sources from IFSKIT
VirtualPC product
Kernel regression tests
Documentation for Native NT API
Something over 500K lines of source
47
WRK licensing
Improvements over current MSR UR license:
– Faculty feel comfortable agreeing to its conditions
– Students can use in classroom environment
License type:
– Non commercial, academic use only; allow derivative works for noncommercial purpose
Eligibility criteria:
– Available to faculty and students in colleges/universities WW
Usage scenarios:
– View, copy, reproduce, distribute within the institution
– Modify for teaching and experimentation purposes
– Produce teaching and research publications including relevant snippets
of source
• Can use in textbooks and academic publications, and community forums
• Have to perpetuate MS copyright notices
– Share derivatives within academic community
48
Status
 CRK:
 Core & security topics are available now
 Elective & Supplementary topics will be available by
end of 2005
 ProjectOZ and WRK – we will be looking for
participants in pilots and trials AY05/06
 If you are interested - contact us at
compsci@microsoft.com
 More information on this and related topics
 Shared Source
http://www.microsoft.com/resources/sharedsource
 Curriculum Repository on MSDNAA
http://www.msdnaa.net/curriculum
Download