Windows NT Internal Architecture

advertisement
®
Windows NT Internals
David Solomon
David Solomon Expert Seminars
Microsoft Corporation
Agenda





Introduction
Tools
System Architecture
Processes and Threads
Memory Management
About The Speaker
David Solomon





14 years at Digital - the last 10 as a
developer in the VMS operating
system development group
Started Windows NT developer
training company in 1992
Author of Inside Windows NT, 2nd
edition (Microsoft Press) and
Windows NT for OpenVMS
Professionals (Digital Press)
Regular speaker at industry
conferences (WinDev, Tech•Ed,
Software Development, DECUS...)
Recipient of past Microsoft MVP
award for MSWIN32 technical support
About The Company

David Solomon Expert Seminars offers high-quality
Windows developer training


Instructors include:


Doug Boling, Brian Catlin, Jamie Hanrahan, Jeff
Prosise, Jeffrey Richter, and David Solomon
Topics include:








Taught by well known industry experts and authors
who develop and teach their own courses
Windows CE
Windows NT Internals
Windows NT and WDM Device Drivers
Windows NT® Server Applications
Win32® Programming
Visual C++® and MFC
COM/ActiveX® Programming
To be notified of new classes and other
developments, join our e-mail interest list
Session Goals

Goals



Audience assumptions



Explain internal architecture and operation of core
Windows NT components
Use various tools that demonstration internal
Windows NT behavior
Familiar with basic 32-bit OS concepts
Familiar with Win32 API (processes, threads,
memory management)
Acknowledgements


Jamie Hanrahan (jeh@cmkrnl.com - www.cmkrnl.com),
co-author of the Windows NT internals seminar from
which these slides were taken
Dave Cutler, Helen Custer, John Balciunas, Lou Perazzoli,
Mark Lucovsky, Steve Wood, Tom Miller, Gary Kimura,
and Landy Wang for their support and assistance in
understanding Windows NT internals
Windows NT Architecture Environment
System
Processes
Service
Controller
User
Mode
Services
Alerter
RPC
Event
Logger
System
Threads
File
systems
POSIX
Replicator
WinLogon
Session
Manager
Kernel
Mode I/O Manager
Subsystems
Applications
User
Application
Subsystem DLLs
OS/2
Win32
NTDLL.DLL
Executive API
Cache
Manager
Processes
& Threads
Security
Virtual
Memory
Win32
User,GDI
Object management / Executive RTL
Device drivers
Kernel
Hardware Abstraction Layer (HAL)
Hardware interfaces (buses, I/O, interrupts, timers,
clocks, DMA, cache control, etc.)
Copyright by Microsoft Corporation. Used by permission.
Windows NT 5.0
Internal changes

In one sense, much is the same


Basic architecture of many
components unchanged:
 Win32 subsystem, memory manager, process
model, thread scheduling, security model,
file system
But lots of additions of major
new functionality:

Active Directory, distributed security, Kerberos,
Microsoft management console, IntelliMirror™,
NTFS extensions (content indexing, quotas, reparse
points, sparse files, link tracking)
Windows NT 5.0
Internal changes

Kernel/core changes include:






I/O system (plug and play and power management)
64-bit Very Large Memory support for Alpha
Job object
Integration of Terminal Server
Comparable to level of change from 3.51 to 4.0
Also many incremental
performance improvements:

Object Manager, Memory manager (e.g., working set
management algorithms), SMP scalability…
Agenda





Introduction
Tools
System Architecture
Processes and Threads
Memory Management
Tools Preview
tool
Performance Monitor
Registry Editor
Windows NT Diagnostics
Kernel Debugger
Pool Monitor
Global Flags
Open Handles
QuickSlice
Process Viewer
Process Exploder
Process Status
Pmon
Object Viewer
Process Walker
Page Fault Monitor
Spy++
executable
PerfMon
RegEdt32
WinMSD
i386kd,
alphakd
poolmon
gflags
oh
qslice
pviewer,
pview
pview
pstat
pmon
WinObj
PWalk
PFMon
origin
Windows NT
Windows NT
Windows NT
Widows NT CD \support\debug
Windows NT CD \support\debug
Windows NT Resource Kit
Windows NT Resource Kit
Windows NT Resource Kit
Windows NT Resource Kit
Platform SDK, VC++
Windows NT Resource Kit 4.0
Windows NT Resource Kit
Windows NT Resource Kit
Platform SDK
Platform SDK
Platform SDK
Visual C++
Windows NT Resource Kits

Full “Windows NT 5.0 Resource Kit”



250+ utilities
Combines what was in the 4.0 Server and
Workstation resource kits
Subset “Windows NT 5.0 Resource Kit
Support Tools”


50 utilities
Ships in \support\reskit on Windows NT CD
www.sysinternals.com

Windows NT internals articles and tools


Some examples:






Some generated using reverse engineering
(e.g., no source access)
winobj - view object manager namespace
and objects
nthandlex - show open handles by process
ntfilmon - log all file I/O operations
ntregmon - log all registry accesses
cpufrob - change thread quantum
Caveat: Most include a device driver, hence
you’re added “trusted code”

No warranty on using these on your system!
GFLAGS (Global Flags)



Changes system-wide
or image-wide
debugging flags
Poolmon requires
“enable pool taggin”
Oh (open handles)
requires “maintain a
list of objects for
each type”
Windows NT Kernel
Debugger (1 Of 4)

Two versions:


Command line: I386KD.EXE, ALPHAKD, etc., shipped with
Windows NT

In NTcdrom:\support\debug\i386, … \debug\alpha, etc.

Select directory to match host system (where you will
run the debugger executable); select executable to
match target system (system being debugged)

Also need many DLLs from this directory

Also need symbol files from
NTcdrom:\support\debug\targetarch\symbols\ …
Extended via WinDbg shipped with Platform SDK
(part of MSDN Professional)

Provides GUI, fully-symbolic, source-level debugging

Needs same DLLs and symbol files
Windows NT Kernel
Debugger (2 Of 4)

Documentation:




Windows NT Workstation Resource Guide
(see “Windows NT Debugger”)
Windows NT Device Driver Kit (DDK)
See i386kd -?
Help within debugger: commands “?” and “!?”
and “!help”
Windows NT Kernel
Debugger (3 Of 4)

Two modes of operation:


Open a crash dump file:
C:\> set _NT_SYMBOL_PATH=
ntcdrom:\support\debug\i386\symbols
C:\> i386kd -Z dumpfilename
Connect to a live system via null modem cable
(must boot target system with /DEBUG/DEBUGPORT=COMn in
boot.ini)
C:\> set
_NT_SYMBOL_PATH=ntcdrom:\support\debug\i386\symbols
C:\> set _NT_DEBUG_PORT=COMn
default COM1
C:\> set _NT_DEBUG_BAUD_RATE=nnnnn
default 19200
C:\> i386kd
serial “null modem” cable
(for debugger)
host
target
Windows NT Kernel
Debuggers (4 Of 4)

Third-party product: SoftICE for
Windows NT (NuMega)



Runs on same system - e.g., doesn’t
require second system for live debugging
x86 only
See www.numega.com
Agenda



Introduction
Tools
System Architecture










Kernel Mode Environment
Executive, Kernel, HAL, Drivers
Product Packaging
System Threads
Environment Subsystems
System Service Dispatching
Process-based Windows NT code
Summary
Processes and Threads
Memory Management
Kernel Mode Versus User Mode

A processor state





Controls access to memory
Each memory page is tagged
to show the required mode for
reading and for writing
 Protects the system from
the users
 Protects the user (process)
from themselves
 System is not protected
from system
Code regions are tagged “no
write in any mode”
Controls ability to execute
privileged instructions
A Windows NT abstraction
 Intel: Ring 0, Ring 3
 PerfMon, Processor:
“Privileged Time” and
“User Time”

Associated with threads



Threads can change from
user to kernel and back
Part of saved context, along
with registers, etc.
Does not affect scheduling
Components
Access mode
Applications
User
Subsystem processes
User
Executive
Kernel
Kernel
Kernel
Drivers
Kernel
HAL
Kernel
Getting Into Kernel Mode
Code is run in kernel mode for one of three reasons:
1. Requests from user mode


Via the system service dispatch mechanism
Kernel-mode code runs in the context of the requesting thread
2. Interrupts from external devices




Windows NT-supplied interrupt dispatcher invokes the interrupt
service routine
ISR runs in the context of the interrupted thread (so-called
“arbitrary thread context”)
ISR often requests the execution of a “DPC routine,” which also
runs in kernel mode
Time not charged to interrupted thread
3. Dedicated kernel-mode system threads


Some threads in the system stay in kernel mode at all times
(mostly in the “System” process)
Scheduled, preempted, etc., like any other threads
Interrupt Dispatching
user or
kernel mode
code
interrupt !
kernel mode
Note, no thread or
process context
switch!
Interrupt dispatch routine
Disable interrupts
Interrupt service routine
Record machine state (trap
frame) to allow resume
Mask equal- and lower-IRQL
interrupts
Find and call appropriate
ISR
Dismiss interrupt
Restore machine state
(including mode and
enabled interrupts)
Tell the device to stop
interrupting
Interrogate device state,
start next operation on
device, etc.
Request a DPC
Return to caller
Interrupt Precedence Via IRQLs
IRQL = Interrupt Request Level




31
30
29
28
2
1
0

The “precedence” of the interrupt
with respect to other interrupts

Different interrupt sources have
different IRQLs
Not the same as IRQ
High
Power fail
Interprocessor Interrupt
Clock
Device n
.
.
.
Device 1
Dispatch/DPC
APC
Low

IRQL is also a state of the
processor
Servicing an interrupt raises
processor IRQL to that
interrupt’s IRQL

This masks subsequent
interrupts at equal and lower
IRQLs
User mode is limited to IRQL 0
Hardware interrupts
Deferrable software interrupts
normal thread execution
Alpha IRQLs

IRQL on Alpha implemented in PAL code
7
6
5
4
3
2
1
0
High
Interprocessor Interrupt
Clock
Device High
Device
Dispatch/DPC
APC
Low
DPCs (Deferred Procedure Calls)

A list of “work requests”



One queue per processor (but processors can run each others’ DPCs)
Implicitly ordered by time of request (FIFO)
Used to defer processing from higher (device) interrupt level to a
lower (dispatch) level


Used heavily for driver
“after interrupt” functions
Used for quantum end and timer expiration
queue head
DPC object
DPC object
DPC object
DfrdCtx
SysArg1
SysArg2
XydriverDpcRtn(DpcObj,
DfrdCtx,SysArg1,SysArg2)
{
// ...
}
Accounting For
Kernel-Mode Time




“Processor Time” = total busy
time of processor (equal to
elapsed real time - idle time)
“Processor Time” = “User
Time” + “Privileged Time”
“Privileged Time” = time
spent in kernel mode
“Privileged Time” includes:



Interrupt Time
DPC Time
Again note: interrupts and
DPCs are not charged to any
process or thread
Screen snapshot from: Programs |
Administrative Tools | Performance Monitor
click on “+” button, or select Edit | Add to chart…
Agenda



Introduction
Tools
System Architecture










Kernel Mode Environment
Executive, Kernel, HAL, Drivers
Product Packaging
System Threads
Environment Subsystems
System Service Dispatching
Process-based Windows NT code
Summary
Processes and Threads
Memory Management
Windows NT Executive


Upper layers of operating system
Provides “generic OS” services



Almost completely portable C code
Exports functions (“services”) which may
be invoked via user-mode APIs



Processes, threads, memory management,
I/O, interprocess communication,
synchronization, security
Interface is NTDLL.DLL
E.g., Win32 ReadFile -> executive NtReadFile
Most interfaces to executive services not
documented

Used by subsystem writers
Windows NT Kernel

Abstracts differences between processor
architectures


Main services





x86 vs. Alpha vs., etc.
Thread scheduling and context switching
Generic wait operations
Exception and interrupt dispatching
Operating system synchronization
primitives (MP and UP)
Machine Independent C
Not a classic “microkernel”

shares address space
withrest of kernel-mode
components
Assembler
Machine Dep. C
HAL - Hardware
Abstraction Layer

A separate loaded binary (c:\winnt\system32\hal.dll)



Several different versions for different motherboards, UP vs. MP, etc.
Installation procedure selects appropriate HAL for platform and copies
to Hal.Dll on system disk
Purpose:
Isolate (abstract) Kernel and Executive from platform-specific details
 Present uniform model for ease of driver development
Sample HAL routines:
 HAL abstracts:
 I/O system specifics (bus interfaces, DMA…)
HalGetBusData
 System timers, Cache coherency and flushing HalGetBusDataByOffset
HalAssignSlotResources
 SMP support, Hardware interrupt priorities
HalSetBusData



OEM Development Kit needed to buildHALs
HAL contains some Executive and
Kernel subroutines
HalSetBusDataByOffset
HalTranslateBusAddress
HalGetInterruptVector
HalGetAdapter
READ_REGISTER_ULONG
WRITE_PORT_UCHAR
Kernel-Mode Device Drivers

Separate loadable modules (drivername.SYS)




Only way to add “kernel extensions” or to access
kernel mode system routines
Defined in registry




Linked like .EXEs
Linked against NTOSKRNL.EXE and HAL.DLL
Same area as Win32 services (t.b.d.)
Differentiated by Type value
View loaded drivers with pstat.exe, drivers.exe
Several types:




“Ordinary” hardware drivers
File system
NDIS miniport, SCSI miniport (linked against port drivers)
Win32K.Sys - Windowing system
WDM (Win32 Driver Model)



Extension to Windows NT driver model
to support for Plug and Play and Power
Management
Allows source/(x86) binary-compatible
drivers across Windows 98 and
Windows NT 5.0
Non trivial additions to existing drivers:




3 new major IRP types
36 new minor IRPs added
6 new miniport driver types
Supporting WDM affects every area of
a driver
WDM Drivers

What’s covered in WDM:








IEEE 1394 (Firewire)
Universal Serial Bus (USB)
Audio: Speakers, microphone, CODEC
Human Interface Devices: mouse, keyboard,
monitor controls, game devices
Still Imaging: Cameras, scanners
Video Devices: Video capture, DVD
Advanced Power and Configuration Interface
(ACPI) BIOS support
Not covered by WDM:




Network
Storage
File System
Video
Agenda



Introduction
Tools
System Architecture










Kernel Mode Environment
Executive, Kernel, HAL, Drivers
Product Packaging
System Threads
Environment Subsystems
System Service Dispatching
Process-based Windows NT code
Summary
Processes and Threads
Memory Management
NTOSKRNL.EXE
System Processes Services
Alerter
WinLogon
Session
Manager
RPC
Event
Logger
System
Threads
File
systems
User
Application
Subsystem DLLs
OS/2
Win32
NTDLL.DLL
Executive API
Cache
Manager
Processes
& Threads
Security
Virtual
Memory
Win32
User,GDI
Object management / Executive RTL
Device drivers
Kernel
Hardware Abstraction Layer (HAL)
Hardware interfaces (buses, I/O, interrupts, timers,
clocks, DMA, cache control, etc.)
Copyright by Microsoft Corporation. Used by permission.
NtosKrnl.Exe
Kernel
Mode I/O Manager
POSIX
Replicator
Service
Controller
User
Mode
Applications
Environment
Subsystems
NTOSKRNL.EXE

NTOSKRNL.EXE


HAL.DLL


Windows NT executive
and kernel
Hardware Abstraction
Layer - interface to
hardware platform
BOOTVID.DLL

Boot video driver
Naming Convention For
Internal Windows NT Routines

Two- or three-letter component code in beginning of
function name
Executive
Ex - General executive routine
Ob
- Object management
Exp
- Executive private (not exported)
Io
- I/O subsystem
Cc - Cache manager
Se
- Security
Mm
- Memory management
Ps
- Process
structure
Rtl - Run-Time Library
Lsa
- Security Authentication
FsRtl
- File System Run-Time Lib
Zw
- File access, etc.
Kernel
Ke - Kernel
Ki - Kernel internal (not available outside the kernel)
HAL
Hal - Hardware Abstraction Layer
READ_, WRITE_ - I/O port and register access
Multiprocessor Support

Code comprising NTOSKRNL compiled twice:
Once for uniprocessor, once for multiprocessor


Two files on Windows NT media:





Avoids penalizing uniprocessor systems for
added MP complexity
UP version: NTOSKRNL.EXE
MP version: NTKRNLMP.EXE
Selected at installation time, but copied to NTOSKRNL
All drivers, DLLs, EXEs are built to run on on MP
Upgrading from Uniprocessor vs Multiprocessor



See uptomp.exe (in Resource Kit)
2 files replaced with different code
 NTKRNLMP.EXE replaces NTOSKRNL.EXE
 new HAL replaces HAL.DLL
4 files replaced with same code, but modified image header
 KERNEL32.DLL, NTDLL.DLL, WINSRV.DLL, WIN32K.SYS
Identifying Your NTOSKRNL

Build numbers


Service packs



Incremented each time
Windows NT is built from sources
(i.e., different for beta releases)
Replaces .EXEs (including usually
NTOSKRNL), .DLLs, etc.
Do not change Windows NT
build number
Free versus Checked build




Free = retail version; Checked =
debug version
Used primarily in driver testing
Build number is the same
Recompilation of system with
DEBUG flag true
 Therefore a different
NTOSKRNL.EXE
 Note: MP only (NTOSKRNL and
NTKRNLMP.EXE identical)
Screen snapshot from:
Programs | Administrative Tools
| Windows NT Diagnostics
Workstation Vs Server

Core operating system executables
are identical


Windows NT Server a superset of
Workstation




NTOSKRNL.EXE, HAL.DLL, xxxDRIVER.SYS,
etc., (t.b.d.)
domains, host-based RAID 5, NetWare gateway,
DHCP server, WINS, DNS, full Internet
Information Server…
Enterprise Server adds yet more functionality
(Clusters, 3GB address space)
Terminal Server enables multi-user thin
client support
MP limits: Workstation: 2 CPUs, Server:
4 CPUs, Server Enterprise: 8 CPUs
Workstation Vs Server

Registry indicates system type


HKLM\CurrentControlSet\Control\ProductOptions
 ProductType: WinNT=Workstation,
ServerNT=Server not a domain controller,
LanManNT=Server that is a Domain Controller
 ProductSuite: Indicates Enterprise Edition,
Terminal Server…
Code in the operating system tests these
values and behaves slightly differently in
a few places




Licensing limits (number of processors, number
of inbound network connections, etc.)
Boot-time calculations (memory manager)
Default length of time slice
See DDK: MmIsThisAnNtasSystem
Agenda



Introduction
Tools
System Architecture










Kernel Mode Environment
Executive, Kernel, HAL, Drivers
Product Packaging
System Threads
Environment Subsystems
System Service Dispatching
Process-based Windows NT code
Summary
Processe and Threads
Memory Management
System Threads


Internal worker routines that need thread context
Drivers or Executive can create system threads




Always run in kernel mode
Usually associated with the “System” process by default

But can be tied to any process
Not non-preemptible (unless they raise IRQL to 2 or above)
Kernel mode APIs:




PsCreateSystemThread
PsTerminateSystemThread
KeSetBasePriorityThread
KeSetPriorityThread
Threads In The “System”
Process


Note CPU time is 100%
kernel mode
“Start address” is
address of thread
function




On Intel (at least):
Addresses 8xxxxxxx will
correspond to symbols in
NtosKrnl.Exe
Addresses Axxxxxxx are
routines in Win32K.Sys
Addresses Fxxxxxxx
are routines in loaded
device drivers
Screen snapshot from: Programs | Resource Kit |
Diagnostics | Process Viewer
select “System” process
Threads In The
“System” Process

Memory Management






Security Reference Monitor


Redirector and Server Worker Threads
Threads created by drivers for their
exclusive use


Command Server Thread
Network


Modified Page Writer for mapped files
Modified Page Writer for paging files
Balance Set Manager
Swapper (kernel stack, working sets)
Zero page thread (thread 0, priority 0)
Examples: Floppy driver, parallel port driver
Pool of Executive Worker Threads


Used by drivers, file systems…
Accessed via ExQueueWorkItem
Threads In System Process
(Observed on Intel Windows NT Workstation 4.0 )
Routine Name
Phase1Initialization
ExpWorkerThread
Priority Notes
0
9-16
MiDereferenceSegmentThread 18
MiModifiedPageWriter
17
KeBalanceSetManager
16
KeSwapProcessOrStack
23
FsRtlWorkerThread
16, 17
SepRmCommandServerThread 15
First thread in life of system; becomes zero
page thread
Pool of worker threads
Dereferences segments; also expands
paging file
Writes modifed pages to paging file
Reclaims memory from processes, with aid
of . . .
Scheduled by balance set manager
Dedicated worker threads for FSDs
MiMappedPageWriter
17
Security Reference Monitor Command
Server
Writes modified pages to mapped files
(Win32 threads)
16
routines in Win32K.Sys (0xA0000000)
(driver threads)
various
routines in *driver.Sys (0xF0000000)
Agenda



Introduction
Tools
System Architecture










Kernel Mode Environment
Executive, Kernel, HAL, Drivers
Product Packaging
System Threads
Environment Subsystems
System Service Dispatching
Process-based Windows NT code
Summary
Processes and Threads
Memory Management
Environment Subsystems

Expose “native API”



Two main components



“Wrap” and extend Windows NT native functionality
Interfaces to write subsystems not documented
Subsystem DLLs - convert documented API to native API
Environment Subsystem Process - maintain state of client
processes; implement some subsystem APIs
Three provided with Windows NT:



Win32
Posix
 Bare minimum Posix standards, no optional components
OS/2
 Support for 1.x character-mode applications only
Subsystem Extensions

OS/2



Microsoft sells an add-on to the
OS/2 subsystem
Supports 1.x Presentation Manager
Posix



OpenNT from SoftWay
More-featured replacement for
Posix subsystem
www.opennt.com
Environment Subsystems

Subsystem for each .exe specified in image header

See winnt.h
IMAGE_SUBSYSTEM_UNKNOWN 0 // Unknown subsystem
IMAGE_SUBSYSTEM_NATIVE
1 // Image doesn't require a subsystem
IMAGE_SUBSYSTEM_WINDOWS_GUI 2 // Win32 subsystem (graphical app)
IMAGE_SUBSYSTEM_WINDOWS_CUI 3 // Win32 subsystem (character cell)
IMAGE_SUBSYSTEM_OS2_CUI 5 // OS/2 subsystem
IMAGE_SUBSYSTEM_POSIX_CUI 7 // Posix subsystem


See Explorer / QuickView (right-click on .exe or .dll file)
Or \reskit\exetype image.exe
Showing .exe Type
With QuickView

In Explorer:



Right-click on
an executable
file or .DLL
“Context menu”
appears
Select Quick
View
Environment Subsystems
Loading

Subsystems to load specified in registry:


Values:





\SYSTEM\CurrentControlSet\Control\Session Manager\SubSystems
Required
Optional
Windows
csrss.exe
os2ss.exe
psxss.exe
Kmode
- list of value names for subsystems to load at boot time
- list of value names for subsystems to load when needed
- value giving filespec of Win32 subsystem (csrss.exe)
Win32 APIs required
(Client Server Runtime SubSystem)
OS/2 APIs
optional
Posix APIs
optional
- value giving filespec of Win32K.Sys
(kernel-mode component of Win32)
Some Win32 API DLLs are in “known DLLs” registry entry:

\SYSTEM\CurrenctControlSet\Control\Session Manager\KnownDLLs
Environment Subsystems
Components

Subsystem process


API DLLs


For Win32: CSRSS.EXE
For Win32: Kernel32.DLL, Gdi32.DLL, User32.DLL, etc.
Kernel-mode extension to executive

Win32 only: Win32K.SYS
Environment Subsystems
System
and Server
Processes
User
Mode
Kernel
Mode
User
Application
OS/2
Subsystem DLL
NTDLL.DLL
Win32
Executive
Device Drivers
Hardware Abstraction Layer (HAL)
Kernel
POSIX
Win32
User/GDI
Windows NT Simplified
Architecture
(3.51 and earlier)
System
and Server
Processes
Environment Subsystems
User
Application
OS/2
Win32
POSIX
Subsystem DLL
1
2
User
Mode
NTDLL.DLL
Kernel
Mode
Executive
LPC
Device Drivers
Kernel
Hardware Abstraction Layer (HAL)
1
2
Most Win32 Kernel APIs
All other Win32 APIs, including User and GDI APIs
Windows NT Simplified
Architecture
(4.0 and later)
System
and Server
Processes
Environment Subsystems
User
Application
OS/2
Subsystem DLL
1
2
3
User
Mode
Win32
POSIX
NTDLL.DLL
Kernel
Mode
Executive
LPC
Device Drivers
Hardware Abstraction Layer (HAL)
1
Most
Win32 Kernel APIs
Most
Win32 User and GDI APIs
2
A 3few Win32 APIs
Win32
User/GDI
Kernel
(Reduced) Role Of Win32
Subsystem Process







Process creation and deletion
Thread creation and deletion
Get temporary file name
Drive letters
Security checks for file
system redirector
Window management for console
(character cell) applications
Some support for 16-bit DOS support
(NTVDM.EXE)
Agenda



Introduction
Tools
System Architecture










Kernel Mode Environment
Executive, Kernel, HAL, Drivers
Product Packaging
System Threads
Environment Subsystems
System Service Dispatching
Process-based Windows NT code
Summary
Processes and Threads
Memory Management
Invoking System Functions
From User Mode

Kernel-mode functions (“services”) are invoked from user mode
via a protected mechanism





x86: INT 2E; Alpha: SYSCALL (PALcode)
I.e., on a call to an OS service from user mode, the last thing that
happens in user mode is this “change mode to kernel” instruction
Causes an interrupt, handled by the system service dispatcher
(KiSystemService) in kernel mode
Return to user mode is done by dismissing the interrupt or exception
The desired system function is selected by the “system
service number”




Every Windows NT function exported to user mode has a
unique number
Push this number on the stack just before the
“change mode” instruction
(after pushing the arguments to the service)
This number is an index into the system service dispatch table
Table gives kernel-mode entry point address and argument list
length for each exported function
Invoking System Functions
From User Mode

All validity checks are done after the user to kernel transition




KiSystemService probes argument list, copies it to kernel-mode stack,
and calls the executive or kernel routine pointed to by the table
Service-specific routine checks argument values, probes pointed-to
buffers, etc.
Once past that point, everything is “trusted”
This is safe, because:





The system service table is in kernel-protected memory; and
The kernel mode routines pointed to by the system service table are
in kernel-protected memory; therefore:
User mode code can’t supply the code to be run in kernel mode; it
can only select from among a predefined list
Arguments are copied to the kernel mode stack before
validation; therefore:
Other threads in the process can’t corrupt the arguments “out from
under” the service
NTDLL.DLL

PUSH of service # and INT 2E are “wrapped” by small “jacket”
procedures in NTDLL.DLL



Entry points in NtDll.Dll are not supported or documented for use
from user mode apps





These user-mode routines have the same function names and
arguments as the kernel mode routines they call
 E.g., NtWriteFile in NtDll.Dll invokes NtWriteFile in NtosKrnl.Exe
Therefore exports of NTDLL are the “NT native API”
A few are documented in the DDK for call from kernel mode
A few images that come with Windows NT are written to the “native
API” exposed by NtDll.Dll (“Windows NT native images”)
See article on www.sysinternals.com
NTDLL also contains image loader and other support functions
What about getting to USER and GDI functions in Win32K.SYS?


System service wrapper exists in USER32.DLL, GDI32.DLL
Does not go through NTDLL.DLL
Tracing An Example Win32 Call
Win32 application
call WriteFile(…)
WriteFile
in Kernel32.Dll
call NtWriteFile
return to caller
Win32specific
NtWriteFile
in NtDll.Dll
Int 2E
return to caller
used by all
subsystems
software interrupt
KiSystemService
in NtosKrnl.Exe
NtWriteFile
in NtosKrnl.Exe
U
K
call NtWriteFile
dismiss interrupt
do the operation
return to caller
Source: MSJ, August
1996, page 21
(by Matt Pietrek)
Tracing An Example Win32 Call


Depends.Exe in Resource Kit and Platform SDK
Allows viewing of image->DLL relationships, imports,
and exports
Examining Symbols
In Key Images

Examine imports and exports of an .EXE down
to the OS


In Explorer, right mouse click on EXE or DLL, then
“quick view” (built in) or “View Dependencies”
(Dependency Walker tool in ResKit and Platform SDK)
Or use LINK /DUMP /EXPORTS, /IMPORTS
1. Look at imports of \winnt\system32\notepad.exe
2. Look at exports and imports of kernel32.dll

Most of the exports are documented Win32 calls
3. Look at exports and imports of ntdll.dll


None of the exports are documented
Some are the same as exports from ntoskrnl.exe,
documented in DDK, with identical
Examining Symbols
In Key Images
4. Look at exports and imports of ntoskrnl.exe



About 1000 total exported symbols
About 300 of the exported routine names are
documented in DDK
Callable only from kernel mode
5. Look at all global symbols in ntoskrnl.exe




Defined in \support\symbols\xxx\debug\exe\ntoskrnl.dbg
Quick viewer won’t display - use Kernel Debugger “x *”
with just this .dbg file loaded
About 4000 total symbols (Includes executive data cells
in addition to routines)
Exports of ntoskrnl.exe are a subset of this list
Agenda



Introduction
Tools
System Architecture










Kernel Mode Environment
Executive, Kernel, HAL, Drivers
Product Packaging
System Threads
Environment Subsystems
System Service Dispatching
Process-based Windows NT code
Summary
Processes and Threads
Memory Management
Process-Based
Windows NT Code

Pieces of Windows NT that run in separate
executables (.exe’s), in separate processes




Started by system
Not tied to a user logon
Have full process context
Three types:



Environment Subsystems (already described)
Win32 Services
System startup processes
 Note: “system startup processes” is not an
official MS-defined name
Process Creation Hierarchy



tlist.exe (from
resource kit)
tlist /t shows
creation hierarchy
Creating process
can exit, leaving
created process
running - hence this
display does not
show all creators

Explorer.exe is
actually started by
userinit.exe, which
then exits
Process-Based
Windows NT Code
Win32 services

Win32 .EXEs (applications) that run independently of a
logged on user








Start at boot or logon time, survive logoff
Defined by CreateService API - view through Control Panel
See srvany.exe, sc.exe, srvinstw.exe, instsrv.exe in Resource Kit
Typically do not interact with the desktop
 Get startup configuration parameters from Registry
 Log errors to Windows NT Event Log
Use some form of IPC mechanism for client communication and control
Services will likely make use of Windows NT security impersonation
Remotely manageable (start, stop, user-defined codes)
 Server Manager allows remote control of services
 Code is the same to control services locally vs. remotely
Examples of built-in Windows NT Services

Schedule service (at command), Event Log, Remote Access Server, etc.
Life Of A Service

Install time

Setup application tells Service
Controller about the service
Setup
Application
Registry
CreateService
System boot / initialization


SCM reads registry, starts
services as directed
Management / maintenance

Control panel can start and stop
services and change startup
parameters
Control
Panel
Service
Controller
Service
Processes
Where Are Services Defined?

Maintained in Windows NT Registry:



Mandatory information kept on each service:




Type of service (Win32, Driver…)
Imagename of service .EXE
 NOTE: Some service .EXEs contain more than one service
Start type (automatic, manual, or disabled)
Optional information:




HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services
One key per installed service
Display Name
Dependencies
Account and password to run under
Can store application-specific configuration parameters

“Parameters” under service key
Process-Based
Windows NT Code
System startup processes


Separate processes loaded or started at boot time (not as
services or environment subsystems)
Names of images are not in registry


“Hardwired” in the source code
Most are Win32 executables, one (smss) is a “native image”
(Idle)
(System)
first
Process id 0
Part of the loaded system image
Home for idle thread(s) (not a real process nor real threads)
Called “System Process” in many displays
Process id 2
Part of the loaded system image
Home for kernel-defined threads (not a real process)
Thread 0 (routine name Phase1Initialization) launches the
“real” process, running smss.exe…
…and then becomes the zero page thread
Process-Based
Windows NT Code
System startup processes






smss.exe
Session Manager
The first “created” process
Takes parameters from
\Registry\Machine\System\CurrentControlSet\Control\Session Manager
Launches required subsystems (csrss) and winlogon
winlogon.exe Logon process
Presents first login prompt
Presents “enter username and password” dialog
Launches services.exe, lsass.exe, and nddeagnt.exe
When someone logs in, launches userinit.exe
services.exe Service Controller; also, home for many NT-supplied
services
Starts processes for services not part of services.exe (driven by
\Registry\Machine\System\CurrentControlSet\Services )
lsass.exe
Local Security Authentication Server
userinit.exe
Started after logon; starts desktop (Explorer.Exe) and exits
(hence does not show up in tlist output; Explorer appears to be an orphan)
explorer.exe and its children are the creators of all interactive apps
Agenda



Introduction
Tools
System Architecture










Kernel Mode Environment
Executive, Kernel, HAL, Drivers
Product Packaging
System Threads
Environment Subsystems
System Service Dispatching
Process-based Windows NT code
Summary
Processes and Threads
Memory Management
Four Contexts For
Executing Code

Full process and thread context:





Have thread context but no “real” process:


Threads in “System” process
Routines called by other threads / processes:




User applications
Win32 Services
Environment subsystem processes
System startup processes
Subsystem DLLs
Executive system services (NtReadFile, etc.)
GDI routines in Win32K.Sys (and graphics drivers)
No process or thread context



(“Arbitrary thread context”)
Interrupt dispatching
Device drivers
Where Is The Code?

Kernel32.Dll, Gdi32.Dll, User32.Dll


NtDll.Dll




The loadable module that includes the now-kernel-mode Win32 code
(formerly in csrss.exe)
Hal.Dll


Executive and kernel
Includes most routines that run as threads in “system” process
Win32K.Sys


Provides user-mode access to system-space routines
Also contains heap manager, image loader, thread startup routine
Ntoskrnl.Exe (or Ntkrnlmp.exe)


Export Win32 entry points
Hardware Abstraction Library
drivername.Sys

Loadable kernel drivers
Agenda





Introduction
Tools
System Architecture
Processes and Threads
Memory Management
Processes And
Threads

What is a process?



Thread
Thread
What is a thread?




Represents an instance of a running program
 You create a process to run a program
 Starting an application creates a process
Primary argument to CreateProcess is image
file name (or command line)
Per-process
address space
An execution context within a process
Primary argument to CreateThread is a
function entry point address
All threads in a process share the same perprocess address space
Thread
Every process starts with one thread



Running the program’s “main” function
Can create other threads in the same process
Can create additional processes
Systemwide
Address Space
Tools To Examine Processes






Task Manager
Performance Monitor
pviewer.exe (pview in Platform SDK):
shows processes, threads within processes,
memory details
pview.exe (process explode):
thread and process ACLs and tokens
tlist.exe - tlist /t shows parent/child relationships
QuickSlice




qlice.exe
CPU usage by process, and by thread within each process
Pulist - process user list
Vadump - dump virtual address space of a process
Tools To Examine Processes

Page fault monitor (pfmon.exe)



Pstat






pstat.exe (char mode, no icon)
One-time snapshot of system
Shows state of threads within all processes, with
wait reasons
Kernel debugger


Shows page fault type and origin of subject application
Can provide data to working set tuner (part of Platform SDK)
Shows various internal structures
See Windows NT® Workstation Resource Kit documentation
oh.exe (ResKit), nthandleex
(www.sysinternals.com) - show open handles
Ntpmon (www.sysinternals.com)
Windows NT 5.0 Job Object

New kernel object to collect a group of
related processes


System enforces job quotas
and security context



CreateJobObject/OpenJobObject
Limits: Total and current CPU time, total and active
processes, per-process and per-job CPU time, min
and max working set, CPU affinity, priority class
Security limits: No administrators token, only
restricted token, only specific token, filter token,
no accessing windows outside the job, no
reading/writing the clipboard
To examine: See new performance counters +
new !job command in kernel debugger
Processes And Threads
Internal Structures
Access token
VAD
Process
object
VAD
VAD
Virtual address space descriptors
Handle table
See kernel debugger
commands:
!processfields
!threadfields
!process
!thread
!tokenfields
!token
!handle
!object
object
object
Thread
Thread
Thread
…
Access token
!processfields
Pcb:
ExitStatus:
LockEvent:
LockCount:
CreateTime:
ExitTime:
LockOwner:
UniqueProcessId:
ActiveProcessLinks:
QuotaPeakPoolUsage[0]:
QuotaPoolUsage[0]:
PagefileUsage:
CommitCharge:
PeakPagefileUsage:
PeakVirtualSize:
VirtualSize:
Vm:
LastProtoPteFault:
DebugPort:
ExceptionPort:
ObjectTable:
Token:
WorkingSetLock:
WorkingSetPage:
ProcessOutswapEnabled:
ProcessOutswapped:
AddressSpaceInitialized:
AddressSpaceDeleted:
AddressCreationLock:
0x0
0x68
0x6c
0x7c
0x80
0x88
0x90
0x94
0x98
0xa0
0xa8
0xb0
0xb4
0xb8
0xbc
0xc0
0xc8
0xf8
0xfc
0x100
0x104
0x108
0x10c
0x12c
0x130
0x131
0x132
0x133
0x134
ForkInProgress:
VmOperation:
VmOperationEvent:
PageDirectoryPte:
LastFaultCount:
VadRoot:
VadHint:
CloneRoot:
NumberOfPrivatePages:
NumberOfLockedPages:
ForkWasSuccessful:
ExitProcessCalled:
CreateProcessReported:
SectionHandle:
Peb:
SectionBaseAddress:
QuotaBlock:
LastThreadExitStatus:
WorkingSetWatch:
LpcPort:
InheritedFromUniqueProcessId:
GrantedAccess:
DefaultHardErrorProcessing
LdtInformation:
VadFreeHint:
VdmObjects:
ProcessMutant:
ImageFileName[0]:
VmTrimFaultValue:
0x158
0x15c
0x160
0x164
0x168
0x170
0x174
0x178
0x17c
0x180
0x184
0x186
0x187
0x188
0x18c
0x190
0x194
0x198
0x19c
0x1a0
0x1a4
0x1a8
0x1ac
0x1b0
0x1b4
0x1b8
0x1bc
0x1dc
0x1ec
!threadfields
Tcb:
CreateTime:
ExitTime:
ExitStatus:
PostBlockList:
TerminationPortList:
ActiveTimerListLock:
ActiveTimerListHead:
Cid:
LpcReplySemaphore:
LpcReplyMessage:
LpcReplyMessageId:
Client:
IrpList:
TopLevelIrp:
ReadClusterSize:
ForwardClusterOnly:
DisablePageFaultClustering:
DeadThread:
HasTerminated:
EventPair:
GrantedAccess:
ThreadsProcess:
StartAddress:
Win32StartAddress:
LpcExitThreadCalled:
HardErrorsAreDisabled:
0x0
0x1b0
0x1b8
0x1c0
0x1c4
0x1cc
0x1d4
0x1d8
0x1e0
0x1e8
0x1fc
0x200
0x208
0x20c
0x214
0x21c
0x220
0x221
0x222
0x223
0x224
0x228
0x22c
0x230
0x234
0x238
0x239
Looking At Waiting Threads

pstat.exe (Resource Kit)


Shows state of every thread in every process
But for threads that are waiting, that’s all
we know…
Looking At Waiting Threads

!thread command in kernel debugger shows
what a thread is waiting on
Dispatcher Objects

Any kernel object you can wait for is a “dispatcher object”





Some exclusively for synchronization

E.g., events, mutexes (“mutants”), semaphores, queues, timers
Others can be waited for as a side effect of their prime function

E.g., processes, threads, file objects
Non-waitable kernel objects are called “control objects”
All dispatcher objects have a common header
All dispatcher objects are in one of two states




“Signalled” versus “nonsignalled”
When signalled, a wait on the object is satisfied
Different object types differ in
terms of what changes their state
Wait and unwait implementation is
common to all types of dispatcher objects
Dispatcher
object
Size
Type
State
Wait listhead
Object-typespecific data
(see \ddk\inc\nttddk.h)
Thread Objects
WaitBlockList
WaitBlockList
Dispatcher
Objects
Wait Blocks


Size
Type
State
Wait blocks
Wait listhead
List entry
Object-typespecific data
Thread
Object
Key
Type
Next link


Represent a thread’s
reference to something it’s
waiting for (one per handle
passed to WaitFor…)
All wait blocks from a
given wait call are chained
to the waiting thread
Type indicates wait for
“any” or “all”
Key denotes argument list
position for
WaitForMultipleObjects
Size
Type
State
Wait listhead
List entry
List entry
Object-typespecific data
Thread
Object
Key
Type
Next link
Thread
Object
Key
Type
Next link
Agenda





Introduction
Tools
System Architecture
Processes and Threads
Memory Management




Virtual Address Space Layout
Process Memory Usage
Global System Cache
System Memory Usage
4GB Virtual Address Space

00000000
Unique per
process,
accessible in
user or
kernel mode
.EXE code
Globals
Per-thread user
mode stacks
Process heaps
.DLL code
C0000000
System
wide,
accessible
only in
kernel mode
FFFFFFFF


Exec, Kernel,
HAL, drivers, perthread kernel
mode stacks,
Process
page tables,
Win32K.Sys
hyperspace
File system cache
Paged pool
Non-paged pool
Address space of one
process is not directly
reachable from other
processes
2 GB systemwide

7FFFFFFF
80000000
Per process,
accessible
only in
kernel mode
2 GB per-process

The operating system is
loaded here, and appears
in every process’s
address space
There is no process for
“the operating system”
(though there are
processes that do things
for the OS, more or less
in “background”)
System Space Layout
x86
Alpha AXP
80000000
80000000
System code (NTOSKRNL, HAL, boot
drivers); initial nonpaged pool
A0000000
A4000000
C0000000
C0400000
C0800000
C0C00000
C1000000
E1000000
System Mapped Views (e.g. WIN32K.SYS)
or session space (Terminal Server only)
Additional System PTEs (& big cache)
Process Page Tables and Page Directory
Hyperspace and process working set list
Unused No Access
System Working Set List
System Cache
Paged Pool
EB000000 (min)
System PTEs
Non-Paged Pool expansion
FFBE0000
FFC00000
Crash dump information
HAL usage
System code (NTOSKRNL, HAL,
boot drivers) and initial nonpaged pool
C0000000
C1000000
C2000000
C3000000
C4000000
DE000000
E1000000
EB000000 (min)
FDFEC000
Process Page Tables and Page Directory
Hyperspace and process working set list
Unused No Access
System Working Set List
System Cache
System Mapped Views (e.g. WIN32K.SYS)
Paged Pool
System PTEs
Non-Paged Pool expansion
Crash dump information & HAL usage
3GB Process Space Option

00000000
Unique per
process,
accessible in
user or
kernel mode
Per process,
accessible
only in
kernel mode
BFFFFFFF
C0000000
System
wide,
accessible
only in
kernel mode
FFFFFFFF
Unique per
process
.EXE
code
(= per
appl.),
Globals
user mode
Per-thread user
mode stacks
.DLL code
Process heaps
Only available on x86
Server Enterprise Edition



Expands per-process
address space

Process page tables,
hyperspace
Exec, kernel,
HAL,
drivers, etc.

Boot with /3GB option in
BOOT.INI
Chief “loser” in system
space is file system cache
But image must be
marked as “large address
space aware”
A stopgap while we wait
for 64-bit Windows NT
(Merced and Alpha; postWindows NT 5.0)
64-bit Very Large Memory In
Windows NT 5.0
00000000 00000000
00000000 7FFFFFFF
00000001 00000000
2GB user space
2GB user space
2GB process space

28GB Large
Memory Area

Alpha Windows NT Server
Enterprise Edition only
Referenced by 64-bit
pointers


00000007 FFFFFFFF
00000008 00000000

Invalid (inaccesible)
(about 1.8x10^19
bytes; not to scale!)
FFFFFFFF 7FFFFFFF
FFFFFFFF 80000000
FFFFFFFF FFFFFFFF
2GB system space

Cannot be paged out - must
be resident at all times
Cannot be used for code,
only data file mapping
New APIs: VirtualAllocVlm,
MapViewOfFileVlm,
Read/WriteFileVlm,
Read/WriteProcessMemoryVl
m, etc.)
Yet another stopgap prior to
64-bit Windows NT
Application Startup Maps
V.A.S. To Code On Disk
00000000
paging file
.dll
.exe
7FFFFFFF



See link/dump/header, or QuickView for .exe’s and .dll’s
CreateFileMapping, MapViewOfFile simply make the mechanism
available to application-level code
All of these files may simultaneously be mapped by
other processes
Process Virtual
Address Layout
Screen snapshot from: Programs | SDK Tools | Process Walker
Process | Load Process | notepad
Agenda





Introduction
Tools
System Architecture
Processes and Threads
Memory Management




Virtual Address Space Layout
Process Memory Usage
Global System Cache
System Memory Usage
Process Memory Usage

Working set: All the physical pages
“owned” by a process




Essentially, all the pages the process can
reference without incurring a page fault
Upper limit on size for each process
When limit is reached, a page must be
released for every page that’s brought
in (“working set replacement”)
Working set limit: The maximum
pages the process can own


Maximum is calculated as
(available pages - 512 pages)
Result stored in MmMaximumWorkingSetSize
Working Set List
A FIFO list for each process
newer pages
older pages
PerfMon
Process “WorkingSet”
Working Set Replacement
PerfMon
Process “WorkingSet”


To standby
or modified
page list
When working set “count” = working set size,
must give up pages to make room for new pages
Page replacement is ”modified FIFO”


MP x86 and Alpha: no regard to accessed bit
Windows NT 5.0 on uniprocessor x86 takes into account age
Locking Pages

Pages may be locked into the process working set

Locked pages are guarenteed in physical memory (“resident”)
when any thread in process is executing
Win32:
status = VirtualLock(baseAddress, size);
status = VirtualUnlock(baseAddress, size);

Number of lockable pages is a fraction of the maximum
working set size


Changed by SetProcessWorkingSetSize
Pages can be locked into physical memory (by drivers only)

Pages are then immune from outswapping as well as paging
MmProbeAndLockPages
Memory Management
Information
Task manager processes tab
1  “Mem Usage” = physical
2 
3
4

memory used by process
(working set size, not
working set limit)
“VM Size” = private (not
shared) committed virtual
space in processes
“Mem Usage” in status bar is
total of “VM Size”
column/maximum allowed i.e., same as “commit
charge” in “Performance” tab
(see next slide) - not same as
“Mem Usage” column here!
1
2
3
4
Screen snapshot from : Task Manager | Processes tab
Memory Management
Information
PerfMon - process object
1 
2 
6 
“Working Set” =
working set size (not limit)
“Private Bytes” = same as
“VM Size” from Task
Manager Processes list
“Virtual Bytes” =
committed virtual space,
including
2
shared pages
6
1
Screen snapshot from: Performance Monitor
counters from Process object
Memory Management Information
Task manager performance tab
3
4
“Commit charge total” =
total of private (not
shared) committed
virtual space in all
processes (i.e. total
of “VM Size” from
processes display)
“Commit charge limit” =
sum of available
physical memory +
free space in
paging file
3
3
4
3
4
Screen snapshot from: Task Manager | Performance tab
Agenda





Introduction
Tools
System Architecture
Processes and Threads
Memory Management




Virtual Address Space Layout
Process Memory Usage
Global System Cache
System Memory Usage
File System Virtual
Block Cache


Shared by all file systems (local or remote)
Caches all files


Virtual block cache (not logical block)




Including file system metadata files
Managed in terms of blocks within files, not blocks
within partition
Uses standard Windows NT virtual memory mechanisms
Coherency maintained between mapped files and
read/write access
Virtual size: 64-512mb (960MB if large cache size set)


In system virtual address space, so visible to all
Divided into 256kb “views”
Cached File Operations

Open a file:



Find an available view
Map the first 256kb of the
file into the view
Read from or write to a
cached file:



Remap as necessary to
map referenced section of
file into the cache
Copy data between
application buffer and
cache’s virtual address
space
Actual I/O is due to paging
Process
address
space
System
address
space
File
Fast I/O
Cache
Manager
I/O Subsystem API (Ntxxx)
Fast I/O
path
I/O Manager (Ioxxx)
Driver
Support
Routines
(Io, Ex,
Ke, Mm,
Hal, FsRtl,
...)
File System drivers
(e.g. NTFS)

Fast I/O path

Disk device driver

HAL I/O access routines
I/O ports and registers

Allows executive
I/O APIs to access
cache directly
Bypasses file
system driver
Bypasses IRP
generation, probeand-lock of user
buffer, etc.
Cache Size

Physical size: Depends on available memory



Competes for physical memory with processes, paged
pool, pageable system code
Part of “system working set”
 Automatically expanded / shrunk by system
 Normal working set adjustment mechanisms
 Relies on Memory Manager for global memory
policy
 Performance Monitor: Memory object | System cache
resident bytes shows current physical space occupied
by cache
See \SYSTEM\CurrentControlSet\Control\Session
Manager\ Memory Management\LargeSystemCache
 Default is 0 for both Workstation and Server
 1 = favor system working set vs. process working set
 also allows cache to be >512MB virtual size
 Can modify with Control Panel->Network->Services->
Server properties
Cache Functions
And Control

Automatic asynchronous readahead





Done by separate “Readahead” system thread
64kb readaheads by default
Predicts next read location based on history of last 3 reads
Readahead hints can be provided to CreateFile:
 FILE_FLAG_SEQUENTIAL does 192kb read ahead
 FILE_FLAG_RANDOM_ACCESS disables read ahead
Write-back, not write-through



Dirty page threshold forces writing
 Small system: Physical Pages / 8; medium system:
Physical Pages / 4
 Large system: add above 2 together
“Lazy writer” thread queues 1/4 of dirty pages every second to
separate “Write Behind” system thread (note, does not flush
mapped files)
Can override via CreateFile with FILE_FLAG_WRITE_THROUGH
 Or explicitly call FlushFileBuffers when you care (does flush
mapped files)
Cache Functions
And Control

Can disable cache completely on a
per-file basis



CreateFile with
FILE_FLAG_NO_BUFFERING
Requires reads/writes to be done on
sector boundaries
Buffers must be aligned in memory
on sector boundaries
Agenda





Introduction
Tools
System Architecture
Processes and Threads
Memory Management




Virtual Address Space Layout
Process Memory Usage
Global System Cache
System Memory Usage
System Paged Memory

Just as processes have working sets, Windows NT’s pageable
system-space code and data lives in the “system working set”


Pageable components of system working set:





Cache is one of 4 components of “system working set”
Paged pool
Pageable code and data in the exec
Pageable code and data in kernel-mode drivers, Win32K.Sys,
graphics drivers, etc.
Global file system data cache
To get physical (resident) size of these with PerfMon, look at:





Memory | Pool Paged Resident Bytes
Memory | System Code Resident Bytes
Memory | System Driver Resident Bytes
Memory | System Cache Resident Bytes
Memory | Cache bytes counter is total of these four “resident”
(physical) counters (not just the cache; same as “File Cache” on
Task Manager / Performance tab
Sessions


New memory management object to support
Windows NT® Server 5.0
All processes in an interactive session share a:



Session-specific copy of Win32K.Sys
Instance of Winlogon
Session working set
x86
80000000
System code (NTOSKRNL, HAL, boot
drivers); initial nonpaged pool
A0000000
Win32k.sys *8MB)
A0800000
Session Working Set Lists
A0C00000
Mapped Views for Session
A2000000
Paged Pool for Session
System Nonpaged Memory

Nonpageable components:



Nonpageable parts of
NtosKrnl.Exe, drivers
Nonpaged pool (see
PerfMon, Memory object:
Pool nonpaged bytes)
To get size of nonpageable
system code, run
\ntreskit\pstat.exe & add
columns 1 & 2
7 non-paged code
8 non-paged data
9 pageable code+data

output of “drivers”
(\ntreskit\drivers.exe) is
similar

Win32K.Sys is paged, even
though it shows up as
nonpaged
7
8
9
Monitoring Pool Usage


Poolmon.exe in \support\debug
Must first turn on pool tagging with gflags
“p” to toggle between nonpaged, paged pool, or both
Sorting:
“b” to sort by total # of bytes
“a” to sort by # of allocations
“t” to sort by structure tag
“Free” Memory

System keeps unassigned physical
pages (those not part of any working
set) on five lists





Free page list
Modified page list
Standby page list
Zero page list
Bad page list - pages that failed
memory test at system startup
Managing Physical Pages
demand zero
page faults
pages read
from disk
Standby
Page
List
Process
Working
Sets
“soft”
page
faults
working set
replacement
modified
page
writer
Modified
Page
List
Free
Page
List
zero
page
thread
Zero
Page
List
Bad
Page
List
Memory Management Information
Task manager performance tab
1 “Available” memory = total of
free, zero, and standby lists
(majority usually are
standby pages)
2 “File cache” is really total
physical size of pageable
portions of: paged pool,
NtosKrnl.Exe code and data,
drivers code and data, and
file system cache (same as
PerfMon “cache
bytes” counter)
3 “Kernel Memory Paged” is
resident size of paged pool
4 “Kernel Memory Nonpaged”
is actual size of
nonpaged pool
1
2
3
4
Screen snapshot from: Task Manager | Performance tab
Summary: Accounting For
Physical Memory Usage

Process working sets








See total displayed by DRIVERS utility
in Windows NT Resource Kit
Perfmon: Memory / Pool
nonpaged bytes
Free, zero, and standby page lists


Perfmon: Memory / Available bytes
Or: Task Manager / Performance tab:
Physical memory: Available
Pageable, but currently-resident,
system-space memory

Nonpageable pool


Perfmon: Process / Working set
Note, shared resident pages are
counted the process working set of
every process that’s faulted them in
Hence, the total of all of these may be
greater than physical memory
Nonpageable system code
(NTOSKRNL + drivers, including
win32k.sys &graphics drivers)




Perfmon: Memory / Pool paged
resident bytes
Perfmon: Memory / System
cache resident bytes
Perfmon: Memory / System code
resident bytes
Perfmon: Memory / System
driver resident bytes
Memory | Cache bytes counter is
really total of these four
“resident” (physical) counters
Modified, Bad page lists

can only see size with
!memusage command in Kernel
Debugger
Windows NT Internals
Information Sources

Books




MSDN Library










Inside Windows NT (Solomon, MS Press)
Advanced Windows (Richter, MS Press)
Windows NT Workstation Resource Guide (MS Press)
Platform SDK API documentation
Windows NT Device Driver Kit (DDK) documentation
Win32 Knowledge Base - has some Windows NT internals articles
Past
Windows NT conferences audio/video tapes (www.mobiletape.com)
www.sysinternals.com - Windows NT internals articles and tools
www.microsoft.com/hwdev - hardware developers and driver writers
www.microsoft.com/hwdev/ntifskit - Installable File System Developers Kit
comp.os.ms-windows.programmer.nt.kernel-mode - drivers newsgroup
www.cmkrnl.com - Windows NT device driver FAQ
Download