第五章 操作系统安全体系结构
 操作系统的测试
 渗透测试和老虎队分析
 Ethical hacking:正面的黑客行动
 打补丁
 可/不可
 原因
 旧系统有了新的应用
 系统在设计时考虑不充分
 构造安全操作系统时的安全体系结构应当如何?
 体系结构设计的主要任务
 各种不同角度的需求,可能有冲突
 需要折衷
 计算机系统的安全体系结构,包括
1. 详细描述系统中安全相关的所有方面
2. 在一定的抽象层次上描述各个安全相关模块之间的
3. 提出指导设计的基本原理
4. 提出开发过程的基本框架及对应于该框架体系的层
 可信计算机系统评估准则TCSEC
 通用评估准则CC
美国国防部的目标安全体系(DoD Goal Security
 抽象体系
 通用体系
 逻辑体系
 特殊体系
 Flask
 Flask在LSM中的应用
Flask history
 In 1992 & 1993, researchers at the NSA and SCC worked on
the design and implementation of DTMach, an outgrowth of
the TMach project and the LOCK project.
 DTMach integrated a generalization of type enforcement , a flexible
access control mechanism, into the Mach microkernel.
 The DTMach project was continued in the DTOS project.
 The DTOS project improved upon the earlier design and
implementation work, yielding a prototype that was released
to universities for research (e.g. Secure Transactional
Resources, DX).
 After the DTOS project, a new joint effort was started by the
NSA, SCC, and the University of Utah's Flux project to
transfer the DTOS security architecture into the Fluke
research OS.
 During the integration, the architecture was enhanced to
provide better support for dynamic security policies
 It was named Flask.
 Flask: Flux Advanced Security Kernel
 Flask was ported to:
 OSKit
 Security-Enhanced Linux
Ray Spencer, et al., The flask security architecture:
system support for diverse security policies, in
Proceedings of the 8th conference on USENIX
Security Symposium - Volume 8. 1999, USENIX
Association: Washington, D.C.
Flux Advanced Security Kernel
The Flask Security Architecture:
System Support for Diverse Security
Ray Spencer Secure Computing Corporation
Stephen Smalley, Peter Loscocco National Security Agency
Mike Hibler, David Andersen, Jay Lepreau University of Utah
参考了Jim Stevens的ppt
The notion of “security” in a system is defined in
terms of its security policy
A wide range of security policies exist due to the
diversity of computing environments
Operating systems must be flexible in support for
security policies to accommodate the spectrum of
security policies
Supporting policy flexibility is not as simple as just
implementing multiple policies
3 Requirements of Policy Flexibility
 Support fine-grained access controls on low-level objects
 Propagate access rights according to security policy
 Deal with changes in policy over time, including
revoking previously granted permissions
 Earlier systems provided some mechanisms to implement
policy flexibility
 Previous systems failed to address all three requirements at once
 This paper describes Flask architecture and a microkernel
based prototype to demonstrate that policy flexibility is
 Flask is based on the concept of mandatory access controls
 Compare to discretionary access controls (DAC)
Policy Flexibility
 List all known security policies and define flexibility
through that list?
 Unrealistic
A better definition is needed!
It is more useful to define security policy flexibility
by viewing the computer system as an abstract state
machine with atomic state transformations
Total flexibility is achieved when security policy
knows entire state of system and can affect all
operations in the system
 Allow/deny operation
 Atomically inject handler routines
 It is possible to modify the existing security policy and to
revoke any previously granted access.
Total flexibility is obviously not possible in a real
A more realistic approach is to ask what subset of
system state and operations are relevant to security
 Flexibility of a practical system therefore depends on
how complete the set of control operations is and what
portion of the state is available to the security policy
 Granularity of the controlled operations affects the degree
of flexibility because it impacts the granularity at which
sharing can be controlled
A policy flexible system must be capable of
supporting a wide variety of security policies.
Security policies may be classified by
 The need to revoke previously granted access
 The type of input required to make access decisions
 The sensitivity of policy decisions to external factors like
history or environment
 Transitivity of access decisions
Revocation is the most difficult characteristic to
 Security policy must deal with policy changes interleaved
with execution of controlled operations
 Interleaving must be atomic so any controlled operation has
a consistent policy
 Atomicity is difficult to achieve because access permissions
tend to migrate throughout the system
 Example: Unix write permissions on a file are only checked when the
file is opened. The granted permission is cached in the file descriptor.
Changing permissions only affects future open operations.
 Migrated permissions are common in capabilities, access rights in
page tables, open IPC connections, and other operations in progress
Must make sure entire system knows if a permission
is revoked when policy changes
 Complicated and potentially expensive
 Must identify relevant in-progress operations
Three ways to handle revocation for an in-progress
 Abort and return error
 Restart operation and check permission
 Wait for operation to complete
Waiting is not safe because it does not enforce
policy and can take an unbounded amount of time
Insufficiency of Popular Mechanisms
We will take a look at:
 Capability-Based Systems
 Intercepting Requests
Capability-Based Systems
Capabilities are transferable tokens that reference an
object and access rights
 A capability is an unforgable data structure maps access
rights to objects
 can be passed around
 stored in kernel memory so user can only modify it using an
 like file descriptors
Example OS implementations are Hydra, KeyKOS,
EROS, SCAP, ICAP, and Trusted Mach
Capability mechanisms are poorly suited to
providing policy flexibility because they allow the
holder of the capability to control the propagation of
that capability
 Security policy MUST control propagation of access
rights to properly implement rules of security policy
 Cannot trust the capability holders to implement policy
Hydra and KeyKOS had enhancements to limit
propagation, but they were specific to certain
policies and very complex
Intercepting Requests
A common approach to add security is to intercept
service requests with an additional security layer
May be done in capability or non-capability based
Examples: Kernel Hypervisors (not VM!), SPIN,
Lava, KeySAFE
Can work at kernel-level or user-level
 Must expose all abstractions and information flows that
the security policy wishes to control
 Requires state to be exposed to avoid redundancies and to
make sure that policy enforcement mechanisms know
what to do
 Can only affect an operation as requests pass through the
Related Work
 Security architecture of Flask is based on DTOS, which had
similar goals.
 Had mechanisms that were policy independent, but not rich enough
to support some policies (particularly dynamic policies)
 Used Mach microkernel design to handle revocation of memory
permissions (could not handle other permissions)
 Generalized Framework for Access Control (GFAC)
 Assumes all controlled operations are performed in same atomic
operation in which the policy is consulted
 Difficult to achieve in a practical system and primary obstacle that
Flask had to overcome
 Effectively provided immediate revocation of memory
permissions by invalidating segment descriptors
 Shows that this problem is not new
 Had a capability revocation method, but didn’t work for
migrated permissions
Flask Design and Implementation
Flask prototype is implemented in a microkernelbased multiserver OS
 Microkernel isn’t essential though
 Only requires a reference monitor
The base system is Fluke
 Originally a capability-based system
 Modified to meet requirements of Flask architecture
In operating systems architecture, a reference
monitor is a tamperproof, always-invoked, and
small enough to be fully-tested and analyzed
module that controls all software access to data
objects or devices (verifiable). The reference
monitor verifies the nature of the request against a
table of allowable access types for each process on
the system.
 Object manager – components that enforce security policy
 Security server – components that make security policy
Primary Goal:
Ensure that subsystems always have a consistent
view of policy decisions regardless of how they are
made and how they change over time
Secondary goals:
application transparency, defense-in-depth, ease of
assurance, and minimal performance overhead
Flask provides three primary elements for object
 Interfaces for accessing security server decisions
 Access – permission between two entities
 Labeling – specify security attributes of an object
 Polyinstantiation – which member of a set of resources should be
accessed for a particular request
 Access vector cache (AVC) to cache decisions and
minimize performance overhead
 Registration service to receive notifications when policy
Object managers must define:
 a mechanism to assign labels to their objects
 a control policy, which specifies how security decisions
are actually used and enforced
 handling routines that are called when policy changes
Object Labeling
All objects controlled by the security policy are
labeled with a set of security attributes, referred to
as the security context
Flask provides two data types for labeling objects
 Security contexts – variable length strings that can be
interpreted by any application or user that understands the
security policy, can contain whatever is needed by the
security policy and is therefore flexible
 SID – fixed size values used as references to security
contexts, created for efficiency reasons (cheaper to pass
around), security server maintains SID mappings
General Support Mechanisms
Client and Server Identification
 IPC calls require the client and server to be identified so
the roles are known for a security decision
Caching security decisions
 Use AVC to save security decisions because querying the
security server is expensive due to IPC and security
 Coherence is provided by policy change handler routines
Polyinstantiation Support
 Security server identifies which instantiation can be
accessed by a client
Requesting and caching security decisions in Flask
Polyinstantiation in Flask
Microkernel-Specific Features
Binds an SID to each memory segment, which is the
same SID of whatever object is stored in that
memory, and allows Flask to leverage Fluke’s
protection model
Associates a Flask permission with each memory
access mode based on the SID of the address space
and the memory segment
 Uses to verify that accesses to mapped memory are
allowed by security policy
Revocation Support
 After policy change, the object manager’s behavior must
reflect the change
 Policy changes must complete in a timely manner
Three step protocol
 Security server notifies all object managers that may
previously been exposed to revoked permissions
 Object manager updates its internal state
 Object manager notifies the security server that the
update is complete
Sequence numbers are used to synchronize policy
changes and policy decisions
Security Server
 Provide mapping from SIDs to security contexts
 Allocate SIDs for newly created objects
 Manage AVCs of object managers (with handler
 Provide interface for changing policy (if needed)
 Cache computations on server side as well because
computations can get expensive
Distributed systems
 If in a homogeneous policy environment: the security
server of each node merely act as a local cache of the
environment’s policy
 to support heterogeneous policy environments, it is
desirable for each node to have its own security server
with a locally defined policy component, with some
degree of coordination at a higher level.
Flask security server is defined through a
combination of code and a policy database
 Policy database language can express many policies
 Any security policy that can be expressed through the
prototype’s policy database language may be
implemented simply by altering the policy database.
 but some policy changes may require code changes or by
completely replacing the security server.
 BUT always do not require any changes to the object
The policies enforced by the prototype server were:
 Multi-level security
The policy logic for the multi-level security policy is
largely defined through the security server code, aside
from the labels themselves.
 Type enforcement
 Identity-based access control
 Role-based access control
The policy logic for the other subpolicies is primarily
defined through the policy database language.
Flexibility of Flask Prototype
Three sources of potential inflexibility
 Range of operations that system can control
 Limitation of operations that may be invoked by the
security policy (depends on object manager instances)
 Amount of state information available to security policy
for making decisions
 Limited to 2 SIDs per query in the prototype, cannot handle
 Architecture does not limit this, but changing it may be a source
of reduced performance
Overhead for labeling is about 1% compared to
Table 2 presents measurements for IPC operations
of various bit lengths
 All tests are in the AVC
Table 3 presents measurements for decision time
when decision is stored in different locations
Table 2: IPC Time
Naive: same test as fluke, but with flask
Client identification: modified to use flask specific server-side IPC to
obtain SID on every call
Client impersonation: uses client side IPC to specify an effect SID for
every call
Table 3: Security Decision Time
trivSS – computation is trivial, just communication overhead
realSS – combination of computation and communication
Revocation time is shown
in Table 4.
It is the most expensive
Shown with a varying
Table 4: Revocation Times
number of connections.
Has overhead of stopping all threads in prototype.
That is the majority of the time. Scales linearly with
number of connections after that.
Although this is expensive, policy changes are
relatively rare.
 GNU Build System (make,
gcc, ld)
 Compilation of about 8000
LOC of .c and .h files
 Also executed on FreeBSD
for comparision
Table 5: Execution Time
Flask-FFS-PM – unmodified Fluke
object managers
Memfs – memory file system (to
reduce page faults)
Hint – predetermined location in
Cache – must find decision in
Table 6: Security Decision Resolution
Invasiveness of Flask Code
Overall, Fluke components increased in size less
than 8% (see Table 7)
Kernel increased by 19%
57% of changes to process manager and 61% of
changes to kernel were “trivial”
Only extended Fluke API with security
functionality, fully backwards compatible with
 Paper provided a useable definition of policy flexibility
 Shows that pure capability based systems and intercepting
request based systems are inadequate for achieving policy
 Paper described operating system security architecture
capable of supporting a wide range of security policies
 Demonstrated practicality of architecture with prototype
microkernel-based system
 Appendix A has examples for Flask based file server,
network server, and process manager
Other Flask object managers
File Server
Network Server
Process Manager
File Server
The Flask file server provides four types of
controlled (labeled) objects: file systems,
directories, files, and file description objects.
Since file systems, directories and files are
persistent objects, their labels must also be
Figure 6: Labeling of persistent objects
Table 8: Permission requirements for relabeling a file.
Network Server
Table 9: Layered controls in the network protocol stack.
Process Manager
Current Status
 NSA implemented a Linux Security Module (LSM) called
It is an implementation of Flask
It is in the mainline kernel
Released in many distros including RHEL, Debian, etc.
Often criticized for being overly complicated to set up and
 TrustedBSD
 Part of this system is a port of SELinux extensions to FreeBSD
 TrustedDarwin is a port of TrustedBSD to the Darwin system
 Some components of TrustedBSD have spilled over into OS X, not
sure if this includes Flask implementation
R. Spencer, S. Smalley, P. Loscocco, M. Hibler, D.
Andersen, and J. Lepreau.
The Flask Security Architecture: System Support
for Diverse Security Policies.
In Proceedings of the Eighth USENIX Security
Symposium, pages 123-139, Aug. 1999.
SELinux & LSM
SELinux motivated the creation of LSM.
Separate kernel from security features in order to
minimize the impact to kernel.
LSM doesn’t provide any security but it adds
security fields to kernel and provides interfaces for
managing these fields for maintaining security
Design and Implementation
Testing and Functionality
Design and Implementation
Testing and Functionality
Security is a chronic and growing problem
Linux systems do experience a large number of
software vulnerabilities
An important way to mitigate software
vulnerabilities is through effective use of access
 Non-DAC
But there has been no real consensus on which is
the one true access control model.
Because of this lack of consensus, there are many
patches to the Linux kernel that provide enhanced
access controls [6, 10, 11, 13, 16, 18, 23, 19, 31] but
none of them are a standard part of the Linux
The Linux Security Modules (LSM) project seeks to
solve this Tower of Babel quandry by providing a
general purpose framework for security policy
 This allows many different access control models to be
implemented as loadable kernel modules, enabling
multiple threads of security policy engine development to
proceed independently of the main Linux kernel.
A number of existing enhanced access control
implementations have already been adapted to use
the LSM framework,
 POSIX.1e capabilities
 SELinux
 Domain and Type Enforcement (DTE)
The problem: Constrained Design Space
 At the 2001 Linux Kernel Summit, the NSA presented their
work on SELinux, an implementation of a flexible access
control architecture in the Linux kernel.
 Linus Torvalds appeared to accept that a general access
control framework for the Linux kernel is needed.
 However, given the many Linux kernel security projects, and
Linus’ lack of expertise in sophisticated security policy, he
preferred an approach that allowed security models to be
implemented as loadable kernel modules.
 In fact, Linus’ response provided the seeds of the LSM design.
The design of LSM was constrained by the practical
and technical concerns of both the Linux kernel
developers and the various Linux security projects.
Linus Torvalds specified that the security
framework must be:
 truly generic, where using a different security model is
merely a matter of loading a different kernel module;
 conceptually simple, minimally invasive, and efficient;
 able to support the existing POSIX.1e capabilities logic
as an optional security module.
The “LSM problem”
The “LSM problem” is to unify the functional needs
of as many security projects as possible, while
minimizing the impact on the Linux kernel.
LSM takes the approach of mediating access to the
kernel’s internal objects: tasks, inodes, open files,
etc., as shown in Figure 1.
Figure 1: LSM Hook Architecture
why LSM chose this approach?
 ? system call interposition: mediating system calls as they
enter the kernel
 ? device mediation: mediating at access to physical devices
 Reason: information critical to sound security policy
decisions is not available at those points
 At the system call interface, userspace data, such as a path name, has
yet to be translated to the kernel object it represents, such as an
inode. Thus, system call interpostion is both inefficient and prone to
time-of-check-to-time-of-use (TOCTTOU) races
 At the device interface, some other critical information (such as the
path name of the file to be accessed) has been thrown away.
 In between is where the full context of an access request can be seen,
and where a fully informed access control decision can be made.
Figure 2: Permissive LSM hook.
Implementation Overview
Task Hooks
Program Loading Hooks
IPC Hooks
Filesystem Hooks
Network Hooks
Other Hooks
Design and Implementation Overview
The LSM kernel patch modifies the kernel in five
primary Ways
 adds opaque security fields to certain kernel data
 inserts calls to security hook functions at various points
within the kernel code
 adds a generic security system call
 provides functions to allow kernel modules to register
and unregister themselves as security modules,
 moves most of the capabilities logic into an optional
security module
Opaque Security Fields
The opaque security fields are void* pointers, which
enable security modules to associate security
information with kernel objects.
Table 1
Table 1: Kernel data structures modified
by the LSM kernel patch
and the corresponding abstract objects.
The setting of these security fields and the
management of the associated security data is
handled by the security modules.
Calls to Security Hook
Figure 3 shows the
vfs_mkdir kernel
function after the
LSM kernel patch
has been applied.
Registering Security Modules
 全局变量:security_ops
 提供Callee的接口定义
 security_ops的初始化
Task Hooks
task_struct structure
2.6.26 security_operations中task相关
2.6.26 security.h中对各task hook调用点的封装定义
The LSM task hooks have full task life-cycle
 create() task hook
 a task can spawn children?
 alloc_security() task hook
 manage the new task’s security field.
When a task exits
kill() task hook
 the task can signal its parent?
Parent: wait() task hook
 the parent task can receive the child’s signal?
free_security() task hook
 Release the task’s security field.
 例如:setuid(2).
setuid() task hook.
post_setuid() task hook.
 getpgid()
 getscheduler()
Program Loading Hooks
The linux_binprm structure represents a new
program being loaded during an execve(2).
May to change privileges when a new program is
Hooks are used to verify a task’s ability to load a
new program and update the task’s security field.
 alloc_security() :分配模糊域相关空间
 set_security() : 设置模糊域
 may be called multiple times during a single execve(2)
 compute_creds() : set the new security attributes of a task
 Typically, it will calculate the tasks new credentials based on
both its old credentials and the security information stored in the
linux_binprm security field.
Once the new program is loaded,
 free_security():释放模糊域相关空间
Filesystem Hooks
For file operations, three sets of hooks
 filesystem hooks,
 inode hooks,
 file hooks.
LSM adds a security field to each of the associated
kernel data structures:
 super block,
 inode,
 file.
IPC Hooks
 standard SysV IPC mechanisms:
 shared memory,
 semaphores,
 message queues.
 ipc_security_ops
 shm_security_ops,
 sem_security_ops,
 msg_queue_security_ops,
 msg_msg_security_ops.
Other Hooks
LSM provides two additional sets of hooks:
module hooks and a set of top-level system hooks.
 Module hooks can be used to control the kernel
operations that create, initialize, and delete kernel
 System hooks can be used to control system operations,
such as setting the system hostname, accessing I/O ports,
and configuring process accounting.
Performance Impact
The performance cost of the LSM framework is
critical to its acceptance
it was a major part of the debate at the Linux 2.5
developer’s summit that spawned LSM.
Microbenchmarks & Macrobenchmarks
Compared a stock Linux kernel to one modified
with the LSM patch, but with no modules loaded
 最坏情况下的开销:
 6.2% for stat(),
 6.6% for open/close,
 7.2% for file delete.
 通常情况下的开销:
 often 0%, ranging up to 2%
building the Linux kernel
even better:
 no measurable performance impact.
Security Impact
LSM provide some real security value??
This can be viewed in two ways.
 First, must not create new security holes and needs to be
thorough and consistent in its coverage.
 a project from IBM
static and dynamic analysis of the LSM framework
 Second, must be general enough to support a variety of
access control models
 SELinux
 DTE Linux
 LSM port of Openwall kernel patch
 POSIX.1e capabilities
 LIDS (Linux Intrusion Detection System)
requirements: to meet two criteria:
 be relatively painless for people who don’t want it,
 be useful and effective for people who do want it.
LSM meets these criteria.
 The patch is relatively small,
 the performance data shows that the LSM patch imposes
nearly zero overhead.
 The broad suite of security products from around the
world that have been implemented for LSM shows that
the LSM API is useful and effective for developing Linux
security enhancements.
Flask体系结构在Linux LSM中的应用
The end.