2.7 Kernel integrity measurement schemes

advertisement
Title - Software based Remote Attestation: measuring integrity of user
applications and kernels
Authors:
Raghunathan
Srinivasan1
(corresponding
author),
Partha
Dasgupta1, Tushar Gohad2
Affiliation:
1. School of Computing, Informatics and Decision Systems
Engineering, Arizona State University, Tempe, AZ, USA
2. MontaVista Software LLC
Address:
Email:
raghus@asu.edu
Phone:
(1) 480-965-5583
Fax:
(1)-480-965-2751
Abstract:
This research describes a method known as Remote attestation to attest the
integrity of a process using a trusted remote entity. Remote attestation has
mostly been implemented using hardware support. Our research focuses on
the implementation of these techniques based entirely on software utilizing
code injection inside a running process to attest its integrity.
A trusted external entity issues a challenge to the client machine and the client
machine has to respond to this challenge.
The result of this challenge
provides the external entity with an assurance on whether or not the software
executing on the client machine is compromised. This paper also shows
methods to determine the integrity of the operating system on which software
based remote attestation occurs.
Keywords: Remote Attestation, Integrity Measurement, Root of Trust, Kernel
Integrity, Code Injection.
1.
Introduction
Many consumers utilize security sensitive applications on a machine (PC)
along with other vulnerable software. Malware can patch on various software
in the system by exploiting these vulnerabilities. A regular commodity OS
consists of millions of lines of code (LOC) [1]. Device drivers usually range
in size between a few lines of code to around 100 thousand lines of code
(KLOC), with an average of 1 bug per device driver [2]. Another empirical
study showed that bugs in the kernel may have a lifetime of nearly 1.8 years
on average [3], and that there may be as many as 1000 bugs in the 2.4.1 Linux
kernel. The cumulative effect of such studies is that it is difficult to prevent
errors that can be exploited by malware. Smart malware can render Antimalware detection techniques by disabling them.
Hardware detection
schemes are considered to be non modifiable by malware. However, mass
scale deployment of hardware techniques remains a challenge, and they also
have the stigma of digital rights management (DRM) attached. Another issue
with hardware measurement schemes is that software updates have to be
handled such that only legitimate updates get registered with the hardware. If
the hardware device offers an API to update measurements, malware can
attempt to use that API to place malicious measurements in the hardware. If
the hardware device is not updatable from the OS, then reprogramming has to
be performed on it to reflect updated measurements.
Software based attestation schemes offer flexibility and can be changed
quickly to reflect legitimate updates. Due to the ease of use and the potential
of mass scale deployment, software based attestation schemes offer significant
advantages over hardware counterparts.
However, every software based
attestation scheme is potentially vulnerable to some corner case attack
scenario. In extreme threat model cases and cases where updates are rare,
network administrators can switch to using hardware based measurement
schemes.
For the general consumer, software based schemes offer a
lightweight protocol that can detect intrusions prior to serious data losses.
Remote Attestation is a set of methods that allows an external trusted agent to
measure the integrity of a system.
Software based solutions for Remote
Attestation schemes vary in their implementation techniques. Pioneer [4],
SWATT [5], Genuinity [6], and TEAS [7] are well known examples. In
TEAS, the authors prove mathematically that it is highly difficult for an
attacker to determine the response for every integrity challenge, provided the
code for the challenge is regenerated for every instance. However, TEAS does
not provide any implementation framework.
In Genuinity, a trusted authority sends executable code to the kernel on the untrusted machine, and the kernel loads the attestation code to perform the
integrity measurements. Genuinity has been shown to have some weaknesses
by two studies [8], [5]. However, the authors of Genuinity have since claimed
that these attacks may work only on the specific cases mentioned in the two
works, a regeneration of the challenge by the server would render the attacks
insignificant [9].
This work is quite similar to Genuinity with certain differences in technique.
Like Genuinity, this work focuses on the importance of regenerating code that
performs integrity measurement of an application on the client. We do not
utilize the Operating System support to load the challenge, the application has
to receive the code and execute it. In addition, this paper also deals with the
problem of what we term a ‘redirect’ attack where an attacker may direct the
challenge to a different machine.
The attestation mechanisms presented in this work use the system call
interface of the client platform. Due to this, the problem of determining
integrity of an application on a client platform is split into two orthogonal
problems. The first involves determining the integrity of the user application
in question by utilizing system calls and software interrupts. The orthogonal
problem is determining the integrity of the system call table, interrupt
descriptors, and the Text section of the kernel that runs on client platform
For the first problem, it is assumed that the system calls will produce the
correct results.
Rootkits are assumed to be absent from the system. We
assume that there may be various other user level applications on the client
platform that may attempt to tamper with the execution of the challenge. For
the second problem, this paper presents a scheme where an external entity can
determine the state of the OS Text section, System call Table, and the
Interrupt Descriptor table on the client machine. It can be noted that the
external entities obtaining the integrity measure for the application and the OS
can be different.
The solution in this paper is designed to detect changes made to the code
section of a process. This allows the user (Alice) to determine whether one
application is clean on the system. The same technique can be extended to
every application on the system to determine whether all installed applications
are clean. Trent is a trusted entity who has knowledge of the structure of an
un-tampered copy of the process (P) to be verified.
Trent may be the
application vendor or Trent may be an entity that offers attestation services for
various applications. It should be noted that Trent only needs to know the
contents and behavior of the clean program image of P to generate challenges.
Trent provides executable code (C) to Alice (the client/ end user), which Alice
injects on P. C takes overlapping MD5 hashes on the sub-regions of P and
returns the results to Trent. Trent has to be a trusted agent as the client
downloads program code or performs certain operations based on Trent’s
instructions. If Trent is not trusted then Alice cannot run the required code
with certainty that it will not compromise Alice’s machine (MAlice).
C is newly generated randomized code that executes on the user end to
determine the integrity of an application on an x86 based platform. This
ensures that an attacker cannot determine the results of the integrity
measurement without executing C. Trent places some programming constructs
in C that ensure that C is difficult to execute in a sandbox or a controlled
environment. A software protocol means that there exists opportunity for an
attacker (Mallory) to forge results.
The solution provided in this paper
protects itself from the following attacks.
Replay Attack: Mallory may provide Trent forged results by replaying the
response to a previous attestation challenge. To prevent this scenario, Trent
changes the operations performed in every instance of C. This is done by
placing some lines in the source code of C that depend on various constants.
C is recompiled for every attestation request. These constants are generated
prior to code compilation using random numbers.
Consequentially, the
outputs of these measurements change with the change of every constant. The
code produced by Trent requires that Mallory monitors and adapts the attack
to suit the challenge.
We utilize the concept that program analysis of
obfuscated code is complex enough to prevent attacks [7].
Tampering: Mallory may analyze the operations performed by the challenge
to return forged values.
Trent places dummy instructions, randomizes
locations of variables, and places some self modifying instructions to prevent
static analysis of the application. It must be noted that self modifying code is
normally not permitted in the Intel x86 architecture as the code section is
protected against writes. However, we use a Linux OS call ‘mprotect’ to
change the protections on the code section of the process in which C executes
to allow this feature. Furthermore Trent also maintains a time threshold by
which the results are expected to be received; this reduces the window of
opportunity for Mallory to launch a successful attack.
Redirect: Mallory may re-direct the challenge from Trent to a clean machine
or execute it in a sandbox which will provide correct integrity values as the
response to Trent.
The executable code sent by Trent obtains machine
identifiers to determine whether it executed at the correct machine. It also
executes certain tests to determine if it was executed inside a sandbox. C
communicates multiple times to Trent while executing tests on P. This makes
it harder for Mallory to prevent C from executing.
These techniques are
discussed in detail in section 5.
For obtaining the integrity measurement of the OS Text section, the attestation
service provider Trent′ provides executable code (Ckernel) to the client OS
(OSAlice). OSAlice receives the code into a kernel module and executes the
code. It is assumed that OSAlice has means such as Digital Signatures to verify
that Ckernel did originate from Trent′. The details of implementation of this
scheme are in section 7.
The rest of the paper is organized as follows. Section 2 contains a review of
the related work. Section 3 describes the problem statement, threat model and
assumptions made in this solution. Section 4 describes the overall design of
the system; section 5 describes the obfuscation techniques used in creating C.
Section 6 describes the implementation of the application attestation system,
section 7 describes the implementation of kernel runtime measurements and
section 8 concludes the paper.
2.
Related Work
Code attestation involves checking if the program code executing within a
process is legitimate or has been tampered. It has been implemented using
hardware, virtual machine and software based detection schemes. In this
section we discuss these schemes as well as methods to perform program
analysis and obfuscation techniques available in literature.
2.1 Hardware based integrity checking
Some hardware based schemes operate off the TPM chip provided by the
Trusted Computing Group [10],[11], [12], while others use a hardware
coprocessor which can be placed into the PCI slot of the platform [13], [14].
The schemes using the TPM chip involve the kernel or an application
executing on the client obtaining integrity measurements, and providing it to
the TPM, the TPM signs the values with its private key and may forward it to
an external agent for verification. The coprocessor based schemes read
measurements on the machine without any assistance from the OS or the CPU
on the platform, and compare measurements to previously stored values. The
hardware based scheme can allow a remote (or external) agent to verify
whether the integrity of all the programs on the client machine is intact or not.
Hardware based schemes have a stigma of DRM attached to them, may be
difficult to reprogram and are not ideally suited for mass deployment.
The
TPM based schemes have little backward compatibility in that it does not
work on legacy systems which do not have a TPM chip.
Integrity Measurement Architecture (IMA) [15] is a software based integrity
measurement scheme that utilizes the underlying TPM on the platform. The
verification mechanism does not rely on the trustworthiness of the software on
the system. IMA maintains a list of hash values of all possible executable
content that is loaded in the system. When an executable, library, or kernel
module is loaded, IMA performs an integrity check prior to executing it. IMA
measures values while the system is being loaded, however does not provide
means to determine whether any program that is in execution got tampered in
memory. IMA also relies on being called by the OS when any application is
loaded; it relies on the kernel functions for reading the file system, and relies
on the underlying TPM to maintain an integrity value over the measurement
list residing in the kernel. Due to this, each new measurement added to a
kernel-held measurement list results in a change required for values stored in
the Platform Configuration Register (PCR) of the TPM security chip on the
system.
2.2 Virtualization based Integrity checking
Virtualization implemented without hardware support has been used for
security applications. This form of virtualization was implemented prior to
large scale deployment of platforms containing in built hardware support for
virtualization. Terra uses a trusted virtual machine monitor (TVMM) and
partitions the hardware platform into multiple virtual machines that are
isolated from one another [16].
Hardware dependent isolation and
virtualization are used by Terra to isolate the TVMM from the other VMs.
Terra implements a scheme where potentially every class of operation is
performed on a separate virtual machine (VM) on the client platform. Terra is
installed in one of the VMs and is not exposed to external applications like
mail, gaming, and so on. The TVMM is provided the role of a Host OS. The
root of trust in Terra is present in the hardware TPM; the TPM takes
measurements on the boot loader, which in turn takes measurements on the
TVMM. The TVMM takes measurements on the VMs prior to loading them.
Terra relies on the underlying TPM to take some measurements.
Most
traditional VMM based schemes are bulky and need significant resources on
the platform to appear transparent to the end user, this holds true for Terra
where the authors advocate multiple virtual machines.
2.3 Integrity checking using hardware assisted virtualization
Hardware support for virtualization has been deployed in the widely used x86
consumer platforms recently. Intel and AMD have come out with Intel VT-x
and AMD-V which provide processor extensions where a system
administrator can load certain values in the hardware to setup a VMM and
execute the operating system in a guest environment. The VMM runs in a
mode that has higher privileges than the guest OS and can therefore enforce
access control between multiple guest operating systems and also between
application programs inside an OS. The system administrator can also setup
events in the hardware which cause the control to exit from the guest OS to
the VMM in a trap and emulate model. The VMM can take a decision based
on the local policy whether to emulate or ignore the instruction.
VIS [17] is a hardware based virtualization scheme which determines the
integrity of client programs that connect to a remote server. VIS contains an
Integrity Measurement Module which reads the cryptographically signed
reference measurement (manifest) of a client process. VIS verifies the
signature in a scheme similar to X.509 certificate measurement and then takes
the exact same measurements on the running client process to determine
whether it has been tampered. The OS loader may perform relocation of
certain sections of the client program, in which case the IMM reverses these
relocations using information provided in the manifest and then obtains the
measurement values. VIS requires that the pages of the client programs are
pinned in memory (not paged out). VIS restricts network access during the
verification phase to prevent any malicious program from bypassing
registration. VIS does not allow the client programs unrestricted access to
network before the program has been verified.
2.4 Software based integrity measurement schemes
Genuinity [6] implements a remote attestation system in which the client
kernel initializes the attestation for a program. It receives executable code and
maps it into the execution environment as directed by the trusted authority.
The system maps each page of physical memory into multiple pages of virtual
memory creating a one to many relationship between the physical and virtual
pages.
The trusted external agent sends a pseudorandom sequence of
addresses, the Genuinity system othen takes the checksum over the specified
memory regions. Genuinity also incorporates various other values like the
Instruction and Data TLB miss count, counters which determine number of
branches and instructions executed. The executable code performs various
checks on the client kernel and returns the results to a verified location in the
kernel on the remote machine, which returns the results back to the server.
The server verifies if the results are in accordance with the checks performed,
if so the client is verified. This protocol requires OS support on the remote
machine for many operations including loading the attestation code into the
correct area in memory, obtaining hardware values such as TLB. Commodity
OS have many applications, requiring OS support or a kernel module for each
specific application can be considered a major overhead.
In Pioneer [4] the verification code resides on the client machine. The verifier
(server) sends a random number (nonce) as a challenge to the client machine.
The result returned as response determines if the verification code has been
tampered or not. The verification code then performs attestation on some
entity within the machine and transfers control to it. This forms a dynamic
root of trust in the client machine. Pioneer assumes that the challenge cannot
be re directed to another machine on a network, however in many real world
scenarios a malicious program can attempt to redirect challenges to another
machine which has a clean copy of the attestation code. In its checksum
procedure, it incorporates the values of Program Counter and Data Pointer,
both of which hold virtual memory addresses. An adversary can load another
copy of the client code to be executed in a sandbox like environment and
provide it the challenge. This way an adversary can obtain results of the
computation that the challenge produces and return it to the verifier. Pioneer
also assumes that the server knows the exact hardware configuration of the
client for performing a timing analysis, this places a restriction on the client to
not upgrade or change hardware components.
In TEAS [7] the authors
propose a remote attestation scheme in which the verifier generates program
code to be executed by the client machine. Random code is incorporated in
the attestation code to make analysis difficult for the attacker. The analysis
provided by them proves that it is very unlikely that an attacker can clearly
determine the actions performed by the verification code; however
implementation is not described in the research.
A Java Virtual Machine (JVM) based root of trust method has also been
implemented to attest code [18].
The authors implement programs in Java
and modify the JVM to attest the runtime environment. However, the JVM
has known vulnerabilities and is itself software that operates within the
Operating System, and hence is not a suitable candidate for checking integrity.
SWATT [5] implements a remote attestation scheme for embedded devices.
The attestation code resides on the node to be attested. The code contains a
pseudorandom number generator (PRG) which receives a seed from the
verifier. The attestation code includes memory areas which correspond to the
random numbers generated by PRG as part of the measurement to be returned
to the verifier. The obtained measurements are passed through a keyed MAC
function, the key for the instance of MAC operation is provided by the
verifier. The problem with this scheme is that if an adversary obtains the seed
and the key to the MAC function, the integrity measurements can be spoofed
as the attacker would have access to the MAC function and the PRG code.
2.5 Attacks against software based attestation schemes
Genuinity has been shown to have weaknesses by two works [8], [5]. In [8] it
is described that Genuinity would fail against a range of attacks known as
substitution attacks. The paper suggests placing attack code on the same
physical page as the checksum code. The attack code leaves the checksum
code unmodified and writes itself to the zero-filled locations in the page. If the
pseudo random traversal maps into the page on which the imposter code is
present, the attack code redirects the challenge to return byte values from the
original code page. Authors of Genuinity countered these findings by stating
that the attack scenario does not take into account the time required to extract
test cases from the network, analyze it, find appropriate places to hide code
and finally produce code to forge the checksum operations [9]. The attacks
were specifically constructed against one instance of the checksum generation,
and would require complex re engineering to succeed against all possible test
cases. This would require a large scale code base to perform the attack. Such
a large code base would not be easy to hide.
In [5] it is suggested that genuinity has a problem of mobile code where an
attacker can exploit vulnerabilities of mobile code as code is sent over the
network to be executed on the client platform. In addition, the paper also
states that Genuinity reads 32 bit words for performing a checksum and hence
will be vulnerable if the attack is constructed to avoid the lower 32 bits of
memory regions. These two claims are countered by the authors of Genuinity
[9]. The first is countered by stating that Genuinity incorporates public key
signing which will prevent mobile code modifications by an attacker, while
the second is countered by stating that genuinity reads 32 bits at a time, and
not the lower 32 bits of an address.
A generic attack on software checksum based operations has been proposed
[19]. This attack is based on installing a kernel patch that redirects data
accesses of integrity measurement code to a different page in the memory
containing a clean copy of the code. This attack constitutes installation of a
rootkit to change the page table address translation routine in the OS.
Although this scheme potentially defeats many software based techniques, the
authors have themselves noted that it is difficult for this attack to work on an
x86 based 64 bit machine which does not use segmentation, this is because the
architecture does not provide the ability to use offsets for code and data
segments. Moreover, an attack like this requires the installation of a kernel
level rootkit that continuously redirects all read accesses to different pages in
memory.
The attestation scheme presented in this paper for the user
application cannot defend itself against this attack, however, the scheme
presented in this work to determine the integrity of the kernel is capable of
detecting such modifications. In addition, Pioneer [4] suggests a workaround
on this classes of attacks by suggesting that if there are multiple virtual
address aliases, which in turn creates extra entries in the page table which will
lead to the OS eventually flushing out the spurious pages.
2.6 Program analysis and code obfuscation
Program Analysis requires disassembly of code and the control flow graph
(CFG) generation. The linux tool ‘objdump’ is one of the simplest linear
sweep tools that perform disassembly. It moves through the entire code once,
disassembling each instruction as and when encountered. This method suffers
from a weakness that it misinterprets data embedded inside instructions hence
carefully constructed branch statements induce errors [20]. Linear sweep is
also susceptible to insertion of dummy instructions and self modifying code.
Recursive Traversal involves decoding executable code at the target of a
branch before analyzing the next executable code in the current location. This
technique can also be defeated by opaque predicates [21]where one target of a
branch contains complex instructions which never execute [22].
CFG generation involves identifying blocks of code such that they have one
entry point and only one branch instruction with target addresses.
Once
blocks are identified, branch targets are identified to create a CFG. Compiler
optimization techniques such as executing instructions in the delay slot of a
branch cause issues to the CGF and require iterative procedures to generate an
accurate CFG. The execution time of these algorithms is non-linear (n2) [23].
2.7 Kernel integrity measurement schemes
An attacker can compromise any measurements taken by a user level program
by installing a kernel level rootkit. The kernel provides file system, memory
management and system calls for user applications. The remote attestation
scheme as implemented in this work requires kernel support. This section
describes prior work done in implementing kernel integrity measurement.
Co-processor schemes that are installed on the PCI slot of the PC have been
used to measure the integrity of the kernel as mentioned in section 2.1. One
scheme [13] computes the integrity of the kernel at installation time and stores
this value for future comparisons. The core of the system lies in a coprocessor (SecCore) that performs integrity measurement of a kernel module
during system boot. The kernel interrupt service routine (SecISR) performs
integrity checks on a kernel checker and a user application checker. The
kernel checker proceeds with attesting the entire kernel .TEXT section and
modules. The system determines that during installation for the machine used
for building the prototype, the .TEXT section began at virtual address
0xC0100000 which corresponded to the physical address 0x00100000, and
begin measurements at this address.
Another work focuses on developing a framework for classifying rootkits
[24]. The authors state that there are three classes of rootkits, those that
modify system call table, those that modify targets of system calls, and those
that redirect references to the system call table by redirecting to a different
location.
A kernel level rootkit may perform these actions by using
/dev/kmem device file, an example of such a rootkit is the knark rootkit [25].
The rootkit detector keeps a copy of the original System.map file and
compares the current system call table’s addresses with the original values. A
difference between the two tables indicates system call table modification.
This system of detecting changes to system call table detected the presence of
knark rootkit that modifies 8 system calls.
The framework also detects
rootkits like SucKIT [26] which overwrite kernel memory to create a fake
system call table. Any user access to the system calls re directs to the new
table. The rootkit checker determines if the current system call table starts at
a location different that the original address, in which case a compromise is
detected.
LKIM [27] obtains hashes and contextual measurements to determine the
integrity of the platform. In addition to taking hash measurements on kernel
Text section, system call table, LKIM also takes measurements on some other
descriptors such as inodes, executable file format handlers, Linux security
model hooks and so on. The measurements taken are defined by a set of
measurement instructions. The paper states that there is no silver bullet to
prevent the Linux OS from forging results, hence propose a hypervisor based
scheme instead of a native OS scheme. The hypervisor scheme involves
changing Xen’s domain U to host the LKIM infrastructure. The domain
hosting LKIM is provided Domain 0 privileges.
3.
Threat model and Assumptions
We assume that Mallory an attacker has complete control over software
residing on Alice’s machine and Mallory possesses the power to start a clean
copy of Alice’s installed program P to execute it in a controlled environment
to return results to Trent. Mallory can also attempt to re-direct the challenge
to another machine which runs a clean copy of P. We assume that Mallory
will not perform transient attacks like patching P with malicious code at any
given time t and then at any time t + ∆ replace the old instructions back and
remove any modifications. This behavior can be classified as rootkit like
behavior which will not be determined by the application level remote
attestation. However, a rootkit like this would get detected in the kernel level
remote attestation as described in section 7.
We assume that Alice will trust the code provided by Trent and allow it to
execute on the machine to be verified, and that Alice has means such as
certificates and digital signatures to verify that the verification code (C) has
been generated by Trent. We also assume that Alice is not running MAlice
behind a NAT and that the machine has only one network interface. The
reason to make these assumptions is that C takes measurements on MAlice to
determine if it is the same machine that contacted Trent. If MAlice is behind a
NAT then Trent would see the request coming from a router and
measurements from MAlice. This work focuses on the general client platform
where only one network interface is installed, and each network interface has
only one IP address associated with it.
In the case that there are many
addresses configured on the same network interface, the code can be altered to
populate all possible IP addresses that it reads from the interface and send
them to Trent. Trent can parse through the result to find the matching IP
address.
For the user application attestation part, this work does not assume a
compromised kernel. The verification code C relies on the kernel to handle
the system calls executed through interrupts, and to read the file structure
containing the open connections on the system. There are many system call
routines in the Linux kernel and monitoring and duplicating the results of each
of these may be a difficult task for malware. Reading the port file structure
also requires support from the operating system. We will assume that the OS
provides correct results when the contents of a directory and file are read out.
Without this assumption, Remote Attestation cannot be performed entirely
without kernel support.
For the kernel attestation part, we assume that the kernel is compromised;
system call tables may be corrupted, and a malware may have changed the
interrupt descriptors. Runtime code injection is performed on a kernel module
to measure the integrity of the kernel. It is assumed that Alice has means such
as digital certificates to determine that the code being injected is generated by
a trusted server. It is also assumed that the trusted server is the OS vendor or
a corporate network administrator who has knowledge of the OS mappings for
the client.
4.
Overview of operations to be performed on Client end
If Alice could download the entire copy of P every time the program had to be
executed then Remote Attestation would not be required. However, since P is
an installed application, Alice must have customized certain profile options,
saved some data which will be cumbersome to create ever time.
Alice uses P to contact Trent for a service, Trent returns to P: a challenge
which is executable code (C). P must inject C in its virtual memory and
execute it at a location specified by Trent. C computes certain measurements
and communicates integrity measurement value M1 directly to Trent. This
process is depicted in Fig. 1. Trent has a local copy of P on which the same
sets of tests are executed as above to produce a value M0. Trent compares M1
and M0; if the two values are the same then Alice is informed that P has not
been tampered. This raises the issue of verifiable code execution, in which
Trent wants to be certain that C took its measurements on P residing inside
MAlice. To provide this guarantee C executes some more tests on MAlice and
returns their results to Trent. These checks ensure that C was not bounced to
another machine, and that it was not executed in a sandbox environment
inside a dummy P process within MAlice.
There are many ways in which Mallory may tamper with the execution of C.
Mallory may substitute values of M1 being sent to Trent such that there is no
evidence of any modification to P having taken place. It is also possible that
Mallory may have loaded another copy of P which has not been tampered
inside a sandbox, execute C within it, and provide the results back to Trent.
Mallory may have also redirected the challenge to another machine on the
network making it compute and send the responses back to Trent. Without
addressing these issues, it is not possible for Trent to correctly determine
whether the measurements accurately reflect the state of P on MAlice. If Trent
can determine that C executed on MAlice, and C was not executed in a sandbox
then Trent can produce code whose results are difficult to guess and the
results can indicate the correct state of P. Achieving these guarantees require
that C provides Trent with a machine identifier and a process identifier.
Trent can retain a sense of certainty that the results are genuine by producing
code that makes it difficult for Mallory to pre-compute results. Once these
factors are satisfied, Trent can determine whether P on MAlice has been
tampered. The entire process of Remote Attestation is shown in Fig. 2.
4.1 Determining checksum and MD5 on P
C computes a MD5 hash of P to determine if the code section has been
tampered. Downloading MD5 code is an expensive operation as the code size
is fairly large, and MD5 code cannot be randomized as it may lose its
properties. Due to these reasons, the MD5 code permanently resides on P. To
prevent Mallory from exploiting this aspect, a two phase hash protocol is
implemented.
Trent places a mathematical checksum inside C which
computes the checksum on the region of P containing the MD5 executable
code along with some other selected regions. Trent computes the results of
the checksum locally and verifies if C is returning the expected value. C
proceeds with the rest of the protocol if Trent responds in affirmative.
Trent changes the operations of the checksum in every instance so that
Mallory cannot use prior knowledge to predict the results of the mathematical
operations. C does not take the checksums over fixed sized regions; instead
Trent divides the entire area over which checksum is taken into multiple
overlapping sub-regions, the boundaries of the sub-regions are defined inside
C by Trent by moving the data pointer back by a random number that is
generated during compilation of the C source code.
For the prototype
implementation, the method used to generate the random numbers was the
‘rand’ call, since rand call may not me truly random, we used the ‘srand’ call
and used the current stack pointer of the source code generating program as
the seed to the random number. The stack of all processes is randomized
using Address Space Layout Randomization (ASLR) [28]. It can be noted
that this is not as secure as using a cryptographically secure random number
generator. In real world applications, Trent can use the Linux ‘/dev/random’
file [29] to read random numbers.
The individual checksums are then combined and sent to Trent. This is
depicted in Fig. 3. C performs MD5 hash on overlapping sub-regions of P
defined in a similar fashion as above. A degree of obfuscation is added by
following the procedure in Fig. 4. C initially takes the MD5 hash of the first
sub-region (H1). It then obtains the MD5 hash of the next sub-region (H2). It
then concatenates the two values to produce H1H2. Then a MD5 Hash of
H1H2 is taken to produce H12. H12 is then concatenated with H3 to produce
H12H3. H12H3 is hashed again to produce H23 and so on. This process is
followed for all the sub-regions and sent to Trent.
Drawing inferences from executable code is considered difficult as discussed
in section 2. Randomizing the boundary overlaps between the sub-regions
makes it difficult to predict the hash values being generated. Mallory has to
execute the code to observe the computation being performed.
The
checksums are taken on overlapping sub regions to make the prediction of
results more difficult for Mallory.
This creates multiple levels of
indeterminacy for an attack to take place. Mallory has to not only predict the
boundaries of the sub-regions, but has to also deal with the overlap among the
sub-regions. Overlapping checksums also ensures that if by accident the subregions are defined identically in two different versions of C, the results of
computation produced by C are still different. This also ensures that some
random sections of P are present more than once in the checksum, making it
more difficult for Mallory to hide any modifications to such regions.
MD5 checksum has been used in this prototype, it has been discovered that it
has collisions. However, MD5 can be substituted easily with a different
hashing algorithm in a software based attestation scheme, the same cannot be
done easily in a TPM or hardware based attestation scheme.
4.2 Determining process identifiers.
C determines whether it was executed inside a fake process or the correct P
process by obtaining some identifiers. C determines the number of processes
having an open connection to Trent on MAlice. This is obtained by determining
the remote address and remote port combinations on each of the port
descriptors in the system. C communicates to Trent using the descriptor
provided by P and does not create a new connection. This implies that in an
ideal situation there must be only one such descriptor on the entire system,
and the process utilizing it must be the process under which C is executing.
The passing of socket descriptor from P to C also addresses the issue of
redirection of challenge to another machine partially. The only way for such a
connection to exist on a machine is if Trent accepts the incoming request,
otherwise the machine will not have a socket descriptor with the ability to
communicate with Trent.
If there is more than one process having such a connection then an error
message is sent to Trent. If there is only one such process, C computes its
own process id and compares the two values. If they match an affirmative
message is sent to Trent. If the values do not match then it reports an error
with an appropriate message to Trent.
4.3 Determining the Identifier for MAlice
C has to provide Trent the guarantee that it was not re-directed to another
machine and that it was not executed in a sandbox environment or pasted on
another clean copy of P within MAlice. The first is achieved by obtaining any
particular unique machine identifier.
In this case the IP address of the
machine can serve as an identifier. Trent has received a request from Alice
and has access to the IP address of MAlice. If C returns the IP address of the
machine it is executing on Trent can determine if both are the same machine
or not. It can be argued that IP addresses are dynamic however there is little
possibility that any machine will change its IP address in the small time
window between a request by Alice to measurements being taken and
provided to Trent.
C determines the IP address of MAlice using System
Interrupts. Mallory will also find it hard to tamper with the results of an
Interrupt. The interrupt ensures that the address present on the Network
interface is correctly reported to Trent. It can again be noted that Mallory
may have changed the address of the network interface to match that of MAlice,
but as these machines are not behind a NAT it would be quite difficult for
Mallory to provide the identical address to another machine on an external
network and communicate with that machine. On receiving the results of the
four tests, Trent knows that P has not been tampered from the time of
installation to the time of request of verification being sent from MAlice.
5.
Design of Checksum code produced by Trent
Trent has to prevent Mallory from analyzing the operations performed by C.
Trent places a series of obfuscations inside the generated code along with a
time threshold (T) by which the response from MAlice is expected. If C does
not respond back in a stipulated period of time (allowing for network delays),
Trent will know that something went wrong at MAlice. This includes denial of
service based attacks where Trent will inform Alice that C is not
communicating back.
Fig. 5 shows a sample snippet of the C mathematical checksum code.
The
send function used in the checksum snippet is implemented using inline ASM.
It is evident that in order to forge any results, Mallory must determine the
value of checksum2 being returned to Trent.
This requires that Mallory
identifies all the instructions modifying checksum2 and the locations on stack
that it uses for computation. To prevent Mallory from analyzing the injected
code, certain obfuscations are placed in C as discussed below:
5.1 Changing execution flow and locations of variables on stack
To prevent Mallory from utilizing knowledge about a previous instance of C
in the current test, Trent changes the checksum operations performed by
selecting mathematical operations on memory blocks from a pool of possible
operations and also changes the order of the instructions. The results of these
operations are stored temporarily in the stack. Trent changes the pointers on
the stack for all the local variables inside C for every instance. These steps
prevent Mallory from successfully launching an attack similar to those used
for HD-DVD key stealing [30, 31].
5.2 Inserting Dummy Instructions
Program Analysis is a non linear operation as discussed in section 2. An
increase in the number of instructions that Mallory has to analyze decreases
the time window available to forge the results of these operations. Trent
inserts instructions that never execute and also inserts operations that are
performed on MAlice but not included as part of the results sent back to Trent.
These additions to the code make it difficult for Mallory to correctly analyze
C within a reasonable period of time.
5.3 Changing instructions during execution
Mallory may perform static analysis on the executable code C sent by Trent.
A good disassembler can provide significant information on the instructions
being executed, and allow Mallory to determine when system calls are made
and when function calls are made. In addition it may also allow Mallory to
see the area of code which reads memory recursively. If these tools do not
have access to the code to be executed before it actually executes, then
Mallory cannot determine the operations performed by C. Trent removes
some instructions in C while sending the code to MAlice and places code inside
C with data offsets such that during execution, this section in C changes the
modified instructions to the correct values. This way without executing C it is
difficult for Mallory to determine the exact contents of C.
6.
Implementation of user application attestation
In this section the implementation of the techniques proposed in this paper are
described.
architecture
All the coding was done using the C language on Intel x86
machines
on
Linux
kernel
using
the
gcc
compiler.
6.1 Generation of C by Trent
Trent generates C for every instance of verification request. If Trent sent out
the same copy of the verification code, then Mallory can gain significant
knowledge on the individual checks performed by C, by generating new code
for every instance of verification Trent mitigates this possibility. Trent also
places obfuscations inside the code to prevent static analysis of the executable
code.
The operations performed by Trent to obfuscate the operations
performed during verification are discussed below.
6.1.1 Changing execution flow and locations of variables on stack
Changing execution flow and locations of stack serves to prevent the program
analysis on C. The source code of C was divided into four blocks which are
independent of each other.
Trent assigns randomly generated sequence
numbers to the four blocks and places them accordingly inside C source code.
The checksum block is randomized by creating a pool of mathematical
operations that can be performed on every memory location and selecting
from the pool of operations. The pool of operations is created by replacing
the mathematical operation with other mathematical operation on the exact
same location.
Once the mathematical operations are selected in the C source code, Trent
changes the sub-regions for the checksum code and the MD5 calling
procedure. This is done by replacing the numbers defining the sub-regions. C
has sub–regions defined in its un-compiled code. To randomize the subregions, a pre-processor is executed on the un-compiled C such that it changes
the numbers defining the sub-regions. The numbers are generated such that
the sub-regions overlap by a random value.
C allocates space on the local stack to store computational values. Instead of
utilizing fixed locations on the stack, Trent replaces all variables inside C with
pointers to locations on the stack.
To allocate space on the stack Trent
declares a large array of type ‘char’ of size N, which has enough space to hold
contents of all the other variables simultaneously. Trent executes a preprocessor which assigns locations to the pointers.
The pre-processor
maintains a counter which starts at 0 and ends at N-1. It randomly picks a
pointer to be assigned a location and assigns it to the value on the counter and
increments the counter using the size of the corresponding variable in
question. This continues until all the pointers are assigned a location on the
stack. Trent compiles C source code to produce the executable after placing
these obfuscations.
6.1.2 Obfuscating instructions executed
Mallory cannot obtain a control flow graph (CFG) or perform program
analysis on the executable code of C provided the instruction is being
executed by C cannot be determined. Trent changes the instructions inside the
executable code such that they cause analysis tools to produce incorrect
results.
C contains a section (Crestore) which changes these modified
instructions back to their original contents when it executes. Crestore contains
the offset from the current location and the value to be placed inside the
offset. Trent places information to correct the modified instructions inside
Crestore. Crestore is executed prior to executing other instructions inside C and
Crestore corrects the values inside the modified instructions.
6.2
Execution of C on Client’s Machine
The executable code is received by the Client’s (Alice) machine.
The
received information contains the length of the code and the location where it
should be placed and executed. Normally it is not possible to introduce new
code into a process during run time. However Alice’s software (P) can use a
Linux library call to place C at the required location and execute the code. C
communicates the results of the verification back to Trent without relying on
P. The details of its execution are discussed below.
6.2.1
Injection of code by P on itself
P makes a connection request to Trent. Trent grants the request and provides
the number of bytes of challenge to be received and follows it with providing
the executable code of C. Trent also sends the information on the location
inside P where C should be placed. P receives the code and prepares the area
for injection by executing the library utility mprotect on the area. The code
section of a process in the Intel x86 architecture is write-protected. This
utility changes the protection on the code specified area of the code section
and allows this area to be overwritten with new values. Once the injection is
complete P creates a function pointer which points to the address of the
location where the code was injected and calls the function using the pointer,
transferring control to C.
6.2.2
Obtaining measurements on the target machine
C obtains certain identifiers on MAlice that allow Trent to identify whether it
indeed executed at the correct machine and process. These identifiers have to
be located outside the process space of P; therefore C computes the following
values in order to send them to Trent. The IP address of MAlice, mathematical
checksum on the MD5 code residing inside P, MD5 hash values of
overlapping sub-regions inside P, and the process state that allows C to
determine whether it was executed inside a sandbox.
The first involves identifying the machine on which it is executing. Trent
received an incoming connection from Alice, hence it is possible to track of
the IP address of MAlice. Although most IP addresses are dynamic, there is
little probability of an IP address changing in the small time window between
a request being sent and C taking its measurements. C does not utilize the
system call libraries to obtain values. It utilizes interrupts to execute system
calls. This involves loading the stack with the correct operands for the system
call, placing the system call number in the A register and the other registers
and executing the interrupt instruction. The sample code for creating a socket
is shown in Fig. 6.
Reading the IP address involves creating a socket on the network interface and
obtaining the address from the socket by means another system call – ioctl.
The obtained address is in the form of an integer which is converted to the
standard A.B.C.D format. After this, the address is sent to Trent using the
send routine inside the socketcall system call. It must be noted that the send is
done using the socket provided by P and not using a new socket. This is done
so that Mallory cannot bounce C to another machine. If Mallory did that, then
Mallory must provide an existing connection to Trent. However as
connections to any machine can exist only with Trent’s knowledge, this
situation cannot arise.
Trent verifies the address of the machine and sends a response to C which then
proceeds to take checksum on some portions of the code and follows up with
an MD5 hash of the entire code section. As discussed in section 4.2 and 6.3,
the sub-regions are defined randomly and such that they overlap. C sends the
checksum and MD5 results to Trent utilizing the system interrupt method for
send as discussed above. C obtains the pid of the process (P 0) under which it
is executing using the system interrupt for getpid. It then locates all the
remote connections established to Trent from MAlice. This is done by reading
the contents of the ‘/proc/net/tcp/’ file. The file has a structure shown in Fig.
7.
As seen in figure there is a remote address and port information for every
connection that allows C to identify any open connection to Trent. Once all
the connections are identified, C utilizes the inode of each of the socket
descriptor to locate any process utilizing it. This is done by scanning the
‘/proc/<pid>/fd’ folder for all the running processes on MAlice. In the ideal
situation there should be only one process id (P 1) utilizing the identified
inode. If it encounters more than one such process, then it sends an error
message back to Trent. Once the process id P 1 is obtained, C measures if the
id P
0
and the id P
1
are the same. If so, C sends an affirmative to Trent.
These measurements allow Trent to be certain that C executed on P residing
on MAlice.
7. Remote kernel attestation
To measure the integrity of the kernel we implement a scheme which is
similar to the user application attestation scheme. Trent′ is a trusted server
who provides code (Ckernel) to MAlice. It is assumed that Alice has means such
as digital signature verification scheme to determine whether Ckernel was sent
by Trent′. Alice receives Ckernel using a user level application Puser, verifies
that it was sent by Trent’ and places it in the kernel of the OS executing on
MAlice. Ckernel is then executed which obtains integrity measurements (Hkernel)
on the OS Text section, system call table, and the interrupt descriptors table.
Ckernel passes these results to Puser, which returns these results to Trent′. If
required Ckernel can encrypt the integrity measurement results using a one time
pad or a simple substitution cipher, however as the test case generated is
different in every instance, this is not a required operation. Figure 8 depicts
this process. Trent′ also provides a kernel module Pkernel that provides ioctl
calls to Puser. As seen in figure 8a, Puser receives Ckernel from Trent′. In figure
8b, Puser forwards the code to Pkernel. It is assumed that Pkernel has the ability
to verify that the code was sent by Trent′. Pkernel places the received code in
its code section at a location specified by Trent′ and executes it. Ckernel obtains
an arithmetic and MD5 checksum on the specified regions of the kernel on
MAlice and returns the results to Puser as seen in figure 8c. Puser then forwards
the results to Trent′ who determines whether the measurements obtained from
the OS on MAlice match with existing computations (figure 8d). Since Trent′ is
an OS vendor or a corporate network administrator, it can be assumed that
Trent′ has local access to a pristine copy of the kernel executing on M Alice to
obtain expected integrity measurement values generated by Ckernel. Although
this seems like Trent′ would need infinite memory requirements to keep track
of every client, most OS installations are identical as they are off the shelf. In
addition if Trent is a system administrator of a number of machines on a
corporate network, Trent′ would have knowledge of the OS on every client
machines.
7.1 Implementation
The kernel attestation was implemented on an x86 based 32 bit Ubuntu 8.04
machine executing with 2.6.24-28-generic kernel. In Linux the exact identical
copy of the kernel is mapped to every process in the system. Since we use the
system calls, and software interrupts for the application attestation part, this
section describes the integrity measurement of the text section (which contains
the code for system calls and other kernel routines), the system call table and
the interrupt descriptor table.
The /boot/System.map-2.6.24-28-generic file on the client platform was used
to locate the symbols to be used for kernel measurement. The kernel text
section was located at virtual address 0xC0100000, the end of kernel text
section was located to be at 0xc03219CA which corresponded to the symbol
'_etext'. The system call table was located at 0xC0326520, the next symbol in
the maps file was located at 0xc0326b3c, a difference of 1564 bytes. The
'arch/x86/include/asm/unistd_32.h' file for the kernel build showed the
number of system calls to be 337. Since MAlice was a 32 bit system, the space
required for the address mappings would be 1348 bytes. We took integrity
measurements from 0xC0326520 - 0xC0326B3B. The Interrupt descriptor
table was located at 0xc0410000 and the next symbol was located at
0xc0410800, which gives the IDT a size of 2048 bytes. A fully populated IDT
should be 256 entries of 8 bytes each which gives a 2KB sized IDT, this is
consistent with the System.maps file on the client machine.
Trent′ also provides a kernel module (Pkernel) to the client platform which is
installed as a device driver for a character device. Pkernel offers functionalities
using the ioctl call. Puser receives the code from the trusted authority and
opens the char device. Puser then executes an ioctl which allows the kernel
module to receive the executable code. As in the user application attestation
case, Trent′does not send the MD5 code for every attestation instance. Instead
the trusted authority sends a driver code which populates a data array and
provides it to the MD5 code which stays resident on Pkernel. To prevent
Mallory from exploiting this, the trusted authority also provides an arithmetic
checksum computation routine which is downloaded for every attestation
instance.
This provides a degree of extra unpredictability to the results
generated by the integrity measurement code.
Kernel modules can be relocated during compile time. This means that the
Trent′ would not know where the MD5 code got relocated during installation
of the module. In order to execute the MD5 code, the Trent′ requests the
location of MD5 function in the kernel module from the client end. After
obtaining the address, Trent′ generates the executable code Ckernel which has
numerous calls to the MD5 code. At generation, the call address may not
match the actual function address at the client end. Once Ckernel is generated,
the call instructions are identified in the code and the correct target address is
patched on the call instruction. Once this patching is done, Trent′ sends the
code to the client end. The call address calculation is done as follows:
call_target
length_ofcall
=
-(
(address_injected_driver
) - address_mdstring
+
call_locations[0]
+
);
code_in_file[jump_locations[0] +1 ] = call_target;
Ckernel is loaded in a char array code_in_file. The location where Ckernel address
to be injected is determined by Trent′ by selecting a location from a number of
'nop' locations in the module, this address is termed as address_injected_driver
in the above code snippet. The call location in the generated executable code
is determined by scanning the code for the presence of the call instruction.
The length of call instruction is a constant value which is dependent on the
current architecture. Finally the address of mdstring (which is the location of
MD5 code) is obtained from the client machine as described above. The
second statement changes the code array by placing the correct target address.
This procedure is repeated for all the call instructions in the generated code. It
must be noted that Ckernel calls only the MD5 code and no other function. If
obfuscation is required, Trent′ can place some junk function calls which get
executed by evaluating an ‘if statement’.
Trent′ can construct several if
statements such that they never evaluate to true. It can be noted that even if
the client does not communicate the address of the MD5 code, Pkernel can be
designed such that the MD5 driver provided by the trusted authority and the
MD5 code reside on the same page. This means that the higher 20 bits of the
address of the MD5 code and the downloaded code will be the same and only
the lower 12 bits would be different. This allows the Trent′ to determine
where Ckernel will reside on the client machine, and automatically calculate the
target address for the MD5 code. This is possible because the C compiler
produces lower 12 bits of function addresses while creating a kernel module
and allows the higher 20 bits to be populated during module insertion.
Once the code is injected, Trent′ issues a message to the user application
requesting the kernel integrity measurements. Puser executes another ioctl
which causes the Pkernel to execute the injected code. Ckernel reads various
memory locations in the kernel and passes the data to the MD5 code. The
MD5 code returns the MD5 checksum value to Ckernel which in turn returns the
value to the ioctl handler in the Pkernel. Pkernel then passes the MD5 and
arithmetic checksum computations back to Puser which forwards the results to
the Trent′.
If required the disable interrupt instruction can be issued by Ckernel to prevent
any other process from obtaining hold of the processor. It must be noted that
in multi processor systems disable interrupt instruction may not prevent a
second processor from smashing kernel integrity measurement values.
However, as the test cases are different for every attestation instance, Mallory
may not gain anything by smashing integrity measurement values.
8. Results
The time threshold (T) is an important parameter in this implementation. We
aim to prevent an attacker Mallory from intercepting C and providing fake
results to Trent. If T is too large then Mallory may be able to obtain some
information about the execution of C. The value of T must take into account
network delays. Network delays between cities in IP networks are of the
order of a few milliseconds [32]. Hence measuring the overall time required
for one instance of Remote Attestation and adding a few seconds to the
execution time can suffice for the value of T.
We obtained the source code for the VLC media player interface [33]. We
removed some sections of the interface code and left close to 1000 lines of C
code in the program.
We measured various stages of the integrity
measurement process. We took 2 pairs of machines running Ubuntu 8.04.
One pair were legacy machines executing on an Intel Pentium 4 processor
with 1 GB of ram, and the second pair of machines were Intel Core 2 Quad
machine with 3 GB of ram. The tests measured were the time taken to
generate code including compile time, time taken by the server to do a local
integrity check on a clean copy of the application and time taken by the client
to perform the integrity measurement and send a response back to the server.
To obtain an average measurement for code generation we executed the
program in a loop of 1000 times and measured the time taken using a watch.
We also measured the time reported by system clock and found to be a slight
variation (order of 1 second) in the time perceived by the human eye using the
watch and that reported by the system clock at the end of the loop. The time
taken for compiling the freshly generated code was measured similarly. These
two times are reported in table 1.
We then executed the integrity measurement code C locally on the server and
sent it to the client for injection and execution. The time taken on the server is
the compute time the code will take to generate integrity measurement on the
server as both machines were kept with the same configuration in each case.
These times are reported in table 2. It must be noted that the client requires a
higher threshold to report results because it has to receive the code from the
network stack, inject the code, execute it, return results back through the
network stack to the server. Network delays also affect the time threshold.
We can see from the two tables that it takes an order of a few hundred
milliseconds for the server to generate code, while the integrity measurement
is very light weight and returns results in the order of a few milliseconds. Due
to this the code generation process can be viewed as a huge overhead.
However, the server need not generate new code for every instance of a client
connection. It can generate the measurement code periodically every second
and ship out the same integrity measurement code to all clients connecting
within that second. This can alleviate the workload on the server. A value for
T can be suitably computed from the table taking into consideration network
hops required and be set to a value less than 5 seconds.
9.
Conclusion and Future work
This paper implements a method for implementing Remote Attestation entirely
in software. We also presented number of other schemes in literature that
address the problem of program integrity checking. We reduced the window
of opportunity for the attacker Mallory to provide fake results to the trusted
authority Trent by implementing various forms of obfuscation and providing
new executable code for every run. We implemented this scheme on Intel x86
architecture and set a time threshold for the response.
As future work we plan to implement this scheme using the virtualization
extensions. We also plan to extend this work to find out whether the client
process continued executing after the Remote Attestation was successful.
References
[1] Web link. In brief and statistics: The H open source. Retrieved on October
4, 2010, http://www.h-online.com/open/features/What-s-new-in-Linux-2-635-1047707.html?page=5
[2] T. Ball, E. Bounimova, B. Cook, V. Levin, J. Lichtenberg, C. McGarvey,
B. Ondrusek, S. K. Rajamani and A. Ustuner, "Thorough static analysis of
device drivers," ACM SIGOPS Operating Systems Review, vol. 40, pp. 73-85,
2006.
[3] A. Chou, J. Yang, B. Chelf, S. Hallem and D. Engler, "An empirical study
of operating systems errors," in Proceedings of the Eighteenth ACM
Symposium on Operating Systems Principles, 2001, pp. 73-88.
[4] A. Seshadri, M. Luk, E. Shi, A. Perrig, L. Van Doorn and P. Khosla,
"Pioneer: Verifying code integrity and enforcing untampered code execution
on legacy systems," in ACM SIGOPS Operating Systems Review, 2005, pp. 116.
[5] A. Seshadri, A. Perrig, L. van Doorn and P. Khosla. SWATT: SoftWarebased ATTestation for embedded devices. 2004 IEEE Symposium on Security
and Privacy. pp. 272-282.
[6] R. Kennel and L. H. Jamieson, "Establishing the genuinity of remote
computer systems," in Proceedings of the 12th USENIX Security Symposium,
2003, pp. 295-308.
[7] J. A. Garay and L. Huelsbergen, "Software integrity using timed excutable
agents," in Proceedings of the 2006 ACM Symposium on Information,
Computer and Communications Security, 2006, pp. 189-200.
[8] U. Shankar, M. Chew and J. D. Tygar, "Side effects are not sufficient to
authenticate software," in Proceedings of the 13th USENIX Security
Symposium, 2004, pp. 89-102.
[9] R. Kennel and L. H. Jamieson, "An Analysis of proposed attacks against
GENUINITY tests," CERIAS Technical Report, Purdue University, 2004.
[10] F. Stumpf, O. Tafreschi, P. Röder and C. Eckert, "A robust integrity
reporting protocol for remote attestation," in Second Workshop on Advances
in Trusted Computing (WATC’06 Fall), 2006.
[11] R. Sailer, X. Zhang, T. Jaeger and L. Van Doorn, "Design and
implementation of a TCG-based integrity measurement architecture," in
SSYM'04: Proceedings of the 13th Conference on USENIX Security
Symposium, 2004, pp. 223-228.
[12] K. Goldman, R. Perez and R. Sailer, "Linking remote attestation to secure
tunnel endpoints," in STC '06: Proceedings of the First ACM Workshop on
Scalable Trusted Computing, 2006, pp. 21-24.
[13] L. Wang and P. Dasgupta, "Coprocessor-based hierarchical trust
management for software integrity and digital identity protection," Journal of
Computer Security, vol. 16, pp. 311-339, 2008.
[14] N. L. Petroni Jr, T. Fraser, J. Molina and W. A. Arbaugh, "Copilot-a
coprocessor-based kernel runtime integrity monitor," in Proceedings of the
13th Conference on USENIX Security Symposium-Volume 13, 2004.
[15] R. Sailer. IBM research - integrity measurement architecture. Retrieved
on November 3, 2010,
http://domino.research.ibm.com/comm/research_people.nsf/pages/sailer.ima.h
tml
[16] T. Garfinkel, B. Pfaff, J. Chow, M. Rosenblum and D. Boneh, "Terra: A
virtual machine-based platform for trusted computing," ACM SIGOPS
Operating Systems Review, vol. 37, pp. 193 - 206, 2003.
[17] R. Sahita, U. Savagaonkar, P. Dewan and D. Durham, "Mitigating the
lying-endpoint problem in virtualized network access frameworks," 18th
IFIP/IEEE international conference on Managing virtualization of networks
and services, 2007, pp. 135-146.
[18] V. Haldar, D. D. Chandra and M. M. Franz, "Semantic remote attestation:
A virtual machine directed approach to trusted computing," in USENIX
Virtual Machine Research and Technology Symposium, 2004, pp. 29-41.
[19] G. Wurster, P. C. van Oorschot and A. Somayaji, "A generic attack on
checksumming-based software tamper resistance," in 2005 IEEE Symposium
on Security and Privacy, 2005, pp. 127-138.
[20] B. Schwarz, S. Debray and G. Andrews, "Disassembly of executable
code revisited," in Proceedings of Working Conference on Reverse
Engineering, 2002, pp. 45-54.
[21] C. Collberg, C. Thomborson and D. Low, "Manufacturing cheap stealthy
opaque constructs," in Proceedings of Working Conference on Reverse
Engineering, 1998, pp. 184-196.
[22] C. Linn and S. Debray, "Obfuscation of executable code to improve
resistance to static disassembly," in Proceedings of the 10th ACM Conference
on Computer and Communications Security, 2003, pp. 290-299.
[23] K. D. Cooper, T. J. Harvey and T. Waterman, "Building a control flow
graph from scheduled assembly code,"
[24] J. F. Levine, J. B. Grizzard and H. L. Owen. (2006, Detecting and
categorizing kernel-level rootkits to aid future detection. IEEE Security &
Privacy pp. 24-32.
[25] Web link, "Information about the knark rootkit," Retrieved on November
9 2010. http://www.ossec.net/rootkits/knark.php
[26] D. Sd. (2001), Linux on-the-fly kernel patching without LKM.
[27] P. A. Loscocco, P. W. Wilson, J. A. Pendergrass and C. D. McDonell,
"Linux kernel integrity measurement using contextual inspection," in 2007
ACM Workshop on Scalable Trusted Computing, 2007, pp. 21-29.
[28] Web link, "Address space layout randomization," Retrieved on April 25,
2010. http://pax.grsecurity.net/docs/aslr.txt
[29] Web link, "Linux man pages online - kernel random number generator,"
Retrieved on August 30, 2010. http://linux.die.net/man/4/random
[30] Web link. Hackers discover HD DVD and blu-ray processing key - all
HD titles now exposed. Retrieved on November 3, 2009.
http://www.engadget.com/2007/02/13/hackers-discover-hd-dvd-and-blu-rayprocessing-key-all-hd-t/
[31] Web link, "Hi-Def DVD Security is bypassed," Retrieved on November
3, 2009. http://news.bbc.co.uk/2/hi/technology/6301301.stm
[32] Web link, "Global IP Network Latency," Retrieved on January 17, 2010.
http://ipnetwork.bgtmo.ip.att.net/pws/network_delay.html
[33] Web link, "VLC media player source code FTP repository," Retrieved on
February 24, 2010. http://download.videolan.org/pub/videolan/vlc/
Machine
Pentium 4
Quad Core
Test
generation
12.3
5.2
Compilation
time
320
100
Total
Time
332
105
Table 1: Average code generation time in milliseconds on server end for Intel
Pentium 4 and Core 2 Quad machines for one instance of the measurement
Machine
Pentium 4
Quad Core
Server side
execution time
0.6
0.4
Client side
execution time
22
16
Table 2: Time taken in milliseconds to compute the measurements on server
and on the remote client
Figure Captions
Figure Number
Caption
1
Challenge response Overview
2
Protocol Overview
3
Hash obtained on overlapping sub-regions.
Two
instances have different sub-regions
4
Procedure for obtaining the MD5 Hash of the entire
code section
5
Snippet from the checksum code
6
ASM code for creating a socket
7
Contents of /proc/net/tcp file
8
Kernel remote attestation scheme
a.
User application initiates attestation request
b.
User application sends attestation code to kernel
c.
Kernel returns integrity values to user application
d.
Verification of kernel integrity by trusted server
Figures
MAlice
Trent
Request
P
C
C
Measurements
Results
Fig. 1
1. Alice
Trent
Verification Request
2. Trent
Alice
Inject code at location, execute it
3. C
Trent
Machine Identifier
C
4. Trent
Proceed
5. C
Trent
Initial Checksum
C
6. Trent
Proceed
7. C
Trent
MD5 Hash of specified regions
C
8. Trent
Proceed
9. C
Trent
Test of correct process ID
C
10. Trent
Proceed/Halt
Fig. 2
0
0
Checksum 1
Checksum 1
50
60
Checksum 2
Checksum 2
80
110
Checksum 3
Checksum 3
Checksum 4
150
160
Checksum 4
200
Fig. 3
200
Region 1
MD 5
H1
Concatenation
H2
Region 1
+
H1H2
MD 5
MD 5
H12
Region 3
Region N
MD 5
MD 5
H3
+
H12H3
+
Fig. 4
MD 5
Result
{
……
x = <random value>
a = 0;
while (a<400) {
checksum 1 += Mem[a];
if ((a % 55) == 0)
{checksum2 += checksum1/x;}
a++;
}
send checksum2;
…..
}
Fig. 5
__asm__(“sub
$12, %%esp\n”
“movl
$2, (%%esp)\n”
“movl
$1, 4(%%esp) \n”
“movl
$0, 8(%%esp) \n”
“movl
$102, %%eax\n”
“movl
$1,%%ebx\n”
“movl
%%esp, %%ecx\n”
“int
$0x80\n”
“add
$12, %%esp\n”
: “=a” (new_socket)
);
Fig. 6
sl local_address
0: 0100007F:1F40
1: 00000000:C3A9
2: 00000000:006F
3: 0100007F:0277
4: 0100007F:0019
5: 0100007F:743A
rem_address
00000000:0000 0A
00000000:0000 0A
00000000:0000 0A
00000000:0000 0A
00000000:0000 0A
00000000:0000 0A
st tx_queue rx_queue tr tm->when retrnsmt
uid timeout inode
00000000:00000000 00:00000000
00000000:00000000 00:00000000
00000000:00000000 00:00000000
00000000:00000000 00:00000000
00000000:00000000 00:00000000
00000000:00000000 00:00000000
0
0
0
0
0
0
Fig. 7
00000000
00000000
00000000
00000000
00000000
00000000
0
0
0
0
0
0
5456 1 f6eb0980 299 0 0 2 -1
4533 1 f6ec0000 299 0 0 2 -1
4473 1 f6f60000 299 0 0 2 -1
5690 1 f6ec0980 299 0 0 2 -1
5358 1 f6ec04c0 299 0 0 2 -1
5411 1 f6eb04c0 299 0 0 2 -1
Userland
Kernel attestation
request
Trent′
Puser
Ckernel
Pkernel
Operating System
Fig. 8a
Userland
Userland
Puser
Puser
Ckernel
Hkernel
Pkernel
Pkernel
Operating System
Operating System
Fig. 8b
Userland
Fig. 8c
Kernel integrity
measurements
Puser
OK
Pkernel
Operating System
Fig. 8d
Figure 8
Trent′
Download