Uploaded by syamsuhazwanjaafar

digital-forensics

advertisement
Digital forensics
Thibault Debatty
Contents
1
Preamble
1
2
Requirements
2.1 SIFT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3
3
I
Disk Forensics
5
3
Disk and filesystem
3.1 FAT filesystem . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2 ext2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.3 Chapter review . . . . . . . . . . . . . . . . . . . . . . . . . .
9
9
12
14
4
Disk imaging
4.1 FTK Imager . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2 Linux . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.3 Chapter review . . . . . . . . . . . . . . . . . . . . . . . . . .
15
15
17
17
5
Mounting an image
5.1 Mount a dd image on Linux and SIFT .
5.2 Mount an E01 image on Linux and SIFT
5.3 Windows : FTK Imager . . . . . . . . .
5.4 Chapter review . . . . . . . . . . . . .
.
.
.
.
19
19
20
21
22
6
Forensics tools
6.1 The Sleuth Kit . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.2 Chapter review . . . . . . . . . . . . . . . . . . . . . . . . . .
23
23
28
7
BitLocker drive encryption
7.1 Chapter review . . . . . . . . . . . . . . . . . . . . . . . . . .
29
30
i
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
CONTENTS
II
8
9
Windows forensics
Windows internals
8.1 Windows registry . . . .
8.2 Event Logs . . . . . . .
8.3 Security Identifier (SID)
8.4 Chapter review . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Windows artifacts
9.1 System information . . . . . . . . . .
9.2 Account usage . . . . . . . . . . . . .
9.3 Application execution . . . . . . . . .
9.4 File and folder opening . . . . . . . .
9.5 Deleted items . . . . . . . . . . . . .
9.6 Physical location and network activity
9.7 Browser activity . . . . . . . . . . . .
9.8 External device/USB usage . . . . . .
10 Forensics tools
10.1 RegRipper . . . . . . . .
10.2 evtxinfo and evtxexport .
10.3 Eric Zimmerman’s tools
10.4 Thumbcache viewer . . .
10.5 Additional exercises . . .
III
31
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
33
33
38
40
42
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
43
43
44
45
49
51
52
52
53
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
55
55
59
60
61
62
Memory forensics
11 Principles of computer memory
11.1 Address space of a process .
11.2 Virtualization and paging [5]
11.3 Interrupts . . . . . . . . . .
11.4 System calls . . . . . . . . .
11.5 Chapter review . . . . . . .
65
.
.
.
.
.
67
67
68
70
73
74
12 Windows memory
12.1 Windows architecture . . . . . . . . . . . . . . . . . . . . . . .
12.2 Chapter review . . . . . . . . . . . . . . . . . . . . . . . . . .
75
75
77
13 Memory acquisition
79
14 Memory analysis
81
ii
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Digital forensics
CONTENTS
15 Linux memory forensics
83
IV
85
V
Mobile device forensics
Network forensics
87
16 Command line tricks
17 Network protocols
17.1 IPv4 . . . . .
17.2 TCP . . . . .
17.3 ICMP . . . .
17.4 UDP . . . . .
17.5 ARP . . . . .
.
.
.
.
.
18 tcpdump
18.1 Requirements .
18.2 Packet analysis
18.3 Packet capture .
18.4 Chapter review
91
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
99
99
100
101
102
103
.
.
.
.
105
105
105
118
122
19 Wireshark
123
VI
125
Cloud forensics
20 Why cloud forensics ?
127
VII
129
Final words
21 Practical considerations
131
21.1 Preserving evidence . . . . . . . . . . . . . . . . . . . . . . . . 131
21.2 Time and time zones . . . . . . . . . . . . . . . . . . . . . . . 133
References
Digital forensics
135
iii
CONTENTS
iv
Digital forensics
Chapter 1
Preamble
Digital forensics [1] (sometimes known as digital forensic science) is a branch
of forensic science encompassing the recovery, investigation, examination and
analysis of material found in digital devices, often in relation to mobile devices
and computer crime.
Digital forensics is an extremely broad subject, for two main reasons. First,
because of the diversity of materials involved:
1. disk from a desktop computer (windows or Mac), which can contain artifacts
or logs produced by the operating system or by the installed applications;
2. memory of a desktop computer;
3. storage or memory image of a smartphone (iPhone and Android);
4. disk or memory dump from a server;
5. traffic captured on a network;
6. logs from a cloud environment.
Second, because of the skills required from the analyst. Indeed, the process of
digital forensics directly relates to the process of intelligence analysis [2], [3]:
1. correct data collection (from disk or memory for example) requires care;
2. processing the data to extract information from the evidence requires to
know which tool(s) can be used and how;
3. information analysis to get actual intelligence (being able to explain what
actually happened and why) requires to understand how the system under
analysis (Windows, a network, a cloud environment) is supposed to work,
to be able to interpret the information produced by your tools;
4. and all steps require a good deal of critical thinking.
In the following chatpers, we will try to cover the different sources of information
1
CHAPTER 1. PREAMBLE
you might encounter in a forensics investigation: Windows computer, memory
dump, smartphone, network trace, cloud environment.
For each data source, we will briefly explain (or review) how it works, and we will
showcase some tools that can be used to extract information.
2
Digital forensics
Chapter 2
Requirements
This book contains a lot of exercises. We strongly encourage you to do the exercise,
to practice your investigation skills, and to help you understand and memorize the
different skills.
To do these exercises, you will need:
• a system with Windows installed (can be a VM), on which you have administrator rights;
• the SIFT workstation (can also be a VM, see below).
2.1
SIFT
The SIFT workstation [4] is a collection of forensic tools. It was originally created
by Rob Lee, from the SANS institute. The current version of the SIFT workstation
is based on Ubuntu 20.04 and has a lot of preinstalled forensics tools like Plaso,
Sleuthkit, regripper, Volatility etc.
The default credentials of the SIFT workstation are:
• username: sansforensics
• password: forensics
The easiest way to run the SIFT workstation is to download the VM applicance
that you can find on https://www.sans.org/tools/sift-workstation/
v
Exercise
Download and install the SIFT workstation
3
CHAPTER 2. REQUIREMENTS
Figure 2.1: SIFT workstation running in a VM
4
Digital forensics
Part I
Disk Forensics
5
Most computer devices store permanent information on hard drive or Solid State
Drive (SSD). Hence one of the main source of information in an digital forensics
investigation is the disk of the different involved devices.
From the OS point of view, a disk is simply a bunch of numbered but unorganized
sectors. To store information, the disk is usually divided into 1 or more pieces
called partition. Inside each partition, a filesystem is used to define how and
where the files and folders are recorded.
In this first Part of the book, we will see:
1.
2.
3.
4.
how information is stored on disks (Chapter 3);
how to perform a correct image of a disk (Chapter 4);
how to mount this image in the SIFT Workstation (Chapter 5);
how to extract files (possibly deleted) from the image.
Digital forensics
7
8
Digital forensics
Chapter 3
Disk and filesystem
From the OS point a view, a disk is simply a sea of sectors, numbered starting
from 0. The size of clusters is traditionally 512 bytes, or 4096 bytes for newer
drives (known as Advanced Format, AF) [5], [6].
To store data efficiently, the OS must organize how files and folders are stored
on the disk. Two concepts allow this: partitions allow to split the disk in smaller
pieces (sometimes also called volumes), and filesystems define how files and
folders are stored inside the volume.
3.1
FAT filesystem
The File Allocation Table (FAT) was one of the first filesystems. It is almost the
only filesystem that all operating systems can read and write. Hence despite its
age and limitations, it is still used today, mainly for USB drives.
There has been different version of FAT. In this Section we will discuss FAT32,
the most commonly used version today.
To organize data, FAT32 divides the partition in 2 areas:
• the system area stores meta-information about the files and directories, and
about the filesystem itself;
• the data area stores the actual content of files (and directories).
The system area itself is composed of
• the boot record, that contains the jump instruction used for the system to
boot, and information about the filesystem;
• 2 copies of the file allocation table.
9
CHAPTER 3. DISK AND FILESYSTEM
Figure XXXX represents the main fields of the boot record:
• JUMP (located at 0x00, 3 Bytes) is the jump instruction for the system boot;
• OS ID (located at 0x03, 8 Bytes) shows which OS was used to format the
device;
• Sectors/FAT (0x24, 4 Bytes) indicates the number of sectors of this FAT;
• SN (0x43, 4 Bytes) is the volume serial number (not the SN of the drive);
• LABEL (0x47) is the volume label, displayed by the OS;
• FS type (0x52) is the filesystem type
Figure 3.1: FAT32 boot record
On a Linux system or SIFT workstation, you can read the boot record of an image
with
hexdump -n 96 -C <image>
Pay attention that FAT32 uses little endian byte order so the first byte has the
lowest value.
In the example below:
• the jump instruction is EB 58 90;
• the OS ID is 6d 6b 66 73 2e 66 61 74 or mkfs.fat, which shows the
disk was formatted on a Linux system;
• the fat has e8 0e 00 00 or 232 + 14 x 256 = 3816clusters;
• the volume label is 32 47 20 20 20 20 20 20 20 20 20 which translates to 2G
• likewize, the FS type is FAT32
10
Digital forensics
CHAPTER 3. DISK AND FILESYSTEM
After the boot record comes the file allocation table. This table has one entry for
each cluster in the data area. With FAT32, each entry is 32 bits (4 Bytes).
Only the lowest 28 bits are actually used for addressing because the highest 4 bits
are reserved for future use. Remember that FAT32 uses little endian byte ordering,
so in the values below you may see a ? instead of the second-to-last hexadecimal
character.
The fat allows to track which files occupy which clusters, and which clusters are
free:
• if the cluster is free, the corresponding value in the fat is 0x00 00 00 ?0
• if the cluster is used by a file, the value in the fat is the address of the next
cluster for this file
• if the cluster is the last cluster of a file, the value in the fat is 0xf8 ff ff
?f
The Figure XXXX shows an example of a FAT. It contains a single file, that spans
on clusters 0x00 00 00 00, 0x00 00 00 01 and 0x00 00 00 03. The clusters
“‘0x00 00 00 02 and 0x00 00 00 04 are free.
Figure 3.2: Example of a file allocation table. The green column indicates the
cluster numbers. It is actually not stored on the disk.
When a file is deleted from the partition, the OS will:
• remove the corresponding entry from the directory;
Digital forensics
11
CHAPTER 3. DISK AND FILESYSTEM
• mark the corresponding clusters as free (by writing the 0x0000 0000 value
in the fat).
The actual data (located in the data area of the partition) will be left untouched,
which means that data is still present on the drive and can be recovered if it has
not been overwritten (see Chapter ??).
Size limitations
Because the size of the sectors per FAT field of the boot record is 32 bits, the
maximum size of a FAT32 partition is 2 TB for drives that use sectors of 512
Bytes, and 16 TB for drives that have sectors of 4096 Bytes.
3.2
ext2
ext2 [7], [8] is the ancestor of ext4, that is currently the default filesystem for a lot
of Linux distributions. Ext4 is more performant and has additional features (like
journaling) compared to ext2, but the working principles of both filesystems is the
same.
In ext2, each file is described by an inode. The inode contains a.o. :
•
•
•
•
•
•
3.2.1
the length of the file in bytes;
the device ID (this identifies the device containing the file);
the ownership of the file (UID and GID);
the permisions associated with the file (read, write, execute);
various timers (creation, access, modification);
pointers (direct or indirect) to the clusters that store the file’s contents
Partition structure
When a partition is formatted with ext2, some space is reserved at the beginning
of the partition to store inodes, as illustrated on the Figure below.
3.2.2
Directory
In ext2, a directory is a special file. The data blocks (so the content of the ‘file’) is
a list of directory entries. Each directory entry associates one file name with one
inode number.
To find a file, the directory is searched front-to-back for the associated filename.
For reasonable directory sizes, this is fine. But for very large directories this is
12
Digital forensics
CHAPTER 3. DISK AND FILESYSTEM
Figure 3.3: Ext2 inode pointesr
Figure 3.4: Ext2 disk organization
Digital forensics
13
CHAPTER 3. DISK AND FILESYSTEM
inefficient. Hence one of the optimizations of ext3 is a second way of storing
directories (HTree) that is more efficient than just a list of filenames.
Because directories are actually a special kind of file, each directory is also
represented by an inode. The root directory is always stored in inode number
two, so that the OS can find it when the partition is mounted. Subdirectories are
implemented by storing the name of the subdirectory in the name field, and the
inode number of the subdirectory in the inode field.
3.3
Chapter review
At the end of this Chapter, you should be able to answer the following questions:
• Explain how FAT32 works, and what happens when you delete a file
• Explain the role and working of ext2
14
Digital forensics
Chapter 4
Disk imaging
When performing a forensics investigation, we must absolutely preserve the
original evidences. For disks, it means we should absolutely avoid modifying the
content of the disks. Hence, we must first create an image of the disk, that we will
analyze. Depending on the situation, there are 3 ways to create a disk image:
1. using a hardware write blocker;
2. using a bootable USB stick;
3. directly on the system under investigation.
4.1
FTK Imager
FTK Imager [9] is a data preview and imaging tool used to acquire digital evidence
in a forensically sound manner by creating copies of data without changing the
original in any way. The latest version supports the AFF4 format and execution
on portable drives.
It was originally developped by AccessData, and now by exterro. It can run on a
Windows computer, or directly from a USB drive.
With FTK Imager you can:
• Create forensic images of entire local hard drives, CDs and DVDs, thumb
drives and other USB devices or just the files and folders you need.
• Preview the contents of forensic images stored on local machines or network
drives.
• Create hashes of files to verify data integrity using either Message Digest 5
(MD5) or Secure Hash Algorithm (SHA-1).
15
CHAPTER 4. DISK IMAGING
Figure 4.1: FTK Imager
–
Additional resources
To create USB drive with FTK Imager: https://cylab.be/blog/180/runn
ing-and-imaging-with-ftk-imager-from-a-flash-device
FTK Imager can create images from different sources:
• Physical Drive creates an exact copy of the complete physical drive. So if
the system has a 2TB harddrive, but only 20GB used, the image size will be
2TB. This is the best option as it allows to recover deleted files. This is the
best option.
• Logical Drive create an exact copy of a single partition. If they are deleted
partitions, or data hidden outside of the main partition, we will not be able
to recover that data.
• Image File allows to create an image from another image, so basically
converting one image to another format.
• Contents of a Folder only copies the content of a folder. This means we
cannot recover deleted files.
• Fernico Device is meant for Fernico FAR archive systems.
To save the image, FTK Imager supports different formats:
• dd is a simple bit-by-bit copy of the source, with no additional information.
The name originates from the dd command that you can find on Unix
16
Digital forensics
CHAPTER 4. DISK IMAGING
systems.
• E01 is also known as e01/ex01, Encase evidence file or Expert Witness
Format (EWF). It also create a bit-by-by copy of the source, but has additional features: encryption (AES256) and compression (LZ) of the data,
header information (MD5 and SHA-1 hashing of the data, case number,
evidence name and number, date and time of acquisition etc.). This should
be the preferred format whenever possible.
• SMART is the format used by the SMART forensics tool from ASR
Data[10].
• AFF stands for Advanced Forensics Format is an open source format that
has features similar to E01.
• AD1 is a proprietary format from AccessData (FTK Imager) used to store
the image of a Folder.
v
4.2
Exercise
Download and install FTK Imager on your windows machine, then use
it to create an image of a USB key that you own.
Linux
On a Linux system, you can simply use the dd command to create an image of a
disk:
sudo dd if=/dev/device of=/path/to/image bs=128K
For example:
sudo dd if=/dev/sda of=~/usb.img bs=128K
4.3
Chapter review
At the end of this Chapter, you should be able to answer the following questions:
•
•
•
•
Explain why and how we can collect a correct image of a disk.
Explain the different possibilities for creating a disk image.
On a windows computer, create an image of a disk.
On a Linux computer, create an image of a disk.
Digital forensics
17
CHAPTER 4. DISK IMAGING
18
Digital forensics
Chapter 5
Mounting an image
5.1
Mount a dd image on Linux and SIFT
As dd creates an exact byte copy of the original drive, you can simply mount the
image using. . . the mount command:
sudo mount -t <fs> -o loop,ro /path/to/image /path/to/mountpoint
For example:
sudo mkdir -p /mnt/images/01
sudo mount -t vfat -o loop,ro ~/usb.img /mnt/images/01
v
Exercise
Download the exercise file usb-01.img.zip from https://cylab.be/s/f
FMqA
This file is a dd image of a USB drive. What is the content of the file
password.txt ?
19
CHAPTER 5. MOUNTING AN IMAGE
✓
5.2
Solution
wget https://cylab.be/s/fFMqA -O usb-01.img.zip
unzip usb-01.img.zip
sudo mkdir -p /mnt/images/01
sudo mount -t vfat -o loop,ro usb-01.img /mnt/images/01
cat /mnt/images/01/password.txt
Mount an E01 image on Linux and SIFT
As mentioned earlier, the Expert Witness Format (E01) is a special format that
‘packs’ the disk image, possibly compressed, with additional information. So
mounting an E01 image on Linux or SIFT requires 2 steps:
1. use the ewfmount command to ‘expose’ the raw disk image inside the E01
container:
sudo ewfmount /path/to/image.E01 /mnt/e01
2. use mount to mount the partition.
sudo mount -o ro,loop /mnt/e01/ewf1 /mnt/windows
For example:
sudo su
ewfmount usb-02.E01 /mnt/e01
mount -o ro,loop -t vfat /mnt/e01/ewf1 /mnt/windows_mount
ls /mnt/windows_mount
–
Additional resources
ewfmount is already installed on SIFT. On a Debian based distribution
(like Ubuntu), you can install it with:
sudo apt install ewf-tools
20
Digital forensics
CHAPTER 5. MOUNTING AN IMAGE
v
Exercise
Download the exercise file usb-02.E01 from https://cylab.be/s/5Lne8
This file is an E01 image of a USB drive. What is the content of the
file password.txt ?
5.3
Windows : FTK Imager
On a Windows machine, you can use FTK Imager to mount an image.
Figure 5.1: Mounting an E01 image with FTK imager
FTK Imager proposes 3 mount types:
• Physical only attaches the image as a physical drive, without actually
mounting the different partitions. The content of the image cannot be
viewed in Windows Explorer, but the drive can be viewd using Windows
applications that perform Physical Name Querying, like partition editors.
This mount type is only available with full disk images like RAW/dd and
E01.
• Logical attaches the partition as a new drive, which is visible in Windows
Explorer. This mode is only available for Logical images that only contain
Digital forensics
21
CHAPTER 5. MOUNTING AN IMAGE
the content of file and folders, like AD1.
• Physical & Logical is available for full disk images. It attaches the image
as a physical drive, and mounts the different partitions in Windows Explorer.
5.4
Chapter review
At the end of this Chatper, you should be able to:
• Mount a disk image in the SIFT workstation.
22
Digital forensics
Chapter 6
Forensics tools
6.1
The Sleuth Kit
The Sleuth Kit (TSK) is a collection of command line tools that allows to analyze disk images and recover deleted files. It is already installed on the SIFT
workstation.
TSK allows to:
•
•
•
•
analyze raw (dd), E01 and AFF images (see Chapter 4);
analyze NTFS, FAT, ExFAT, Ext4, Ext3 and a lot of other file systems;
recover the content of deleted blocks;
create a time line of file activity, which can be imported into a spread sheet
to create graphs and reports.
The complete list of features is available on the web site https://www.sleuthkit.org/
6.1.1
Filesystem and partition information
To list the partitions contained in an image:
mmls <image>
To display type and details about a file system:
fsstat -o <offset in sectors> <image>
23
CHAPTER 6. FORENSICS TOOLS
Figure 6.1: Example of fstat showing details of a FAT32 file system
v
24
Exercise
Download and extract usb-03.img.zip from https://cylab.be/s/L
Oz9Z. This image contains multiple partitions. List the different
partitions and filesystems.
Digital forensics
CHAPTER 6. FORENSICS TOOLS
✓
Solution
wget https://cylab.be/s/LOz9Z -O usb-03.img.zip
unzip usb-03.img.zip
## list the partitions
## start and end positions are indicated
## in sectors
mmls usb-03.img
### get the type of the first partition
fsstat -o 2048 usb-03.img
...
–
Additional resources
On a Linux system, you may also list partitions and filesystems with
fdisk -l <image>
But fdisk is usually less acurate then TSK tools.
6.1.2
File recovery
As we have seen previously, when a file or directory is deleted, the content is
actually not erased from the disk. The data sectors are marked as free, and the
corresponding entry is removed from the index structure (inode or FAT entry).
This means that, in some case, the content of deleted files can be recovered.
With TSK, the first step is to use fls to list deleted files:
fls -d <image>
This will show, for each file, the corresponding inode number or FAT entry number
(inum).
Next, we can use icat to extract the content of a file:
icat -r <image> <inum>
You can find an example on the Figure below.
Digital forensics
25
CHAPTER 6. FORENSICS TOOLS
Figure 6.2: Recover the content of a deleted file with The Sleuth Kit
v
Exercise
v
Exercise
In usb-01.img, what is the content of the deleted file deleted.txt ?
Download usb-04.E01 from https://cylab.be/s/kbcRa
The image contains 3 deleted files (PDF, DOCX, PNG). Recover the
files. . .
–
Additional resources
Foremost is another tool you can use to recover deleted files:
https://cylab.be/blog/283/recovering-deleted-files-with-foremost
6.1.3
Timeline creation
As we have seen in Chapter 3, files and directories usually have times associated
with them. The quantity and description of which depend on the file system type.
For example, Ext2/3 file systems have a Modified, Accessed, Changed and deleted
time. FAT stores the Written, Accessed, and Created time, although by spec the
Created and Access times are optional and the Access time is only accurate to the
day. NTFS has created, modified, changed, and accessed times.
26
Digital forensics
CHAPTER 6. FORENSICS TOOLS
The fls tool from TSK allows to extract this information from a disk image. Then
the mactime tool allows to sort all of the temporal data into a single timeline.
You can run fls and save the result to a body file text file with the following
command:
fls -m -r <image> > <bodyfile.txt>
The body file format stores the following fields for each file:
MD5|name|inode|mode_as_string|UID|GID|size|atime|mtime|ctime|crtime
Figure 6.3: The Sleuth Kit : fls
In a second step, you can use the mactime tool to create a sorted and well formated
report of disk activity. Mactime has different options that you can find on the help
page [11]. The simplest command to analyze a body file is:
mactime -b <bodyfile.txt>
In the output, the different activity types are identified by the letters ‘m’, ‘a’, ‘c’,
‘b’. But their meaning depends on the underlying filesystem:
File system
m
a
Ext4
Ext2/3
FAT
NTFS
UFS
Modified
Modified
Written
File Modified
Modified
Accessed Changed
Accessed Changed
Accessed N/A
Accessed MFT Modified
Accessed Changed
Digital forensics
c
b
Created
N/A
Created
Created
N/A
27
CHAPTER 6. FORENSICS TOOLS
Figure 6.4: The Sleuth Kit : mactime
6.2
Chapter review
At the end of this Chapter you should be able to:
• Get information about the partitions and filesystems contained in an image
• Extract deleted files from an image
28
Digital forensics
Chapter 7
BitLocker drive encryption
BitLocker [12], [13] is a full volume encryption feature included with Microsoft
Windows versions Pro, Enterprise and Education. It is designed to protect data by
providing encryption for the entire volumes.
By default, it uses the AES encryption algorithm in cipher block chaining (CBC)
or XTS mode, with a 128-bit or 256-bit key. The CBC is applied to each individual
sector.
Starting with Windows Server 2012 and Windows 8, Microsoft has complemented
BitLocker with the Microsoft Encrypted Hard Drive specification, which allows
the cryptographic operations of BitLocker encryption to be offloaded to the storage
device’s hardware, thus reducing the performance impact for the user.
Three authentication mechanisms can be used to encrypt a volume:
• Transparent operation mode: This mode uses the capabilities of TPM 1.2
hardware to provide for transparent user experience. The user powers up and
logs into Windows as usual. The key used for disk encryption is encrypted by
the TPM chip and will only be released to the OS loader code if the early boot
files appear to be unmodified. The pre-OS components of BitLocker achieve
this by implementing a Static Root of Trust Measurement—a methodology
specified by the Trusted Computing Group (TCG). This mode is vulnerable
to a cold boot attack, as it allows a powered-down machine to be booted
by an attacker. It is also vulnerable to a sniffing attack, as the volume
encryption key is transferred in plain text from the TPM to the CPU during
a successful boot.
• User authentication mode: This mode requires that the user provides some
authentication to the pre-boot environment in the form of a pre-boot PIN or
password.
29
CHAPTER 7. BITLOCKER DRIVE ENCRYPTION
• USB Key Mode: The user must insert a USB device that contains a startup
key into the computer to be able to boot the protected OS. Note that this
mode requires that the BIOS on the protected machine supports the reading
of USB devices in the pre-OS environment.
Some combinations of the above authentication mechanisms are also possible:
• TPM + PIN
• TPM + PIN + USB Key
• TPM + USB Key
Also, all BitLocker combinations allow the creation of a recovery key that can
also be used to decipher a volume.
When you try to mount a BitLocker encrypted device or partition, the software
you use will usually require or ask for the encryption password or recovery key. . .
7.1
Chapter review
At the end of this Chapter, you should be able to answer the following questions:
• Explain the working of BitLocker
30
Digital forensics
Part II
Windows forensics
31
Chapter 8
Windows internals
As explained in the Introduction, performing a correct forensics analysis and using
forensics tools correctly requires to understand how the investigated system is
actually supposed to work. In this Chapter, we will cover some internal aspects of
Windows that will be interesting for a forensics investigation.
8.1
Windows registry
The Windows registry [14] is a hierarchical database of key - value pairs, that
stores settings and information for Windows and for applications that opt to use
the registry.
In other words, the registry contains information, settings, options, and other
values for programs and hardware installed on all versions of Microsoft Windows
operating systems. For example, when a program is installed, a new subkey
containing settings such as a program’s location, its version, and how to start the
program, are all added to the Windows Registry.
However, it is not a requirement for Windows applications to use the Windows
Registry.
The Windows Registry stores all these settings in one logical hierarchy (like a
tree), but in multiple files (see Section 8.1.1).
Indeed, user-based registry settings are loaded from a user-specific file. This way
the registry allows multiple users to share the same machine, and also allows
programs to work for less privileged users.
Backup and restoration is also simplified as the registry can be accessed over a
network connection for remote management/support, including from scripts, using
33
CHAPTER 8. WINDOWS INTERNALS
the standard set of APIs, as long as the Remote Registry service is running and
firewall rules permit this.
Because the registry is a database, it offers improved system integrity with features
such as atomic updates. If two processes attempt to update the same registry value
at the same time, one process’s change will precede the other’s and the overall
consistency of the data will be maintained.
The registry has a hierarchial tree-like structure, but it has 7 predefined root keys.
The main ones are:
• HKEY_LOCAL_MACHINE or HKLM stores settings that are specific to
the local computer.
• HKEY_USERS or HKU contains subkeys for each user profile on the
machine.
• HKEY_CURRENT_USER or HKCU stores settings that are specific to the
currently logged-in user. The HKEY_CURRENT_USER key is a link to
the subkey of HKEY_USERS that corresponds to the user, hence the same
information is accessible in both locations.
For example, the name of your computer is stored in the registry, at the key
HKLM\SYSTEM\CurrentControlSet\Control\ComputerName\ComputerName
On a Windows computer, you can view and modify the content of the registry with
regedit.exe, the built-in Windows Registry Editor.
Figure 8.1: Registry Editor
34
Digital forensics
CHAPTER 8. WINDOWS INTERNALS
v
8.1.1
Exercise
On your Windows machine, launch regedit, and check the name of
your computer in the Registry.
Hives
Even though the registry presents itself as an integrated hierarchical database,
branches of the registry are actually stored in a number of disk files called hives.
For example, individual settings for users on a system are stored in a hive (disk
file) per user. During user login, the system loads the user hive under the
HKEY_USERS key and sets the HKCU (HKEY_CURRENT_USER) symbolic
reference to point to the current user. This allows applications to store/retrieve
settings for the current user implicitly under the HKCU key.
The user-specific HKEY_CURRENT_USER user registry hive is stored in NTUSER.DAT
inside the user profile. There is one of these per user. If a user has a roaming
profile, then this file will be copied to and from a server at logout and login
respectively.
Here are the main hives:
• C:\Windows\System32\config\SAM is the Security Accounts Manager
and contains login information about the users;
• C:\Windows\System32\config\SECURITY contains security information,
and possibly passwords;
• C:\Windows\System32\config\SOFTWARE contains information about installed applications and the default Windows settings;
• C:\Windows\System32\config\SYSTEM contains information relating to
hardware and system configuration;
• C:\Users\$USER$\NTUSER.DAT contains user preferences and settings
(see above); in the registry, it is mapped to HKEY_CURRENT_USER;
• C:\Users\$USER$\AppData\Local\Microsoft\Windows\UsrClass.dat
contains information concerning User Access Control (UAC) configuration
and about GUI display for the user experience; In the registry, it is mapped
to HKEY_CURRENT_USER/Software/Classes.
8.1.2
ControlSets
The control set [15] registry branch records information that is needed
to start Windows and devices related information that is used to run
Digital forensics
35
CHAPTER 8. WINDOWS INTERNALS
Figure 8.2: Some hives on Windows 11
36
Digital forensics
CHAPTER 8. WINDOWS INTERNALS
Windows (Windows Services).
Windows stores at least two control
sets in the registry: HKEY_LOCAL_MACHINE\SYSTEM\ControlSet001 and
HKEY_LOCAL_MACHINE\SYSTEM\ControlSet002.
Usually, both of them have the same information. However, if a fundamental
change is made to the system such as a change of the hardware, there is the
possibility that Windows cannot boot up anymore because of a faulty entry in
the registry’s control set. Thus, only one of the copies is changed. If Windows
manages to boot up correctly, it copies the newer control set over the older so that
both are in sync again.
The registry key HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet is just a
link to one of the two real control sets: the one that is currently loaded.
The current control set is recorded into the Current key that is available in the
registry under HKEY_LOCAL_MACHINE\SYSTEM\Select.
v
Exercise
On your Windows machine, use the Registry Editor to check the
currently used ControlSet.
Figure 8.3: Registry Control Sets
Digital forensics
37
CHAPTER 8. WINDOWS INTERNALS
8.1.3
Most Recently Used (MRU)
Some registry entries store lists of Most Recently Used (MRU) items. The first
item listed in the value data being the most recently accessed and the last entry
being the oldest.
8.2
Event Logs
When interesting events take place, Windows records event information in the
event logs. Since Windows Vista, these logs are stored in
C:\Windows\System32\winevt\Logs
The logs are stored in a specific format called Windows XML Event Log, with the
extension .evtx.
Figure 8.4: C:\Windows\System32\winevt\Logs
Windows also has a built-in tool called the Event Viewer that allows administrators
and suers view the event logs on a local or remote machine.
These events are categorized in 3 classes:
• System : events generated by the Windows operaging system;
• Aplication : events generated by applications on the local machine;
• Security : events related to login attempts.
Each event also has an event ID. Here are a few useful examples:
38
Digital forensics
CHAPTER 8. WINDOWS INTERNALS
Figure 8.5: Windows Event Viewer
•
•
•
•
•
•
•
4624 :
4625 :
4634 :
4647 :
4648 :
4672 :
4720 :
successful logon
unsuccessful logon
logon session terminated
logon session terminated by user
user logon attempted by a user with different credentials
user logon with admin rights
user account created
The complete list of event ID and description can be found at https://www.ultima
tewindowssecurity.com/securitylog/encyclopedia/
v
Exercise
On your Windows machine
1. open the Event Viewer
2. use the filtering functionality to find all failed logon attempts on
your machine
For logon events, Windows also records additional information, including the
logon type. This allows to show if the logon event was caused by a direct logon
from the user (interactive logon), or by the execution of a scheduled task (batch
login), or through the network (Network logon).
Digital forensics
39
CHAPTER 8. WINDOWS INTERNALS
The complete list of logon types is presented below:
• Interactive (2) : Console logon; RUNAS; Hardware remote control solutions (such as Network KVM or Remote Access / Lights-Out Card in
server);
• Network (3) : NET USE; RPC calls; Remote registry;
• Batch (4) : Scheduled tasks;
• Service (5) : Windows services;
• NetworkCleartext (8) :IIS Basic Auth (IIS 6.0 and newer); Windows
PowerShell with CredSSP;
• NewCredentials (9) : RUNAS /NETWORK;
• RemoteInteractive (10) : Remote Desktop (formerly known as “Terminal
Services”).
Figure 8.6: Windows Event showing Logon Type
8.3
Security Identifier (SID)
A Security Identifier (SID) [16], [17] is a unique, immutable identifier of a user,
user group, or other security principal. So the SID is roughly the Windows
equivalent of the UID on a Unix system.
Unlike the username, the SID cannot be modified. This allows to uniquely identify
a user, even if username is changed. . .
Windows grants or denies access and privileges to resources based on access
control lists (ACLs), which use SIDs to uniquely identify users and their group
memberships. When a user logs into a computer, an access token is generated
40
Digital forensics
CHAPTER 8. WINDOWS INTERNALS
that contains user and group SIDs and user privilege level. When a user requests
access to a resource, the access token is checked against the ACL to permit or
deny particular action on a particular object.
The format of a SID can be illustrated using the following example:
S-1-5-21-3623811015-3361044348-30300820-1013
In this example:
S : indicates that the string is a SID;
1 : is the version of the SID specification;
5 : the identifier authority value;
21 : is the subauthority type. For example, 21 is a domain, and 18 is a
LocalSystem;
• 3623811015-3361044348-30300820 : is the unique authentifier for the
subauthority
• 1013 : is a relative ID (RID) that identifies the user or group on this system
(in this example, in this domain).
•
•
•
•
By default, the administrator account on a system has RID 500, and the guest
account has RID 501.
You can find your SID by typing the following command in a command prompt:
whoami/user
Figure 8.7: whoami/user
v
Exercise
Find your SID
You can also find the SID of all users on the system with the following command:
Digital forensics
41
CHAPTER 8. WINDOWS INTERNALS
wmic useraccount get name,sid
8.4
Chapter review
At the end of this Chapter, you should be able to answer the following questions:
•
•
•
•
42
In which directory are most registry hives located?
Explain working and organization of the Windows registry (include hives).
Explain what event logs are, and give a few examples.
Explain the use and format of Windows SID.
Digital forensics
Chapter 9
Windows artifacts
When a computer is running, Windows will store a lot of (possibly interesting)
information in a lot of different locations. To help investigation, we can group
information and artifacts in different categories (as suggested by SANS in [18]):
1.
2.
3.
4.
5.
6.
7.
8.
9.1
System information;
Account usage;
Application execution;
File and folder opening;
Deleted items;
Physical location and network activity;
Browser activity;
External device/USB usage.
System information
Hostname
The hostname of the system is saved by Windows in registry key
SYSTEM\CurrentControlSet\Control\ComputerName\ComputerName
Timezone
The current system time zone is stored by windows in registry key
SYSTEM\CurrentControlSet\Control\TimeZoneInformation
43
CHAPTER 9. WINDOWS ARTIFACTS
–
Additional resources
The timezone information stored in the registry is actually a reference
to an entry in tzres.dll, like @tzres.dll,-300. You can find the
string description of the entry online, for example on
https://www.win7dll.info/tzres_dll.html
Version and installation time
The information related to the currently installed version of Windows is saved in
the registry, in
SOFTWARE\Microsoft\Windows NT\CurrentVersion
Information related to the previously installed versions is stored in the registry, in
SYSTEM\Setup\Source OS
Both locations hold roughly the same keys:
•
•
•
•
ProductName;
BuildNumber;
InstallTime;
etc.
Last Shutdown Time
The time of last shutdown is stored by Windows in registry key
SYSTEM\CurrentControlSet\Control\Windows\Shutdown Time
9.2
Account usage
These artifacts allow to determine which user(s) used the computer, when and
how.
User accounts
The list of user accounts is stored in the registry, in SAM\Domains\Account\Users.
The accounts are listed by relative identifiers (RID, see sec. 8.3). Each entry
shows, a.o.:
• account creation time;
• last login time;
• last password change;
44
Digital forensics
CHAPTER 9. WINDOWS ARTIFACTS
Figure 9.1: Windows Current Version information
• login count;
• group memberships.
Microsoft Cloud Accounts
If the user uses a Microsoft Cloud Account to log into the system, the email
address associated with the account, will be stored in
SAM\Domains\Account\Users\<RID>\InternetUserName
Logon attempts
When a logon attempt is made (sucessful or failed), Windows will record logs in
the Security event log (see Section 8.2).
9.3
Application execution
These artifacts allow to show that an application was executed by the user.
Jump Lists
Windows Jump Lists were introduced in Windows 7 and allow user access to
frequently or recently used items quickly via the task bar. They can identify
Digital forensics
45
CHAPTER 9. WINDOWS ARTIFACTS
applications in use and a additional metadata about items accessed via those
applications.
Jump lists are stored in
%USERPROFILE%\AppData\Roaming\Microsoft\Windows\Recent\
AutomaticDestinations
Each jump list file is named according to an application identifier (AppID). The
list of AppIDs can be found on https://dfir.to/EZJumpList
• Automatic Jump List Creation Time = First time an item added to the jump
list. Typically, the first time an object was opened by the application.
• Automatic Jump List Modification Time = Last time item added to the jump
list. Typically, the last time the application opened an object.
Figure 9.2: Windows Jump Lists
Last visited MRU
This entry tracks applications that are currently used by the user and the directory
location for the last file accessed by the application.
For Windows XP, this entry is located in:
NTUSER.DAT\Software\Microsoft\Windows\CurrentVersion\Explorer\
ComDlg32\LastVisitedMRU
Since Windows 7, the entry is in:
NTUSER.DAT\Software\Microsoft\Windows\CurrentVersion\Explorer\
ComDlg32\LastVisitedPidlMRU
46
Digital forensics
CHAPTER 9. WINDOWS ARTIFACTS
Figure 9.3: Last Visited MRU
Run MRU
This key stores the MRU list of commands executed using the Run dialog box:
NTUSER.DAT\Software\Microsoft\Windows\CurrentVersion\Explorer\
RunMRU
UserAssist
UserAssist key records metadata on GUI-based programs executed by the user.
The key is located in:
NTUSER.DAT\Software\Microsoft\Windows\CurrentVersion\Explorer\
UserAssist\{GUID}\Count\
The {GUID} identifies the type of execution:
• CEBFF5CD means the executable was directly triggered;
• F4E57C4B means a shortcut was used.
The entries show the application path, and additional information depending on
the Windows version:
• last run time;
Digital forensics
47
CHAPTER 9. WINDOWS ARTIFACTS
Figure 9.4: Run dialog box
Figure 9.5: Registry entry RunMRU
48
Digital forensics
CHAPTER 9. WINDOWS ARTIFACTS
• run count;
• focus time : total time in milliseconds that application was in focus;
• focus count : total number of times the application was re-focused on.
Remark: values are encoded using ROT-13 of character values in the ASCII range
[A-Za-z]. For example, P:\Hfref\guvon\ is the ROT-13 encoded version of
C:\Users\thiba\
Figure 9.6: UserAssist registry entries
9.4
File and folder opening
These artifacts allow to determine which files or folders have been opened by the
user.
Open/Save MRU
Windows stores the list of files that have been opened or saved using a Windows
shell dialog box in the Most Recently Used (MRU) registry entry.
For windows XP, this entry is located in
NTUSER.DAT\Software\Microsoft\Windows\CurrentVersion\Explorer\
ComDlg32\OpenSaveMRU
Digital forensics
49
CHAPTER 9. WINDOWS ARTIFACTS
For windows 7 and more recent, the entry is
NTUSER.DAT\Software\Microsoft\Windows\CurrentVersion\Explorer\
ComDlg32\OpenSavePIDlMRU
The windows shell dialog box is the dialog box used by most common applications
to open or save a file. . .
Last Visited MRU
This registry entry stores the applications that are currently in use by the user
and the directory location for the last file accessed by the application.
For windows XP, the entry is located in
NTUSER.DAT\Software\Microsoft\Windows\CurrentVersion\Explorer\
ComDlg32\LastVisitedMRU
Since Windows 7, the entry is in
NTUSER.DAT\Software\Microsoft\Windows\CurrentVersion\Explorer\
ComDlg32\LastVisitedPidlMRU
Recent Docs
This registry key is tracking the last files and folders opened. It is used to populate
places like the “Recent” menus present in the Start menu:
NTUSER.DAT\Software\Microsoft\Windows\CurrentVersion\Explorer\
RecentDocs
It is a MRU list, but with a subtlety [19]: the MRUListEx entry stores the index of
the last modified registry entry, in hexadecimal. Hence it can be considered as a
pointer to the start of the list.
Office Recent Files
MS Office programs track their own recent files list, to make it easier for users to
access previously opened files.
This list can be located in different locations:
NTUSER.DAT\Software\Microsoft\Office\<Version>\<AppName>\File
MRU
where version can be one of:
• 16.0 = Office 2016/2019/M365;
• 15.0 = Office 2013;
• 14.0 = Office 2010;
50
Digital forensics
CHAPTER 9. WINDOWS ARTIFACTS
• 12.0 = Office 2007;
• 11.0 = Office 2003;
• 10.0 = Office XP.
or
NTUSER.DAT\Software\Microsoft\Office\<Version>\<AppName>\User
MRU\LiveId_####\File MRU
for Microsoft 365, or
NTUSER.DAT\Software\Microsoft\Office\<Version>\<AppName>\User
MRU\AD_####\File MRU
for Microsoft 365 (Azure Active Directory).
MS Word Reading Locations
Beginning with Word 2013, the last known position of the user within a Word
document is recorded in
NTUSER.DAT\Software\Microsoft\Office\<Version>\Word\Reading Locations
Shell Bags
Shell Bags [20] are used by Windows to save the view preferences (position, size
etc.) of each folder opened by the user in the File Explorer.
For each opened directory, Windows create a ShellBag entry in
USRCLASS.DAT\Local Settings\Software\Microsoft\Windows\Shell\Bags
USRCLASS.DAT\Local Settings\Software\Microsoft\Windows\Shell\BagMRU
ShellBags can hold a lot of information about:
•
•
•
•
•
9.5
local folders;
network folders;
removable devices;
deleted folders;
opened ZIP archives.
Deleted items
These artifacts allow to determine which files have been deleted by the user.
Recycle Bin
Digital forensics
51
CHAPTER 9. WINDOWS ARTIFACTS
Like on most Operating Systems, when a user deletes a file on Windows, the file
is simply moved to a hidden system folder.
For Windows XP: C:\Recycler
Since Windows 7 : C:\Recycle.Bin
Each user is assigned a SID sub-folder (that can be mapped to a user name by
inspecting the registry)
Since Windows 7, the files preceded by $I###### contain the original filename
and deletion date/time, and the files preceded by $R###### contain the original
deleted file contents.
Windows Search Database
Windows Search indexes more than 900 file types, including email and file metadata, allowing users to search based on keywords. It contains extensive file
metadata and even partial file content.
Since Windows 7, the database is saved in
C:\ProgramData\Microsoft\Search\Data\Applications\Windows\Windows.edb
Thumbs.db
The hidden database file is created in directories where images were viewed as
thumbnails. In some cases, the thumbnail of deleted pictures is kept in Thumbs.db
Since Windows 10, the thumbnail cache is centralized in
%USERPROFILE%\AppData\Local\Microsoft\Windows\Explorer
Shell Bags
In some cases, shell bags can be used to recover deleted files.
9.6
Physical location and network activity
(p.m.)
9.7
Browser activity
(p.m.)
52
Digital forensics
CHAPTER 9. WINDOWS ARTIFACTS
Figure 9.7: Thumbnails cache on Windows 11
9.8
External device/USB usage
(p.m.)
Digital forensics
53
CHAPTER 9. WINDOWS ARTIFACTS
54
Digital forensics
Chapter 10
Forensics tools
10.1
RegRipper
RegRipper is a perl script that allows to dump the content of a registry hive file
into readable text. It is preinstalled on the SIFT workstation.
Figure 10.1: RegRipper
55
CHAPTER 10. FORENSICS TOOLS
–
Additional resources
On the SIFT workstation, RegRipper has a bug. You can find the fix at
https://cylab.be/blog/287/sift-workstation-fix-rippl-error-globalsymbol-plugindir-requires-explicit-package-name
RegRipper actually relies on plugins to parse the content of the registry hives.
You can list the available plugins with:
rip.pl -l
Next to each plugin (between the square brackets), you will find the profiles that
use this plugin. A profile is simply a predefined group of plugins.
Figure 10.2: RegRipper plugins and profiles
You can analyze a hive using a specific profile with:
rip.pl -r <hive> -f <profile>
Or, to run only a single plugin:
rip.pl -r <hive> -p <plugin>
The most commonly used profiles are:
• sam : for SAM hives
• system : for SYSTEM hives
56
Digital forensics
CHAPTER 10. FORENSICS TOOLS
• ntuser : for NTUSER.DAT
And here are a few interesting plugins:
•
•
•
•
•
•
•
•
•
timezone
compname
shutdown
networklist
shares
unreadmail
recentdocs
typedpaths
userassist
v
Exercise
User profiling
Download and extract hives-01.zip from https://cylab.be/s/Q2zQ0
Use RegRipper to parse the registry hives, and answer the following
questions:
1.
2.
3.
4.
5.
✓
What is the RID of IEuser ?
To which groups does this user belong?
When was the las time this account looged into the system?
When was the password changed?
What is the login count?
Solution
This kind of information is stored in the SAM hive:
rip.pl -f sam -r SAM
Digital forensics
57
CHAPTER 10. FORENSICS TOOLS
v
Exercise
System profiling
Using hives-01.zip, use RegRipper to parse the registry hives, and
answer the following questions:
1.
2.
3.
4.
✓
Which ControlSet is used by this system?
What is the timezone of the system?
What is the name of the machine?
When was the last shutdown (in UTC)?
Solution
This kind of information is stored in the SYSTEM hive:
rip.pl -f system -r SYSTEM
v
58
Exercise
Networks 5. What networks was the machine connected to? 6. What is
the last network used by the machine? 7. When was the last connection
to this network? 8. What type of network is this? 9. Are there any
shared drives on this machine?
Digital forensics
CHAPTER 10. FORENSICS TOOLS
v
Exercise
User activity
Using hives-01.zip, use RegRipper to parse registry hives and answer the following questions:
1. What is the mail client used by the user IEUser (if any)?
2. What is the last .zip file opened by the user, and when (in
UTC)?
3. What is the last .txt file opened by the user, and when (in
UTC)?
4. What is the last .pdf file opened by the user, and when (in
UTC)?
5. What is the last folder opened by the user, and when (in UTC)?
6. Is there any evidence of the user searching for specific paths in
Windows (TypedPaths registry key)?
7. List the last programs executed by the user.
✓
Solution
This kind of information is stored in the NTUSER.DAT hive of each
user:
rip.pl -f ntuser -r NTUSER.DAT
10.2
evtxinfo and evtxexport
evtxinfo and evtxexport are part of the Libevtx project [21]. This library
allows to access and dump the content of Windows XML Event Log (EVTX)
databases.
These tools are already installed on the SIFT machine. They can be used on the
command line with:
evtxinfo <file.evtx>
evtxexport <file.evtx>
Digital forensics
59
CHAPTER 10. FORENSICS TOOLS
v
Exercise
Download and extract eventlogs-01.zip from https://cylab.be/s/G
wsxl
Use evtxinfo to analyze the Security event log and answer the
following questions:
1. What is the version of the Event Log?
2. What is the number of records in the Event Log?
3. Is there anything suspicious about the Event Log information?
✓
Solution
v
Exercise
evtxinfo Security.evtx
Use evtxexport to dump the content of the Security event log, and
answer the following questions:
1. Waht is the last event logged in the Security log, and what is the
meaning (according to the event identifier) ?
2. Which account is related to the last event?
3. Is there anything interesting about how this account was used?
Check the preceding events. . .
10.3
Eric Zimmerman’s tools
Eric Zimmerman is a forensics specialist who wrote a collection of analysis tools:
https://ericzimmerman.github.io/
The tools run on Windows and are thus not installed on the SIFT workstation.
60
Digital forensics
CHAPTER 10. FORENSICS TOOLS
–
Additional resources
You can find installation instructions for Eric Zimmerman’s tools at:
https://cylab.be/blog/290/install-eric-zimmermans-forensics-toolkit
10.3.1
Registry Explorer
The Registry Explorer allows to load hives and explore the registry.
On the right pane (Values), the top row allows to filter values.
10.3.2
LNK files
10.4
Thumbcache viewer
Thumbcache Viewer is a small utility that allows to inspect and visualize the
content of thumbcache files (Thumbs.db). You can download the tool from
https://thumbcacheviewer.github.io/
Digital forensics
61
CHAPTER 10. FORENSICS TOOLS
As you may notice on the screenshot above, the path of the file is not saved in the
cache. Thumbnails are only identified by a hash. This means we must re-compute
the hash of each image on the device to find the image corresponding to the
thumbnail.
–
v
Additional resources
You can find the details of the hash algorithm at https://www.swiftfor
ensics.com/2012/06/windows-7-thumbcache-hash-algorithm.html
Exercise
Download and extract the
https://cylab.be/s/XaVkj
thumbcache
database
from
Analyze the images. Is there a suspicious image?
10.5
Additional exercises
Download and extract case20-disk-01.zip from https://cylab.be/s/j8RIq
Answer the following questions. For each question you must be able to give an
answer, and briefly explain the tool or technique used to find the answer. . .
User profiling
1. Which accounts are active on this computer?
62
Digital forensics
CHAPTER 10. FORENSICS TOOLS
2. Do they have administrator rights?
3. Can you find traces of failed login attempts?
System profiling
1.
2.
3.
4.
5.
What is the hostname of the computer?
What is the timezone of the computer?
What is the installed version of Windows?
When was this installed?
When was the last shutdown?
Executed applications
1. According to userassist, which applications have been executed by
TimmersVic?
Digital forensics
63
CHAPTER 10. FORENSICS TOOLS
64
Digital forensics
Part III
Memory forensics
65
Chapter 11
Principles of computer memory
In this Chatper, we will cover some principles that are common to (almost) all
operating systems regarding memory organization.
11.1
Address space of a process
The address space of a process contains all of the memory state of the running
program. For example, the code of the program (the instructions) have to live in
memory somewhere, and thus they are in the address space. The program, while it
is running, uses a stack to keep track of where it is in the function call chain as
well as to allocate local variables and pass parameters and return values to and
from routines. Finally, the heap is used for dynamically-allocated, user-managed
memory, such as that you might receive from a call to malloc() in C or new in an
object-oriented language such as C++ or Java. Of course, there are other things in
there too (e.g., statically-initialized variables), but for now let us just assume those
three components: code, stack, and heap.
On Figure XXXX, we have a tiny address space (only 16KB) . The program code
lives at the top of the address space (starting at 0 in this example, and is packed
into the first 1K of the address space).
Code is static (and thus easy to place in memory), so we can place it at the top
of the address space and know that it won’t need any more space as the program
runs.
Next, we have the two regions of the address space that may grow (and shrink)
while the program runs. Those are the heap (at the top) and the stack (at the
bottom). We place them like this because each wishes to be able to grow, and by
putting them at opposite ends of the address space, we can allow such growth:
67
CHAPTER 11. PRINCIPLES OF COMPUTER MEMORY
Figure 11.1: An example address space of a process
they just have to grow in opposite directions. The heap thus starts just after the
code (at 1KB) and grows downward (say when a user requests more memory via
malloc()); the stack starts at 16KB and grows upward (say when a user makes a
procedure call). However, this placement of stack and heap is just a convention;
you could arrange the address space in a different way if you’d like (as we’ll see
later, when multiple threads co-exist in an address space, no nice way to divide
the address space like this works anymore, alas).
Of course, when we describe the address space, what we are describing is the
abstraction that the OS is providing to the running program.
11.2
Virtualization and paging [5]
From the perspective of memory, early machines didn’t provide much of an
abstraction to users. Basically, the physical memory of the machine consisted of
64KB reserved for the Operating System, and there would be one single process
(program) using the rest of the available memory.
When computers evolved and allowed to run multiple processes at the same time,
OS developers had to implement techniques that allowed to run these processes,
without having to modify the programs themselves. This was implemented using
2 mechanisms: memory virtualization and paging.
With virtualization, each process running on the computer ‘thinks’ he is using the
68
Digital forensics
CHAPTER 11. PRINCIPLES OF COMPUTER MEMORY
complete available address space, starting at address 0. This is called the virtual
memory. The code and data of the process is actually saved in a different place in
physical memory.
To use available physical memory in the most effective way, most modern OS
implement paging: the physical address space is split into fixed-sized (typically
4096 Bytes) units called pages.
Figure 11.2: Paging
To record where each virtual page of the address space is placed in physical
memory, the operating system keeps a per-process data structure known as
a page table. The major role of the page table is to store address translations
for each of the virtual pages of the address space, thus letting us know where in
physical memory they live.
When a process tries to access a memory address, for example to read a write a
variable, the virtual address must be translated into a physical memory address.
This is the task of the Memory Management Unit (MMU), and it takes 3 steps:
1. the page table of the process is found thanks to the Page Map Address
Register (PMAR);
Digital forensics
69
CHAPTER 11. PRINCIPLES OF COMPUTER MEMORY
2. the virtual address is split between page number (20 bits) and offset (12 bits)
3. the physical start address of the page is found in the page table, and the
offset is added
Figure 11.3: Memory address translation
To avoid wasting available memory, shared libraries are handled in a special
way: for all processes, they are mapped to the same physical memory addresses.
This way shared memory must not be duplicated in memory. This is also the case
for some OS memory areas that must be accessible by user processes. This is
illustrated on the Figure below.
11.3
Interrupts
The x86 has 4 protection levels, numbered 0 (most privilege) to 3 (least privilege).
In practice, most operating systems use only 2 levels: 0 and 3, which are then
called kernel mode and user mode, respectively. The current privilege level with
which the x86 executes instructions is stored in %cs register, in the field Current
Privilege Level (CPL).
For example, to make a system call on the x86 architecture (see below), a program
invokes the int n instruction, where n specifies the index of the interrupt in the
interrupt descriptor table (IDT).
70
Digital forensics
CHAPTER 11. PRINCIPLES OF COMPUTER MEMORY
Figure 11.4: Page number and offset
Figure 11.5: Shared libraries
Digital forensics
71
CHAPTER 11. PRINCIPLES OF COMPUTER MEMORY
When a software interrupt [22] occurs, the x86 cpu first saves the current context
on the kernel stack.
Figure 11.6: Interrupt table
In the original 8086 processor (and all x86 processors in Real Mode), the Interrupt Vector Table (IVT) controlled the flow into an Interrupt Service Routine
(ISR) . The IVT started at memory address 0x00, and could go as high as 0x3FF,
for a maximum number of 256 ISRs (ranging from interrupt 0 to 255). Each entry
in the IVT contained 2 words of data: A value for the Instruction Pointer (IP)
and a value for the Code Segment (CS) (in that order). For example, let’s say
that we have the following interrupt:
int 14h
When we trigger the interrupt, the processor goes to the 20th location in the IVT
(14h = 20).
The processor then loads %eip and %cs. This will modify the privilege level (if
required) and start the execution the interupt handler.
At the end of the interruption handler, the values saved during the int instruction
are popped from the stack, and CPU resumes execution at the saved %eip.
Since the 286 (but extended on the 386), interrupts may be managed by a table
in memory called the Interrupt Descriptor Table (IDT). The IDT only comes into
72
Digital forensics
CHAPTER 11. PRINCIPLES OF COMPUTER MEMORY
play when the processor is in protected mode. Much like the IVT, the IDT contains
a listing of pointers to the ISR routine.
The following assembly structure represents an IDT entry:
struc idt_entry_struct
base_low: start of ISR
sel:
segment of ISR
always0:
not used
flags:
DPL and other flags
base_high: end of ISR
endstruc
where DPL is the Desired Protection Level.
IVT and IDT are installed in memory by the OS during the boot process.
11.4
System calls
As we have seen in the previous Section, most modern processors define two
different privilege levels under which software may be executed: a user-mode
program is hence limited to its own address space so that it cannot access or modify
other running programs or the operating system itself, and is usually prevented
from directly manipulating hardware devices (e.g. the frame buffer or network
devices).
However, many normal applications obviously need access to these components.
Hence the operating system (kernel), that executes at the highest privilege, defines
system calls to provide well defined, safe implementations for such operations.
Hence system calls [23] can roughly be defined as functions that:
• are implemented in the operating system,
• are executed in kernel mode but
• can be executed by user programs.
Implementing system calls requires a control transfer from user space to kernel
space, which involves some sort of architecture-specific feature. A typical way to
implement this is to use a software interrupt or trap. Interrupts transfer control
to the operating system kernel so software simply needs to set up some register
with the system call number needed, and execute the software interrupt.
This is the only technique provided for many RISC processors, but CISC architectures such as x86 support additional techniques. For example, the x86 instruction
set contains the instructions SYSCALL/SYSRET and SYSENTER/SYSEXIT
Digital forensics
73
CHAPTER 11. PRINCIPLES OF COMPUTER MEMORY
(these two mechanisms were independently created by AMD and Intel, respectively, but in essence they do the same thing). These are “fast” control transfer
instructions that are designed to quickly transfer control to the kernel for a system
call without the overhead of an interrupt.
11.5
Chapter review
At the end of this Chapter, you should be able to answer the following questions:
•
•
•
•
74
Explain the address space of a single process
Explain how and why we need memory virtualization and paging
Explain how interrupts work
Explain what a system call is, and how it can be implemented using interrupts
Digital forensics
Chapter 12
Windows memory
In this Chapter, we will cover with a little more details the internals of Windows,
and how this OS manages memory and processes.
12.1
Windows architecture
Like many other Operating Systems, Windows uses only 2 protection levels
available from the x86 architecture: user mode and kernel mode.
Windows also has a monolithic kernel OS: bulk of the OS and device driver code
shares the same kernel-mode protected memory space. This means that any OS
component or device driver can potentially corrupt data being used by other OS
system components (intentionally or not). However, Mircrosoft tries to address
this through attempts to strengthen and validate components that can be loaded
in protected memory. For example Windows device drivers must be signed by
Micrsoft before they can be loaded.
The simplified architecture of Windows is represented on the Figure below.
The Hyper-V hypervisor at the bottom is a feature introduced witn Windows 8.
This component runs with the same CPU privilege (0) as the rest of the kernel,
but because it uses specialized CPU virtualization instructions (VT-x on Intel
and SVM on AMD), it can isolate itself from the rest of the kernel, while also
monitoring it.
Service processes are processes that host Windows services such as the Task
Scheduler or the Print Spooler.
System processes are specific system processes like the logon process and the
Session Manager.
75
CHAPTER 12. WINDOWS MEMORY
Figure 12.1: Simplified Windows Architecture
Environment susbsystems allow to support different OS environments (or personalities) presented to the user and programmer. This feature is for example
responsible for the Windows Subsystem for Linux (WSL).
The subsystem DLLs are an abstraction layer between the API presented to (and
used by) application developers, and the internal native calls implemented mosty
in Ntdll.dll. The subsystem DLLs are well documented and stable, while the
NTDLL.DLL calls are not documented and subject to regular changes. . .
The device drivers include hardware drivers that translate user I/O funtion calls
into specific hardware device I/O requests, and non-hardware device drivers, such
as file system and network protocol implementations.
The Hardware Abstraction Layer (HAL) isolates the kernel and device drivers
from platform specific hardware differences (like differences between motherboards).
12.1.1
Environment subsytems
The role of an environment sysbsytem is to expose some subset of the base
Windows executive system services to applications. Each subsystem can provide
access to different subsets of the native services and system calls in Windows.
Each executable image (.exe) is bound to only one subsystem. When the iamge is
run, the process creation code examines the sysbsystem type field in the image
header so that it can notify the proper susbsystem of the new process.
76
Digital forensics
CHAPTER 12. WINDOWS MEMORY
When Windows boots, the subsystems are started by the Session Manager
(Smss.exe) process. The subsystem startup information is stored under
the registry key HKLM/SYSTEM/CurrentControlSet/Control/Session
Manager/Subsystems. The Required key lists the subsystems that load when
the system boots.
The Windows key containes the file specification of the (classical) Windows subsystem: csrss.exe, which stands for Client/server Runtime Subsystem.
12.2
Chapter review
At the end of this Chapter, you should be able to answer the following questions:
• Explain the architecture of windows, and the role of the different components.
Digital forensics
77
CHAPTER 12. WINDOWS MEMORY
78
Digital forensics
Chapter 13
Memory acquisition
79
CHAPTER 13. MEMORY ACQUISITION
80
Digital forensics
Chapter 14
Memory analysis
81
CHAPTER 14. MEMORY ANALYSIS
82
Digital forensics
Chapter 15
Linux memory forensics
83
CHAPTER 15. LINUX MEMORY FORENSICS
84
Digital forensics
Part IV
Mobile device forensics
85
Part V
Network forensics
87
A lot of our digital activity invovles network usage. Hence part of a digital
forensics investigation involves the analysis of network traffic.
Compared to Windows forensics investigation, network forensics analysis presents
2 additional challenges:
1. Network forensics requires preparation: unlike Windows, a router or a
switch does not store logs or statistics about transmitted network traffic.
Hence a logging or recording infrastructure must be put in place before the
incident actually happens. Otherwize there is simply nothing to analyze. . .
2. The amount of data to store and process may become huge.
For example, a simple 1Gb network link used at 100% will generate daily:
1x109 bit/sx3600x24
= 10800GB/day
8x109
That is rougly 10TB of data, for a single link (and in a single direction).
This means that if a forenscis investigation must be performed over full packet
capture network data, the investigator will need large enough storage space, and
appropriate tools to process this large data.
Wireshark is probably the most used tool to analyse network captures (and the
easiest). However, it is only able to open for relatively small files (rougly maximum
500MB). Hence the first step of an investigation will be to use command line tools
to pre-process the data.
So in the next Chapters we will first review some command line tricks and commands that will be usefull for network traffic analysis. Then we will give a review
of the main network protocols that you may encounter. And in Chapter 18 we will
show how tcpdumpcan be used to preprocess the data.
Digital forensics
89
90
Digital forensics
Chapter 16
Command line tricks
Analyzing network traffic requires to use and combine command lines tools!
This Chapter contain a lot of exercises. To do the exercises yourself, you will
need:
• a Linux system
• the file lorem.txt
cat
Reads the content of a (text) file
$ cat lorem.txt
wc
Counts the number of lines, words and characters of a file
$ wc lorem.txt
39 1939 13222 lorem.txt
To show only the number of lines :
$ wc -l lorem.txt
piping
Redirects the output of one command to the input of another
$ cat lorem.txt | wc
91
CHAPTER 16. COMMAND LINE TRICKS
v
Exercise
✓
Solution
Show only the number of lines in lorem.txt
$ cat lorem.txt | wc -l
cut
Extracts the given field(s) from each line.
$ cut -d <delimiter> -f <field(s)>
$ cut -d ' ' -f 2,4-6
v
Exercise
✓
Solution
In lorem.txt, show only the first word of each line.
$ cat lorem.txt | cut -d ' ' -f 1
sort
Sort the lines.
For numeric values:
$ sort -n
For human readable values (1M, 1G):
$ sort -h
uniq
92
Digital forensics
CHAPTER 16. COMMAND LINE TRICKS
Filter adjacent identical lines (used for logs deduplication)
$ cat uniq.txt
line a
line a
line b
line a
line b
$ cat uniq.txt | uniq
line a
line b
line a
line b
To show count:
$ uniq -c
$ cat uniq.txt | uniq -c
2 line a
1 line b
1 line a
1 line b
Often used together with sort:
$ cat uniq.txt | sort
3 line a
2 line b
| uniq -c
or
$ cat uniq.txt | sort
2 line b
3 line a
v
| uniq -c | sort -n
Exercise
In lorem.txt, what is the first word of each line that appears most
often?
Digital forensics
93
CHAPTER 16. COMMAND LINE TRICKS
✓
Solution
$ cat lorem.txt | cut -d ' ' -f 1 | sort | uniq -c \
| sort -n
Only show repeated lines:
$ uniq -d
grep
Search for a string or pattern in each line
E.g
$ grep USB /var/log/syslog
Defining a pattern
Quantifiers:
• * : 0 or more
• ? : 0 or 1
• + : at least 1
E.g.
Lo+rem matches Lorem and Loorem and Looorem and . . .
$ grep Lo+rem lorem.txt
Classes represent different possible characters:
• . represents any character (1 instance)
$ grep "L.rem" lorem.txt
•
•
•
•
\s : whitespace
\S : non-whitespace
\d : digit
\D : non-digit
Combining classes and quantifiers
E.g: .* represents 0 or more instances of any character
$ grep "L.*m" lorem.txt
94
Digital forensics
CHAPTER 16. COMMAND LINE TRICKS
Some special characters must be escaped with \ :
•
•
•
•
•
\/
\\
\*
\ˆ
\%
v
Exercise
Which pattern will match Lorem ipsum ?
a. Lorem\sipsum
b. Lorem\Sipsum
c. Lorem*ipsum
v
Exercise
Which (simplified) pattern can be used to match a URL ?
a.
b.
c.
d.
✓
https?:\/\/\s*
http.://\s*
https?:\/\/\S*
https?://\S*
Solution
https?:\/\/\S*
BUT:
https://mathiasbynens.be/demo/url-regex
Case insensitive :
$ grep -i
Count the number of matching lines
Digital forensics
95
CHAPTER 16. COMMAND LINE TRICKS
Figure 16.1: regex101.com allows to test your patterns
$ grep -c
Invert match (show lines NOT matching pattern)
$ grep -v
Show context (print N lines before and after each match)
$ grep -C <N>
v
Exercise
Info regarding your CPU is kept in /proc/cpuinfo
processor
vendor_id
cpu family
model
model name
...
:
:
:
:
:
0
GenuineIntel
6
158
Intel(R) Core(TM) i7-7700HQ CPU @ 2.80GHz
How would you automatically (in a script) compute the number of
vcores ?
96
Digital forensics
CHAPTER 16. COMMAND LINE TRICKS
✓
Solution
$ grep -c "processor" /proc/cpuinfo
!! Some regex meta-characters are not supported by grep by default:
•
•
•
•
•
+ : 1 or more
? : 0 or 1
( and ) : group
| : or
{} : range specifier e.g: a{4}
To use them: grep -E or egrep
v
Exercise
Compare the result of
• grep Lo+rem lorem.txt and
• grep -E Lo+rem lorem.txt
zgrep
Allows to use grep on compressed files
$ zgrep "UFW BLOCK" /var/log/syslog.2.gz
head & tail
• head shows the first 10 lines
• tail shows the last 10 lines
$ head lorem.txt
$ tail -2 lorem.txt
tail
To keep showing appended data as the file grows (follow):
$ tail -f /var/log/syslog
time
Runs a command and shows resource usage (running time)
Digital forensics
97
CHAPTER 16. COMMAND LINE TRICKS
$ time <my heavy command>
$ time grep usb /var/log/syslog
By default, shows:
• real : wall clock execution time
• user : CPU time spent in user mode
• sys : CPU time spent in kernel mode
Many others available (check man time for the list)
98
Digital forensics
Chapter 17
Network protocols
Before we start using tcpdump to perform actual network traffic analysis, we give
here a quick review of the main network protocols we will encounter. For each
protocol we quickly list and explain the main header fields.
We also use the offset notation, because this notation is also used by tcpdump
and Berkeley Packet Filters (BPF). For example ip[9] is the 10th Byte of the IP
header (offset of 9 Bytes).
17.1
IPv4
Internet Protocol version 4 [24] is the protocol that defines and enables internetworking at the internet layer of the Internet Protocol Suite. It uses a logical
addressing system and performs routing, which is the forwarding of packets from
a source host to the next router that is one hop closer to the intended destination
host on another network.
Figure 17.1: IPv4 Header
• IP Header Length:
– expressed in words of 4 Bytes
99
CHAPTER 17. NETWORK PROTOCOLS
– minimum : 5
• TOS / Differentiated Services Byte:
– 6 bits Differentiated Services
– 2 bits Explicit Congestion Notification (ECN)
• Total length:
– in Bytes
• Fragmentation:
– X : Reserved (evil bit)
– D : Do not fragment
– M : More fragment
– Offset : position of this fragment in original packet (in Byte)
Protocol ip[9]:
•
•
•
•
•
•
1 : ICMP
2 : IGMP
6 : TCP
17 : UDP
41 : IPv6
47 : GRE
17.2
TCP
The Transmission Control Protocol [25] provides reliable, ordered, and errorchecked delivery of a stream of octets (bytes) between applications running on
hosts communicating via an IP network.
Figure 17.2: TCP Header
Common ports:
•
•
•
•
100
20 :
21 :
22 :
23 :
ftp-data
ftp
SSH
telnet
Digital forensics
CHAPTER 17. NETWORK PROTOCOLS
•
•
•
•
•
•
•
•
•
•
25 : smtp
43 : whois
53 : dns
80 : http
110 : pop3
143 : IMAP
443 : https
1433 : MS SQL
3128 : Squid HTTP Proxy
3306 : MySQL
Sequence number :
• random initial value
• incremented by packet size (in Byte)
ACK number :
• next expected sequence number
HL : Header Length:
• expressed in words of 4 Bytes
• minimum : 5
R : Reserved
Flags tcp[13]
0x
8
CWR
4
ECE
2
URG
1
ACK
8
PUSH
4
RES
2
SYN
1
FIN
CWR and ECE are used for Explicit Congestion Notification (ECN).
17.3
ICMP
The Internet Control Message Protocol [26] is a supporting protocol in the Internet protocol suite. It is used by network devices, including routers, to send
error messages and operational information indicating success or failure when
communicating with another IP address, for example, an error is indicated when a
requested service is not available or that a host or router could not be reached.
The main type and code combinations are listed in the table below:
Digital forensics
101
CHAPTER 17. NETWORK PROTOCOLS
Figure 17.3: ICMP
17.4
Type
Code
0
3
3
3
3
5
8
11
0
0
1
2
3
0
0
0
Description
Echo reply
Network unreachable
Host unreachable
Protocol unreachable
Port unreachable
Network redirect
Echo request
TTL exceeded
UDP
The User Datagram Protocol [27] is one of the core communication protocols of
the Internet protocol suite used to send messages (transported as datagrams in
packets) to other hosts on an Internet Protocol (IP) network.
Unlike TCP, UDP uses a simple connectionless communication model with a
minimum of protocol mechanisms. UDP provides checksums for data integrity,
and port numbers for addressing different functions at the source and destination
of the datagram. It has no handshaking dialogues and thus exposes the user’s
program to any unreliability of the underlying network; there is no guarantee of
delivery, ordering, or duplicate protection.
Figure 17.4: UDP Header
Here are the port numbers used by some protocols that run on top of UDP:
102
•
•
•
•
•
67 and 68 : DHCP
123 : NTP
137 and 138 : netbios
161 and 162 : snmp
514 : syslog
Digital forensics
CHAPTER 17. NETWORK PROTOCOLS
17.5
ARP
The Address Resolution Protocol [28] is a communication protocol used for
discovering the link layer address, such as a MAC address, associated with a given
internet layer address, typically an IPv4 address.
Figure 17.5: ARP
Type:
• 1 : Ethernet
• 0x0800 : IPv4
Opcode:
• 1 : request
• 2 : response
Address size:
• IPv4 : 4 Bytes
• Ethernet : 6 Bytes
Digital forensics
103
CHAPTER 17. NETWORK PROTOCOLS
104
Digital forensics
Chapter 18
tcpdump
• Allows to capture and analyze network traffic
• Suitable for LARGE captures (unlike wireshark)
18.1
Requirements
This Chapter contains a lot of exercises to get you familiar with tcpdump. We
strongly encourage you to do the exercises by yourself. To do so, you will need a
Linux system, with tcpdump installed, and the following exercise files:
• treasurehunt_fw_eth1.pcap : https://cylab.be/s/uHlYB
• lostsofweb.pcap : https://cylab.be/s/2FnZC
18.2
Packet analysis
tcpdump -r <file>
Do NOT perform address conversion (DNS):
tcpdump -n -r <file>
• speedup processing
• avoid detection by hacker
On some systems, you have to run tcpdump as superuser:
sudo tcpdump ...
105
CHAPTER 18. TCPDUMP
Exercise
v
What is the wall clock time required to analze the file lotsofweb.pcap:
1. with address conversion?
2. without address conversion?
Solution
✓
18.2.1
1. time tcpdump -r lotsofweb.pcap
2. time tcpdump -n -r lotsofweb.pcap
Display options
Show only the n first packets:
tcpdump -r <file> -c <n>
tcpdump -n -r lotsofweb.pcap -c 12 | tail -1
Figure 18.1: Default tcpdump display format
By default:
• packet time (no date) in local (capture) TZ
106
Digital forensics
CHAPTER 18. TCPDUMP
• L3 protocol
• source IP.port > destination IP.port
For TCP:
•
•
•
•
•
flags : . = ACK, S = SYN, R = RST,. . .
relative sequence number : computed next sequence
relative ack
window, length
L7 protocol
v
Exercise
In lotsofweb.pcap :
1. make the list of source IP addresses
2. which IP address sent most packets?
✓
18.2.2
Solution
1. $
|
2. $
|
tcpdump -n
cut -d '.'
tcpdump -n
cut -d '.'
-r
-f
-r
-f
lotsofweb.pcap | cut -d ' ' -f 3
1-4 | sort | uniq
lotsofweb.pcap | cut -d ' ' -f 3
1-4 | sort | uniq -c | sort -n
Output formating
Show the L2 header (including Ethernet MAC addresses):
$ tcpdump -e
v
Exercise
Using the 12th packet of lotsofweb.pcap,
1. what is the MAC address of the default gateway?
2. what is the manufacturer of this router?
Digital forensics
107
CHAPTER 18. TCPDUMP
Figure 18.2: TCPDUMP showing L2 headers
✓
Solution
1. $ tcpdump -r lotsofweb.pcap -e -c 12
2. check https://macvendors.com
or:
$ curl https://api.macvendors.com/`tcpdump -n \\
-r lotsofweb.pcap -c 12 -e | tail -1 \\
| cut -d ' ' -f 2`
Timestamps
•
•
•
•
•
-t : no timestamps
-tt : unix timestamps
-ttt : delta with previous packet
-tttt : date and time in local TZ
-ttttt : relative to first packet
v
108
Exercise
At what date and time (UTC/GMT) was the first packet received ?
Digital forensics
CHAPTER 18. TCPDUMP
✓
Solution
$ tcpdump -n -r lotsofweb.pcap -c 1 -tt
Then convert unix timestamp, for example on https://www.epochcon
verter.com/
or:
$ TZ="UTC" date -d @`tcpdump -n -r lotsofweb.pcap \\
-c 1 -tt | cut -d ' ' -f 1`
Sequence numbers
-S : show real (absolute) sequence numbers
v
Exercise
✓
Solution
v
Exercise
✓
Solution
In lotsofweb.pcap, what is the real sequence number of the 20th packet
?
$ tcpdump -n -r lotsofweb.pcap -c 20 -S | tail -1
In lotsofweb.pcap, what is the most often appearing sequence number
?
$ tcpdump -n -r lotsofweb.pcap -S | cut -d ' ' -f 9 |
cut -d ':' -f 1 | cut -d ',' -f 1 | sort -n | uniq -c
| sort -n
Digital forensics
109
CHAPTER 18. TCPDUMP
18.2.3
Anomaly detection
Example: 3-sigma rule of thumb
• compute average value mu
• compute standard deviation sigma
• trigger an alert if value > mu + 3 sigma
Why ?
• if value is distributed according to a normal distribution,
• 99.7% of values will be <= mu + 3 sigma
• 99.993% of values will be <= mu + 4 sigma
Figure 18.3: Empirical detection rule
110
Digital forensics
CHAPTER 18. TCPDUMP
Figure 18.4: This is sometimes called AI powered security
Figure 18.5: Azure Sentinel
Digital forensics
111
CHAPTER 18. TCPDUMP
18.2.4
Output formating : packet inspection
• -v : more details, like IP fragmentation (+), TTL and options (but multiline)
• -vv : even more details
• -vvv : more (too many) details
Figure 18.6: Analysis of fragmented packets with TCPDUMP
Packet payload
•
•
•
•
-x : hexadecimal
-xx : hexadecimal, with L2 (ethernet) headers
-X : hexadecimal and ASCII
-XX : hexadecimal and ASCII, with L2 headers
18.2.5
Filtering
• tcpdump filtering relies on libpcap and Berkeley Packet Filters (BPF)
• also used by some other tools
• Byte level (limited protocol analysis)
basic format:
protocol[offset:length] filter
examples:
• $ tcpdump -n -r lotsofweb.pcap 'ip[9] = 1' :
(ICMP)
112
IP protocol 1
Digital forensics
CHAPTER 18. TCPDUMP
• $ tcpdump -n -r lotsofweb.pcap 'tcp[2:2] = 80' : destination
port 80
operators
•
•
•
•
=
!=
> >= . . .
and or not
v
Exercise
In treasurehunt_fw_eth1.pcap
• count the number of icmp packets
• which IP sent most icmp packets?
✓
Solution
• $
=
• $
=
tcpdump -n -r treasurehunt_fw_eth1.pcap 'ip[9]
1' | wc -l
tcpdump -n -r treasurehunt_fw_eth1.pcap 'ip[9]
1' | cut -d ' ' -f 3 | sort | uniq -c | sort -n
Syntactic sugar (macros)
• port 22 : destination or source port 22
• src host 172.16.16.128
• src net not 172.16
Available macros:
•
•
•
•
•
•
•
•
•
host
src host
dst host
net
src net
dst net
port
src port
dst port
Digital forensics
113
CHAPTER 18. TCPDUMP
• icmp
• tcp
• udp
v
Exercise
In treasurehunt_fw_eth1.pcap
• count the number of echo request (ping) packets
✓
Solution
v
Exercise
✓
Solution
v
Exercise
114
$ tcpdump -n -r treasurehunt_fw_eth1.pcap 'icmp[0] =
8 and icmp[1] = 0' | wc -l
In treasurehunt_fw_eth1.pcap, how many echo request packets are
coming from a public (external) IP?
$ tcpdump -n -r treasurehunt_fw_eth1.pcap 'icmp[0] =
8 and icmp[1] = 0 and src net not 192.168' | wc -l
In treasurehunt_fw_eth1.pcap, which IP sent most echo request
packets?
Digital forensics
CHAPTER 18. TCPDUMP
✓
Solution
$ tcpdump -n -r treasurehunt_fw_eth1.pcap 'icmp[0] =
8 and icmp[1] = 0' | cut -d ' ' -f 3 | sort | uniq -c
| sort -n
Write filtered packets to another file:
-w <file>
Useful for analysis with Wireshark!
v
Exercise
In treasurehunt_fw_eth1.pcap:
• how many packets are DNS queries?
• extract these packets to a pcap file
• open your pcap file with wireshark to check your answer
✓
18.2.6
Solution
$ tcpdump -n -r treasurehunt_fw_eth1.pcap 'udp and
dst port 53' -w dns.pcap
Filtering : bitmasks
What if the field is shorter than a Byte ?
E.g. IP header length (IHL), TCP flags etc. . .
1. use a bitmask to set irrelevant bits to 0
2. test resulting value
E.g. find packets that have IP options
Find packets where IHL is > 5
ip[0] contains :
• 4-bit IP version
• 4 bit-IHL
Digital forensics
115
CHAPTER 18. TCPDUMP
ip[0]
bitmask
result
test
:
:
:
:
?
0
0
>
? ? ? a b c d
0 0 0 1 1 1 1
0 0 0 a b c d
5
(0x0f)
Final filter :
(ip[0] & 0x0f) > 5
v
Exercise
In lotsofweb.pcap :
• how many packets have IP options ?
• extract these packets to a new pcap
• open your pcap with wireshark to check which option is set
✓
Solution
$ tcpdump -n -r lotsofweb.pcap '(ip[0] & 0x0f) > 5'
-w ip_options.pcap
Filtering TCP flags
0x
8
4
2
1
tcp[13] : CWR ECE URG ACK
8
4
2
1
PSH RST SYN FIN
ONLY SYN flag
tcp[13] = 0x02
AT LEAST SYN flag
tcp[13]
mask
result
test
: CWR ECE URG ACK
: 0
0
0
0
: 0
0
0
0
: 0
0
0
0
PSH RST SYN FIN
0
0
1
0
0
0 SYN 0
0
0
1
0
(tcp[13] & 0x02) = 0x02
116
Digital forensics
CHAPTER 18. TCPDUMP
v
Exercise
✓
Solution
v
Exercise
• What is the filter to check that only (SYN and ACK) are set
(connection accepted)?
• What is the filter to check that at least (SYN and ACK) are set?
• tcp[13] = 0x12
• (tcp[13] & 0x12) = 0x12
In treasurehunt_fw_eth1.pcap
• How many packets have at least (SYN and ACK) set ?
• Establish the list of listening services (IP:port) on the internal
network.
✓
Solution
v
Exercise
• $ tcpdump -n -r treasurehunt_fw_eth1.pcap 'src
net 192 and ((tcp[13] & 0x12) = 0x12)' | wc -l
• $ tcpdump -n -r treasurehunt_fw_eth1.pcap 'src
net 192 and ((tcp[13] & 0x12) = 0x12)' | cut -d '
' -f 3 | sort -u
What is the filter to check if the IP More Fragments (MF) bit is set?
Digital forensics
117
CHAPTER 18. TCPDUMP
✓
18.3
Solution
(ip[6] & 0x02) = 0x02
Packet capture
Figure 18.7: https://securityintelligence.com/is-full-packet-capture-worth-theinvestment/
List available network interfaces:
$ tcpdump -D
Capture and display on screen:
$ sudo tcpdump -i <interface>
or
$ sudo tcpdump -i <interface> <filter>
Write packets to file:
118
Digital forensics
CHAPTER 18. TCPDUMP
Figure
18.8:
https://www.sans.org/readingroom/whitepapers/forensics/implementing-full-packet-capture-37392
Digital forensics
119
CHAPTER 18. TCPDUMP
Figure 18.9: TCPDUMP server connected to a mirroring port
120
Digital forensics
CHAPTER 18. TCPDUMP
$ sudo tcpdump -i <interface> -w <file.pcap>
To stop, hit ctrl + c
Rotating capture files:
-G <seconds> : write to new file after specified time
Attention: the filename must specify a time format:
-w trace-%Y%M%d.%H%M%S.pcap
Can be combined with
-C <size> : limit files to specified size (in MB)
Figure 18.10: tcpdump rotate capture
v
Exercise
On your machine,
• find your main network interface
• record packets on this interface, rotating every 5 seconds
Digital forensics
121
CHAPTER 18. TCPDUMP
✓
18.4
122
Solution
$ sudo tcpdump -i <interface> \\
-G 5 \\
-w trace-%Y%M%d.%H%m%S.pcap
Chapter review
Digital forensics
Chapter 19
Wireshark
(p.m.)
123
CHAPTER 19. WIRESHARK
124
Digital forensics
Part VI
Cloud forensics
125
Chapter 20
Why cloud forensics ?
127
CHAPTER 20. WHY CLOUD FORENSICS ?
128
Digital forensics
Part VII
Final words
129
Chapter 21
Practical considerations
21.1
Preserving evidence
When digital forensics is involved, it means something pretty bad has happened.
It can be a corporate investigation in case of employee misconduct or stolen data.
It can also be for a criminal investigation.
In either case, the analyst must absolutely preserve the evidences, so his analysis
can be later cross-checked if needed. In this Section, we will cover some practical
considerations to keep in mind before and during the investigation:
1.
2.
3.
4.
the order of volatility;
using write blockers;
checking hashes;
preserving the chain of custody.
21.1.1
Order of volatility
21.1.2
Write blockers
When you plug a disk (like a USB stick, SSD or hard drive) on a computer, most
Operating Systems will automatically read and modify some files on the device.
This obviously alters the evidence, even before the analyst has a chance to create
an image.
To avoid this caveat, a write blocker should always be used!
A forensic disk controller or hardware write-blocker [29] is a specialized type of
computer hard disk controller made for the purpose of gaining read-only access to
computer hard drives without the risk of damaging the drive’s contents.
131
CHAPTER 21. PRACTICAL CONSIDERATIONS
Figure 21.1: A SATA write blocker from Tableau. The write blocker can be
plugged to the host computer using either USB or FireWire (on the left).
132
Digital forensics
CHAPTER 21. PRACTICAL CONSIDERATIONS
Forensic disk controllers intercept write commands from the host operating system,
preventing them from reaching the drive. Whenever the host bus architecture
supports it the controller reports that the drive is read-only. The disk controller
can either deny all writes to the disk and report them as failures, or use on-board
memory to cache the writes for the duration of the session.
A disk controller that caches writes in memory presents the appearance to the
operating system that the drive is writable, and uses the memory to ensure that
the operating system sees changes to the individual disk sectors it attempted to
overwrite. It does this by retrieving sectors from the disk if the operating system
hasn’t attempted to change them, and retrieving the changed version from memory
for sectors that have been changed.
21.1.3
Hash checking
21.1.4
Chain of custody
21.2
Time and time zones
Digital forensics
133
CHAPTER 21. PRACTICAL CONSIDERATIONS
134
Digital forensics
References
[1]
[2]
“Digital forensics - wikipedia.” https://en.wikipedia.org/wiki/Digital_fore
nsics.
“Intelligence analysis - wikipedia.” https://en.wikipedia.org/wiki/Intelligen
ce_analysis.
[3]
“The differences between data, information, and intelligence - united states
cybersecurity magazine.” https://www.uscybersecurity.net/csmag/thedifferences-between-data-information-and-intelligence/.
[4]
“SIFT workstation | SANS institute.” https://www.sans.org/tools/siftworkstation/.
R. H. Arpaci-Dusseau and A. C. Arpaci-Dusseau, Operating Systems:
Three Easy Pieces, 1.00 ed. Arpaci-Dusseau Books, 2018.
[5]
[6]
“Disk sector - wikipedia.” https://en.wikipedia.org/wiki/Disk_sector.
[7]
“ext2 - wikipedia.” https://en.wikipedia.org/wiki/Ext2.
[8]
[9]
“The second extended file system.” https://www.nongnu.org/ext2-doc/ext2
.html.
“FTK® imager - exterro.” https://www.exterro.com/ftk-imager.
[10]
“Our software | ASR data.” http://www.asrdata.com/?page_id=205.
[11]
“Mactime - SleuthKitWiki.” https://wiki.sleuthkit.org/index.php?title=Mac
time.
“BitLocker - wikipedia.” https://en.wikipedia.org/wiki/BitLocker.
[12]
[13]
“BitLocker overview - windows security | microsoft learn.” https://learn.
microsoft.com/en-us/windows/security/operating-system-security/dataprotection/bitlocker/.
[14]
“Windows registry - wikipedia.” https://en.wikipedia.org/wiki/Windows_
Registry.
135
CHAPTER 21. PRACTICAL CONSIDERATIONS
[15]
“Registry: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet.” ht
tps://renenyffenegger.ch/notes/Windows/registry/tree/HKEY_LOCAL
_MACHINE/System/CurrentControlSet/.
[16]
“Security identifiers | microsoft learn.” https://learn.microsoft.com/en-us/w
indows-server/identity/ad-ds/manage/understand-security-identifiers.
[17]
“Security identifier - wikipedia.” https://en.wikipedia.org/wiki/Security_I
dentifier.
“Windows forensic analysis | SANS poster.” https://www.sans.org/posters
/windows-forensic-analysis/.
[18]
[19]
“Updates to the RecentDocs key in windows 10 – forensic 4:cast.” https:
//forensic4cast.com/2019/03/the-recentdocs-key-in-windows-10/.
[20]
“Windows ShellBag forensics in depth.” https://www.giac.org/paper/gcfa/
9576/windows-shellbag-forensics-in-depth/128522.
[21]
“Libyal/libevtx: Library and tools to access the windows XML event log
(EVTX) format.” https://github.com/libyal/libevtx/.
[22]
“x86 assembly/advanced interrupts - wikibooks, open books for an open
world.” https://en.wikibooks.org/wiki/X86_Assembly/Advanced_Interr
upts.
[23]
“System call - wikipedia.” https://en.wikipedia.org/wiki/System_call.
[24]
“Internet protocol version 4 - wikipedia.” https://en.wikipedia.org/wiki/In
ternet_Protocol_version_4.
“Transmission control protocol - wikipedia.” https://en.wikipedia.org/wiki/
Transmission_Control_Protocol.
“Internet control message protocol - wikipedia.” https://en.wikipedia.org/w
iki/Internet_Control_Message_Protocol.
[25]
[26]
[27]
“User datagram protocol - wikipedia.” https://en.wikipedia.org/wiki/User
_Datagram_Protocol.
[28]
“Address resolution protocol - wikipedia.” https://en.wikipedia.org/wiki/
Address_Resolution_Protocol.
“Forensic disk controller - wikipedia.” https://en.wikipedia.org/wiki/Fore
nsic_disk_controller.
[29]
136
Digital forensics
Download