Digital forensics Thibault Debatty Contents 1 Preamble 1 2 Requirements 2.1 SIFT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 3 I Disk Forensics 5 3 Disk and filesystem 3.1 FAT filesystem . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 ext2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Chapter review . . . . . . . . . . . . . . . . . . . . . . . . . . 9 9 12 14 4 Disk imaging 4.1 FTK Imager . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Linux . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Chapter review . . . . . . . . . . . . . . . . . . . . . . . . . . 15 15 17 17 5 Mounting an image 5.1 Mount a dd image on Linux and SIFT . 5.2 Mount an E01 image on Linux and SIFT 5.3 Windows : FTK Imager . . . . . . . . . 5.4 Chapter review . . . . . . . . . . . . . . . . . 19 19 20 21 22 6 Forensics tools 6.1 The Sleuth Kit . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Chapter review . . . . . . . . . . . . . . . . . . . . . . . . . . 23 23 28 7 BitLocker drive encryption 7.1 Chapter review . . . . . . . . . . . . . . . . . . . . . . . . . . 29 30 i . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CONTENTS II 8 9 Windows forensics Windows internals 8.1 Windows registry . . . . 8.2 Event Logs . . . . . . . 8.3 Security Identifier (SID) 8.4 Chapter review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Windows artifacts 9.1 System information . . . . . . . . . . 9.2 Account usage . . . . . . . . . . . . . 9.3 Application execution . . . . . . . . . 9.4 File and folder opening . . . . . . . . 9.5 Deleted items . . . . . . . . . . . . . 9.6 Physical location and network activity 9.7 Browser activity . . . . . . . . . . . . 9.8 External device/USB usage . . . . . . 10 Forensics tools 10.1 RegRipper . . . . . . . . 10.2 evtxinfo and evtxexport . 10.3 Eric Zimmerman’s tools 10.4 Thumbcache viewer . . . 10.5 Additional exercises . . . III 31 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 33 38 40 42 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 43 44 45 49 51 52 52 53 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 55 59 60 61 62 Memory forensics 11 Principles of computer memory 11.1 Address space of a process . 11.2 Virtualization and paging [5] 11.3 Interrupts . . . . . . . . . . 11.4 System calls . . . . . . . . . 11.5 Chapter review . . . . . . . 65 . . . . . 67 67 68 70 73 74 12 Windows memory 12.1 Windows architecture . . . . . . . . . . . . . . . . . . . . . . . 12.2 Chapter review . . . . . . . . . . . . . . . . . . . . . . . . . . 75 75 77 13 Memory acquisition 79 14 Memory analysis 81 ii . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Digital forensics CONTENTS 15 Linux memory forensics 83 IV 85 V Mobile device forensics Network forensics 87 16 Command line tricks 17 Network protocols 17.1 IPv4 . . . . . 17.2 TCP . . . . . 17.3 ICMP . . . . 17.4 UDP . . . . . 17.5 ARP . . . . . . . . . . 18 tcpdump 18.1 Requirements . 18.2 Packet analysis 18.3 Packet capture . 18.4 Chapter review 91 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 99 100 101 102 103 . . . . 105 105 105 118 122 19 Wireshark 123 VI 125 Cloud forensics 20 Why cloud forensics ? 127 VII 129 Final words 21 Practical considerations 131 21.1 Preserving evidence . . . . . . . . . . . . . . . . . . . . . . . . 131 21.2 Time and time zones . . . . . . . . . . . . . . . . . . . . . . . 133 References Digital forensics 135 iii CONTENTS iv Digital forensics Chapter 1 Preamble Digital forensics [1] (sometimes known as digital forensic science) is a branch of forensic science encompassing the recovery, investigation, examination and analysis of material found in digital devices, often in relation to mobile devices and computer crime. Digital forensics is an extremely broad subject, for two main reasons. First, because of the diversity of materials involved: 1. disk from a desktop computer (windows or Mac), which can contain artifacts or logs produced by the operating system or by the installed applications; 2. memory of a desktop computer; 3. storage or memory image of a smartphone (iPhone and Android); 4. disk or memory dump from a server; 5. traffic captured on a network; 6. logs from a cloud environment. Second, because of the skills required from the analyst. Indeed, the process of digital forensics directly relates to the process of intelligence analysis [2], [3]: 1. correct data collection (from disk or memory for example) requires care; 2. processing the data to extract information from the evidence requires to know which tool(s) can be used and how; 3. information analysis to get actual intelligence (being able to explain what actually happened and why) requires to understand how the system under analysis (Windows, a network, a cloud environment) is supposed to work, to be able to interpret the information produced by your tools; 4. and all steps require a good deal of critical thinking. In the following chatpers, we will try to cover the different sources of information 1 CHAPTER 1. PREAMBLE you might encounter in a forensics investigation: Windows computer, memory dump, smartphone, network trace, cloud environment. For each data source, we will briefly explain (or review) how it works, and we will showcase some tools that can be used to extract information. 2 Digital forensics Chapter 2 Requirements This book contains a lot of exercises. We strongly encourage you to do the exercise, to practice your investigation skills, and to help you understand and memorize the different skills. To do these exercises, you will need: • a system with Windows installed (can be a VM), on which you have administrator rights; • the SIFT workstation (can also be a VM, see below). 2.1 SIFT The SIFT workstation [4] is a collection of forensic tools. It was originally created by Rob Lee, from the SANS institute. The current version of the SIFT workstation is based on Ubuntu 20.04 and has a lot of preinstalled forensics tools like Plaso, Sleuthkit, regripper, Volatility etc. The default credentials of the SIFT workstation are: • username: sansforensics • password: forensics The easiest way to run the SIFT workstation is to download the VM applicance that you can find on https://www.sans.org/tools/sift-workstation/ v Exercise Download and install the SIFT workstation 3 CHAPTER 2. REQUIREMENTS Figure 2.1: SIFT workstation running in a VM 4 Digital forensics Part I Disk Forensics 5 Most computer devices store permanent information on hard drive or Solid State Drive (SSD). Hence one of the main source of information in an digital forensics investigation is the disk of the different involved devices. From the OS point of view, a disk is simply a bunch of numbered but unorganized sectors. To store information, the disk is usually divided into 1 or more pieces called partition. Inside each partition, a filesystem is used to define how and where the files and folders are recorded. In this first Part of the book, we will see: 1. 2. 3. 4. how information is stored on disks (Chapter 3); how to perform a correct image of a disk (Chapter 4); how to mount this image in the SIFT Workstation (Chapter 5); how to extract files (possibly deleted) from the image. Digital forensics 7 8 Digital forensics Chapter 3 Disk and filesystem From the OS point a view, a disk is simply a sea of sectors, numbered starting from 0. The size of clusters is traditionally 512 bytes, or 4096 bytes for newer drives (known as Advanced Format, AF) [5], [6]. To store data efficiently, the OS must organize how files and folders are stored on the disk. Two concepts allow this: partitions allow to split the disk in smaller pieces (sometimes also called volumes), and filesystems define how files and folders are stored inside the volume. 3.1 FAT filesystem The File Allocation Table (FAT) was one of the first filesystems. It is almost the only filesystem that all operating systems can read and write. Hence despite its age and limitations, it is still used today, mainly for USB drives. There has been different version of FAT. In this Section we will discuss FAT32, the most commonly used version today. To organize data, FAT32 divides the partition in 2 areas: • the system area stores meta-information about the files and directories, and about the filesystem itself; • the data area stores the actual content of files (and directories). The system area itself is composed of • the boot record, that contains the jump instruction used for the system to boot, and information about the filesystem; • 2 copies of the file allocation table. 9 CHAPTER 3. DISK AND FILESYSTEM Figure XXXX represents the main fields of the boot record: • JUMP (located at 0x00, 3 Bytes) is the jump instruction for the system boot; • OS ID (located at 0x03, 8 Bytes) shows which OS was used to format the device; • Sectors/FAT (0x24, 4 Bytes) indicates the number of sectors of this FAT; • SN (0x43, 4 Bytes) is the volume serial number (not the SN of the drive); • LABEL (0x47) is the volume label, displayed by the OS; • FS type (0x52) is the filesystem type Figure 3.1: FAT32 boot record On a Linux system or SIFT workstation, you can read the boot record of an image with hexdump -n 96 -C <image> Pay attention that FAT32 uses little endian byte order so the first byte has the lowest value. In the example below: • the jump instruction is EB 58 90; • the OS ID is 6d 6b 66 73 2e 66 61 74 or mkfs.fat, which shows the disk was formatted on a Linux system; • the fat has e8 0e 00 00 or 232 + 14 x 256 = 3816clusters; • the volume label is 32 47 20 20 20 20 20 20 20 20 20 which translates to 2G • likewize, the FS type is FAT32 10 Digital forensics CHAPTER 3. DISK AND FILESYSTEM After the boot record comes the file allocation table. This table has one entry for each cluster in the data area. With FAT32, each entry is 32 bits (4 Bytes). Only the lowest 28 bits are actually used for addressing because the highest 4 bits are reserved for future use. Remember that FAT32 uses little endian byte ordering, so in the values below you may see a ? instead of the second-to-last hexadecimal character. The fat allows to track which files occupy which clusters, and which clusters are free: • if the cluster is free, the corresponding value in the fat is 0x00 00 00 ?0 • if the cluster is used by a file, the value in the fat is the address of the next cluster for this file • if the cluster is the last cluster of a file, the value in the fat is 0xf8 ff ff ?f The Figure XXXX shows an example of a FAT. It contains a single file, that spans on clusters 0x00 00 00 00, 0x00 00 00 01 and 0x00 00 00 03. The clusters “‘0x00 00 00 02 and 0x00 00 00 04 are free. Figure 3.2: Example of a file allocation table. The green column indicates the cluster numbers. It is actually not stored on the disk. When a file is deleted from the partition, the OS will: • remove the corresponding entry from the directory; Digital forensics 11 CHAPTER 3. DISK AND FILESYSTEM • mark the corresponding clusters as free (by writing the 0x0000 0000 value in the fat). The actual data (located in the data area of the partition) will be left untouched, which means that data is still present on the drive and can be recovered if it has not been overwritten (see Chapter ??). Size limitations Because the size of the sectors per FAT field of the boot record is 32 bits, the maximum size of a FAT32 partition is 2 TB for drives that use sectors of 512 Bytes, and 16 TB for drives that have sectors of 4096 Bytes. 3.2 ext2 ext2 [7], [8] is the ancestor of ext4, that is currently the default filesystem for a lot of Linux distributions. Ext4 is more performant and has additional features (like journaling) compared to ext2, but the working principles of both filesystems is the same. In ext2, each file is described by an inode. The inode contains a.o. : • • • • • • 3.2.1 the length of the file in bytes; the device ID (this identifies the device containing the file); the ownership of the file (UID and GID); the permisions associated with the file (read, write, execute); various timers (creation, access, modification); pointers (direct or indirect) to the clusters that store the file’s contents Partition structure When a partition is formatted with ext2, some space is reserved at the beginning of the partition to store inodes, as illustrated on the Figure below. 3.2.2 Directory In ext2, a directory is a special file. The data blocks (so the content of the ‘file’) is a list of directory entries. Each directory entry associates one file name with one inode number. To find a file, the directory is searched front-to-back for the associated filename. For reasonable directory sizes, this is fine. But for very large directories this is 12 Digital forensics CHAPTER 3. DISK AND FILESYSTEM Figure 3.3: Ext2 inode pointesr Figure 3.4: Ext2 disk organization Digital forensics 13 CHAPTER 3. DISK AND FILESYSTEM inefficient. Hence one of the optimizations of ext3 is a second way of storing directories (HTree) that is more efficient than just a list of filenames. Because directories are actually a special kind of file, each directory is also represented by an inode. The root directory is always stored in inode number two, so that the OS can find it when the partition is mounted. Subdirectories are implemented by storing the name of the subdirectory in the name field, and the inode number of the subdirectory in the inode field. 3.3 Chapter review At the end of this Chapter, you should be able to answer the following questions: • Explain how FAT32 works, and what happens when you delete a file • Explain the role and working of ext2 14 Digital forensics Chapter 4 Disk imaging When performing a forensics investigation, we must absolutely preserve the original evidences. For disks, it means we should absolutely avoid modifying the content of the disks. Hence, we must first create an image of the disk, that we will analyze. Depending on the situation, there are 3 ways to create a disk image: 1. using a hardware write blocker; 2. using a bootable USB stick; 3. directly on the system under investigation. 4.1 FTK Imager FTK Imager [9] is a data preview and imaging tool used to acquire digital evidence in a forensically sound manner by creating copies of data without changing the original in any way. The latest version supports the AFF4 format and execution on portable drives. It was originally developped by AccessData, and now by exterro. It can run on a Windows computer, or directly from a USB drive. With FTK Imager you can: • Create forensic images of entire local hard drives, CDs and DVDs, thumb drives and other USB devices or just the files and folders you need. • Preview the contents of forensic images stored on local machines or network drives. • Create hashes of files to verify data integrity using either Message Digest 5 (MD5) or Secure Hash Algorithm (SHA-1). 15 CHAPTER 4. DISK IMAGING Figure 4.1: FTK Imager Additional resources To create USB drive with FTK Imager: https://cylab.be/blog/180/runn ing-and-imaging-with-ftk-imager-from-a-flash-device FTK Imager can create images from different sources: • Physical Drive creates an exact copy of the complete physical drive. So if the system has a 2TB harddrive, but only 20GB used, the image size will be 2TB. This is the best option as it allows to recover deleted files. This is the best option. • Logical Drive create an exact copy of a single partition. If they are deleted partitions, or data hidden outside of the main partition, we will not be able to recover that data. • Image File allows to create an image from another image, so basically converting one image to another format. • Contents of a Folder only copies the content of a folder. This means we cannot recover deleted files. • Fernico Device is meant for Fernico FAR archive systems. To save the image, FTK Imager supports different formats: • dd is a simple bit-by-bit copy of the source, with no additional information. The name originates from the dd command that you can find on Unix 16 Digital forensics CHAPTER 4. DISK IMAGING systems. • E01 is also known as e01/ex01, Encase evidence file or Expert Witness Format (EWF). It also create a bit-by-by copy of the source, but has additional features: encryption (AES256) and compression (LZ) of the data, header information (MD5 and SHA-1 hashing of the data, case number, evidence name and number, date and time of acquisition etc.). This should be the preferred format whenever possible. • SMART is the format used by the SMART forensics tool from ASR Data[10]. • AFF stands for Advanced Forensics Format is an open source format that has features similar to E01. • AD1 is a proprietary format from AccessData (FTK Imager) used to store the image of a Folder. v 4.2 Exercise Download and install FTK Imager on your windows machine, then use it to create an image of a USB key that you own. Linux On a Linux system, you can simply use the dd command to create an image of a disk: sudo dd if=/dev/device of=/path/to/image bs=128K For example: sudo dd if=/dev/sda of=~/usb.img bs=128K 4.3 Chapter review At the end of this Chapter, you should be able to answer the following questions: • • • • Explain why and how we can collect a correct image of a disk. Explain the different possibilities for creating a disk image. On a windows computer, create an image of a disk. On a Linux computer, create an image of a disk. Digital forensics 17 CHAPTER 4. DISK IMAGING 18 Digital forensics Chapter 5 Mounting an image 5.1 Mount a dd image on Linux and SIFT As dd creates an exact byte copy of the original drive, you can simply mount the image using. . . the mount command: sudo mount -t <fs> -o loop,ro /path/to/image /path/to/mountpoint For example: sudo mkdir -p /mnt/images/01 sudo mount -t vfat -o loop,ro ~/usb.img /mnt/images/01 v Exercise Download the exercise file usb-01.img.zip from https://cylab.be/s/f FMqA This file is a dd image of a USB drive. What is the content of the file password.txt ? 19 CHAPTER 5. MOUNTING AN IMAGE ✓ 5.2 Solution wget https://cylab.be/s/fFMqA -O usb-01.img.zip unzip usb-01.img.zip sudo mkdir -p /mnt/images/01 sudo mount -t vfat -o loop,ro usb-01.img /mnt/images/01 cat /mnt/images/01/password.txt Mount an E01 image on Linux and SIFT As mentioned earlier, the Expert Witness Format (E01) is a special format that ‘packs’ the disk image, possibly compressed, with additional information. So mounting an E01 image on Linux or SIFT requires 2 steps: 1. use the ewfmount command to ‘expose’ the raw disk image inside the E01 container: sudo ewfmount /path/to/image.E01 /mnt/e01 2. use mount to mount the partition. sudo mount -o ro,loop /mnt/e01/ewf1 /mnt/windows For example: sudo su ewfmount usb-02.E01 /mnt/e01 mount -o ro,loop -t vfat /mnt/e01/ewf1 /mnt/windows_mount ls /mnt/windows_mount Additional resources ewfmount is already installed on SIFT. On a Debian based distribution (like Ubuntu), you can install it with: sudo apt install ewf-tools 20 Digital forensics CHAPTER 5. MOUNTING AN IMAGE v Exercise Download the exercise file usb-02.E01 from https://cylab.be/s/5Lne8 This file is an E01 image of a USB drive. What is the content of the file password.txt ? 5.3 Windows : FTK Imager On a Windows machine, you can use FTK Imager to mount an image. Figure 5.1: Mounting an E01 image with FTK imager FTK Imager proposes 3 mount types: • Physical only attaches the image as a physical drive, without actually mounting the different partitions. The content of the image cannot be viewed in Windows Explorer, but the drive can be viewd using Windows applications that perform Physical Name Querying, like partition editors. This mount type is only available with full disk images like RAW/dd and E01. • Logical attaches the partition as a new drive, which is visible in Windows Explorer. This mode is only available for Logical images that only contain Digital forensics 21 CHAPTER 5. MOUNTING AN IMAGE the content of file and folders, like AD1. • Physical & Logical is available for full disk images. It attaches the image as a physical drive, and mounts the different partitions in Windows Explorer. 5.4 Chapter review At the end of this Chatper, you should be able to: • Mount a disk image in the SIFT workstation. 22 Digital forensics Chapter 6 Forensics tools 6.1 The Sleuth Kit The Sleuth Kit (TSK) is a collection of command line tools that allows to analyze disk images and recover deleted files. It is already installed on the SIFT workstation. TSK allows to: • • • • analyze raw (dd), E01 and AFF images (see Chapter 4); analyze NTFS, FAT, ExFAT, Ext4, Ext3 and a lot of other file systems; recover the content of deleted blocks; create a time line of file activity, which can be imported into a spread sheet to create graphs and reports. The complete list of features is available on the web site https://www.sleuthkit.org/ 6.1.1 Filesystem and partition information To list the partitions contained in an image: mmls <image> To display type and details about a file system: fsstat -o <offset in sectors> <image> 23 CHAPTER 6. FORENSICS TOOLS Figure 6.1: Example of fstat showing details of a FAT32 file system v 24 Exercise Download and extract usb-03.img.zip from https://cylab.be/s/L Oz9Z. This image contains multiple partitions. List the different partitions and filesystems. Digital forensics CHAPTER 6. FORENSICS TOOLS ✓ Solution wget https://cylab.be/s/LOz9Z -O usb-03.img.zip unzip usb-03.img.zip ## list the partitions ## start and end positions are indicated ## in sectors mmls usb-03.img ### get the type of the first partition fsstat -o 2048 usb-03.img ... Additional resources On a Linux system, you may also list partitions and filesystems with fdisk -l <image> But fdisk is usually less acurate then TSK tools. 6.1.2 File recovery As we have seen previously, when a file or directory is deleted, the content is actually not erased from the disk. The data sectors are marked as free, and the corresponding entry is removed from the index structure (inode or FAT entry). This means that, in some case, the content of deleted files can be recovered. With TSK, the first step is to use fls to list deleted files: fls -d <image> This will show, for each file, the corresponding inode number or FAT entry number (inum). Next, we can use icat to extract the content of a file: icat -r <image> <inum> You can find an example on the Figure below. Digital forensics 25 CHAPTER 6. FORENSICS TOOLS Figure 6.2: Recover the content of a deleted file with The Sleuth Kit v Exercise v Exercise In usb-01.img, what is the content of the deleted file deleted.txt ? Download usb-04.E01 from https://cylab.be/s/kbcRa The image contains 3 deleted files (PDF, DOCX, PNG). Recover the files. . . Additional resources Foremost is another tool you can use to recover deleted files: https://cylab.be/blog/283/recovering-deleted-files-with-foremost 6.1.3 Timeline creation As we have seen in Chapter 3, files and directories usually have times associated with them. The quantity and description of which depend on the file system type. For example, Ext2/3 file systems have a Modified, Accessed, Changed and deleted time. FAT stores the Written, Accessed, and Created time, although by spec the Created and Access times are optional and the Access time is only accurate to the day. NTFS has created, modified, changed, and accessed times. 26 Digital forensics CHAPTER 6. FORENSICS TOOLS The fls tool from TSK allows to extract this information from a disk image. Then the mactime tool allows to sort all of the temporal data into a single timeline. You can run fls and save the result to a body file text file with the following command: fls -m -r <image> > <bodyfile.txt> The body file format stores the following fields for each file: MD5|name|inode|mode_as_string|UID|GID|size|atime|mtime|ctime|crtime Figure 6.3: The Sleuth Kit : fls In a second step, you can use the mactime tool to create a sorted and well formated report of disk activity. Mactime has different options that you can find on the help page [11]. The simplest command to analyze a body file is: mactime -b <bodyfile.txt> In the output, the different activity types are identified by the letters ‘m’, ‘a’, ‘c’, ‘b’. But their meaning depends on the underlying filesystem: File system m a Ext4 Ext2/3 FAT NTFS UFS Modified Modified Written File Modified Modified Accessed Changed Accessed Changed Accessed N/A Accessed MFT Modified Accessed Changed Digital forensics c b Created N/A Created Created N/A 27 CHAPTER 6. FORENSICS TOOLS Figure 6.4: The Sleuth Kit : mactime 6.2 Chapter review At the end of this Chapter you should be able to: • Get information about the partitions and filesystems contained in an image • Extract deleted files from an image 28 Digital forensics Chapter 7 BitLocker drive encryption BitLocker [12], [13] is a full volume encryption feature included with Microsoft Windows versions Pro, Enterprise and Education. It is designed to protect data by providing encryption for the entire volumes. By default, it uses the AES encryption algorithm in cipher block chaining (CBC) or XTS mode, with a 128-bit or 256-bit key. The CBC is applied to each individual sector. Starting with Windows Server 2012 and Windows 8, Microsoft has complemented BitLocker with the Microsoft Encrypted Hard Drive specification, which allows the cryptographic operations of BitLocker encryption to be offloaded to the storage device’s hardware, thus reducing the performance impact for the user. Three authentication mechanisms can be used to encrypt a volume: • Transparent operation mode: This mode uses the capabilities of TPM 1.2 hardware to provide for transparent user experience. The user powers up and logs into Windows as usual. The key used for disk encryption is encrypted by the TPM chip and will only be released to the OS loader code if the early boot files appear to be unmodified. The pre-OS components of BitLocker achieve this by implementing a Static Root of Trust Measurement—a methodology specified by the Trusted Computing Group (TCG). This mode is vulnerable to a cold boot attack, as it allows a powered-down machine to be booted by an attacker. It is also vulnerable to a sniffing attack, as the volume encryption key is transferred in plain text from the TPM to the CPU during a successful boot. • User authentication mode: This mode requires that the user provides some authentication to the pre-boot environment in the form of a pre-boot PIN or password. 29 CHAPTER 7. BITLOCKER DRIVE ENCRYPTION • USB Key Mode: The user must insert a USB device that contains a startup key into the computer to be able to boot the protected OS. Note that this mode requires that the BIOS on the protected machine supports the reading of USB devices in the pre-OS environment. Some combinations of the above authentication mechanisms are also possible: • TPM + PIN • TPM + PIN + USB Key • TPM + USB Key Also, all BitLocker combinations allow the creation of a recovery key that can also be used to decipher a volume. When you try to mount a BitLocker encrypted device or partition, the software you use will usually require or ask for the encryption password or recovery key. . . 7.1 Chapter review At the end of this Chapter, you should be able to answer the following questions: • Explain the working of BitLocker 30 Digital forensics Part II Windows forensics 31 Chapter 8 Windows internals As explained in the Introduction, performing a correct forensics analysis and using forensics tools correctly requires to understand how the investigated system is actually supposed to work. In this Chapter, we will cover some internal aspects of Windows that will be interesting for a forensics investigation. 8.1 Windows registry The Windows registry [14] is a hierarchical database of key - value pairs, that stores settings and information for Windows and for applications that opt to use the registry. In other words, the registry contains information, settings, options, and other values for programs and hardware installed on all versions of Microsoft Windows operating systems. For example, when a program is installed, a new subkey containing settings such as a program’s location, its version, and how to start the program, are all added to the Windows Registry. However, it is not a requirement for Windows applications to use the Windows Registry. The Windows Registry stores all these settings in one logical hierarchy (like a tree), but in multiple files (see Section 8.1.1). Indeed, user-based registry settings are loaded from a user-specific file. This way the registry allows multiple users to share the same machine, and also allows programs to work for less privileged users. Backup and restoration is also simplified as the registry can be accessed over a network connection for remote management/support, including from scripts, using 33 CHAPTER 8. WINDOWS INTERNALS the standard set of APIs, as long as the Remote Registry service is running and firewall rules permit this. Because the registry is a database, it offers improved system integrity with features such as atomic updates. If two processes attempt to update the same registry value at the same time, one process’s change will precede the other’s and the overall consistency of the data will be maintained. The registry has a hierarchial tree-like structure, but it has 7 predefined root keys. The main ones are: • HKEY_LOCAL_MACHINE or HKLM stores settings that are specific to the local computer. • HKEY_USERS or HKU contains subkeys for each user profile on the machine. • HKEY_CURRENT_USER or HKCU stores settings that are specific to the currently logged-in user. The HKEY_CURRENT_USER key is a link to the subkey of HKEY_USERS that corresponds to the user, hence the same information is accessible in both locations. For example, the name of your computer is stored in the registry, at the key HKLM\SYSTEM\CurrentControlSet\Control\ComputerName\ComputerName On a Windows computer, you can view and modify the content of the registry with regedit.exe, the built-in Windows Registry Editor. Figure 8.1: Registry Editor 34 Digital forensics CHAPTER 8. WINDOWS INTERNALS v 8.1.1 Exercise On your Windows machine, launch regedit, and check the name of your computer in the Registry. Hives Even though the registry presents itself as an integrated hierarchical database, branches of the registry are actually stored in a number of disk files called hives. For example, individual settings for users on a system are stored in a hive (disk file) per user. During user login, the system loads the user hive under the HKEY_USERS key and sets the HKCU (HKEY_CURRENT_USER) symbolic reference to point to the current user. This allows applications to store/retrieve settings for the current user implicitly under the HKCU key. The user-specific HKEY_CURRENT_USER user registry hive is stored in NTUSER.DAT inside the user profile. There is one of these per user. If a user has a roaming profile, then this file will be copied to and from a server at logout and login respectively. Here are the main hives: • C:\Windows\System32\config\SAM is the Security Accounts Manager and contains login information about the users; • C:\Windows\System32\config\SECURITY contains security information, and possibly passwords; • C:\Windows\System32\config\SOFTWARE contains information about installed applications and the default Windows settings; • C:\Windows\System32\config\SYSTEM contains information relating to hardware and system configuration; • C:\Users\$USER$\NTUSER.DAT contains user preferences and settings (see above); in the registry, it is mapped to HKEY_CURRENT_USER; • C:\Users\$USER$\AppData\Local\Microsoft\Windows\UsrClass.dat contains information concerning User Access Control (UAC) configuration and about GUI display for the user experience; In the registry, it is mapped to HKEY_CURRENT_USER/Software/Classes. 8.1.2 ControlSets The control set [15] registry branch records information that is needed to start Windows and devices related information that is used to run Digital forensics 35 CHAPTER 8. WINDOWS INTERNALS Figure 8.2: Some hives on Windows 11 36 Digital forensics CHAPTER 8. WINDOWS INTERNALS Windows (Windows Services). Windows stores at least two control sets in the registry: HKEY_LOCAL_MACHINE\SYSTEM\ControlSet001 and HKEY_LOCAL_MACHINE\SYSTEM\ControlSet002. Usually, both of them have the same information. However, if a fundamental change is made to the system such as a change of the hardware, there is the possibility that Windows cannot boot up anymore because of a faulty entry in the registry’s control set. Thus, only one of the copies is changed. If Windows manages to boot up correctly, it copies the newer control set over the older so that both are in sync again. The registry key HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet is just a link to one of the two real control sets: the one that is currently loaded. The current control set is recorded into the Current key that is available in the registry under HKEY_LOCAL_MACHINE\SYSTEM\Select. v Exercise On your Windows machine, use the Registry Editor to check the currently used ControlSet. Figure 8.3: Registry Control Sets Digital forensics 37 CHAPTER 8. WINDOWS INTERNALS 8.1.3 Most Recently Used (MRU) Some registry entries store lists of Most Recently Used (MRU) items. The first item listed in the value data being the most recently accessed and the last entry being the oldest. 8.2 Event Logs When interesting events take place, Windows records event information in the event logs. Since Windows Vista, these logs are stored in C:\Windows\System32\winevt\Logs The logs are stored in a specific format called Windows XML Event Log, with the extension .evtx. Figure 8.4: C:\Windows\System32\winevt\Logs Windows also has a built-in tool called the Event Viewer that allows administrators and suers view the event logs on a local or remote machine. These events are categorized in 3 classes: • System : events generated by the Windows operaging system; • Aplication : events generated by applications on the local machine; • Security : events related to login attempts. Each event also has an event ID. Here are a few useful examples: 38 Digital forensics CHAPTER 8. WINDOWS INTERNALS Figure 8.5: Windows Event Viewer • • • • • • • 4624 : 4625 : 4634 : 4647 : 4648 : 4672 : 4720 : successful logon unsuccessful logon logon session terminated logon session terminated by user user logon attempted by a user with different credentials user logon with admin rights user account created The complete list of event ID and description can be found at https://www.ultima tewindowssecurity.com/securitylog/encyclopedia/ v Exercise On your Windows machine 1. open the Event Viewer 2. use the filtering functionality to find all failed logon attempts on your machine For logon events, Windows also records additional information, including the logon type. This allows to show if the logon event was caused by a direct logon from the user (interactive logon), or by the execution of a scheduled task (batch login), or through the network (Network logon). Digital forensics 39 CHAPTER 8. WINDOWS INTERNALS The complete list of logon types is presented below: • Interactive (2) : Console logon; RUNAS; Hardware remote control solutions (such as Network KVM or Remote Access / Lights-Out Card in server); • Network (3) : NET USE; RPC calls; Remote registry; • Batch (4) : Scheduled tasks; • Service (5) : Windows services; • NetworkCleartext (8) :IIS Basic Auth (IIS 6.0 and newer); Windows PowerShell with CredSSP; • NewCredentials (9) : RUNAS /NETWORK; • RemoteInteractive (10) : Remote Desktop (formerly known as “Terminal Services”). Figure 8.6: Windows Event showing Logon Type 8.3 Security Identifier (SID) A Security Identifier (SID) [16], [17] is a unique, immutable identifier of a user, user group, or other security principal. So the SID is roughly the Windows equivalent of the UID on a Unix system. Unlike the username, the SID cannot be modified. This allows to uniquely identify a user, even if username is changed. . . Windows grants or denies access and privileges to resources based on access control lists (ACLs), which use SIDs to uniquely identify users and their group memberships. When a user logs into a computer, an access token is generated 40 Digital forensics CHAPTER 8. WINDOWS INTERNALS that contains user and group SIDs and user privilege level. When a user requests access to a resource, the access token is checked against the ACL to permit or deny particular action on a particular object. The format of a SID can be illustrated using the following example: S-1-5-21-3623811015-3361044348-30300820-1013 In this example: S : indicates that the string is a SID; 1 : is the version of the SID specification; 5 : the identifier authority value; 21 : is the subauthority type. For example, 21 is a domain, and 18 is a LocalSystem; • 3623811015-3361044348-30300820 : is the unique authentifier for the subauthority • 1013 : is a relative ID (RID) that identifies the user or group on this system (in this example, in this domain). • • • • By default, the administrator account on a system has RID 500, and the guest account has RID 501. You can find your SID by typing the following command in a command prompt: whoami/user Figure 8.7: whoami/user v Exercise Find your SID You can also find the SID of all users on the system with the following command: Digital forensics 41 CHAPTER 8. WINDOWS INTERNALS wmic useraccount get name,sid 8.4 Chapter review At the end of this Chapter, you should be able to answer the following questions: • • • • 42 In which directory are most registry hives located? Explain working and organization of the Windows registry (include hives). Explain what event logs are, and give a few examples. Explain the use and format of Windows SID. Digital forensics Chapter 9 Windows artifacts When a computer is running, Windows will store a lot of (possibly interesting) information in a lot of different locations. To help investigation, we can group information and artifacts in different categories (as suggested by SANS in [18]): 1. 2. 3. 4. 5. 6. 7. 8. 9.1 System information; Account usage; Application execution; File and folder opening; Deleted items; Physical location and network activity; Browser activity; External device/USB usage. System information Hostname The hostname of the system is saved by Windows in registry key SYSTEM\CurrentControlSet\Control\ComputerName\ComputerName Timezone The current system time zone is stored by windows in registry key SYSTEM\CurrentControlSet\Control\TimeZoneInformation 43 CHAPTER 9. WINDOWS ARTIFACTS Additional resources The timezone information stored in the registry is actually a reference to an entry in tzres.dll, like @tzres.dll,-300. You can find the string description of the entry online, for example on https://www.win7dll.info/tzres_dll.html Version and installation time The information related to the currently installed version of Windows is saved in the registry, in SOFTWARE\Microsoft\Windows NT\CurrentVersion Information related to the previously installed versions is stored in the registry, in SYSTEM\Setup\Source OS Both locations hold roughly the same keys: • • • • ProductName; BuildNumber; InstallTime; etc. Last Shutdown Time The time of last shutdown is stored by Windows in registry key SYSTEM\CurrentControlSet\Control\Windows\Shutdown Time 9.2 Account usage These artifacts allow to determine which user(s) used the computer, when and how. User accounts The list of user accounts is stored in the registry, in SAM\Domains\Account\Users. The accounts are listed by relative identifiers (RID, see sec. 8.3). Each entry shows, a.o.: • account creation time; • last login time; • last password change; 44 Digital forensics CHAPTER 9. WINDOWS ARTIFACTS Figure 9.1: Windows Current Version information • login count; • group memberships. Microsoft Cloud Accounts If the user uses a Microsoft Cloud Account to log into the system, the email address associated with the account, will be stored in SAM\Domains\Account\Users\<RID>\InternetUserName Logon attempts When a logon attempt is made (sucessful or failed), Windows will record logs in the Security event log (see Section 8.2). 9.3 Application execution These artifacts allow to show that an application was executed by the user. Jump Lists Windows Jump Lists were introduced in Windows 7 and allow user access to frequently or recently used items quickly via the task bar. They can identify Digital forensics 45 CHAPTER 9. WINDOWS ARTIFACTS applications in use and a additional metadata about items accessed via those applications. Jump lists are stored in %USERPROFILE%\AppData\Roaming\Microsoft\Windows\Recent\ AutomaticDestinations Each jump list file is named according to an application identifier (AppID). The list of AppIDs can be found on https://dfir.to/EZJumpList • Automatic Jump List Creation Time = First time an item added to the jump list. Typically, the first time an object was opened by the application. • Automatic Jump List Modification Time = Last time item added to the jump list. Typically, the last time the application opened an object. Figure 9.2: Windows Jump Lists Last visited MRU This entry tracks applications that are currently used by the user and the directory location for the last file accessed by the application. For Windows XP, this entry is located in: NTUSER.DAT\Software\Microsoft\Windows\CurrentVersion\Explorer\ ComDlg32\LastVisitedMRU Since Windows 7, the entry is in: NTUSER.DAT\Software\Microsoft\Windows\CurrentVersion\Explorer\ ComDlg32\LastVisitedPidlMRU 46 Digital forensics CHAPTER 9. WINDOWS ARTIFACTS Figure 9.3: Last Visited MRU Run MRU This key stores the MRU list of commands executed using the Run dialog box: NTUSER.DAT\Software\Microsoft\Windows\CurrentVersion\Explorer\ RunMRU UserAssist UserAssist key records metadata on GUI-based programs executed by the user. The key is located in: NTUSER.DAT\Software\Microsoft\Windows\CurrentVersion\Explorer\ UserAssist\{GUID}\Count\ The {GUID} identifies the type of execution: • CEBFF5CD means the executable was directly triggered; • F4E57C4B means a shortcut was used. The entries show the application path, and additional information depending on the Windows version: • last run time; Digital forensics 47 CHAPTER 9. WINDOWS ARTIFACTS Figure 9.4: Run dialog box Figure 9.5: Registry entry RunMRU 48 Digital forensics CHAPTER 9. WINDOWS ARTIFACTS • run count; • focus time : total time in milliseconds that application was in focus; • focus count : total number of times the application was re-focused on. Remark: values are encoded using ROT-13 of character values in the ASCII range [A-Za-z]. For example, P:\Hfref\guvon\ is the ROT-13 encoded version of C:\Users\thiba\ Figure 9.6: UserAssist registry entries 9.4 File and folder opening These artifacts allow to determine which files or folders have been opened by the user. Open/Save MRU Windows stores the list of files that have been opened or saved using a Windows shell dialog box in the Most Recently Used (MRU) registry entry. For windows XP, this entry is located in NTUSER.DAT\Software\Microsoft\Windows\CurrentVersion\Explorer\ ComDlg32\OpenSaveMRU Digital forensics 49 CHAPTER 9. WINDOWS ARTIFACTS For windows 7 and more recent, the entry is NTUSER.DAT\Software\Microsoft\Windows\CurrentVersion\Explorer\ ComDlg32\OpenSavePIDlMRU The windows shell dialog box is the dialog box used by most common applications to open or save a file. . . Last Visited MRU This registry entry stores the applications that are currently in use by the user and the directory location for the last file accessed by the application. For windows XP, the entry is located in NTUSER.DAT\Software\Microsoft\Windows\CurrentVersion\Explorer\ ComDlg32\LastVisitedMRU Since Windows 7, the entry is in NTUSER.DAT\Software\Microsoft\Windows\CurrentVersion\Explorer\ ComDlg32\LastVisitedPidlMRU Recent Docs This registry key is tracking the last files and folders opened. It is used to populate places like the “Recent” menus present in the Start menu: NTUSER.DAT\Software\Microsoft\Windows\CurrentVersion\Explorer\ RecentDocs It is a MRU list, but with a subtlety [19]: the MRUListEx entry stores the index of the last modified registry entry, in hexadecimal. Hence it can be considered as a pointer to the start of the list. Office Recent Files MS Office programs track their own recent files list, to make it easier for users to access previously opened files. This list can be located in different locations: NTUSER.DAT\Software\Microsoft\Office\<Version>\<AppName>\File MRU where version can be one of: • 16.0 = Office 2016/2019/M365; • 15.0 = Office 2013; • 14.0 = Office 2010; 50 Digital forensics CHAPTER 9. WINDOWS ARTIFACTS • 12.0 = Office 2007; • 11.0 = Office 2003; • 10.0 = Office XP. or NTUSER.DAT\Software\Microsoft\Office\<Version>\<AppName>\User MRU\LiveId_####\File MRU for Microsoft 365, or NTUSER.DAT\Software\Microsoft\Office\<Version>\<AppName>\User MRU\AD_####\File MRU for Microsoft 365 (Azure Active Directory). MS Word Reading Locations Beginning with Word 2013, the last known position of the user within a Word document is recorded in NTUSER.DAT\Software\Microsoft\Office\<Version>\Word\Reading Locations Shell Bags Shell Bags [20] are used by Windows to save the view preferences (position, size etc.) of each folder opened by the user in the File Explorer. For each opened directory, Windows create a ShellBag entry in USRCLASS.DAT\Local Settings\Software\Microsoft\Windows\Shell\Bags USRCLASS.DAT\Local Settings\Software\Microsoft\Windows\Shell\BagMRU ShellBags can hold a lot of information about: • • • • • 9.5 local folders; network folders; removable devices; deleted folders; opened ZIP archives. Deleted items These artifacts allow to determine which files have been deleted by the user. Recycle Bin Digital forensics 51 CHAPTER 9. WINDOWS ARTIFACTS Like on most Operating Systems, when a user deletes a file on Windows, the file is simply moved to a hidden system folder. For Windows XP: C:\Recycler Since Windows 7 : C:\Recycle.Bin Each user is assigned a SID sub-folder (that can be mapped to a user name by inspecting the registry) Since Windows 7, the files preceded by $I###### contain the original filename and deletion date/time, and the files preceded by $R###### contain the original deleted file contents. Windows Search Database Windows Search indexes more than 900 file types, including email and file metadata, allowing users to search based on keywords. It contains extensive file metadata and even partial file content. Since Windows 7, the database is saved in C:\ProgramData\Microsoft\Search\Data\Applications\Windows\Windows.edb Thumbs.db The hidden database file is created in directories where images were viewed as thumbnails. In some cases, the thumbnail of deleted pictures is kept in Thumbs.db Since Windows 10, the thumbnail cache is centralized in %USERPROFILE%\AppData\Local\Microsoft\Windows\Explorer Shell Bags In some cases, shell bags can be used to recover deleted files. 9.6 Physical location and network activity (p.m.) 9.7 Browser activity (p.m.) 52 Digital forensics CHAPTER 9. WINDOWS ARTIFACTS Figure 9.7: Thumbnails cache on Windows 11 9.8 External device/USB usage (p.m.) Digital forensics 53 CHAPTER 9. WINDOWS ARTIFACTS 54 Digital forensics Chapter 10 Forensics tools 10.1 RegRipper RegRipper is a perl script that allows to dump the content of a registry hive file into readable text. It is preinstalled on the SIFT workstation. Figure 10.1: RegRipper 55 CHAPTER 10. FORENSICS TOOLS Additional resources On the SIFT workstation, RegRipper has a bug. You can find the fix at https://cylab.be/blog/287/sift-workstation-fix-rippl-error-globalsymbol-plugindir-requires-explicit-package-name RegRipper actually relies on plugins to parse the content of the registry hives. You can list the available plugins with: rip.pl -l Next to each plugin (between the square brackets), you will find the profiles that use this plugin. A profile is simply a predefined group of plugins. Figure 10.2: RegRipper plugins and profiles You can analyze a hive using a specific profile with: rip.pl -r <hive> -f <profile> Or, to run only a single plugin: rip.pl -r <hive> -p <plugin> The most commonly used profiles are: • sam : for SAM hives • system : for SYSTEM hives 56 Digital forensics CHAPTER 10. FORENSICS TOOLS • ntuser : for NTUSER.DAT And here are a few interesting plugins: • • • • • • • • • timezone compname shutdown networklist shares unreadmail recentdocs typedpaths userassist v Exercise User profiling Download and extract hives-01.zip from https://cylab.be/s/Q2zQ0 Use RegRipper to parse the registry hives, and answer the following questions: 1. 2. 3. 4. 5. ✓ What is the RID of IEuser ? To which groups does this user belong? When was the las time this account looged into the system? When was the password changed? What is the login count? Solution This kind of information is stored in the SAM hive: rip.pl -f sam -r SAM Digital forensics 57 CHAPTER 10. FORENSICS TOOLS v Exercise System profiling Using hives-01.zip, use RegRipper to parse the registry hives, and answer the following questions: 1. 2. 3. 4. ✓ Which ControlSet is used by this system? What is the timezone of the system? What is the name of the machine? When was the last shutdown (in UTC)? Solution This kind of information is stored in the SYSTEM hive: rip.pl -f system -r SYSTEM v 58 Exercise Networks 5. What networks was the machine connected to? 6. What is the last network used by the machine? 7. When was the last connection to this network? 8. What type of network is this? 9. Are there any shared drives on this machine? Digital forensics CHAPTER 10. FORENSICS TOOLS v Exercise User activity Using hives-01.zip, use RegRipper to parse registry hives and answer the following questions: 1. What is the mail client used by the user IEUser (if any)? 2. What is the last .zip file opened by the user, and when (in UTC)? 3. What is the last .txt file opened by the user, and when (in UTC)? 4. What is the last .pdf file opened by the user, and when (in UTC)? 5. What is the last folder opened by the user, and when (in UTC)? 6. Is there any evidence of the user searching for specific paths in Windows (TypedPaths registry key)? 7. List the last programs executed by the user. ✓ Solution This kind of information is stored in the NTUSER.DAT hive of each user: rip.pl -f ntuser -r NTUSER.DAT 10.2 evtxinfo and evtxexport evtxinfo and evtxexport are part of the Libevtx project [21]. This library allows to access and dump the content of Windows XML Event Log (EVTX) databases. These tools are already installed on the SIFT machine. They can be used on the command line with: evtxinfo <file.evtx> evtxexport <file.evtx> Digital forensics 59 CHAPTER 10. FORENSICS TOOLS v Exercise Download and extract eventlogs-01.zip from https://cylab.be/s/G wsxl Use evtxinfo to analyze the Security event log and answer the following questions: 1. What is the version of the Event Log? 2. What is the number of records in the Event Log? 3. Is there anything suspicious about the Event Log information? ✓ Solution v Exercise evtxinfo Security.evtx Use evtxexport to dump the content of the Security event log, and answer the following questions: 1. Waht is the last event logged in the Security log, and what is the meaning (according to the event identifier) ? 2. Which account is related to the last event? 3. Is there anything interesting about how this account was used? Check the preceding events. . . 10.3 Eric Zimmerman’s tools Eric Zimmerman is a forensics specialist who wrote a collection of analysis tools: https://ericzimmerman.github.io/ The tools run on Windows and are thus not installed on the SIFT workstation. 60 Digital forensics CHAPTER 10. FORENSICS TOOLS Additional resources You can find installation instructions for Eric Zimmerman’s tools at: https://cylab.be/blog/290/install-eric-zimmermans-forensics-toolkit 10.3.1 Registry Explorer The Registry Explorer allows to load hives and explore the registry. On the right pane (Values), the top row allows to filter values. 10.3.2 LNK files 10.4 Thumbcache viewer Thumbcache Viewer is a small utility that allows to inspect and visualize the content of thumbcache files (Thumbs.db). You can download the tool from https://thumbcacheviewer.github.io/ Digital forensics 61 CHAPTER 10. FORENSICS TOOLS As you may notice on the screenshot above, the path of the file is not saved in the cache. Thumbnails are only identified by a hash. This means we must re-compute the hash of each image on the device to find the image corresponding to the thumbnail. v Additional resources You can find the details of the hash algorithm at https://www.swiftfor ensics.com/2012/06/windows-7-thumbcache-hash-algorithm.html Exercise Download and extract the https://cylab.be/s/XaVkj thumbcache database from Analyze the images. Is there a suspicious image? 10.5 Additional exercises Download and extract case20-disk-01.zip from https://cylab.be/s/j8RIq Answer the following questions. For each question you must be able to give an answer, and briefly explain the tool or technique used to find the answer. . . User profiling 1. Which accounts are active on this computer? 62 Digital forensics CHAPTER 10. FORENSICS TOOLS 2. Do they have administrator rights? 3. Can you find traces of failed login attempts? System profiling 1. 2. 3. 4. 5. What is the hostname of the computer? What is the timezone of the computer? What is the installed version of Windows? When was this installed? When was the last shutdown? Executed applications 1. According to userassist, which applications have been executed by TimmersVic? Digital forensics 63 CHAPTER 10. FORENSICS TOOLS 64 Digital forensics Part III Memory forensics 65 Chapter 11 Principles of computer memory In this Chatper, we will cover some principles that are common to (almost) all operating systems regarding memory organization. 11.1 Address space of a process The address space of a process contains all of the memory state of the running program. For example, the code of the program (the instructions) have to live in memory somewhere, and thus they are in the address space. The program, while it is running, uses a stack to keep track of where it is in the function call chain as well as to allocate local variables and pass parameters and return values to and from routines. Finally, the heap is used for dynamically-allocated, user-managed memory, such as that you might receive from a call to malloc() in C or new in an object-oriented language such as C++ or Java. Of course, there are other things in there too (e.g., statically-initialized variables), but for now let us just assume those three components: code, stack, and heap. On Figure XXXX, we have a tiny address space (only 16KB) . The program code lives at the top of the address space (starting at 0 in this example, and is packed into the first 1K of the address space). Code is static (and thus easy to place in memory), so we can place it at the top of the address space and know that it won’t need any more space as the program runs. Next, we have the two regions of the address space that may grow (and shrink) while the program runs. Those are the heap (at the top) and the stack (at the bottom). We place them like this because each wishes to be able to grow, and by putting them at opposite ends of the address space, we can allow such growth: 67 CHAPTER 11. PRINCIPLES OF COMPUTER MEMORY Figure 11.1: An example address space of a process they just have to grow in opposite directions. The heap thus starts just after the code (at 1KB) and grows downward (say when a user requests more memory via malloc()); the stack starts at 16KB and grows upward (say when a user makes a procedure call). However, this placement of stack and heap is just a convention; you could arrange the address space in a different way if you’d like (as we’ll see later, when multiple threads co-exist in an address space, no nice way to divide the address space like this works anymore, alas). Of course, when we describe the address space, what we are describing is the abstraction that the OS is providing to the running program. 11.2 Virtualization and paging [5] From the perspective of memory, early machines didn’t provide much of an abstraction to users. Basically, the physical memory of the machine consisted of 64KB reserved for the Operating System, and there would be one single process (program) using the rest of the available memory. When computers evolved and allowed to run multiple processes at the same time, OS developers had to implement techniques that allowed to run these processes, without having to modify the programs themselves. This was implemented using 2 mechanisms: memory virtualization and paging. With virtualization, each process running on the computer ‘thinks’ he is using the 68 Digital forensics CHAPTER 11. PRINCIPLES OF COMPUTER MEMORY complete available address space, starting at address 0. This is called the virtual memory. The code and data of the process is actually saved in a different place in physical memory. To use available physical memory in the most effective way, most modern OS implement paging: the physical address space is split into fixed-sized (typically 4096 Bytes) units called pages. Figure 11.2: Paging To record where each virtual page of the address space is placed in physical memory, the operating system keeps a per-process data structure known as a page table. The major role of the page table is to store address translations for each of the virtual pages of the address space, thus letting us know where in physical memory they live. When a process tries to access a memory address, for example to read a write a variable, the virtual address must be translated into a physical memory address. This is the task of the Memory Management Unit (MMU), and it takes 3 steps: 1. the page table of the process is found thanks to the Page Map Address Register (PMAR); Digital forensics 69 CHAPTER 11. PRINCIPLES OF COMPUTER MEMORY 2. the virtual address is split between page number (20 bits) and offset (12 bits) 3. the physical start address of the page is found in the page table, and the offset is added Figure 11.3: Memory address translation To avoid wasting available memory, shared libraries are handled in a special way: for all processes, they are mapped to the same physical memory addresses. This way shared memory must not be duplicated in memory. This is also the case for some OS memory areas that must be accessible by user processes. This is illustrated on the Figure below. 11.3 Interrupts The x86 has 4 protection levels, numbered 0 (most privilege) to 3 (least privilege). In practice, most operating systems use only 2 levels: 0 and 3, which are then called kernel mode and user mode, respectively. The current privilege level with which the x86 executes instructions is stored in %cs register, in the field Current Privilege Level (CPL). For example, to make a system call on the x86 architecture (see below), a program invokes the int n instruction, where n specifies the index of the interrupt in the interrupt descriptor table (IDT). 70 Digital forensics CHAPTER 11. PRINCIPLES OF COMPUTER MEMORY Figure 11.4: Page number and offset Figure 11.5: Shared libraries Digital forensics 71 CHAPTER 11. PRINCIPLES OF COMPUTER MEMORY When a software interrupt [22] occurs, the x86 cpu first saves the current context on the kernel stack. Figure 11.6: Interrupt table In the original 8086 processor (and all x86 processors in Real Mode), the Interrupt Vector Table (IVT) controlled the flow into an Interrupt Service Routine (ISR) . The IVT started at memory address 0x00, and could go as high as 0x3FF, for a maximum number of 256 ISRs (ranging from interrupt 0 to 255). Each entry in the IVT contained 2 words of data: A value for the Instruction Pointer (IP) and a value for the Code Segment (CS) (in that order). For example, let’s say that we have the following interrupt: int 14h When we trigger the interrupt, the processor goes to the 20th location in the IVT (14h = 20). The processor then loads %eip and %cs. This will modify the privilege level (if required) and start the execution the interupt handler. At the end of the interruption handler, the values saved during the int instruction are popped from the stack, and CPU resumes execution at the saved %eip. Since the 286 (but extended on the 386), interrupts may be managed by a table in memory called the Interrupt Descriptor Table (IDT). The IDT only comes into 72 Digital forensics CHAPTER 11. PRINCIPLES OF COMPUTER MEMORY play when the processor is in protected mode. Much like the IVT, the IDT contains a listing of pointers to the ISR routine. The following assembly structure represents an IDT entry: struc idt_entry_struct base_low: start of ISR sel: segment of ISR always0: not used flags: DPL and other flags base_high: end of ISR endstruc where DPL is the Desired Protection Level. IVT and IDT are installed in memory by the OS during the boot process. 11.4 System calls As we have seen in the previous Section, most modern processors define two different privilege levels under which software may be executed: a user-mode program is hence limited to its own address space so that it cannot access or modify other running programs or the operating system itself, and is usually prevented from directly manipulating hardware devices (e.g. the frame buffer or network devices). However, many normal applications obviously need access to these components. Hence the operating system (kernel), that executes at the highest privilege, defines system calls to provide well defined, safe implementations for such operations. Hence system calls [23] can roughly be defined as functions that: • are implemented in the operating system, • are executed in kernel mode but • can be executed by user programs. Implementing system calls requires a control transfer from user space to kernel space, which involves some sort of architecture-specific feature. A typical way to implement this is to use a software interrupt or trap. Interrupts transfer control to the operating system kernel so software simply needs to set up some register with the system call number needed, and execute the software interrupt. This is the only technique provided for many RISC processors, but CISC architectures such as x86 support additional techniques. For example, the x86 instruction set contains the instructions SYSCALL/SYSRET and SYSENTER/SYSEXIT Digital forensics 73 CHAPTER 11. PRINCIPLES OF COMPUTER MEMORY (these two mechanisms were independently created by AMD and Intel, respectively, but in essence they do the same thing). These are “fast” control transfer instructions that are designed to quickly transfer control to the kernel for a system call without the overhead of an interrupt. 11.5 Chapter review At the end of this Chapter, you should be able to answer the following questions: • • • • 74 Explain the address space of a single process Explain how and why we need memory virtualization and paging Explain how interrupts work Explain what a system call is, and how it can be implemented using interrupts Digital forensics Chapter 12 Windows memory In this Chapter, we will cover with a little more details the internals of Windows, and how this OS manages memory and processes. 12.1 Windows architecture Like many other Operating Systems, Windows uses only 2 protection levels available from the x86 architecture: user mode and kernel mode. Windows also has a monolithic kernel OS: bulk of the OS and device driver code shares the same kernel-mode protected memory space. This means that any OS component or device driver can potentially corrupt data being used by other OS system components (intentionally or not). However, Mircrosoft tries to address this through attempts to strengthen and validate components that can be loaded in protected memory. For example Windows device drivers must be signed by Micrsoft before they can be loaded. The simplified architecture of Windows is represented on the Figure below. The Hyper-V hypervisor at the bottom is a feature introduced witn Windows 8. This component runs with the same CPU privilege (0) as the rest of the kernel, but because it uses specialized CPU virtualization instructions (VT-x on Intel and SVM on AMD), it can isolate itself from the rest of the kernel, while also monitoring it. Service processes are processes that host Windows services such as the Task Scheduler or the Print Spooler. System processes are specific system processes like the logon process and the Session Manager. 75 CHAPTER 12. WINDOWS MEMORY Figure 12.1: Simplified Windows Architecture Environment susbsystems allow to support different OS environments (or personalities) presented to the user and programmer. This feature is for example responsible for the Windows Subsystem for Linux (WSL). The subsystem DLLs are an abstraction layer between the API presented to (and used by) application developers, and the internal native calls implemented mosty in Ntdll.dll. The subsystem DLLs are well documented and stable, while the NTDLL.DLL calls are not documented and subject to regular changes. . . The device drivers include hardware drivers that translate user I/O funtion calls into specific hardware device I/O requests, and non-hardware device drivers, such as file system and network protocol implementations. The Hardware Abstraction Layer (HAL) isolates the kernel and device drivers from platform specific hardware differences (like differences between motherboards). 12.1.1 Environment subsytems The role of an environment sysbsytem is to expose some subset of the base Windows executive system services to applications. Each subsystem can provide access to different subsets of the native services and system calls in Windows. Each executable image (.exe) is bound to only one subsystem. When the iamge is run, the process creation code examines the sysbsystem type field in the image header so that it can notify the proper susbsystem of the new process. 76 Digital forensics CHAPTER 12. WINDOWS MEMORY When Windows boots, the subsystems are started by the Session Manager (Smss.exe) process. The subsystem startup information is stored under the registry key HKLM/SYSTEM/CurrentControlSet/Control/Session Manager/Subsystems. The Required key lists the subsystems that load when the system boots. The Windows key containes the file specification of the (classical) Windows subsystem: csrss.exe, which stands for Client/server Runtime Subsystem. 12.2 Chapter review At the end of this Chapter, you should be able to answer the following questions: • Explain the architecture of windows, and the role of the different components. Digital forensics 77 CHAPTER 12. WINDOWS MEMORY 78 Digital forensics Chapter 13 Memory acquisition 79 CHAPTER 13. MEMORY ACQUISITION 80 Digital forensics Chapter 14 Memory analysis 81 CHAPTER 14. MEMORY ANALYSIS 82 Digital forensics Chapter 15 Linux memory forensics 83 CHAPTER 15. LINUX MEMORY FORENSICS 84 Digital forensics Part IV Mobile device forensics 85 Part V Network forensics 87 A lot of our digital activity invovles network usage. Hence part of a digital forensics investigation involves the analysis of network traffic. Compared to Windows forensics investigation, network forensics analysis presents 2 additional challenges: 1. Network forensics requires preparation: unlike Windows, a router or a switch does not store logs or statistics about transmitted network traffic. Hence a logging or recording infrastructure must be put in place before the incident actually happens. Otherwize there is simply nothing to analyze. . . 2. The amount of data to store and process may become huge. For example, a simple 1Gb network link used at 100% will generate daily: 1x109 bit/sx3600x24 = 10800GB/day 8x109 That is rougly 10TB of data, for a single link (and in a single direction). This means that if a forenscis investigation must be performed over full packet capture network data, the investigator will need large enough storage space, and appropriate tools to process this large data. Wireshark is probably the most used tool to analyse network captures (and the easiest). However, it is only able to open for relatively small files (rougly maximum 500MB). Hence the first step of an investigation will be to use command line tools to pre-process the data. So in the next Chapters we will first review some command line tricks and commands that will be usefull for network traffic analysis. Then we will give a review of the main network protocols that you may encounter. And in Chapter 18 we will show how tcpdumpcan be used to preprocess the data. Digital forensics 89 90 Digital forensics Chapter 16 Command line tricks Analyzing network traffic requires to use and combine command lines tools! This Chapter contain a lot of exercises. To do the exercises yourself, you will need: • a Linux system • the file lorem.txt cat Reads the content of a (text) file $ cat lorem.txt wc Counts the number of lines, words and characters of a file $ wc lorem.txt 39 1939 13222 lorem.txt To show only the number of lines : $ wc -l lorem.txt piping Redirects the output of one command to the input of another $ cat lorem.txt | wc 91 CHAPTER 16. COMMAND LINE TRICKS v Exercise ✓ Solution Show only the number of lines in lorem.txt $ cat lorem.txt | wc -l cut Extracts the given field(s) from each line. $ cut -d <delimiter> -f <field(s)> $ cut -d ' ' -f 2,4-6 v Exercise ✓ Solution In lorem.txt, show only the first word of each line. $ cat lorem.txt | cut -d ' ' -f 1 sort Sort the lines. For numeric values: $ sort -n For human readable values (1M, 1G): $ sort -h uniq 92 Digital forensics CHAPTER 16. COMMAND LINE TRICKS Filter adjacent identical lines (used for logs deduplication) $ cat uniq.txt line a line a line b line a line b $ cat uniq.txt | uniq line a line b line a line b To show count: $ uniq -c $ cat uniq.txt | uniq -c 2 line a 1 line b 1 line a 1 line b Often used together with sort: $ cat uniq.txt | sort 3 line a 2 line b | uniq -c or $ cat uniq.txt | sort 2 line b 3 line a v | uniq -c | sort -n Exercise In lorem.txt, what is the first word of each line that appears most often? Digital forensics 93 CHAPTER 16. COMMAND LINE TRICKS ✓ Solution $ cat lorem.txt | cut -d ' ' -f 1 | sort | uniq -c \ | sort -n Only show repeated lines: $ uniq -d grep Search for a string or pattern in each line E.g $ grep USB /var/log/syslog Defining a pattern Quantifiers: • * : 0 or more • ? : 0 or 1 • + : at least 1 E.g. Lo+rem matches Lorem and Loorem and Looorem and . . . $ grep Lo+rem lorem.txt Classes represent different possible characters: • . represents any character (1 instance) $ grep "L.rem" lorem.txt • • • • \s : whitespace \S : non-whitespace \d : digit \D : non-digit Combining classes and quantifiers E.g: .* represents 0 or more instances of any character $ grep "L.*m" lorem.txt 94 Digital forensics CHAPTER 16. COMMAND LINE TRICKS Some special characters must be escaped with \ : • • • • • \/ \\ \* \ˆ \% v Exercise Which pattern will match Lorem ipsum ? a. Lorem\sipsum b. Lorem\Sipsum c. Lorem*ipsum v Exercise Which (simplified) pattern can be used to match a URL ? a. b. c. d. ✓ https?:\/\/\s* http.://\s* https?:\/\/\S* https?://\S* Solution https?:\/\/\S* BUT: https://mathiasbynens.be/demo/url-regex Case insensitive : $ grep -i Count the number of matching lines Digital forensics 95 CHAPTER 16. COMMAND LINE TRICKS Figure 16.1: regex101.com allows to test your patterns $ grep -c Invert match (show lines NOT matching pattern) $ grep -v Show context (print N lines before and after each match) $ grep -C <N> v Exercise Info regarding your CPU is kept in /proc/cpuinfo processor vendor_id cpu family model model name ... : : : : : 0 GenuineIntel 6 158 Intel(R) Core(TM) i7-7700HQ CPU @ 2.80GHz How would you automatically (in a script) compute the number of vcores ? 96 Digital forensics CHAPTER 16. COMMAND LINE TRICKS ✓ Solution $ grep -c "processor" /proc/cpuinfo !! Some regex meta-characters are not supported by grep by default: • • • • • + : 1 or more ? : 0 or 1 ( and ) : group | : or {} : range specifier e.g: a{4} To use them: grep -E or egrep v Exercise Compare the result of • grep Lo+rem lorem.txt and • grep -E Lo+rem lorem.txt zgrep Allows to use grep on compressed files $ zgrep "UFW BLOCK" /var/log/syslog.2.gz head & tail • head shows the first 10 lines • tail shows the last 10 lines $ head lorem.txt $ tail -2 lorem.txt tail To keep showing appended data as the file grows (follow): $ tail -f /var/log/syslog time Runs a command and shows resource usage (running time) Digital forensics 97 CHAPTER 16. COMMAND LINE TRICKS $ time <my heavy command> $ time grep usb /var/log/syslog By default, shows: • real : wall clock execution time • user : CPU time spent in user mode • sys : CPU time spent in kernel mode Many others available (check man time for the list) 98 Digital forensics Chapter 17 Network protocols Before we start using tcpdump to perform actual network traffic analysis, we give here a quick review of the main network protocols we will encounter. For each protocol we quickly list and explain the main header fields. We also use the offset notation, because this notation is also used by tcpdump and Berkeley Packet Filters (BPF). For example ip[9] is the 10th Byte of the IP header (offset of 9 Bytes). 17.1 IPv4 Internet Protocol version 4 [24] is the protocol that defines and enables internetworking at the internet layer of the Internet Protocol Suite. It uses a logical addressing system and performs routing, which is the forwarding of packets from a source host to the next router that is one hop closer to the intended destination host on another network. Figure 17.1: IPv4 Header • IP Header Length: – expressed in words of 4 Bytes 99 CHAPTER 17. NETWORK PROTOCOLS – minimum : 5 • TOS / Differentiated Services Byte: – 6 bits Differentiated Services – 2 bits Explicit Congestion Notification (ECN) • Total length: – in Bytes • Fragmentation: – X : Reserved (evil bit) – D : Do not fragment – M : More fragment – Offset : position of this fragment in original packet (in Byte) Protocol ip[9]: • • • • • • 1 : ICMP 2 : IGMP 6 : TCP 17 : UDP 41 : IPv6 47 : GRE 17.2 TCP The Transmission Control Protocol [25] provides reliable, ordered, and errorchecked delivery of a stream of octets (bytes) between applications running on hosts communicating via an IP network. Figure 17.2: TCP Header Common ports: • • • • 100 20 : 21 : 22 : 23 : ftp-data ftp SSH telnet Digital forensics CHAPTER 17. NETWORK PROTOCOLS • • • • • • • • • • 25 : smtp 43 : whois 53 : dns 80 : http 110 : pop3 143 : IMAP 443 : https 1433 : MS SQL 3128 : Squid HTTP Proxy 3306 : MySQL Sequence number : • random initial value • incremented by packet size (in Byte) ACK number : • next expected sequence number HL : Header Length: • expressed in words of 4 Bytes • minimum : 5 R : Reserved Flags tcp[13] 0x 8 CWR 4 ECE 2 URG 1 ACK 8 PUSH 4 RES 2 SYN 1 FIN CWR and ECE are used for Explicit Congestion Notification (ECN). 17.3 ICMP The Internet Control Message Protocol [26] is a supporting protocol in the Internet protocol suite. It is used by network devices, including routers, to send error messages and operational information indicating success or failure when communicating with another IP address, for example, an error is indicated when a requested service is not available or that a host or router could not be reached. The main type and code combinations are listed in the table below: Digital forensics 101 CHAPTER 17. NETWORK PROTOCOLS Figure 17.3: ICMP 17.4 Type Code 0 3 3 3 3 5 8 11 0 0 1 2 3 0 0 0 Description Echo reply Network unreachable Host unreachable Protocol unreachable Port unreachable Network redirect Echo request TTL exceeded UDP The User Datagram Protocol [27] is one of the core communication protocols of the Internet protocol suite used to send messages (transported as datagrams in packets) to other hosts on an Internet Protocol (IP) network. Unlike TCP, UDP uses a simple connectionless communication model with a minimum of protocol mechanisms. UDP provides checksums for data integrity, and port numbers for addressing different functions at the source and destination of the datagram. It has no handshaking dialogues and thus exposes the user’s program to any unreliability of the underlying network; there is no guarantee of delivery, ordering, or duplicate protection. Figure 17.4: UDP Header Here are the port numbers used by some protocols that run on top of UDP: 102 • • • • • 67 and 68 : DHCP 123 : NTP 137 and 138 : netbios 161 and 162 : snmp 514 : syslog Digital forensics CHAPTER 17. NETWORK PROTOCOLS 17.5 ARP The Address Resolution Protocol [28] is a communication protocol used for discovering the link layer address, such as a MAC address, associated with a given internet layer address, typically an IPv4 address. Figure 17.5: ARP Type: • 1 : Ethernet • 0x0800 : IPv4 Opcode: • 1 : request • 2 : response Address size: • IPv4 : 4 Bytes • Ethernet : 6 Bytes Digital forensics 103 CHAPTER 17. NETWORK PROTOCOLS 104 Digital forensics Chapter 18 tcpdump • Allows to capture and analyze network traffic • Suitable for LARGE captures (unlike wireshark) 18.1 Requirements This Chapter contains a lot of exercises to get you familiar with tcpdump. We strongly encourage you to do the exercises by yourself. To do so, you will need a Linux system, with tcpdump installed, and the following exercise files: • treasurehunt_fw_eth1.pcap : https://cylab.be/s/uHlYB • lostsofweb.pcap : https://cylab.be/s/2FnZC 18.2 Packet analysis tcpdump -r <file> Do NOT perform address conversion (DNS): tcpdump -n -r <file> • speedup processing • avoid detection by hacker On some systems, you have to run tcpdump as superuser: sudo tcpdump ... 105 CHAPTER 18. TCPDUMP Exercise v What is the wall clock time required to analze the file lotsofweb.pcap: 1. with address conversion? 2. without address conversion? Solution ✓ 18.2.1 1. time tcpdump -r lotsofweb.pcap 2. time tcpdump -n -r lotsofweb.pcap Display options Show only the n first packets: tcpdump -r <file> -c <n> tcpdump -n -r lotsofweb.pcap -c 12 | tail -1 Figure 18.1: Default tcpdump display format By default: • packet time (no date) in local (capture) TZ 106 Digital forensics CHAPTER 18. TCPDUMP • L3 protocol • source IP.port > destination IP.port For TCP: • • • • • flags : . = ACK, S = SYN, R = RST,. . . relative sequence number : computed next sequence relative ack window, length L7 protocol v Exercise In lotsofweb.pcap : 1. make the list of source IP addresses 2. which IP address sent most packets? ✓ 18.2.2 Solution 1. $ | 2. $ | tcpdump -n cut -d '.' tcpdump -n cut -d '.' -r -f -r -f lotsofweb.pcap | cut -d ' ' -f 3 1-4 | sort | uniq lotsofweb.pcap | cut -d ' ' -f 3 1-4 | sort | uniq -c | sort -n Output formating Show the L2 header (including Ethernet MAC addresses): $ tcpdump -e v Exercise Using the 12th packet of lotsofweb.pcap, 1. what is the MAC address of the default gateway? 2. what is the manufacturer of this router? Digital forensics 107 CHAPTER 18. TCPDUMP Figure 18.2: TCPDUMP showing L2 headers ✓ Solution 1. $ tcpdump -r lotsofweb.pcap -e -c 12 2. check https://macvendors.com or: $ curl https://api.macvendors.com/`tcpdump -n \\ -r lotsofweb.pcap -c 12 -e | tail -1 \\ | cut -d ' ' -f 2` Timestamps • • • • • -t : no timestamps -tt : unix timestamps -ttt : delta with previous packet -tttt : date and time in local TZ -ttttt : relative to first packet v 108 Exercise At what date and time (UTC/GMT) was the first packet received ? Digital forensics CHAPTER 18. TCPDUMP ✓ Solution $ tcpdump -n -r lotsofweb.pcap -c 1 -tt Then convert unix timestamp, for example on https://www.epochcon verter.com/ or: $ TZ="UTC" date -d @`tcpdump -n -r lotsofweb.pcap \\ -c 1 -tt | cut -d ' ' -f 1` Sequence numbers -S : show real (absolute) sequence numbers v Exercise ✓ Solution v Exercise ✓ Solution In lotsofweb.pcap, what is the real sequence number of the 20th packet ? $ tcpdump -n -r lotsofweb.pcap -c 20 -S | tail -1 In lotsofweb.pcap, what is the most often appearing sequence number ? $ tcpdump -n -r lotsofweb.pcap -S | cut -d ' ' -f 9 | cut -d ':' -f 1 | cut -d ',' -f 1 | sort -n | uniq -c | sort -n Digital forensics 109 CHAPTER 18. TCPDUMP 18.2.3 Anomaly detection Example: 3-sigma rule of thumb • compute average value mu • compute standard deviation sigma • trigger an alert if value > mu + 3 sigma Why ? • if value is distributed according to a normal distribution, • 99.7% of values will be <= mu + 3 sigma • 99.993% of values will be <= mu + 4 sigma Figure 18.3: Empirical detection rule 110 Digital forensics CHAPTER 18. TCPDUMP Figure 18.4: This is sometimes called AI powered security Figure 18.5: Azure Sentinel Digital forensics 111 CHAPTER 18. TCPDUMP 18.2.4 Output formating : packet inspection • -v : more details, like IP fragmentation (+), TTL and options (but multiline) • -vv : even more details • -vvv : more (too many) details Figure 18.6: Analysis of fragmented packets with TCPDUMP Packet payload • • • • -x : hexadecimal -xx : hexadecimal, with L2 (ethernet) headers -X : hexadecimal and ASCII -XX : hexadecimal and ASCII, with L2 headers 18.2.5 Filtering • tcpdump filtering relies on libpcap and Berkeley Packet Filters (BPF) • also used by some other tools • Byte level (limited protocol analysis) basic format: protocol[offset:length] filter examples: • $ tcpdump -n -r lotsofweb.pcap 'ip[9] = 1' : (ICMP) 112 IP protocol 1 Digital forensics CHAPTER 18. TCPDUMP • $ tcpdump -n -r lotsofweb.pcap 'tcp[2:2] = 80' : destination port 80 operators • • • • = != > >= . . . and or not v Exercise In treasurehunt_fw_eth1.pcap • count the number of icmp packets • which IP sent most icmp packets? ✓ Solution • $ = • $ = tcpdump -n -r treasurehunt_fw_eth1.pcap 'ip[9] 1' | wc -l tcpdump -n -r treasurehunt_fw_eth1.pcap 'ip[9] 1' | cut -d ' ' -f 3 | sort | uniq -c | sort -n Syntactic sugar (macros) • port 22 : destination or source port 22 • src host 172.16.16.128 • src net not 172.16 Available macros: • • • • • • • • • host src host dst host net src net dst net port src port dst port Digital forensics 113 CHAPTER 18. TCPDUMP • icmp • tcp • udp v Exercise In treasurehunt_fw_eth1.pcap • count the number of echo request (ping) packets ✓ Solution v Exercise ✓ Solution v Exercise 114 $ tcpdump -n -r treasurehunt_fw_eth1.pcap 'icmp[0] = 8 and icmp[1] = 0' | wc -l In treasurehunt_fw_eth1.pcap, how many echo request packets are coming from a public (external) IP? $ tcpdump -n -r treasurehunt_fw_eth1.pcap 'icmp[0] = 8 and icmp[1] = 0 and src net not 192.168' | wc -l In treasurehunt_fw_eth1.pcap, which IP sent most echo request packets? Digital forensics CHAPTER 18. TCPDUMP ✓ Solution $ tcpdump -n -r treasurehunt_fw_eth1.pcap 'icmp[0] = 8 and icmp[1] = 0' | cut -d ' ' -f 3 | sort | uniq -c | sort -n Write filtered packets to another file: -w <file> Useful for analysis with Wireshark! v Exercise In treasurehunt_fw_eth1.pcap: • how many packets are DNS queries? • extract these packets to a pcap file • open your pcap file with wireshark to check your answer ✓ 18.2.6 Solution $ tcpdump -n -r treasurehunt_fw_eth1.pcap 'udp and dst port 53' -w dns.pcap Filtering : bitmasks What if the field is shorter than a Byte ? E.g. IP header length (IHL), TCP flags etc. . . 1. use a bitmask to set irrelevant bits to 0 2. test resulting value E.g. find packets that have IP options Find packets where IHL is > 5 ip[0] contains : • 4-bit IP version • 4 bit-IHL Digital forensics 115 CHAPTER 18. TCPDUMP ip[0] bitmask result test : : : : ? 0 0 > ? ? ? a b c d 0 0 0 1 1 1 1 0 0 0 a b c d 5 (0x0f) Final filter : (ip[0] & 0x0f) > 5 v Exercise In lotsofweb.pcap : • how many packets have IP options ? • extract these packets to a new pcap • open your pcap with wireshark to check which option is set ✓ Solution $ tcpdump -n -r lotsofweb.pcap '(ip[0] & 0x0f) > 5' -w ip_options.pcap Filtering TCP flags 0x 8 4 2 1 tcp[13] : CWR ECE URG ACK 8 4 2 1 PSH RST SYN FIN ONLY SYN flag tcp[13] = 0x02 AT LEAST SYN flag tcp[13] mask result test : CWR ECE URG ACK : 0 0 0 0 : 0 0 0 0 : 0 0 0 0 PSH RST SYN FIN 0 0 1 0 0 0 SYN 0 0 0 1 0 (tcp[13] & 0x02) = 0x02 116 Digital forensics CHAPTER 18. TCPDUMP v Exercise ✓ Solution v Exercise • What is the filter to check that only (SYN and ACK) are set (connection accepted)? • What is the filter to check that at least (SYN and ACK) are set? • tcp[13] = 0x12 • (tcp[13] & 0x12) = 0x12 In treasurehunt_fw_eth1.pcap • How many packets have at least (SYN and ACK) set ? • Establish the list of listening services (IP:port) on the internal network. ✓ Solution v Exercise • $ tcpdump -n -r treasurehunt_fw_eth1.pcap 'src net 192 and ((tcp[13] & 0x12) = 0x12)' | wc -l • $ tcpdump -n -r treasurehunt_fw_eth1.pcap 'src net 192 and ((tcp[13] & 0x12) = 0x12)' | cut -d ' ' -f 3 | sort -u What is the filter to check if the IP More Fragments (MF) bit is set? Digital forensics 117 CHAPTER 18. TCPDUMP ✓ 18.3 Solution (ip[6] & 0x02) = 0x02 Packet capture Figure 18.7: https://securityintelligence.com/is-full-packet-capture-worth-theinvestment/ List available network interfaces: $ tcpdump -D Capture and display on screen: $ sudo tcpdump -i <interface> or $ sudo tcpdump -i <interface> <filter> Write packets to file: 118 Digital forensics CHAPTER 18. TCPDUMP Figure 18.8: https://www.sans.org/readingroom/whitepapers/forensics/implementing-full-packet-capture-37392 Digital forensics 119 CHAPTER 18. TCPDUMP Figure 18.9: TCPDUMP server connected to a mirroring port 120 Digital forensics CHAPTER 18. TCPDUMP $ sudo tcpdump -i <interface> -w <file.pcap> To stop, hit ctrl + c Rotating capture files: -G <seconds> : write to new file after specified time Attention: the filename must specify a time format: -w trace-%Y%M%d.%H%M%S.pcap Can be combined with -C <size> : limit files to specified size (in MB) Figure 18.10: tcpdump rotate capture v Exercise On your machine, • find your main network interface • record packets on this interface, rotating every 5 seconds Digital forensics 121 CHAPTER 18. TCPDUMP ✓ 18.4 122 Solution $ sudo tcpdump -i <interface> \\ -G 5 \\ -w trace-%Y%M%d.%H%m%S.pcap Chapter review Digital forensics Chapter 19 Wireshark (p.m.) 123 CHAPTER 19. WIRESHARK 124 Digital forensics Part VI Cloud forensics 125 Chapter 20 Why cloud forensics ? 127 CHAPTER 20. WHY CLOUD FORENSICS ? 128 Digital forensics Part VII Final words 129 Chapter 21 Practical considerations 21.1 Preserving evidence When digital forensics is involved, it means something pretty bad has happened. It can be a corporate investigation in case of employee misconduct or stolen data. It can also be for a criminal investigation. In either case, the analyst must absolutely preserve the evidences, so his analysis can be later cross-checked if needed. In this Section, we will cover some practical considerations to keep in mind before and during the investigation: 1. 2. 3. 4. the order of volatility; using write blockers; checking hashes; preserving the chain of custody. 21.1.1 Order of volatility 21.1.2 Write blockers When you plug a disk (like a USB stick, SSD or hard drive) on a computer, most Operating Systems will automatically read and modify some files on the device. This obviously alters the evidence, even before the analyst has a chance to create an image. To avoid this caveat, a write blocker should always be used! A forensic disk controller or hardware write-blocker [29] is a specialized type of computer hard disk controller made for the purpose of gaining read-only access to computer hard drives without the risk of damaging the drive’s contents. 131 CHAPTER 21. PRACTICAL CONSIDERATIONS Figure 21.1: A SATA write blocker from Tableau. The write blocker can be plugged to the host computer using either USB or FireWire (on the left). 132 Digital forensics CHAPTER 21. PRACTICAL CONSIDERATIONS Forensic disk controllers intercept write commands from the host operating system, preventing them from reaching the drive. Whenever the host bus architecture supports it the controller reports that the drive is read-only. The disk controller can either deny all writes to the disk and report them as failures, or use on-board memory to cache the writes for the duration of the session. A disk controller that caches writes in memory presents the appearance to the operating system that the drive is writable, and uses the memory to ensure that the operating system sees changes to the individual disk sectors it attempted to overwrite. It does this by retrieving sectors from the disk if the operating system hasn’t attempted to change them, and retrieving the changed version from memory for sectors that have been changed. 21.1.3 Hash checking 21.1.4 Chain of custody 21.2 Time and time zones Digital forensics 133 CHAPTER 21. PRACTICAL CONSIDERATIONS 134 Digital forensics References [1] [2] “Digital forensics - wikipedia.” https://en.wikipedia.org/wiki/Digital_fore nsics. “Intelligence analysis - wikipedia.” https://en.wikipedia.org/wiki/Intelligen ce_analysis. [3] “The differences between data, information, and intelligence - united states cybersecurity magazine.” https://www.uscybersecurity.net/csmag/thedifferences-between-data-information-and-intelligence/. [4] “SIFT workstation | SANS institute.” https://www.sans.org/tools/siftworkstation/. R. H. Arpaci-Dusseau and A. C. Arpaci-Dusseau, Operating Systems: Three Easy Pieces, 1.00 ed. Arpaci-Dusseau Books, 2018. [5] [6] “Disk sector - wikipedia.” https://en.wikipedia.org/wiki/Disk_sector. [7] “ext2 - wikipedia.” https://en.wikipedia.org/wiki/Ext2. [8] [9] “The second extended file system.” https://www.nongnu.org/ext2-doc/ext2 .html. “FTK® imager - exterro.” https://www.exterro.com/ftk-imager. [10] “Our software | ASR data.” http://www.asrdata.com/?page_id=205. [11] “Mactime - SleuthKitWiki.” https://wiki.sleuthkit.org/index.php?title=Mac time. “BitLocker - wikipedia.” https://en.wikipedia.org/wiki/BitLocker. [12] [13] “BitLocker overview - windows security | microsoft learn.” https://learn. microsoft.com/en-us/windows/security/operating-system-security/dataprotection/bitlocker/. [14] “Windows registry - wikipedia.” https://en.wikipedia.org/wiki/Windows_ Registry. 135 CHAPTER 21. PRACTICAL CONSIDERATIONS [15] “Registry: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet.” ht tps://renenyffenegger.ch/notes/Windows/registry/tree/HKEY_LOCAL _MACHINE/System/CurrentControlSet/. [16] “Security identifiers | microsoft learn.” https://learn.microsoft.com/en-us/w indows-server/identity/ad-ds/manage/understand-security-identifiers. [17] “Security identifier - wikipedia.” https://en.wikipedia.org/wiki/Security_I dentifier. “Windows forensic analysis | SANS poster.” https://www.sans.org/posters /windows-forensic-analysis/. [18] [19] “Updates to the RecentDocs key in windows 10 – forensic 4:cast.” https: //forensic4cast.com/2019/03/the-recentdocs-key-in-windows-10/. [20] “Windows ShellBag forensics in depth.” https://www.giac.org/paper/gcfa/ 9576/windows-shellbag-forensics-in-depth/128522. [21] “Libyal/libevtx: Library and tools to access the windows XML event log (EVTX) format.” https://github.com/libyal/libevtx/. [22] “x86 assembly/advanced interrupts - wikibooks, open books for an open world.” https://en.wikibooks.org/wiki/X86_Assembly/Advanced_Interr upts. [23] “System call - wikipedia.” https://en.wikipedia.org/wiki/System_call. [24] “Internet protocol version 4 - wikipedia.” https://en.wikipedia.org/wiki/In ternet_Protocol_version_4. “Transmission control protocol - wikipedia.” https://en.wikipedia.org/wiki/ Transmission_Control_Protocol. “Internet control message protocol - wikipedia.” https://en.wikipedia.org/w iki/Internet_Control_Message_Protocol. [25] [26] [27] “User datagram protocol - wikipedia.” https://en.wikipedia.org/wiki/User _Datagram_Protocol. [28] “Address resolution protocol - wikipedia.” https://en.wikipedia.org/wiki/ Address_Resolution_Protocol. “Forensic disk controller - wikipedia.” https://en.wikipedia.org/wiki/Fore nsic_disk_controller. [29] 136 Digital forensics