DRAFT DRAFT A Technique for Measuring Data Persistence using the Ext4 File System Journal1 Kevin D. Fairbanks Electrical and Computer Engineering Department United States Naval Academy Annapolis, MD Abstract—In this paper, we propose a method of measuring data persistence using the Ext4 journal. Digital Forensic tools and techniques are commonly used to extract data from media. A great deal of research has been dedicated to the recovery of deleted data; however, there is a lack of information on quantifying the chance that an investigator will be successful in this endeavor. To that end, we suggest the file system journal be used as a source to gather empirical evidence of data persistence, which can later be used to formulate the probability of recovering deleted data under various conditions. Knowing this probability can help investigators decide where to best invest their resources. We have implemented a proof of concept system that interrogates the Ext4 file system journal and logs relevant data. We then detail how this information can be used to track the reuse of data blocks from the examination of file system metadata structures. This preliminary design contributes a novel method of tracking deleted data persistence that can be used to generate the information necessary to formulate probability models regarding the full and/or partial recovery of deleted data. Keywords-Ext4; File System Forensics; Digital Forensics; Journal; Data Persistence; Data Recovery; Persistence Measurement I. INTRODUCTION The field of Digital Forensics regularly requires the extraction of data from media. Once the data has been extracted different methods of analyzing the data may be initiated and, in fact, lead to further data extraction. From a general standpoint, the targeted data can be split into two categories: allocated and unallocated. Under normal circumstances, it can be assumed allocated data is always present and persists until it is reclassified as unallocated and/or purposefully overwritten. It is when data is classified as unallocated that the value of its persistence rapidly approaches zero from the standpoint of a normal information system. In this paper, we define persistence as a property of data that depends on a variety of factors including its allocation status. In the scope of this paper, we focus on measuring the persistence of unallocated data rather than the recovery of data. Depending upon the circumstance, the recovery of unallocated or deleted data can be of the utmost importance. For example, if an employee is suspected of leaking very important documents from a company using digital media such as an external hard disk or thumb drive, finding evidence of the data leakage in the form of file fragments or unused directory entries can help to determine the scope and breadth of their malicious activities. A less security-based example involves the accidental deletion of data by a user. From the user standpoint, the recovery of this data (e.g. digital pictures, term papers, etc.) is critical. From a data extraction standpoint, the complex nature of modern information systems and the many layers of abstraction that exist between an end user and a storage solution make successful recovery of all or part of this data depend on many factors that have varying weights of importance. 1 This paper has been accepted and will be published in the Proceedings of the 39th IEEE Computer Society International Conference on Computers, DRAFT DRAFT DRAFT DRAFT Currently, it cannot be stated with mathematical certainty that a deleted file that possesses a specific set of attributes, such as size and elapsed time since deletion, has a certain chance for full or partial recovery. This uncertainty is due to the complex nature of contemporary storage devices and the many layers of abstraction between a user application and the data that is being saved. In [1], Fairbanks and Garfinkel posit that data can experience a decay rate and list many factors that can affect this rate. This paper seeks to extend that premise by proposing a method to observe data decay in the Ext4 file system. The proposed method makes use of the file system journal to determine when blocks of data have been overwritten as an alternative to differential analysis techniques that make use of disk images before and after a set of changes has occurred as mentioned in [12]. Once proper measurements of data decay can be consistently taken, then experiments to gather empirical data can be designed. Thus, we view the development of the proposed method as a foundational step toward solving the larger problem of quantifying the probability of data persistence after it has been deleted. II. BACKGROUND A. Layers of Data Abstraction Fig. 1: Data Abstraction Layers As noted in [6], when analyzing digital data, there are multiple layers of abstraction to consider. This is depicted in Fig. 1, which focuses on the analysis of persistent data from storage media rather than memory or network sources. The analysis of application data involves looking into targeted files, understanding the formats associated with those files, and how a particular application or set of applications interprets and manipulates those files. This includes the analysis of JPEG, MP3, and even Sqlite3 files. While application data is usually what end users interact with the most, the applications typically rely on a file system for data storage. File system analysis makes use of the data structures that allow applications to manipulate files. Although the file system layer is primarily concerned with the storage and retrieval of application data, in the process a great deal of metadata is created about the application data, and even the file system itself, that can be used for data recovery and analysis. File systems typically reside in one or more volumes. Volumes are used to organize physical media. They may be used to partition physical media into several areas or combine several physical media devices into a single or multiple logical volumes. Examples of volume organization include RAID and LVM. Analysis at the physical layer often involves the interpretation of bytes, sectors, and/or pages depending upon the physical medium. Each layer in this model affects the persistence of data. For example, a word processing application may be used to create and modify a file. From the perspective of the file system, each time a modification takes place a new updated version of the file may be created and the old one deleted. This process generally leaves the old deleted data on the physical media until it is overwritten. Beneath the file system, the file data could be striped or duplicated across multiple volumes. Also, one or more volumes DRAFT DRAFT DRAFT DRAFT may reside on physical media that inherently makes duplicate copies of data to address error or lengthen the lifetime of the media device by distributing usage across the entire device. The preceding example illustrates the complex nature of data storage and the challenge of measuring data persistence in modern information systems. While the data is not truly irrecoverable until it has been overwritten on the physical media, analysis at that level should not be considered trivial and requires specialized equipment in certain situations. Our proposed approach works at the file system level of abstraction by observing of one of its crash recovery mechanisms, the file system journal, and using it as a vector to detect when data has been potentially overwritten. B. The Ext4 File System Fig. 2: File System Layout2 The features and data structures of the Ext4 file system and its predecessors are detailed in [2], [3], and [4]. The purpose of this section is to provide a high-level overview that facilitates comprehension of the forthcoming results. The Ext family of file systems (Ext2, Ext3, and Ext4) generally divides a partition into evenly sized block groups with the potential exception of the last block group. While each block group has a bitmap for data blocks and inodes to denote their respective allocation status, only a fraction of the block groups contain backup copies of the file system super block and group descriptors. The file system super block contains metadata about the overall file system such as the amount of free blocks and inodes, the block size, and the number of blocks and inodes per a block group. Each block group has a descriptor that contains information like the allocation status of data blocks and inodes as well as the offset of important blocks in the block group. The inode table is where the inode data structures are actually stored. An inode is a file system data structure that contains file metadata such as timestamps and most importantly the location of data blocks associated with a file. Ext2 and Ext3 use a system of direct and indirect block pointers to provide a mapping of the data blocks, while Ext4 employs extents for this purpose. File names are contained in directories, which are a special type of file. Each filename is associated with an inode number through the use of a directory entry structure. This arrangement creates the ability to associate a single inode with more than one filename. For convenience, Table 2 in the Appendix section describes the data structure of a directory entry. C. File System Journals Many current operating systems make use of journaling file systems. This includes Microsoft Windows use of NTFS [5]; Apple’s OSX use of HFS+; and many Linux distributions use of Ext3, Ext4, and/or XFS. File system journals are normally used in situations where a file system may have been unmounted uncleanly, leaving it in an inconsistent state. Events such as power failures, operating system crashes, and the removal of the volume containing the file system before all data has been flushed to the volume and it can be safely removed can lead to the inconsistent state. In these situations, the file system journal can be 2 Appears in [2] and is adapted from [4] DRAFT DRAFT DRAFT DRAFT replayed to bring the file system back to a consistent state in less time than it would take to perform a full file system check. Generally speaking, the use a journal does not guarantee that all user data can be recovered after a system failure. As its main purpose is to restore consistency, it just ensures that a file system transaction, such as a set of write operations, has taken place fully or not at all. This property is referred to as atomicity. Although our proposed technique focuses on the Ext4 file system due to its open source nature, in theory it can be generalized and applied to several other file systems. In [2], Ext4 file system structures are examined from a Digital Forensic perspective and comparisons are drawn between Ext4 and its predecessor Ext3. Although the overall structure of the file system is important, we shall focus on the Ext4 journal. D. The Ext4 Journal Fig. 3: Journal Blocks Fig. 3, taken from [2], summarizes the major block types contained in the Ext4 file system journal. The journal is a fixed-size reserved area of the disk. Although it does not have to reside on the same device as the file system being journaled, it commonly does. Also, because space for the journal is usually allocated when the file system is created, its blocks are normally contiguous. The Ext4 journal operates in a circular fashion. Thus when the end of the journal area has been reached, new transactions committed to the journal overwrite the data at the beginning of the journal area. The journal uses a Journal Super Block to indicate the start of the journal. The Ext4 journaling mechanism, the Journal Block Device 2 (JBD2), is a direct extension of the Ext3 JBD [3]. Like the original JBD, JBD2 is not directly tied to the Ext4. The fact that other file systems can make use of it as a journaling mechanism, is important to note as this explains why JBD2 is not file system aware. When JBD2 is given a block of data that is to be journaled, it does not know if that block contains actual data or metadata about a set of files or the file system. The only information that it has and really needs for recovery situations is the file system block number to which the journaled block corresponds. Ext4 groups sets of write operations to the file system into transactions. Each transaction is then passed to JBD2 to be recorded. Every transaction has a unique sequence number that is used in descriptor, commit, and revoke blocks. The descriptor block marks the beginning of a transaction and contains a list of the file system blocks that are being updated in the transaction. The commit block is used to mark the end of a transaction. This means that the file system area of the disk as been successfully updated. If a DRAFT DRAFT DRAFT DRAFT transaction has not been completed and a system crash occurs, a revoke block is created. Revoke blocks nullify all uncommitted transactions with a sequence number less than them. In Fig. 3, transaction 55 has fully completed. If a system crash occurs before the metadata can be written to the file system area of the volume, the metadata from the journal will be copied to the correct file system blocks when the journal is replayed. However, transaction 56 is not committed. When the journal is replayed, the revoke block with sequence number 57 will nullify this transaction. As noted in [3], the Ext4 journal was modified from the Ext3 to support both 32-bit and 64-bit file systems. Also, to increase reliability, checksumming was added to the journal due to a combination of the importance of the metadata that is contained within it as well as the frequency with which the journal is accessed and written. If this feature is enabled, a checksum is included in the transaction commit block. The addition of the transaction checksums makes it possible to detect when blocks are not written to the journal. A benefit to this approach is that while the original JBD used a two-stage commit process, JBD2 can write a complete transaction at once. Like its predecessor Ext3, Ext4 can use one of three modes of journaling: Journal, Ordered, and Writeback. Journal mode is the safest mode. It writes all of file data, metadata, and file system metadata to the journal area first. The data is then copied to the actual file system blocks in the target volume. Thus the safety of Journal mode compromises performance as every write to the file system will cause two writes. Both Ordered and Writeback mode only write file and file system metadata to the journal area. The major difference between the two modes is that Ordered mode ensures that data is first written to the file system area before updating the file system journal transaction as complete. In Writeback mode, file system area writes and journal area writes can be interspersed. Writeback mode maximizes performance, while Ordered mode minimizes the risk of file system corruption when only journaling metadata. Many Linux distributions operate in Ordered Mode by default. Fig. 4: Journal Blocks III. HYPOTHESIS We propose continuously monitoring the Ext4 journal as a method to measure the persistence of data. In particular, we suggest monitoring journal descriptor blocks as they contain the file system block numbers of the data being written to the journal. Depending on the mode in which the journal is operating, the file system blocks that are recorded in the journal can be analyzed to reveal further information about data persistence in the Ext4 file system. To test this hypothesis, a series of python scripts has been implemented that makes use of the Ext2-4 debugging utilities. The scripts use both the dump and logdump commands of the debugfs program to regularly gather the contents of the file system journal. The contents are then parsed and inserted into a Sqlite3 database, which can be queried offline to gather persistence measurements. Sample output from this system is displayed using an Sqlite3 database browser in Fig. 4. As our goal is to test the suitability of the Ext4 journal as a mechanism to measure data persistence, all blocks written to the journal were recorded for analysis. DRAFT DRAFT DRAFT DRAFT As the mechanism to gather data from the journal runs continuously, the circular nature of the journal does not inhibit the ability collect to persistence measurements. As summarized in Fig. 4, each JBD2 descriptor and commit block pair (denoted by block types of 1 and 2 respectively) contains increasing matching transaction sequence number timestamps. Fig. 4 also denotes the journal and file system blocks used by a transaction. While the journal blocks will eventually be reused, the transaction sequence numbers and commit times are sufficient to ensure proper order when tracking file system block reuse. IV. EXPERIMENTAL PROCEDURE A. Test Environment Setup The experiments used for this preliminary research were conducted using an Ubuntu 14.04.1 LTS Linux environment. The 3.8.0 version of the Linux kernel was used throughout the experiments. The 1.42.9 version of e2fsprogs (the Ext2, Ext3, and Ext4 file system utilities) was used to confirm our analysis of the Ext4 journal. All data gathered from the journal was also verified using hex editor to ensure the proper functioning of the data collection scripts. In order to obtain consistent results, a 10 GB file was created using the dd command. This was formatted using the mkfs.ext4 tool with all of its default behaviors and then mounted as a loop device. It is believed that this setup is ideal for determining the validity of a persistence measurement technique. If the file system volume on which the operating system and the majority of the binary application files reside is used as a measurement target, then the journal shall capture the effects of various logging and background mechanisms executing on the measurement target. Although it is important to understand the effect normal operating system behavior will have on data persistence, this should be decoupled from the development and testing of persistence measurement techniques as much as possible. This approach allows researchers to separate measurement technique limitations and idiosyncrasies from artifacts produced by a particular operating system and its associated applications. B. Ordered Mode Experiment This test of the proposed measurement strategy consisted of several simple operations. First a file, Test.txt, was created using the vim text editor and saved. Next, text was added to the file and it was saved again. The cp command was then used to make a duplicate of the file, Test2.txt. Finally the original file was deleted using the rm command. Throughout this series of operations the file system journal was active and operating in Ordered Mode. From the perspective of usable data logged to the journal, Ordered and Writeback are expected to behave similarly; therefore, a separate experiment using Writeback Mode was not conducted. C. Journal Mode Test The major focus of this research is the study of data persistence. In this context, it is reasonable to sacrifice file system performance in order to gain insight into the potential to recover deleted data. With this in mind, the loop device was unmounted and the journal was set to operate in Journal Mode. The procedure from the Ordered Mode Experiment was repeated, only altering the name of Test.txt to Test3.txt and Test2.txt to Test4.txt. DRAFT DRAFT DRAFT DRAFT V. RESULTS & ANALYSIS A. Ordered Mode Results File System Block Number 673 657 1 8865 0 642 Description Group 0 Inode Table Group 0 Inode Bitmap Group Descriptors 1st block of root directory File System Super Block Group 1 Bitmap TABLE 1: BLOCK DESCRIPTIONS A review of the logged journal data after the Ordered Mode experiment was conducted revealed several blocks being modified during this process. They are summarized in Table 1 along with a description of the contents. Each block was recorded entirely in the file system journal and is consequently accessible for the study of data persistence. The inode and data block bitmaps, blocks 657 and 642 respectively, can be used to determine changes in the allocation status of the respective structures they denote over time. In Fig. 4, it can be seen that block 657 is updated in several transactions. From block 673, entire inodes can be extracted. It has been noted in [2] that Ext4 inodes make use of both inode-resident extents as well as extent trees when necessary. Furthermore, it was demonstrated that when extent trees are created, the blocks that make up the tree would be recorded in the file system journal. Therefore, it is entirely possible to determine the file system data blocks that are modified by analyzing journal transactions and extracting the necessary information from the inode structure. An example of this is displayed in Fig. 5 where the file system block number of the data is highlighted on the line beginning at offset 0x0045B20. Fig.5: Inode Structure Retrieved from Journal During the experiment, the ls –l command was used to retrieve the inode numbers associated with the Test.txt and Test2.txt files. It was observed that the Test.txt file was initially associated with inode 12 after creation. After the file was manipulated, the inode was changed to 14. The creation of the Test2.txt file via the cp command then associated it with inode 12. DRAFT DRAFT DRAFT DRAFT In this experiment, different versions of block 8865 were revealed to contain a wealth of information including the names and inode numbers of temporary files created by the text editor. This is captured in Fig. 6. Table 2 is provided to aid in the analysis of the Fig. 6. Fig. 6-a displays the data from the directory file while Test.txt was being edited with vim. In this subfigure it can be seen that .Test.txt.swp points to inode 14 while .Test.txt.swx, which is no longer valid due to the record length of previous entry, pointed to inode 13. Fig. 6-b shows the state of the directory after the data has been added to Test.txt and Test2.txt created by using cp. This subfigure shows that Test2.txt is now associated with inode 12 while Test.txt is associated with inode 14. It is worth noting that while “Test2.txt.swp” appears to be a filename, the name length field at offset 0x0062 (0x09) associated with the directory entry cause the “.swp” portion to be invalid. By looking past the last valid entry in this directory it can be seen that the file Test.txt~ was at one point associated with inode 12. The “.swx” which follows the filename entry is residue from an earlier version of the directory data block. Finally, Fig. 6-c exhibits the final state of the block after Test.txt has been deleted. The major difference is the between Fig. 6-c and Fig. 6-b is that the record length field of the Test2.txt file entry at offset 0x0060 has been changed to 0x0fd4 (4052) to reflect that any entry after it is invalid. B. Journal Mode Results (a) Before the Test.txt is saved (b) After Test2.txt is created (c) After deletion of Test.txt Fig. 6: Directory Changes DRAFT DRAFT DRAFT DRAFT The results of the Journal Mode Experiment closely mirrored that of the Ordered Mode Experiment. The primary difference was the expected inclusion of file data blocks in the journal enabling greater detail to be gained from the journal monitoring technique directly. As none of the data blocks were reused, we were able to handily retrieve the data from the file system area and verify that it matched what was committed in the journal. While this is not novel from a data recovery standpoint, from a persistence measurement viewpoint this could be very important. Since newer versions of the data blocks are recoverable from the journal, it may be possible to determine the persistence of data when blocks are partially overwritten. Thus the mode of journaling will yield greater resolution when measuring data persistence. VI. RELATED WORK Using the file system journal to gather data is not unique from a Digital Forensics perspective as noted in [7] and [8]. In [9], a method of continuously monitoring the journal to detect the malicious modification of timestamps is examined. In the context of computer security persistence as been studied using virtualization technology such as in [10]. Due the nature memory, understanding how data persists is important to memory forensics and resulted in research such as that conducted in [11]. For detecting changes after a set of events has occurred, differential analysis has been employed in various forms [12]. However, this type of analysis typically makes use of multiple disk images or virtual machine snapshots. To our knowledge, using the file system journal as a vector to measure data persistence is a novel approach. VII. LIMITATIONS The proposed method to measure data persistence is reliant on the file system journal. As such, it cannot reliably determine if data persists on the physical media (at a lower level of abstraction) due to operations such as wear leveling. This method of monitoring is also ineffective on file systems that do not employ a journal such as FAT. Ext4 does not flush data to the file system immediately when a write operation is performed. Instead it buffers data for a period to permit the block allocator time to find contiguous blocks. A tradeoff is that files with a short lifetime may not be written to the file system at all. In these situations, that data may not appear in the file system journal. As this data is ephemeral in nature, its impact on persistence may be minimal. This warrants further study. VIII. CONCLUSIONS AND FUTURE WORK We have proposed a method to measure data persistence by continuously monitoring the file system journal and demonstrated that depending upon the method of journaling that is employed, the level of detail that can be readily gathered varies. The technique employed takes advantage of the primary purpose of the file system journal as a recovery mechanism and extracts key information, such as the list of file system blocks that have been updated, from the journal descriptor blocks. This method is proposed as an alternative to using differential analysis techniques that detect file system block changes between different disk images after a set of operations has been performed. While the monitoring technique will impose overhead, it collects data on incremental file system changes that may be lost when performing disk image differencing. Another tradeoff is that the proposed method is influenced by the method of journaling employed by the target file system. If the journal is operating in Ordered or Writeback mode, this technique will be able to track inode, and block allocation changes handily. Furthermore, it can be used to directly track the persistence of data in directory files. The limitation of using the journal as a mechanism to track persistence in these modes of operation is that the content of a normal file system block cannot be viewed. This creates a limit on the resolution of persistence tracking making the size of a file system block the smallest unit of measurement. If the file system journal is operating in Journal Mode, where all file data is recorded to the journal before being written to the file system area, then all of the benefits of Ordered Mode are gained as DRAFT DRAFT DRAFT DRAFT well as insight into file content persistence. This will allow research to be conducted into instances where a file system block is only partially overwritten. As our long-term goal is to measure persistence under varying circumstances in order to generate data that will aide in the formation of probability models, we will have full control over the method of journaling employed in future experiments. The proposed method of gathering evidence of data persistence captures many of the changes to the file system in an incremental manner. It also addresses issues of missing data that would arise by the circular nature of the journal. The trade off is that the journal must be constantly monitored. Other methods of monitoring system changes may be more efficient or could be used in conjunction with the proposed method. Future work includes studying these techniques and performing a comparative analysis of the different ways in which persistence can be measured at different layers of abstraction. Finally, when measurements of data persistence are taken, research must be done in order to understand the best way to quantify persistent data and the rate at which unallocated data is overwritten. For example, if a file system is quiescent when it is not being actively written to by an application, data persistence most likely depends upon the number of user-initiated operations since file deletion. However, if in the absence of user activity the file system, operating system, or media device begins to perform a background activity, such as defragmentation; then the time since file deletion may be the most important factor. APPENDIX Size 32b 16b 8b 8b Varies Description Inode Number Record Length Name Length File Type File Name TABLE 2: DIRECTORY ENTRY FORMAT3 ACKNOWLEDGEMENTS This work was supported, at least in part, as a Naval Academy Research Council project. 3 All multiple byte values are little endian encoded DRAFT DRAFT DRAFT DRAFT REFERENCES K. Fairbanks and S. Garfinkel, “Column: Factors Affecting Data Decay,” Journal of Digital Forensics, Security and Law, p. 7, 2012. [2] K. Fairbanks, “An analysis of Ext4 for digital forensics,” Digital Investigation, vol. 9, pp. S118–S130, Aug. 2012 [3] A. Mathur, M. Cao, S. Bhattacharya, A. Dilger, A. Tomas, and L. Vivier, “The new ext4 filesystem: current status and future plans,” in Proceedings of the Linux Symposium, 2007, vol. 2, pp. 21–33. [4] D. Bovet and M. Cesati, Understanding the Linux Kernel. 3rd ed. Sebastopol, CA: O’Reilly Media, Inc.; 2005. [5] “NTFS Technical Reference”, March 28, 2003. Available Online: http://technet.microsoft.com/enus/library/cc758691(v=ws.10).aspx [6] B. Carrier, File system forensic analysis. Addison-Wesley, 2005. [7] C. Swenson, R. Phillips, S. Shenoi. File system journal forensics. In: Advances in digital forensics III. IFIP international federation for information processing, vol. 242. Boston: Springer; 2007. p. 231–44 [8] Eckstein, K., "Forensics for advanced UNIX file systems," Information Assurance Workshop, 2004. Proceedings from the Fifth Annual IEEE SMC , vol., no., pp.377,385, 10-11 June 2004 doi: 10.1109/IAW.2004.1437842 [9] Fairbanks, K.D.; Lee, C.P.; Xia, Y.H.; Owen, H.L., "TimeKeeper: A Metadata Archiving Method for Honeypot Forensics," Information Assurance and Security Workshop, 2007. IAW '07. IEEE SMC , vol., no., pp.114,118,20-22 June 2007 [10] J. Chow, B. Pfaff, T. Garfinkel, and M. Rosenblum. “Shredding your garbage: reducing data lifetime through secure deallocation.”In Proceedings of the 14th conference on USENIX Security Symposium Volume 14 (SSYM'05), Vol. 14. USENIX Association, Berkeley, CA, USA, 22-22. 2005. [11] A Schuster, “The impact of Microsoft Windows pool allocation strategies on memory forensics” Digital Forensics Research Workshop 2008. [12] Garfinkel S, Nelson A, Young J. A general strategy for differential forensic analysis. Digital Investigation Aug, 2012;9:S50–9. [1] DRAFT DRAFT