Stage1: Goal: According to the requirement, since the Mirage may cause the physical bit changing of certain image after checking out, we need to find out an effective way to hash and sign this image relying on its logical structure rather than the physical bit layout. Thus, we need to delve into the detailed structures of vmdk file and the common linux filesystem (e.g. ext2/3), by which we can separate hash/sign work into several different parts: the vmdk metadata part and the guest machine’s file system part. After that, we can combine these two parts into the final result. Method: 1. The vmdk description text In the “Virtual Disk Format 1.1” manual provided by vmware, we can see the design details of vmware virtual disk structure. Basically, a text descriptor describing the layout of the data in the virtual disk is the header of a vmdk image file, which in other words, is the metadata of this vmdk. Figure 1 shows the description of our sample vmdk file. Figure 1 The description of vmdk image 2. Guest Machine's File System After acquiring the vmdk's metadata, now we can access the whole file structure (starting from the '/' root directory) by using “vmware-mount” command tool in the vmware workstation utility package to mount this vmdk image to a certain point. Then, we can write a shell script (or a C/Java program) to walk through the whole file structure tree recursively, during which, for every file, we can use the linux built-in command “stat” to retrieve all kinds of information related to this file, including the file name, block number, inode, access status, file-related time, and owner/group id, etc, and concatenate these information with the file content before doing the hash work. So the pseudocode should look like this: foreach(child in the directory) stat $child > statinfo cat statinfo $child >> tmpfile sha1sum tmpfile > thisfilehash thisfilehash XOR previousfilehash > previousfilehash if [ $child also has child ] call recursively end end Figure 2 The output of the stat of a vmdk Note that since in the linux system, everything is designed as a file, even for those devices and processes, we need to be more careful when dealing with those /dev, /proc (generally, this directory should be empty). As mentioned in the given script, we can also use the “fdisk -ul” to get the partition information of this vmdk image, which is also very important to the whole integrity. Discussion: Apart from the above, we may also check whether it is possible to get other information of the file system, such as the superblock, inode pool, and so forth, since these parts should also be covered by the integrity checking. Though we can use filesystem-related command including “e2fsck” to get some details about the ext2/3 filesystem, simple mounting may still be unable to fetch enough info from the file system. We may check further for the vmware utility tool or may turn to other methods based on the linux file system API.