University of South Australia Development of an approach using automation to enhance the process of Computer Forensic Analysis Minor Thesis Student: Daniel Walton ID: 110071749 Email Address: waldj007@mymail.unisa.edu.au Supervisor: Dr Elena Sitnikova Subject: Master of Science (Cyber Security and Forensic Computing) LMIA Year: 2013 AUTHOR’S DECLARATION I declare that this Thesis does not incorporate, without acknowledgement, any material previously submitted for a degree or diploma in any university; and that to the best of my knowledge it does not contain any materials previously published or written by another person except where due reference is made in the text. Daniel Walton 1 ACKNOWLEDGEMENTS To my wonderful wife Rebecca, thank you for your patience and support all the way through the process of my study for this thesis, it has been invaluable. To my adorable daughter Charlotte, your coming into the world has brightened our lives as it’s so much of a joy watching you grow and discover this world. To Dr Allan Watt, without your urging I may not have begun these studies. Thank you for your help and encouragement. To my supervisor Dr Elena Sitnikova, many thanks for your help, guidance and support with this thesis. 2 Abstract One of the biggest issues for digital forensic investigators is that storage in computers continues to grow in size rapidly, while the ability to extract intelligence from said data isn't increasing at the same rate. This is making it harder and harder to find relevant data in the growing sea of irrelevant data. Digital Forensic analysis is performed usually when an incident of interest is detected. Incidents are either detected by humans or by automated systems detecting suspicious behaviour. There are automated systems which detect a need for digital forensic analysis by identifying suspicious activity (Forensic Readiness). To do this these systems usually have a set of criteria or rules that define normal activity, as well as criteria for abnormal or suspicious activity and will send alerts when suspicious activity occurs. However when it comes to the forensic analysis of evidence, there is not much in the way of software to analyse forensic evidence and provide findings. Automated analysis of evidence will be invaluable for investigators, as it will help them to remove irrelevant data and focus on data of interest. This research examined what is currently available for automatic analysis of evidence, as well as assessing the implementation of an automatic analysis system. 3 AUTHOR’S DECLARATION ................................................................................... 1 ACKNOWLEDGEMENTS ....................................................................................... 2 Abstract ................................................................................................................... 3 CHAPTER ONE - Introduction ................................................................................ 7 1. Background ................................................................................................... 7 1.1 Forensic Readiness ................................................................................ 7 1.2 Forensic evidence acquisition................................................................. 8 1.3 Forensic Analysis .................................................................................... 8 1.4 Automated analysis ................................................................................ 9 1.5 Significance of the Problem .................................................................. 10 1.6 Research Issue.................................................................................... 11 1.6.1 Sub-Problems.................................................................................... 11 1.7 Elaboration of the Sub-problems ......................................................... 12 1.8 Research thesis title ............................................................................ 14 Chapter Two – Literature Review .......................................................................... 15 2 Literature review.......................................................................................... 15 2.1 Automating processing of evidence ........................................................ 15 2.2 Computer Profiling ................................................................................... 17 2.3 Timeline analysis ..................................................................................... 19 2.4 Analysis ................................................................................................... 21 Chapter Three - Methodology ............................................................................... 23 3. 3.1 Methodology................................................................................................ 23 Focus of research .................................................................................... 24 CHAPTER Four – Computer profiling ................................................................... 26 4. 4.1 Introduction ................................................................................................. 26 Methods ................................................................................................... 28 4 CHAPTER FIVE – Log2timeline and Plaso ........................................................... 30 5. Introduction ................................................................................................. 30 5.1 Comparison ............................................................................................. 31 CHAPTER SIX – Analysis of different analysis systems ....................................... 33 6. Introduction ................................................................................................. 33 6.1 Incident Response/ Malware (Malicious Software) ................................. 33 6.2 Intellectual Property theft (IP Theft) ........................................................ 33 6.3 Access to Child Abuse Material (CAM) ................................................... 33 6.4 Snort ....................................................................................................... 34 6.5 Markov chain analysis methods .............................................................. 35 6.6 Further different analysis methods .......................................................... 35 6.7 Statistics ................................................................................................. 36 CHAPTER SEVEN – Tests and implementations ................................................. 38 7. Introduction ................................................................................................. 38 7.1 Profiling ................................................................................................... 38 7.2 Large Files ........................................................................................... 40 7.3 Most Visited websites .......................................................................... 42 7.4 User profile registry dates and times: .................................................. 43 7.5 Analysis ................................................................................................... 44 7.6 Intellectual Property (IP) Theft ............................................................ 45 7.7 Incident Response / Malware analysis ................................................ 49 7.8 Time changing .................................................................................... 51 7.9 Statistics .............................................................................................. 51 7.10 Rule based analysis .......................................................................... 52 7.11 Findings .................................................................................................. 53 CHAPTER EIGHT - CONCLUSIONS AND FURTHER WORK ............................. 54 8.1 Research Conclusions ............................................................................ 54 5 8.2 Areas for further study ............................................................................. 57 8.3 Conclusion .............................................................................................. 59 References:........................................................................................................... 61 Appendix A – Glossary .......................................................................................... 65 Appendix B – List of formats that Log2Timeline tool parses ................................. 66 Appendix C – USB history report .......................................................................... 68 Appendix D – Web URL’s ..................................................................................... 73 6 CHAPTER ONE - Introduction 1. Background Digital forensic investigators are swimming in a sea of data, the size of storage keeps increasing as well as the amount of data people are generating is also increasing. On the other side digital forensic tools abilities to deal with all this data is not increasing at the same rate. Data reduction is done by removing known files by their cryptographic hash, using date ranges or finding files based on specific keywords (Richard & Roussev 2006). There currently is a lot of manual analysis being performed by investigators. The motivation for this paper is to find what sort of automatic analysis techniques are currently in use, what research has been done (often different to real world), find research areas of interest and investigate an approach which will be able to assist with the analysis of evidence 1.1 Forensic Readiness Forensic investigations are normally started when something suspicious is detected and forensic analysis is initiated to find out exactly what happened. This detection either is manually detected by a human suspicious about some activity or lack of, who then initiates procedures to get this investigated or the suspicious activity is detected automatically by a specialised computer monitoring system which then sends an alert and then a decision can be made whether or not forensic analysis is required. If forensic analysis is required then the evidence is acquired and analysis started. There are many different types of anomaly detection systems and methods for computers. Anti-virus software is used on most computers and is used to detect and clean viruses, including other forms like malware, worms, Trojan horses and spyware. An Intrusion Detection System (IDS) is used to detect suspicious network traffic; it monitors network traffic and raises alerts when suspicious traffic is detected. Security Information Event Management (SIEM) systems monitor the logs generated from Computers (mostly servers), IDS's and network devices like firewalls, routers and switches for suspicious activity. Anti-Spam software is used to detect unsolicited 7 commercial email otherwise known as SPAM. These systems all have methods for discerning between legitimate content and behaviour and abnormal content or behaviour. Forensic readiness is a model for the early detection and collection of evidence relating to suspicious activity (Tan 2001). SIEM systems when configured correctly provide Forensic Readiness abilities for the ICT infrastructure that they have been configured to monitor, especially by providing rule based event detection and secure logging systems (Rowlingson 2004). This works by collecting logs from all servers, network infrastructure (switches, routers, IDS's) and sometimes data from workstations. As these logs all come from different places they are also in different formats with windows event logs, syslog event logs from Unix/Linux and more commonly, different types of text based delimited logs. Conversions are performed across the different formats so as to combine them in one aggregated log. These aggregated logs are then automatically examined for suspicious activity. Forensic readiness helps detect incidents which will need investigation by Computer Forensic Analysis. 1.2 Forensic evidence acquisition Once an event is detected evidence must be acquired into a form for analysis. Acquired evidence is a snapshot of the system at the point of acquisition (McKemmish 1999; Rider, Mead & Lyle 2010). This could be a copy of a mobile phone, Memory from a computer or storage like hard disk drives (HDD's) and USB flash drives (Sutherland et al. 2008). The acquired evidence is collected in a way so that it can be verified afterwards to confirm the evidence hasn't been modified since the initial acquisition (NIST 2004). This is done with a cryptographic hash and is usually a MD5 hash or a SHA1 hash. Once the evidence has been acquired a copy can be made of the acquired evidence, the master copy stored securely away and the working copy used for analysis. 1.3 Forensic Analysis Analysis is always performed on the acquired evidence files. As the evidence is a static resource it is easy to try different analysis methods as well as have the time to compare tools to make sure the analysis results are repeatable. Most forensic analysis is manually performed by investigators using any of the four main forensic 8 tools; AccessData’s Forensic Tool Kit (FTK), GetData’s Forensic Explorer, Guidance Software’s Encase, and X-Ways Forensics. These tools have many automated features for the processing of evidence and extraction of data of interest but the actual analysis still has to be done by the investigator. The automated features provided by these tools are very useful for removing irrelevant data, the opening and viewing of many different file types, viewing the content of container files, parsing operating system files for metadata, finding deleted files, data carving and the ability to search for files with raw searches as well as indexed searches. All these features help the investigator with the forensic analysis of evidence yet don't actually do any automated analysis of the evidence. For example, for the detection of IP theft (Australian Institute of Criminology 2008) the investigator will analyse the USB device history information, windows shortcut ‘LNK’ files for removable device access, Internet history, registry shellbag entries and sequential access/create times of files and correlate the results, in the interest to find whether there are signs that data was copied to a USB flash drive. The process of doing this is quite manual and there may be ways of automating some of this process. 1.4 Automated analysis The open source Intrusion Detection System (IDS) Snort uses a simple rule based detection system to detect network packets of interest (Roesch 1999). The syntax for Snorts rule language is flexible enough that rules can be written to detect items of interest in traffic in protocols that it was not originally designed to detect and an example of this is ‘SCADA MODBUS’ communications over TCPIP (Morris, Vaughn & Dandass 2012). Snorts rule based analysis is primarily packet by packet and rules do not normally analyse more than once packet at once. The network packets it analyses are usually standardised and documented protocols but for the automated analysis of computer evidence there are a lot more different types of sources of data to process. For a thorough analysis of evidence of a computer running MS Windows examination of the following is required: the filesystem, Event logs, Registry, parsed registry structures (an example of these are some of the following: shell bags, userassist, USB) registry, Internet history of all installed browsers and image EXIF metadata. 9 These different pieces of data are stored in different files and in different formats and reading from all these different areas would be complicated as the different formats would all need their own custom parser to extract out intelligence and the different data from each file would need to be combined. As Snort’s rule based system lends itself to adaptation to monitor and analyse different protocols from the same packet dump, a similar rule based system should also work over a simplified, standardised file format for the recording of metadata from evidence which differentiates between the different sources of metadata, such as the differentiates between the entries sources. A comparison between the different forensic tools currently available to the earlier tools made by the same companies finds minimal difference in the area of automated analysis. Their functionality is mostly still the same as it was and they are mostly just scaled up versions to deal with more data as well as parsing more file types (Marrington et al. 2007). Automated analysis has the potential to revolutionise the digital forensic area by building on existing tools and providing actual analysis of evidence. This would save the investigator having to manually parse through evidence looking for areas of interest, as the results of the automated analysis would immediately highlight the areas of interest for them to analyse. The investigator could then spend more time on validating the results of the automated analysis and collating their findings together as the automated analysis system quickly directs them to areas of interest, saving much labour time. If the analysis system used rules they could be shared and investigators could write new rules providing additional analysis features. 1.5 Significance of the Problem Investigators currently spend a lot of time manually analysing evidence and with large cases with much data it would make a big difference, if there was a way to speed up analysis of the supplied evidence. This would be so investigators can quickly get a profile of the activity on each computer, as well easily finding signs of suspicious activity as this will enable them to spend more time on the solving of cases and less on the extraction and analysis of evidence. 10 In many police departments there are backlogs for digital forensic analysis which would benefit with an automatic analysis system, which could analyse evidence and provide reports with profiles of the compute. This would also include all users as well as results of automatic analysis, showing what suspicious activity has occurred on the computer. As usual investigators will need to backup any findings with evidence and so the reports will contain information explaining where the results came from. Hard disk drives are only getting bigger and a way to quickly automate the analysis of evidence will be of large benefit for the digital forensic field. Development of the ‘log2timeline’ tool gave investigators a large jump in the capabilities for investigators, as it enabled them to automatically extract file information and metadata, where previously it was a manual task (Guðjónsson 2010). The plan for this thesis is to advance forensic analysis capability by combining with the capabilities of ‘Plaso’, by adding automated analysis so as to reduce the amount of manual analysis the investigator needs to perform. 1.6 Research Issue The research issue for this project is: How can automation be used to improve the process of Computer Forensic Analysis? 1.6.1 Sub-Problems This has generated a number of sub-problems as follows: 1 What are the existing tools for extracting relevant information from evidence as well as the quality of the extracted information from these tools? 2 What solutions are there for parsing the many undocumented file and metadata formats which are yet to be discovered and documented but could contain information of interest? 3 How to ensure a low false-positive and false-negative detection rates while keeping a high detection rate of relevant information? 4 What approach can be used to enhance digital forensic analysis with automation? 11 1.7 Elaboration of the Sub-problems There are many different formats for the storing of data and metadata on computers which compounds the analysis problem (Brownstone 2004). For many of these files, the format is undocumented or proprietary which also complicates analysis (Kent, Chevalier & Grance 2006). For example for an intellectual property theft case the investigator would want to at least parse the registry (Accessdata 2005; Wong 2007) and the ‘setupapi.dev.log/setupapi.log’ to gain details with regard to USB removable device history, parse all link files, to gain details with regard to files opened from removable devices, parse Internet Explorer for local file access to get further detail regarding access to removable devices and also parse the NTFS USN Journal (Carrier 2005), to check that files were not renamed before being copied to removable devices. It is quite time consuming to parse all these different areas and combine the information to analyse what has happened. As many of these different files and metadata are undocumented, the ability to extract useful information from them is often dependant on individual analysts performing research and discovering the internal formats and writing software to extract useful data. Didier Stevens in 2006 posted his ‘Userassist’ tool to parse the ‘userassist’ entries from the Windows Registry, he later discovered that for Windows 7 and Windows server 2008R2, that the format had changed and so released a newer version of the tool with support for extracting these values from the newer versions of Windows (Stevens 2010). In 2012 he released an updated version with beta support for Windows 8. There are new discoveries of different files and metadata that are useful for forensic analysis. Due to this there is always the concern that investigators who do not keep up to date with new discoveries with regard to what new metadata can be extracted may miss information. This will lead to a lower standard of care, potentially resulting in innocent people being incarcerated or guilty people being let off because of potentially missing relevant information. In regard to automatic analysis, current forensic tools are not able to analyse evidence and provide the Investigator with a report showing discovered suspicious activity as well as provide a profile of activity on the computer. The current forensic tools: 12 are able to view many different file types that are able to parse and extract data from many different files that can help exclude known irrelevant files and detect known relevant files that can analyse email and are able to index and search the evidence. There are many different tools to help the investigator with forensic analysis but their abilities can be mostly summed up as assisting the investigator analyse the evidence, as compared to analysing the evidence themselves and providing the investigator with the results of analysis for the investigator to check. Some of the most common of these tools are AccessData’s Forensic Tool Kit (FTK), GetData’s Forensic Explorer, Guidance Software’s Encase, and X-Ways Forensics. The best that can be hoped for is better assistance for the investigator with analysis as compared to better analysis. The investigator still needs to combine together the different sources of data and perform analysis themselves to discover what actually happened. The detection rate of an automated analysis system is important as investigators do not want to be flooded with irrelevant data. One example is an email spam filter where a high false-positive rate means spam is not removed and the user gets too much spam or with a high false-negative rate lots of legitimate emails are filtered out as spam which is also unwanted. What is wanted is a high detection rate (or high signal to noise ratio) for items of interest and unwanted items removed. To automatically analyse evidence, software would need to be able to read all the relevant parts of evidence needed for the analysis being performed. For intellectual property theft for example, to detect the use of a USB flash drive to copy files off computer information from the file system, link files and the registry are used together to link together activity. Multiple sources of data need to be combined together and used to help detect findings of relevance. This provides challenges as 13 different formats of information need to be combined in ways, where their content is still readable without conflicting or losing detail. 1.8 Research thesis title The proposed research thesis title is Development of an approach using automation to enhance the process of Computer Forensic Analysis 14 Chapter Two – Literature Review 2 Literature review The main aim of this research is to discover what existing systems there are for the automated analysis of computer evidence as well as research into an approach for providing automation for analysis. With the proposal of an automatic evidence analysis system some compromises need to be made. Snort is a network Intrusion Detection System (IDS) it examines network packets to find suspicious activity in network traffic. These network packets arrive serially and have a known size and date and time. This makes the task of analysis a lot simpler as every packet has an arrival time connected to it as well as network traffic follows documented standards and protocols which helps analysis and the job of detecting anomalies. With the forensic analysis of computers there are many different places for data to be stored as well as many different file formats and most are either not standardised or the format isn't open. 2.1 Automating processing of evidence Richard and Roussevin wrote a paper discuss the processing of evidence in ways to minimise and lower the analysis load for investigators because of the growth in the size of collected evidence (Richard & Roussev 2006). They propose a distributed system with which to divide up the evidence so as to be able to get multiple computers processing this evidence. This is automated processing not automated analysis and it seems very similar to what the commercial product FTK does. They include many ways to help with culling known irrelevant files as well as ways to help reveal data of relevance by extracting metadata and other additional information. This automated processing concept could be of use for automated analysis as it is designed to scale processing across many systems and might be of use for large cases where resources are of concern. In Ayers paper “A second generation computer forensic analysis system” he proposes a system for the processing of evidence which is very similar to what was proposed above by Richard and Roussev 2006 and as with theirs it's mostly concerned with the processing of evidence, searches and hashing of evidence to aid 15 the investigators time spent analysing the evidence, one point of difference is with his focus on creating an audit trail with regard to actions by the software and users (Ayers 2009). Farrell 2009 in his thesis is working from a similar premise in that storage devices are getting cheaper and larger and so the automatic processing of evidence to create automated reports will help remove “some of the load” from law enforcement staff (Farrell 2009). His analysis method is fairly easy to implement in that it primarily collects statistics about various forms of data in the evidence and collates it together to create a report. An example of this would be listing the most commonly used email addresses, web pages and recently accessed documents. What is nice to see is the creation of different reports for different users, which is very useful to differentiate who on the computer did what actions. As with Richard and Roussevin he mentions using hashing functions combined with known good and known bad files to detect files to ignore and to flag as important to help with the extraction of relevant data and removal of irrelevant data. Statistics can be a simple method of analysis to implement in an automatic analysis system which helps provide a profile of the system. Elsaesser & Tanner discuss the idea of using Abstract models to “guide” or help the analysis of a computer which has attacked to find details regarding the network intrusion (Elsaesser & Tanner 2001). As the previous papers it also is looking at ways to help the investigator deal with the great amount of data in computers, although in this case its specifically looking at log files to find signs of network intrusions. The abstract models they put forward can be also applied in the general analysis of evidence to determine capabilities that a user may have and with that determine which activities they could or couldn't have been able to do. An example of this is that new software was installed on the system but the user in question didn't have the rights to install the software, which should raise an alert to analyse this further. These concepts are quite viable for detecting abnormal activity but may require more processing time. 16 Regarding the use of general purpose programming languages Garfinkel's paper on the creation of fiwalk.py using the ‘pyflag’ library shows how python can be used to extract file system information and file metadata as well as outputting the processed information into easily parsible ‘XML’ (eXtensible Markup Language) files which would leave the output easily available for others to parse and extend in other programs (Garfinkel 2009). This sort of system helps the development of other tools as it extracts information which it presents in an open format easy for modification and use. 2.2 Computer Profiling Andrew Marrington (2007, 2009, 2011) was involved in the writing of several important papers on the subject of “Computer profiling”. His premise is to simplify analysis of evidence by generating a profile of the evidence which will enable the investigator to get a good idea of what activity has occurred, saving the investigator from having to do the analysis themselves and enables them to make an easy choice whether or not they will need to do a full analysis on the evidence. This is a common theme as storage is increasing in size and being able to create profiles of evidence will help investigators to quickly see whether or not they may be findings of interest in the evidence. This is more towards the concept of digital forensics automatic analysis. Marrington with regard to the forensic reconstruction of computer activity by using events, mentions that there are four classes of objects to be found on a computer system namely “Application, Content, Principal, and System” and they discuss the detection of relationships between these (Marrington et al. 2007). They mention that finding relationships between objects is important and that it is complicated in computers because of many different formats of data and put forward models and ideas on how this could be done. The proposed profiling system uses information from the file system, event logs, file metadata (using libextractor), word metadata and user information from the registry which covers the most important data areas on a computer although there are no mentions of using link files, jump lists, the registry ( This includes registry dates and times, userassist entries , shell bag entries and shim cache entries (Davis 2012)) or internet history which would help broaden the information for analysis. The analysis performed is profiling based and there is no checking for common items of interest like signs of ip theft or signs of malware 17 infection or other potential areas of interest but does make it a lot easier for an analyst to get a good idea of the evidence. A PHD thesis written by Andrew Marrington expands further on the previous paper with discussion of Computer Profiling with further in depth analysis of related areas with examination of ‘datamining’ and analysis of files with statistics based on extractable text and some major analysis with regard to different “Computational models” . Examines computational models put forward by Brian Carrier as well as by Gladyshev and Patel concluding that these models are not feasible without a method to automatically describe evidence based on “a finite state model (Marrington 2009) and decides a better system is to model the computer history by using as a foundation the computer event log. Analysis of computer evidence is a complex problem as there are lots of different types of file formats and metadata as well as different event logs to be processed and analysed. He mentioned models sometimes are hard to translate to the real world and implementation of their model is the best test. It can be seen in this following paper about detecting inconsistency in time lines what the Marrington’s model with a software implementation can perform. Marrington et al discuss some specific automatic analysis in their paper, regarding detecting the changing of computer clocks by examining events from the event log with regard to the fact that certain events and actions cannot occur before a user has logged on to the system (Marrington et al. 2011). Users need to complete login proceedings before they can open applications, if there are events showing the user has open applications at a time when they were not logged in, then that could only occur if the time has been changed. To detect this they correlate file system and document metadata with event log information to compare user activity with login and logoff events, this is an excellent system which could be paired with more user activity information extracted from the users local registry file (NTUSER.dat), internet history and other areas to get more detail regarding user activity. Regarding the detection of changing the system time there are additional methods which could also of been mentioned in the paper as there are gaps in their analysis. The windows event log files in themselves are sequential (ring buffers) and new entries shouldn't be older than previous entries, which can easily detected by sequentially parsing the event log files then sorting by event file offset and 18 comparing the dates for each entry. There are further places which can be examined to detect time changing like thumbs.db thumbnails database files, NTFS USN Journal, Windows restore points, Volume shadow snapshots for example which all have sequential entries. The creator of the bulk extractor tool Garfinkel compares different computers using what he calls “Cross Drive Analysis” over the output of the Bulk Extractor program to find which computers are related (Garfinkel 2006). The bulk extractor tool is a program which processes a disk image at sector level, it doesn't read the file system or parse any files it just processes the text it can extract from each sector, which in itself isn't all that sophisticated but from this it is able to give an rough profile of the computers activity. Most tools are document or file system based and don't focus on analysing unallocated areas which this tool does. The gathered profile data can then be compared to over computers profiles to see if there are any connections. The main areas of weakness are that it will not be able to read fragmented files properly although modern file systems are self-defragmenting and so this is not so much of an issue and the other area of weakness is the ability to read inside compressed or encrypted files. Bulk extractor is quite impressive in that it has the ability to decompress and read some compressed file types and from looking at its roadmap this will only improve. The statistical abilities combined with the data it collects shows that showing what occurs the most will often reveal a profile of behaviour on the machine. An example is the most visited internet urls and most occurring email addresses, these show how useful statistics can be to quickly bring information of interest out of a sea of irrelevant data. With regard to analysis of activities this will only be lightly covered by this tool as it doesn't parse the registry link files event logs, file system and other areas of file metadata. 2.3 Timeline analysis In the interest to extract as much intelligence with regard to activities and dates and times, timelines of computer activity were created to help with analysis. Initially investigators would create a timeline of file system activity and use that for the analysis of incidents and as it was helpful the use of this analysis method has grown with investigators parsing as many areas of a computer as possible to extract dates and times and signs of activity. This was originally manually done by combining in Excel the Internet history, parsed event logs, registry entries and file system data 19 culled from many different tools. This was a very manual process using different tools with different outputs which all needed to be combined into a common format for analysis. This manual and time consuming job was automated by the creation of the log2timeline software as described in Guðjónsson's 2010 seminal paper on mastering the super timeline. The ‘log2timeline’ software improved the making of timelines by parsing many different areas of the computer for artefacts and dates and times and combining it all together into a format which is easy for analysts to use. Guðjónsson calls it the “Super Timeline”. The ‘log2timeline’ tool also helped in that provided tools to help with basic filtering. Timelines help with the analysis of incidents as an investigator can look at a time period of interest and examine all the recorded activity for that time. As an example, looking at the timeline of when a virus infected the system can help find all the virus's changes to the file system and registry and help pinpoint the infection vector as well as discovering how it starts and where it is stored. Creating time lines is a lot easier using the ‘log2timeline’ tool although the analyst still has to do the analysis themselves can be a burden with the vast amount of data that these timelines often contain. For analysts to analyse the timeline fully they need to have an understanding of the meaning of the entries and how they fit into the operation of a computer (Guðjónsson 2010) which requires a lot of knowledge. Guðjónsson had concerns that the ability of the ‘log2timeline’ tool to extract so much detailed information raises the need for filtering to remove the irrelevant information and just keep the relevant. At the moment there is no simple automated way to do this apart from knowing time periods of interest to focus on or having specific whitelist or blacklist keywords to help refine the timeline. Being able to easily find suspicious entries would speed up investigations while quickly reducing the amount of information the investigator needs to wade through which raises the requirement for some way to quickly and easily remove irrelevant or known good entries so as to focus more on the items of interest (Guðjónsson 2010) which is something best automated. Creating a graph of the number of entries with regard to time is a method which will help to easily visualise the timeline and will help with the detection of 20 spikes and surges of activity as well as gaining an overall profile of activity on the computer. Easy visualisation of the timeline will also help with the presentation of reports as it will help non-technical people like lawyers and judges to understand the data. Most of the ideas proposed by Guðjónsson for the simplification of the output of ‘log2timeline’ is directly applicable in an automatic analysis system. 2.4 Analysis As part of his model Marrington 2011, proposed “Four phases of analysis” which fits with his event log focus. They are discovery, content categorisation and extraction, relationship extraction and event correlation and pose an excellent model for the processing of data into more manageable forms (Marrington et al. 2011). The last of the Four phases of analysis “Event correlation” is required to link all the discovered data and relationships to the event log entries, where a system not focussing on log entries could leave out this phase and example of this is using the output of Guðjónssons 2010 “Log2Timeline” software. The ‘Log2Timeline’ software is quite impressive with the breadth of metadata and logs it is able to extract. As mentioned above Marrington's system extracted file system, metadata and logs information from the most important places, but this pales into insignificance compared to the list of places that ‘Log2Timeline’ extracts file system, metadata and log data from. ‘Log2Timeline’ extracts its information from a wide array of files and metadata and provides a vast amount of data but luckily it is in a kind of standardised format to ease analysis. Super timelines make it a lot easier to visualise activity on the system at any point of time. Temporal Analysis is the analysis of events around a specific time. So with a malware infection temporal analysis of the timeline at the time of the infection can provide us with; the point of infection, all malware related files, where all these malware files are, the viruses method for start-up and other changes the malware may have made, like for example. The timeline can be analysed for spikes of activity which often can indicate events of interest and some examples of this are: file copying, deletion of many files, antivirus scans, and software installation. The log2timeline software doesn't do much in the way of analysis in itself but does collect and parse information from many areas an presents it in a simple format ready for analysis. It would be interesting to see the results if Marrington were to 21 combine the information that Log2Timeline extracts with their analysis system and see the additional level of detail and relationships that their report tool would be able to produce. Peisert and Bishop discuss modelling system logs with regard to the detection of actions by intruders.Their paper is mostly applicable to the implementation of Forensic Readiness with the detection of suspicious incidents as compared to digital forensic analysis. Although they do discuss an analysis model which they label “Requires/Provides” and involves the concept of “Capabilities” which are needed to gain a goal as well as the capabilities then provided by gaining that goal (Peisert & Bishop 2007b). With regard to forensic analysis this can be used for example to detect that software has been installed or used which provides capabilities that don't fit with the types of activity expected from the user like the use of peer to peer software like frostwire or even visiting certain types of websites. Carrier and Spafford used an automated analysis technique of looking for outlier files by discovering and classifying normal activity (Carrier & Spafford 2005). Outlier files or activity is files or activity which is not normal and can be detected once rules have been created which whitelist normal activity. An example of this is the discovery of executable files in the c:\windows\fonts directory which is abnormal and should be detected as outlier activity. Categorisation of outlier files can be helpful for automatic analysis yet is dependent on a good rule set of blacklisted files and whitelisted files. This is also directly applicable to analysis of the registry with regard to detection of outlier “Autoruns”. Autoruns is a term referring to programs automatically being run at system startup and are commonly used by unwanted programs like malware and viruses. 22 Chapter Three - Methodology 3. Methodology The research methodology will be similar to the agile software development model working in an iterative model with analysis, development, implementation and evaluation as different stages. It will involve examining the data which the system collects after performing certain suspicious actions. Windows XP will be the operating system of choice as it very well understood by forensic tools and provides a wealth of data for parsing. The methodology to address each of the sub-problems is proposed below. Sub-Problem 1 – What are the existing tools for extracting relevant information from evidence as well as the quality of the extracted information from these tools ? Comparison will be done between the Log2timeline tool, the Plaso tool and different specific tools for the extraction of file metadata. The use of the Plaso and log2timeline tools in investigations isn't quite so common as it could be. The integration of these tools with an automated analysis system will help reduce many of the issues there are with the many different tools for the extraction of metadata. These tools will be compared and the one most suitable for integration with an automated analysis system will be used. Sub-Problem 2 – What solutions are the for parsing the many undocumented file and metadata formats which are yet to be discovered and documented but could contain information of interest? Microsoft and Apple will both continue making new operating systems with new file systems, file types and metadata to be extracted. Collaboration on the ‘log2timeline’ and Plaso tools by adding the new file types and metadata will help enable that new file types and metadata can be parsed and understood by forensic tools. Forensic analysis systems would need to keep up to date and have new rules to deal with the new metadata. 23 Sub-Problem 3 – How to ensure a low false-positive and false-negative detection rate while keeping a high detection rate of relevant information? Research with regarding to text based analysis systems, rule based analysis systems like Snort, statistical analysis systems and comparisons made to discover which methods provide a more reliable analysis system with higher detection rates and more relevant data reliability. Testing with regard to which rules work better and comparisons to find the most reliable methods. Sub-Problem 4 - What approach can be used to enhance digital forensic analysis with automation? Comparison with similar tools like Bayesian spam filters and primarily with Snort and how it's rule based system works as well as examining and comparing different analysis based systems like using statistics, Markov chains, natural language processing, rule based systems and finding an approach to provide the automating of analysis. 3.1 Focus of research Guðjónsson discusses in his paper with regard to future work in in the computer timeline creation area that “The need for a tool that can assist with the analysis after the creation of the super timeline is becoming a vital part for this project to properly succeed” (Guðjónsson, 2010) and the focus of this research will be to work towards assisting with analysis. The concept for the implementation of such a system is based on the concept of the open-source network Intrusion Detection System (IDS) tool called Snort. This is a tool which reads a standardised data source (in snorts cases network packets) analyses them by the use of user created rules and creates alerts for packets of internet. Statistics as well as profiling each user account will help the investigator to get an idea of the activity of each user on a computer, which when combined will provide a profile of the computer itself. A rule based system like Snort could use the output of 24 the ‘log2timeline’ program as a standardised source of computer activity information for analysis. With a flexible enough rule language investigators will be able easily write up new rules to detect new and unforeseen activity. 25 CHAPTER Four – Computer profiling 4. Introduction The idea of computer profiling is to provide the investigator with a report which describes the computer and an outline of user activity. This should give them a rough idea of who has used the computer, some of the activity that has occurred as well as when it occurred. This can help investigators by saving the time that they would need to do the profiling themselves especially as some of the profiling that can be done automatically can be very hard to do manually like statistics. Good profiling data can help the investigator to quickly direct and focus their investigation in the relevant areas. Computer evidence provides a “quantity and complexity” problem for the investigator especially as computer storage and complexity is increasing so computer profiling can help filter out less relevant information and help put focus on the more important specifics of need for the investigator (Marrington 2009). With all analysis compromises need to be made. Some information is a lot more complicated to extract than others and some is of lower priority. The choice needs to be made between what's more useful and what is harder to extract and obtain a useful balance. As there is a lot of detail extracted by Log2Timeline and Plaso this information can be processed to provide computer and user profile information. The following information lists some of the potential information that these reports could contain: The “Quick Report” could contain the following: System profile: • Operating system details ( Windows 7, Home Premium, 64bit …) • Hardware details, (CPU details, RAM capacity, Storage information, Add on cards) • Users, ( list all users) • time zone 26 • Installed software The “Medium Report” could contain the following: (Including the contents of the “Quick Report”) System profile: • Categorising of installed software with flagging of suspicious software. • Listing the top ten large files on the system (can be used to find Truecrypt containers, large archives/videos/mailboxes) User profile: • Connected USB storage drives per user. • Statistics of “user created” files. • Top viewed websites. • Recent website domains. • Recent opened files • Most recent Created, modified or accessed files in user profile. • Users profile paths • Date range showing the first and last date that each user used this computer. The “Exhaustive Report” could contain the following: (including the contents of the “Medium Report”) System profile: • Activity spikes, Find activity spikes for the system. Often shows times of interest. User Profile: • Categorising Internet history to profile users web activity and flag suspicious sites. • List folders which only contain Audio or Video files (Potential porn). It's common for people with Child Abuse Material (CAM) to catalogue them in different folders so 27 detecting such behaviour is important. (Also common for normal people to structure their data this way,but helps in CAM cases as it can point to potential areas of interest. • Statistics on top file authors (Taken from file metadata). Many file types record the name of the user which edited them (Especially office documents) • List recent Internet searches • List logon and logoff dates for each user. Potentially flag when they logged in at abnormal hours/days. • List all occurrences of sequentially created or accessed files as it could show file copying. • From NTFS USN Journal find all renamed files and flag all occurrences where the new filename occurs in access to removable devices. One potential anti-forensic method to try to avoid forensic analysis is to rename the file before copying it or opening it from the removable device. An example of this is renaming the file “company customerlist.xls” to ccl.xls which is a lot more innocuous file name and then copying that • Activity spikes, Find activity spikes for each user. Often shows times of interest. 4.1 Methods Many forensic tools already have the ability to create basic computer profiles for evidence. Encase 6 has the ability to create computer profile reports and these do contain a lot of useful information about the computer and its configuration and which users there are, although it doesn't provide any information regarding user activity. Its computer profile report mostly covers what's mentioned under the “Quick Report” in the above section. The program called “Forensic Scanner” written by Harlan (Carvey 2011) goes further by doing some basic analysis for each user as well as detecting some signs of malware infection. Compared to the above lists it would be like creating a “Basic Report” but with some information from the “Medium Report”. 28 The normal way to extract much of the data mentioned under “Medium Report” and “Exhaustive Report” is by manual manipulation of extracted information. For example the internet history domain list can be exported and by the use of easily written (In this case a simple python script) script the domains can be counted up and a list of the most visited domains and numbers can be created and used for analysis. The issue with this is that it is quite a manual process and needs to be repeated for each new case. 29 CHAPTER FIVE – Log2timeline and Plaso 5. Introduction Kristinn Gudjonsson wrote his 2010 paper regarding the extraction of metadata and file system data. This paper culminated with the development of his log2timeline tool which extracts data from different sources and types which are combined together to provide a timeline of system activity (Guðjónsson, 2010). Log2timeline development continued with further file parsers being added all the way to version 0.65 where by this stage development had moved to the Plaso tool. Plaso is written in Python and ‘Log2Timeline’ was written in Perl. Manual extraction of metadata has been the traditional method for analysts. One of the issues with this is that not all tools are the same and some tools miss a lot of information. Analysts often need to keep current with the state of metadata extraction for different file types because some programs will be unable to extract all the possible information and may only extract part of the information. The concept of creating timelines by combining many different parsed metadata logs together wasn't that well used, partially because of the amount of effort to combine different types of data. The ‘log2timeline’ and ‘Plaso’ tool's removed a lot of that complexity. If an investigator is to ignore these two programs and their ability for extracting metadata for a timeline they would have to manually extract the data by using many different parsers and then have to combine the output of these different tools into one format. This isn’t as simple as some will have different columns of information which will need to be manipulated so that they can be combined together. This was done as a test to compare manual methods with automated using encase. Using Encase 6 and using the Link file, Internet History, and Event log parsers as well as file system information and which was then combined together using Excel. To combine the different tables of information from the different parsers was quite a time consuming job. The link file parser output and the file system information each contained at least 3 different columns with Date/time information which all needed to 30 be added to the timeline and also the parsed event logs and Link files both contained multiple columns of information which needed combining together. Log2Timeline makes what would be an exhausting manual job a relatively simple processing job. From having to use many different tools (Commercial, Free and open-source) for this process it has been reduced down to the use of one tool. When it came out it was a ground breaking tool for the creation of timelines and there was at the time no commercial tool which could do this and this is the same today. ‘Plaso’ is a newer version of this tool which has been rewritten in python to be easier to expand with third party developed python code because of its open architecture but currently hasn't reached feature parity with log2timeline for the amount of different parsed metadata file types. (Appendix B contains a full list of file types that Log2time parses) 5.1 Comparison The following table contains a quick comparison between Plaso and Log2Timeline. Attributes Log2Timeline Plaso Maturity Mature Not mature yet Metadata formats parsed Many Medium (Most important are covered) NTFS Volume Shadow Unable Able to parse Source – Folder Able Able Source – Disk image Unable Able Development Medium Easier, Python is simpler to develop with Snapshots (VSS) Table 1 – Tool comparison ‘Plaso’ currently can't parse all the different file formats that L2T (Log2Timeline) was able but can parse the most useful formats. Its ability to access NTFS Volume Shadow Snapshots (VSS) and disk images directly is a big jump in capabilities over L2T. It is a lot easier to develop for and the developers are looking at addressing the missing parsers. 31 The ‘Plaso’ developer website has a Google documents spread sheet which contains a list of currently implemented features and wanted features. Users can collaborate on existing features and propose new features including file format parsers. This ability to easily collaborate will help resolve issues with unknown file formats. Unknown file formats and their reverse engineering and documentation will still be a problem but having a program (Plaso) with an easily accessible collaborative development system will make it a lot easier to implement new parsers for newly discovered file formats. There are no tools which can automatically parse and decode unknown file and metadata formats. The Digital Forensics community by collaboration and data sharing can help make sure that new formats can be parsed. 32 CHAPTER SIX – Analysis of different analysis systems 6. Introduction Before examining what software and algorithms can be used for analysis it is best to look at what types of analysis that potentially may be used first. Once the types of analysis have been examined, different analysis systems can then be examined for suitability. What are some common Digital Forensic types of analysis? The following are the most common types of Digital Forensic analysis (Carrier 2003; Davis 2012; Turner 2006). 6.1 Incident Response/ Malware (Malicious Software) The purpose here is to look for unauthorised software on the computer. Now malware includes viruses, worms, rootkits, Trojan horses and key loggers. Analysis of malware involves looking for the malware initial entry point vector), its propagation mechanism (initial infection any artifacts left on the system and its Persistence Mechanism (Baker, Hutton & Hylender 2011; Carvey 2013) . 6.2 Intellectual Property theft (IP Theft) The purpose here is to find all possible methods for copying IP and look for any signs that this has occurred. Common areas for analysis here are emails with attached files being sent to the staff members personal email address and also removable storage devices which were connected on the staff members last days of work where access to work related files is found (Australian Institute of Criminology 2008; Goodman 2001). This will be expanded on further in Chapter 7 showing the implementation of scripts using Plaso to help with IP theft investigations. 6.3 Access to Child Abuse Material (CAM) The purpose here is to look for collections of media files ( like Videos and Photos), to find signs of large container files, find file copying methods, look for suspicious 33 internet browsing activity as well as look for the use of evidence hiding software (Watt 2012). Excluded from the list are more specific types of analysis. Specific analysis is commonly involves proving that something did or didn't occur and who was using the computer at that time. An example of this is confirming when the computer was used and by whom to confirm some ones alibi. Electronic discovery is also not listed as it doesn't contain any real analysis (mostly just searching and exporting relevant search hits). 6.4 Snort Snort is often seen as being a signature based program and also as a rules based program (Eckmann 2001). As snort has a rules based system which can detect suspicious activity as well as the flexibility to support creation of new rules it can stay current with detection of threats on the network. So for forensic analysis a rules based system which also supports the sharing and creation of new rules will help the analysis process and endeavour to keep the analysis system current (Aickelin, Twycross & Hesketh-Roberts 2008; Roesch 1999). There is the example of the ‘RegRipper’ program by Harlan Carvey (Guðjónsson 2010; Stevens 2006). The continual updating of the application's analysis plug-ins with new and the updating of existing plug-ins has enabled it for example to add the parsing of registry shell bag artefacts (Carvey 2012). Yet all discussion and rules regarding Snort are mentioned were relating to matching one packet only. No mention of linking together information from various packets for a match which is a concept which would be useful for Digital Forensics. An example of this is finding link files pointing to removable devices near the time a USB storage device was inserted which suggests that the files that the link file point to came from this USB device. Further research into Snort confirmed the idea that Snort matches per packet only with no rules able to reference prior packets. Being able to examine the state of multiple packets in one rule would be very useful. The Prelude IDS (Intrusion detection system) has rules written in python which integrate directly and have full access to the program and python syntax which makes it very flexible for rule writing (Zaraska 2003). 34 A flexible rule based analysis system like Snort but able to reference other entries at once would be very useful for Digital Forensic analysis. 6.5 Markov chain analysis methods Security Information and Event Management (SIEM) systems are used for the analysis of log files and provided much useful information regarding log file analysis and correlation (Swift 2006). When it comes to analysing log files “Markov chain” analysis methods, well more specifically Hidden Markov Model (HMM) has the potential to be quite useful for the detection of abnormal entries as well as the filtering out of known good entries. With HMM known good log files are analysed to create a baseline of normal activity and then this information can be used for the analysis on evidence log files to look for anomalies (Peng, Li & Ma 2005). HMM is a system, where probabilities are used to calculate the expected next log entry and major deviations from the norm can be flagged (García-Teodoro et al. 2009). This has the potential to detect activity like failing hard drives which doesn't act often as well as artefacts of a malware infection. There has been research that suggests that HMM can be effective but requires considerable computing resources (Peisert & Bishop 2007a). A modified Bayes algorithm has been used and compared to HMM with similar behaviour (Peng, Li & Ma 2005). Principal Component Analysis (PCA) has been proposed as an algorithm which is more simple to implement as well as more efficient for processing than HMM (Xu et al. 2009). These analysis methods could be quite useful for examining log files but are not expected to be of much use for file system or registry file system data. 6.6 Further different analysis methods There are some other analysis techniques which are worth looking at for digital forensic analysis. White/black listing: A simple alternative to HMM and PCA is to use white/black listing. There are a lot of known bad entries and keywords that could be used (Simpson et al. 2011). Blacklists can be used to flag for immediate analysis known suspicious filenames, internet traffic and applications. 35 Child Abuse Material can possibly be detected by checking all of ‘Plaso's’ timeline information for usual keywords. Expanding on this “specific” keywords for different types of crimes or activity could also be used to quickly flag different types of behaviour like for example finding drug related or pornographic material. White listing and blacklisting of known services and programs which are run at startup would remove known entries, flag known bad entries and reveal unknown entries which could be potential areas for analysis for a malware analysis case (Wong 2007). The system and network service accounts under windows cannot be logged into by common users so any “User like” activity like internet history or link files found under these user profiles is activity that should be detected by a blacklist. This is a simple analysis method which would quickly help detect known bad as well as unknown entries which can be further analysed. 6.7 Statistics Statistics can be used to find common activity an example of this is finding that a staff member’s most accessed website on their work computer is be facebook.com which suggests that they don’t do much work. So statistics can be used to show things the user commonly accessed or used. Statistics can be used to show which user account was used the most and the time periods it was used. It can be used to detect find instances of sequentially accessed or created files which is common when files are copied and could point to intellectual property theft. This would appear as sudden spikes of activity for the last accessed or date created date/time for files on the file system. Most users usually use their computer over a certain time frame for office computers this is usually business hours, so the detection of activity outside of these times might signify a malware infection or other suspicious activity (Guðjónsson 2010). Rules based analysis: The flexibility of rule based systems has already been examined with a discussion regarding the IDS Snort. Rules could also be used to tag entries and those tags could also be used in rules (Garfinkel 2009; Marrington et al. 2011). 36 Some example rules: Example rule1. if application category “wiping tool” exists AND registry MRU empty then raise Alert that “possible evidence cleaning has occurred” Thresholds could be used to detect simple spikes in certain activity. Example rule 2 Raise alert when > 2 password failures occur within two minutes. Example rule3 raise alert when > 10 file system create times within two minutes (Could detect copying ?) A potential issue with rules is the flexibility of the rule language. In some ways the Prelude Intrusion Detection System with its rules written in python have so much flexibility(Zaraska 2003). This concept would be directly applicable to creating rules for ‘Plaso’ as well. Some examination was put into looking at model based diagnosis and found that they are a lot more complicated because they try to model the system and can be very computationally intensive (Elsaesser & Tanner 2001). 37 CHAPTER SEVEN – Tests and implementations 7. Introduction For automated Digital Forensic analysis to occur there a few prerequisites. An easily available source of data to analyse is important, ‘Plaso’ has the ability to extract and provide this data as well as be a platform for the development of automatic analysis. In this chapter there will be an examination regarding how ‘Plaso’ can be utilised to provide automated Digital forensic analysis. The long term plan will be to contribute the results of my findings back in to ‘Plaso’. This was implemented using version 1.0.2 of ‘Plaso’ running under Debian. Tests were performed to understand the behaviour of ‘Plaso’ and find limitations which might need to be worked around. Initial limitations were that the file size of files wasn't gathered and the developers were helpful and added that feature. Testing and analysis were originally only going to be with windows XP but after preliminary tests with windows 7 found some interesting artefacts it was decided that windows 7 was worth including in the tests. With the tests which would be performed, manual analysis methods and results were compared to the results gained by using the metadata gathered by ‘Plaso’ and running analysis scripts over this data to make sure the results for both types of analysis was the same. The approach used was to first use Plaso tools like psort.py and pinfo.py to test the viability of the proposed filtering and analysis methods and then implement using python. This is demonstrated in the IP theft section in this chapter, which is also described in chapter 6. The final intention is to contribute the python code back into the Plaso codebase to add analysis abilities to the Plaso tool. 7.1 Profiling When ‘Plaso’ is used to process a computer it collects an initial profile of the computer to help with processing. This data is very helpful for forensic analysis because it provides essential details regarding the computers configuration. Knowing this information saves the investigator from having to find out this information. 38 The following text box contains example profile information gathered by ‘Plaso's’ tool at the time the evidence is processed and metadata extracted. ‘Plaso’ gathers this information to help it understand the environment and decide what metadata needs to be collected. In this case it can be seen that the operating system is Windows XP so because of this ‘Plaso’ will not try to extract Apple OSX or Linux specific metadata. windir = //WINDOWS hostname = XP users = [ { u'name': u'systemprofile', u'path': u'%systemroot%\\system32\\config\\systemprofile', u'sid': u'S-1-5-18'}, { u'name': u'LocalService', u'path': u'%SystemDrive%\\Documents and Settings\\LocalService', u'sid': u'S-1-5-19'}, { u'name': u'NetworkService', u'path': u'%SystemDrive%\\Documents and Settings\\NetworkService', u'sid': u'S-1-5-20'}, { u'name': u'user', u'path': u'%SystemDrive%\\Documents and Settings\\user', u'sid': u'S-1-5-21-1957994488-1409082233-839522115-1003'}] zone = UTC time_zone_str = AUS Eastern Standard Time guessed_os = Windows sysregistry = //WINDOWS/system32/config systemroot = //WINDOWS/system32 osversion = Microsoft Windows XP store_range = (1L, 1L) code_page = cp1252 Figure 1 – Plaso pinfo.py Windows XP computer profile information. The computer profile data gathered by ‘Plaso’ provides a foundation for the development of additional analysis abilities. 39 With normal computer forensic analysis usually the most that could be expected for a computer profile was what Encase 6's Initialise Case would provide. It would provide details about the computer hardware, software installed, the operating system and a list of users but the information wasn't the easiest to deal with (Especially the hardware and software sections) and didn't provide any detail regarding the users activity. The software and hardware sections were not that useful as the way they have been presented made them too verbose as well as having too much white space making them hard to read. This information is directly accessible to python scripts utilising Plaso to analyse it’s data store files (these contain the extracted metadata). The above profile generated by ‘Plaso’ contains the most essential information that would have been contained in an Encase 6 Initialise Case report. With the data already extracted by ‘Plaso’ and with the right analysis rules something with similar content to the medium report referred to in the computer profiling section could quite easily be created. Simple rules with ‘Plaso’ were created to extract USB history (Wong 2007), “Userassist” user activity reports (Stevens 2010), Internet history, Parsed Link files , and largest file reports and some if this information can be seen below. 7.2 Large Files Find and examining the largest files found in the evidence is a common thing to do at the beginning of analysis as it may help find container files (like .zip .rar .7z archive files), email mailbox files (some examples are: .PST or .OST Outlook mail box files.), large video files (which could be copies of downloaded movies) or even find encrypted volumes. For example, Truecrypt encrypted volumes are usually very large files so often can be detected by their size. As ‘Plaso’ has a field containing the files size it’s quite easy to iterate over the files in the evidence looking for the largest files. With Encase the process is to just show all files and then sort by size, this is a manual process in Encase where with ‘Plaso’ it can be automated. For example for our Windows XP the following large files were found using Plaso: 40 For Windows 7: Size (Bytes) Filename 20,971,520,000 /temp/disk_image.raw 5,883,215,872 /pagefile.sys 4,412,411,904 /hiberfil.sys 4,193,572,720 /Windows/RE_DRIVE/recoverycd_iso2/OSImg2.swm 4,185,625,598 /Windows/RE_DRIVE/RECOVERYCD_ISO/RECOVERY_DVD/OSImg.s wm 3,295,094,784 /temp/BT5R3-GNOME-32.iso 2,811,326,464 1,556,324,352 1,301,618,688 801,424,592 /temp/BT5R2-KDE-64.iso /System Volume Information/{15d3b509-a95e-11e2-9343- 6c626d311a8a}{3808876b-c176-4e48-b7ae-04046e6cc752} /System Volume Information/{a62ac45c-ae52-11e2-87e3- e0b9a5aa25e5}{3808876b-c176-4e48-b7ae-04046e6cc752} /ProgramData/Microsoft/Application Virtualization Client/SoftGrid Client/sftfs.fsd Table 2 – Windows 7 top 10 large files A quick examination of table 1 finds that there is a 20GB file in the temp folder called “disk_image.raw”. If this was a Child Abuse Material (CAM) case then this file might be an encrypted volume or a possible virtual machine disk image file needing examination. The files under the “/System Volume Information/” show that there are Volume shadow snapshots on the disk and in the above list there is at least 2.8GB of changes in the snapshots( which potentially are deleted files) . The .ISO DVDROM image files “BT5R3-GNOME-32.iso” and “BT5R2-KDE-64.iso” based on their filenames most likely contain the ‘BackTrack’ penetration testing distribution which contains many hacking tools. For Windows XP 41 Size Filename 1,610,612,736 /pagefile.sys 331,805,736 /share/Service pack3/WindowsXP-KB936929-SP3-x86-ENU.exe 131,170,400 /share/sp1a/xpsp1a_en_x86.exe 76,699,621 /WINDOWS/Driver Cache/i386/driver.cab 67,108,864 /$LogFile 24,412,160 /$MFT 20,056,462 /WINDOWS/ServicePackFiles/i386/sp3.cab 20,056,462 /WINDOWS/Driver Cache/i386/sp3.cab 16,258,580 /WINDOWS/Fonts/batang.ttc 14,688,256 /WINDOWS/ime/IMJP8_1/DICTS/imjpst.dic Table 3 – Windows XP top 10 large files Here it can be seen that there are no user created files larger than 14MB, which means that they don't have an Outlook mailbox or any other large files. The sp3.cab files suggest that windows XP service pack 3 has been installed, further proof is that the file listing also contains service pack 3’s installation file (WindowsXP-KB936929SP3-x86-ENU.exe). 7.3 Most Visited websites Reporting on the most visited websites can provide a profile of user activity on the computer. For our Windows XP virtual machine there isn't much in the way of activity. Count item 376 support.microsoft.com 71 runonce.msn.com 48 clients1.google.com.au 45 www.google.com.au 35 www.microsoft.com 24 windowsupdate.microsoft.com 22 www.ninemsn.com.au 42 21 asset.9msn.com.au 18 s3.buysellads.com 18 html5test.com Table 4– Windows XP top 10 web domains Count item 677 www.google.com.au 336 www.facebook.com 80 mail.google.com 58 bits.wikimedia.org 54 code.google.com 52 en.wikipedia.org 46 webmail.mycompany.com.au 41 www.mozilla.com 31 e5.onthehub.com 29 www.socketmobile.com Table 5 – Windows 7 top 10 web domains Comparison between these two tables gives us an idea of how much activity has occurred on each computer as well as what activity. It can be seen on the Windows 7 computer that there was more traffic to social media sites (Facebook) and cloud email (Gmail) and to the work web-mail web-page which might be indicative of how the user uses their time. The data extracted from Plaso here wasn’t immediately usable for creating statistics and cleaning up of the URL’s was required first, as can be seen in “Appendix - D”. 7.4 User profile registry dates and times: The dates and times of the user profiles registry files reveal how long the user profile has been used for. Of course for more specific detail, examination of the event logs for log in and log off times will help provide further data (depending of course if this logging has been enabled). 43 datetime Timestamp type message 2012-01-16 Create time C:/Documents and Settings/user/NTUSER.DAT Modify time C:/Documents and Settings/user/NTUSER.DAT 00:51:19 (UTC) 2013-10-27 09:45:44 (UTC) Table 6 – Windows XP user profile data Here the table displays the dates and times for the registry files for the user called 'user'. From this can be seen that this user account was first used 16/01/2012 and last used 27/10/2013 thus excluding this user profile from any activity from before or after this time period. As there are no other user profiles (system user accounts are not included) on this system a time period for the use of this computer can be established. 7.5 Analysis As can be seen from the above examples it is quite easy to extract and analyse information using ‘Plaso’ as the foundation for the activity. There are many other small but useful types of analysis which can be done not directly being part of any particular overall analysis theme which can help with getting an idea regarding the activities by users or by the operating system. Some examples of this are: • Statistics showing what user creatable file types live in the users and the amount of each. Shows the activity which has or has not occurred. • Extracting and putting the Internet history, USB activity reports, link file report and Internet history local file access information available for manual analysis may help as the investigator might require the additional detail that these provide. • Creating and exporting timelines around activity of interest may help provide additional detail. A possible example of this is to create a timeline of the time when malware infected the system or when a user was suspected of stealing company IP). There a few types of analysis commonly performed manually where automation can help. 44 7.6 Intellectual Property (IP) Theft In IP theft cases analysts look for methods where data may have been ex-filtrated from the computer. The most common methods for copying company data is by the use of USB removable drives because they are portable, small with these days have a lot of storage capacity. Someone could copy all the Companies IP onto a USB flash drive and walk straight out the front door without anyone noticing. The old days when information was on paper it was a lot harder. Some common forensic areas to examine to find whether this has happened: – Windows Shortcut files (link files) to removable drives – USB device activity – Windows Internet history for file:// access to removable drives The manual analysis method that the analyst would commonly use here would be to parse the link files with Encase 6 and export the registry and parse the registry for USB history information with Woanware's USB Device Forensics tool then compare to correlate link file access with USB device insertion so that our client can be told which USB devices most likely have their stolen information on. The link files contain quite a bit of detail about the drive the files came from, like “Volume Label” and “Drive Serial number” which can be used to confirm which drive it was. With Plaso under python it is simple to iterate through the dump file extracting all USB history information as well as extracting Link file activity information as well. It is then quite simple to look for USB drive activity when there was access to files on removable devices. The system stores information regarding the first time the USB drive was plugged and some information regarding the other times it was inserted. There is device information stored which describes the connected USB remove able drive. In our test the USB flash drive was inserted at 19:57 and a text file was opened from it at 20:28. The information recovered from ‘Plaso’ corroborated this activity. At 2013-10-27 on 19:57:22, the USB Flash drive was connected. The following data had been extracted by ‘Plaso’ from the Registry and found by the USB analysis script. 45 device_type: Disk friendly_name: Lexar JD FireFly USB Device parent_id_prefix: 8&107bbb1a&0 product: Prod_JD_FireFly revision: Rev_1100 serial: 7&1fb3deb6&0&AAMA14G4OXX83PM1&0 subkey_name: Disk&Ven_Lexar&Prod_JD_FireFly&Rev_1100 vendor: Ven_Lexar At 2013-10-27 on 20:28:46 a text file was opened from the USB flash drive. The following information was discovered. The Internet history showed that the following local file was accessed: user@file:///E:/tmp/grml-cheatcodes.txt A Link files was created at: C:/Documents and Settings/user/Recent/grmlcheatcodes.txt.lnk This link file contained the following information about the USB flash drive: File size: 21491 Drive serial number: 0x4e9b6351 Volume label: USBDISK Local path: E:\tmp\grml-cheatcodes.txt Working dir: E:\tmp An examination of the ‘Plaso’ timeline at 20:28 found UserAssist information revealing that the program “C:\WINDOWS\system32\NOTEPAD.EXE” was executed at this time which matches up with the above artefacts regarding a text file being opened from a USB Flash drive. So if this file was important company intellectual property then the extracted information could be used to pinpoint when it happened as well as which user 46 account was active at the time and what device the file was accessed on. Further details can be found in Appendix C. Using the Plaso tools the above information can be extracted by using command line tools. First from the Plaso dump file all windows shortcut entries (.lnk) for removable drives are exported to a Plaso dump file. Image 1 – Export removable lnk information Then all USB activity is exported to the same file. Image 2 – Export USB information Image 3 – List entries The top two entries extract the information from the Plaso store file using two different search terms to get link file entries and USB activity into one dump file. Combining the different search terms into one dump file means that the results of 47 querying the dump file will provide combined results showing when USB devices were connected and files which were opened from them. A script called script_usb.py has been created to extract the USB device and the link file information for analysis extracting all the information extracted above in one pass, the below image contains some example output. Image 4 – Extract Link file and USB Storage devices information as one report. The output can be sent into a text file and analysed in Excel. A report showing all connected USB storage devices can be obtained by the following command. Image 5 – Display connected USB Storage devices 48 This command can also be scripted in python as seen in the following screenshot. Image 5 – Display connected USB Storage devices These scripts can be combined so that they all run in one pass speeding up the processing as compared to using the psort.py command where each query has to be run one at a time. As these scripts can all be combined they can create multiple different reports for the analyst all at the same time. An example would be exporting separate reports for Link files, Internet history, USB devices and userassist activity reports which would provide a good overview of activity on the computer. 7.7 Incident Response / Malware analysis For this type of analysis the analyst is looking for things that don't belong, as in software which has infected the computer. Users don't intentionally install malware, it's gets onto their machine without their consent and because of that tries to hide but it is an abnormality so comparing a normal computer to a malware infected one can help find abnormal activity by helping remove “known good” activity. With malware our analysis focuses on finding the initial entry point, it's propagation method, artefacts left on the system and its persistence mechanism (how it starts) (Baker, Hutton & Hylender 2011). . If the time of the infection is known then from Plaso a timeline of events can be extracted from around the same time period for examination to find exactly which files are affected and potentially how the malware came in. An example of this is that a certain web page was visited and a Java program was downloaded and executed and this downloaded the malware from another site. 49 This is where statistics can be useful as they can help point to spikes in activity which may be evidence of a malware infection, although with such statistics there is a need to be careful to exclude normal activity like the system booting or windows updates being installed which also create spikes in activity. Tests were done regarding whitelisting and blacklisting malware persistence artefacts with the examination of the registry run keys, windows start up folder contents and services and this worked well with the basic tests that were performed. The next stage would be obtaining some malware for testing purposes in a virtual machine making comparisons between before and after the infection. Training to create whitelists of known good services and start-up applications will help improve the detection rate. Additional blacklisting areas added for detection are: – any internet history under any of the computer system user profiles ( systemprofile, LocalService and NetworkService user accounts) – Any Link files under the same folders. – Suspicious software (Evidence cleaners, hack tools, remote access tools,key loggers) There should not be any signs of user activity under any of those mentioned profiles as they are system accounts and not used by users so any detected user activity is suspicious. With Plaso it’s possible to loop through the data running back lists and white lists over the relevant areas looking for relevant entries. [\Microsoft\Windows\CurrentVersion\Run] IMEKRMIG6.1: C:\WINDOWS\ime\imkr6_1\IMEKRMIG.EXE [\Microsoft\Windows\CurrentVersion\Run] IMJPMIG8.1: C:\WINDOWS\IME\imjp8_1\IMJPMIG.EXE /Spoil /RemAdvDef /Migration32 [\Microsoft\Windows\CurrentVersion\Run] MSPY2002: C:\WINDOWS\System32\IME\PINTLGNT\ImScInst.exe /SYNC [\Microsoft\Windows\CurrentVersion\Run] PHIME2002A: C:\WINDOWS\System32\IME\TINTLGNT\TINTSETP.EXE /IMEName 50 [\Microsoft\Windows\CurrentVersion\Run] PHIME2002ASync: C:\WINDOWS\System32\IME\TINTLGNT\TINTSETP.EXE /SYNC [\Microsoft\Windows\CurrentVersion\Run] SchedulingAgent: mstinit.exe /firstlogon Table 7 – Windows XP autorun entries In this case most of the entries were related to Microsoft’s Input Message Editor (IME) software so was added to the white list. The last application “mstinit.exe” has been identified as the Microsoft Scheduling Agent which was whitelisted also. It’s important to note that many malware and virus files will adopt the name of a program which normally exists on the system as camouflage so rules need to be specific enough to ignore the correct version and detect when there are files of this name in unusual places. 7.8 Time changing The system time could be changed by either users or malware to try to evade detection of activity(Marrington et al. 2011; Willassen 2008) as an anti-forensic technique (Guðjónsson 2010). As ‘Plaso’ stores the offsets into the event logs time changing can be detected by iterating though event log entries and checking for date and time moving backward. Comparison between the dates and sequence numbers for other parts of the system which increment (MFT records, USN Journal, Windows XP system restore folder names, User assist entries (Win XP), thumbs.db to detect if the time goes backward. Depending on Logging settings the windows event log can record any attempt to change the date and time. This was tested manually in Encase 6 by parsing the event logs and manually examining these logs. Plaso's dump file information regarding event logs was examined and offset information was found and compared. 7.9 Statistics Statistics are a potential analysis area for digital forensics analysis which would be quite time consuming for a person to manually generate but quite quick and easy for a computer. Some uses are for detection of the following behaviour: – Activity spikes (potentially detect malware installations) 51 – Sequential file accesses or creations can be used to detect files’ being copied which is useful in IP Theft cases. – Flag rarely happening events in the event logs. (needs training but can be used to detect abnormal behaviour) – Provide statistics regarding numbers of different file types (Helps profile user profiles) – Use Averages (Mean / Median), standard deviation and most common and least common to provide additional information for analysis. From a high level the statistics engine could record the number of certain activities per day/week/month/year and this will be based on raw entries as well as over specific areas to provide further more focused statistics. Some of these potential specific areas are: overall starts for the computer, user profiles and windows/software folders. Having specific areas for statistics can help with differentiating between spikes in user activity to spikes in the operating systems activity. Potentially be used to discover normal computer usage times and to flag activity outside these hours. 7.10 Rule based analysis Many of the proposed analysis rules are already common for manual computer forensics but can all be run at once as compared to a human spending the time to do each one at a time. – user created files outside of profile folders (abnormal behaviour) – Flagging archive and backup files in user folders. (IP Theft?) – Folders which only contain photos and videos (of interest in a CAM case) – Flag Cloud storage and file uploading software like uTorrent,Frostwire, Dropbox and Mega which will be useful for IP Theft cases. – Flag endpoint monitoring software, which could have useful log's for analysis. 52 – Flag Phone backup files and folders – Mass file deletions (recycle bin entries) These different rules can be quite useful to help direct an analyst into relevant areas of interest saving them the time needed to look around and manually gather this information. 7.11 Findings It was demonstrated that ‘Plaso’ can be extended to perform automatic analysis and reduce the amount of manual analysis required for investigators. The python programming language using the ‘Plaso’ libraries can be used to process extracted metadata and find relevant information. The power of an easy to use programming language like python combined with all the information that ‘Plaso’ extracts provides a lot of potential for analysis. Implementing an analysis script which works together with ‘Plaso's’ extraction of system and file metadata is a good step forward towards helping analysts to focus their valuable time into relevant areas and provide information which can help them avoid performing irrelevant analysis. The ‘Plaso’ tool itself provides a good stepping stone for analysis as it is able to extract a lot of information from the evidence. This can be an issue as it's possible to extract more information than is required, for the USB analysis our rule extracted far too much information with at least 10-15 artefacts all generated within a minute which mostly contained the same data. Moving forward this information could be trimmed to just select the most relevant and useful records for presentation. It is possible that when examining the activity of one USB storage device that most of the artefacts will occur for when the device was first connected to this computer and also for the last time it was connected so that care will need to be taken when limiting search results for USB devices so as not to miss artefacts for times it was inserted between the first and last insertion times. 53 CHAPTER EIGHT - CONCLUSIONS AND FURTHER WORK This thesis examines an overall research question regarding the use of automation for the improvement of digital forensic analysis. Different methods and concepts are examined, analysed and some tested for viability. Comparison between Log2timeline and ‘Plaso’ revealed that ‘Plaso’ has the most potential for further growth and expandability with regard to adding the abilities for the analysis of evidence. Examination of Snort found that even though Snort is quite a flexible tool it wasn't directly applicable in this case. Its flexibility had merit but as it is aimed toward network packets and as its rules are unable to analyse/match across multiple packets at once it wasn't suitable for this type of analysis. Of the many different types of analysis that could be performed the analysis types that were closer to real world analysis were simpler to implement (in this case Black/White lists and simple rules) with lower false positives. Analysis systems based on statistics like PCA and Markov chains do require a good amount of training first to reduce the amount of false-positives hits. The open IOC (Indicators of Compromise) website which was setup by Mandiant does hold a lot of potential for providing a useful source of information about real world malware indicators which could be integrated into blacklists. Creating computer profile and user profile reports are quite beneficial and do help the analyst get a rough idea of activity on the computer. Of course if more information is required from the report then more time is needed for extracting and processing to generate it. The information parsed by Plaso was able to be processed to gather computer and user profile based information as seen above, perform rule based analysis as well as generate statistics. Tests were performed which confirmed that using ‘Plaso’ as a foundation automated analysis is very viable. 8.1 Research Conclusions It is now possible to answer the original questions posed by the research: 54 Q. What are the existing tools for extracting relevant information from evidence as well as the quality of the extracted information from these tools ? As already mentioned there are many commercial, free and opensource tools for extracting information from evidence. These all have differing levels of the amount, quality and detail of the information extracted. Opensource tools lend themselves toward improvement as the code is easily available so long as there is good documentation and support. There is the example of Encase 6 where the users had asked for HFS+ filesystem support for at least four years and it wasn't implemented. They finally were encouraged to use Encase 7 (when it was released) which did have HFS+ support but had a terribly unintuitive user interface. Q. What solutions are there for parsing the many undocumented file and metadata formats which are yet to be discovered and documented but could contain information of interest? There are no tools that can automatically parse decode unknown file and metadata formats. Discovering the formats of unknown files and extracting useful metadata is either done by reverse engineering or by using information from the developer of that format. Parsing the NTFS USN Journal was made possible by the information which was provided by Microsoft. For undocumented formats the Digital Forensics community can work together to help make sure new formats are parsed into a usable format by reverse engineering it together or by lobbying the developer of the format to provide documentation. The Log2Timeline and Plaso tools are open-source which means people can collaborate in their development and improvement. As they currently parse more metadata formats than other open source tools they provide a good foundation for further development. The Digital Forensics community should collaborate together to make sure that Plaso can parse all known relevant file formats and make sure that 55 when new discoveries are made that Plaso is updated to handle these new file formats. Especially since the Plaso developers have a spread sheet of implemented and unimplemented formats and users can easily monitor progress as well as help the project by contributing parsers. It is well known that proprietary commercial tools are slow in adding newly documented formats for parsing and open-source software has the advantage of being able to update more quickly. Q. How to ensure a low false-positive and false-negative detection rate while keeping a high detection rate of relevant information? One method is to rate the output based on perceived level of quality. Results from the white and black lists to be rated at alert level while the results based on algorithms (HMM, PCA, Bayes) are rated at a “suspicious” alert level until tests and training improve results. This helps prevent alert fatigue or the “call wolf” syndrome where the analyst effectively becomes trained to ignore results because of the amount of false positives. Training the system with a lot of known good data to strengthen the white listing first will help minimise the probability of false positives. Q, What approach can be used to enhance digital forensic analysis with automation? Automation can be used to help with the extraction of metadata and also to perform many analysis tasks which are currently still performed manually by skilled forensic analysts. Using python scripts in conjunction with Plaso and the extracted dump file a lot of usual manual analysis work can be quickly automated. From using rules and statistics to process and analyse data, to the simple extraction of parsed data into reports saves the analyst from manually having to parse different sources of 56 information. An example of this is extracting from the Plaso dump file the USB storage history, Internet history, Link file report into separate reports for the investigator. 8.2 Areas for further study 8.2.1 Categories This could be considered in a way as supporting statistics. Categorising Internet history and software application types as well as creating graphs showing the percentage of each type can help with the computer profiling process for the analyst to get an idea of activity on the computer. Combining the dates and times from all user attributable actions can help map out exactly when the user normally uses the computer and enable flagging of abnormal activity like for example activity on a work computer after hours which is suspicious when usual activity on that computer is between 9am to 5pm. Another example, having a graph showing that 40% of Internet use was for the work related websites and that 60% of Internet was social media will quickly help provide an overview of the users Internet browsing activities. So categorisation has potential to help the analyst. 8.2.2 Correlation between different data types An example of this is the correlation done above between link files and registry entries for USB storage to link the USB device (Serial number, manufacturer and model) information with the link file information (filename, date/time accessed, drive volume name and drive serial number) so that the investigator can confirm which USB storage device a particular file was accessed from. This shows that correlation is already in use by investigators howbeit manually which will limit more advanced correlation being performed. There is a lot of potential for research in these areas. The literature review showed that there has already been much research in these areas. Similar analysis like this is already being performed by SIEM’s. 57 SIEM systems log many different types of logs (Windows and Unix server logs, network switch, network Router, network firewall and Network IDS (Intrusion Detection System)) and attempt to correlate related log's together. Most papers on this subject attempt to correlate by statistics. The SIEM model isn't a clean fit for computer analysis as evidence contains log type evidence (each entry sequential and related by time) as well as file system based evidence (each entry related by file system hierarchy + dates and times). There is potential for filesystem based evidence to use some of the SIEM statistics based analysis methods, but as these items are related by file system structure as well as at least 3 time stamps there is additional complexity for analysis. Log based evidence can use the SIEM correlation models fully but consideration is needed for analysis of filesystem activity by SIEM like analysis systems. 8.2.3 File contents All of the above analysis is primarily based on metadata. Analysis based on file contents would be a new and difficult frontier to research for analysis. Files containing text are currently already dealt with by index searching and in the Electronic discovery area new technology like Predictive Coding and context searching are beginning to extract out more and more intelligence every day. The application of this technology can only but help with Digital Forensic analysis. Text clustering for categorising files and emails provides the ability to find similar emails and documents which can help grouping similar information and topics (Decherchi, Tacconi, & Redi 2009). Yet detecting user behaviour based on emails and other correspondence is a challenge for further research. . 8.2.4 Plaso - tagging The analysis performed using Plaso didn't utilise its powerful tagging system. Using this could help with linking similar things together: Some examples are: 58 • Software related information; Software application folders, software related registry entries, software link files, prefetch information as well as “user assist” information regarding executed programs. • User actions; including Internet history , user profile files, link files , security events , user-assist • User generated data files (not always in profile) • installed software; can be found in registry uninstalls section, program files folders , Application menu folders (All users and individual) Plaso – correspondence Adding more correspondence related information into the timeline can help investigators get a better idea of user activity. Currently Plaso already has support for parsing and including Skype chat messages but adding email, SMS and other chat based messaging information into Plaso can only help provide more vision on the behaviour of the computer/user for the investigator. With the flexibility of Plaso plug-ins could be created providing the ability to import email, SMS and other chat based messaging communication listings from the different commercial forensic tools. This could dovetail in quite nicely with the “User Actions” tagging suggested above. 8.3 Conclusion An approach for providing automation for analysis has been trialled showing the flexibility of the Plaso toolkit in conjunction with python scripts. Computer and user profiling was tested as well as different types of analysis which were successful showing the strength of controlling Plaso for analysis with python. The ability for the extraction of many types of metadata reports from Plaso is also beneficial for analysis which can also be coupled with using python to provide further analysis or statistics with the output can be very beneficial. The openness and ease of using python to control Plaso leads itself for development by the forensic community. 59 Plaso with analysis code contributed by the community working together has great potential for adding many different possible analysis abilities for the investigator. 60 References: Accessdata 2005, Registry Quick Find Chart, p. 16. Aickelin, U, Twycross, J & Hesketh-Roberts, T 2008, ‘Rule Generalisation using Snort’, International Journal of Electronic Security and Digital Forensics (IJESDF), vol. x, no. x, viewed 27 May 2013, <http://www.cs.nott.ac.uk/~uxa/papers/ijesdf_fuzzy_ids.pdf>. Australian Institute of Criminology 2008, Intellectual Property Crime and Enforcement in Australia, Criminology, no. 94, viewed 9 November 2013, <http://www.aic.gov.au/documents/B/D/0/%7BBD0BC4E6-0599-467A-8F6438D13B5C0EEB%7Drpp94.pdf>. Ayers, D 2009, ‘A second generation computer forensic analysis system’, Digital Investigation, vol. 6, pp. S34–S42, viewed 6 March 2013, <http://linkinghub.elsevier.com/retrieve/pii/S1742287609000371>. Baker, W, Hutton, A & Hylender, C 2011, 2011 data breach investigations report, … investigations-report…, p. 72, viewed 24 October 2013, <http://www.wired.com/images_blogs/threatlevel/2011/04/Verizon-2011-DBIR_04-1311.pdf>. Brownstone, RD 2004, ‘Collaborative Navigation of the Stormy’, Technology, vol. X, no. 5. Carrier, B 2003, ‘Defining Forensic Examination and Analysis Tools Using Abstraction Layers’, Internationl Journal of Digital Evidence, vol. 1, no. 4, pp. 1–12, viewed 2 April 2011, <http://scholar.google.com/scholar?hl=en&btnG=Search&q=intitle:Defining+Digital+Forens ic+Examination+and+Analysis+Tools+Using+Abstraction+Layers#2>. ― 2005, File system forensic analysis, viewed 20 April 2013, <http://scholar.google.com/scholar?hl=en&btnG=Search&q=intitle:file+system+forensic+ana lysis#0>. Carrier, B & Spafford, EH 2005, ‘Automated digital evidence target definition using outlier analysis and existing evidence’, in Proceedings of the 2005 Digital Forensics Research Workshop, Citeseer, pp. 1–10, viewed 2 April 2011, <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.81.1345&amp;rep=rep1&amp;ty pe=pdf>. Carvey, H 2011, Forensic Scanner. ― 2012, RegRipper Updates, ‘Windows Incident Response’ Blog post. ― 2013, HowTo: Malware Detection, pt I, viewed 24 October <http://windowsir.blogspot.com.au/2013/07/howto-malware-detection-pt-i.html>. 2013, Davis, A 2012, Leveraging the Application Compatibility Cache in Forensic Investigations. 61 Eckmann, S 2001, ‘Translating Snort rules to STATL scenarios’, Proc. Recent Advances in Intrusion Detection, pp. 1–13, viewed 23 May 2013, <http://www.raidsymposium.org/Raid2001/papers/eckmann_raid2001.pdf>. Elsaesser, C & Tanner, M 2001, ‘Automated diagnosis for computer forensics’, The Mitre Corporation, pp. 1–16, viewed 20 April 2013, <http://www.mitre.org/work/tech_papers/tech_papers_01/elsaesser_forensics/esaesser_forens ics.pdf>. Farrell, P 2009, ‘A Framework for Automated Digital Forensic Reporting’, no. March, viewed 20 April 2013, <https://calhoun.nps.edu/public/handle/10945/4878>. García-Teodoro, P, Díaz-Verdejo, J, Maciá-Fernández, G & Vázquez, E 2009, ‘Anomalybased network intrusion detection: Techniques, systems and challenges’, Computers & Security, vol. 28, no. 1-2, pp. 18–28, viewed 17 October 2013, <http://linkinghub.elsevier.com/retrieve/pii/S0167404808000692>. Garfinkel, SL 2006, ‘Forensic feature extraction and cross-drive analysis’, Digital Investigation, vol. 3, pp. 71–81, viewed 11 March 2013, <http://linkinghub.elsevier.com/retrieve/pii/S1742287606000697>. ― 2009, ‘Automating Disk Forensic Processing with SleuthKit, XML and Python’, 2009 Fourth International IEEE Workshop on Systematic Approaches to Digital Forensic Engineering, Ieee, pp. 73–84. Goodman, M 2001, ‘Making computer crime count’, FBI Law Enforcement Bulletin, vol. 70, FBI, no. 8, pp. 10–17, viewed 2 September 2011, <http://www.ncjrs.gov/App/abstractdb/AbstractDBDetails.aspx?id=190553>. Guðjónsson, K 2010, ‘Mastering the super timeline with log2timeline’, SANS Institute, viewed 21 May 2013, <http://scholar.google.com/scholar?hl=en&btnG=Search&q=intitle:Mastering+the+Super+Ti meline+With+log2timeline#0>. Kent, K, Chevalier, S & Grance, T 2006, ‘Guide to integrating forensic techniques into incident response’, NIST Special Publication, viewed 23 August 2011, <http://cybersd.com/sec2/800-86Summary.pdf>. Marrington, A 2009, ‘Computer profiling for forensic purposes’, viewed 20 October 2013, <http://eprints.qut.edu.au/31048>. Marrington, A, Baggili, I, Mohay, G & Clark, A 2011, ‘CAT Detect (Computer Activity Timeline Detection): A tool for detecting inconsistency in computer activity timelines’, Digital Investigation, vol. 8, pp. S52–S61, viewed 2 April 2013, <http://linkinghub.elsevier.com/retrieve/pii/S1742287611000314>. Marrington, A, Mohay, G, Clark, A & Morarji, H 2007, ‘Event-based computer profiling for the forensic reconstruction of computer activity’, vol. 2007, pp. 71–87, viewed 20 April 2013, <http://eprints.qut.edu.au/15579>. 62 McKemmish, R 1999, ‘What is Forensic computing?’, Trends and Issues in Crime and Criminal Justice, vol. 0817-8542, Australian Institute of Criminology, no. 118, pp. 1–6. Morris, T, Vaughn, R & Dandass, Y 2012, ‘A Retrofit Network Intrusion Detection System for MODBUS RTU and ASCII Industrial Control Systems’, 2012 45th Hawaii International Conference on System Sciences, Ieee, pp. 2338–2345, viewed 16 March 2013, <http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=6149298>. NIST, NI of S and T 2004, Forensic Examination of Digital Evidence: A Guide for Law Enforcement, Office, viewed 13 June 2013, <http://www.ncjrs.gov/App/abstractdb/AbstractDBDetails.aspx?id=199408>. Peisert, S & Bishop, M 2007a, ‘Analysis of computer intrusions using sequences of function calls’, Dependable and Secure …, viewed 27 April 2013, <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4198178>. ― 2007b, ‘Toward models for forensic analysis’, UC Davis Previously Published Works, viewed 20 April 2013, <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4155346>. Peng, W, Li, T & Ma, S 2005, ‘Mining logs files for data-driven system management’, ACM SIGKDD Explorations Newsletter, vol. 7, no. 1, pp. 44–51. Richard, GG & Roussev, V 2006, ‘Next-generation digital forensics’, Communications of the ACM, vol. 49, no. 2, viewed 11 May 2013, <http://dl.acm.org/citation.cfm?id=1113074>. Rider, K, Mead, S & Lyle, J 2010, ‘Disk Drive I/O Commands and Write Blocking’, International Federation for Information …, vol. 242, pp. 163–177, viewed 27 June 2013, <http://cs.anu.edu.au/iojs/index.php/ifip/article/view/11099>. Roesch, M 1999, ‘Snort-lightweight intrusion detection for networks’, Proceedings of the 13th USENIX conference on …, viewed 9 April 2013, <http://static.usenix.org/publications/library/proceedings/lisa99/full_papers/roesch/roesch.pdf >. Rowlingson, R 2004, ‘A ten step process for forensic readiness’, International Journal of Digital Evidence, vol. 2, Citeseer, no. 3, pp. 1–28, viewed 16 October 2011, <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.65.6706&amp;rep=rep1&amp;ty pe=pdf>. Simpson, S, Howard, M, Randolph, K, Goldschmidt, C, Coles, M, Belk, M, Saario, M, Sondhi, R, Tarandach, I, Yonchev, Y & Vähä-Sipilä, A 2011, Fundamental Practices for Secure Software Development 2ND EDITION. Stevens, D 2006, Userassist, viewed <http://blog.didierstevens.com/programs/userassist/>. ― 2010, New Format for UserAssist Registry Keys, no. December. ― 2012, UserAssist Windows 2000 Thru Windows 8, no. July. 63 27 June 2013, Sutherland, I, Evans, J, Tryfonas, T & Blyth, A 2008, ‘Acquiring volatile operating system data tools and techniques’, ACM SIGOPS Operating Systems Review, vol. 42, ACM, no. 3, pp. 65–73, viewed 8 April 2011, <http://portal.acm.org/citation.cfm?id=1368516>. Swift, D 2006, ‘A Practical Application of SIM/SEM/SIEM-Automating Threat Identification’, Paper, SANS Infosec Reading Room, The SANS …, viewed 27 October 2012, <http://scholar.google.com/scholar?hl=en&btnG=Search&q=intitle:A+Practical+Application +of+SIM/SEM/SIEM+Automating+Threat+Identification#0>. Tan, J 2001, ‘Forensic readiness’, Cambridge, MA:@ Stake, pp. 1–23, viewed 16 October 2011, <http://isis.poly.edu/kulesh/forensics/forensic_readiness.pdf>. Turner, P 2006, ‘Selective and intelligent imaging using digital evidence bags’, Digital Investigation, vol. 3, pp. 59–64, viewed 9 November 2013, <http://linkinghub.elsevier.com/retrieve/pii/S174228760600065X>. Watt, AC 2012, ‘Development of a Framework for the Investigation into the Methods Used for the Electronic Trafficking and Concealment of Child Abuse Material’, University of South Australia, p. 348. Willassen, S 2008, ‘Finding evidence of antedating in digital investigations’, ARES, Ieee, pp. 26–32, viewed 5 May 2013, <http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=4529317>. Wong, L 2007, ‘Forensic analysis of the Windows registry’, Forensic Focus, viewed 26 June 2013, <http://www.forensictv.net/Downloads/digital_forensics/forensic_analysis_of_windows_regi stry_by_lih_wern_wong.pdf>. Xu, W, Huang, L, Fox, A, Patterson, D & Jordan, M 2009, Detecting large-scale system problems by mining console logs, … on Operating systems …, viewed 13 August 2013, <http://dl.acm.org/citation.cfm?id=1629587>. Zaraska, K 2003, Prelude IDS: current state and development perspectives, URL http://www. prelude-ids. org/download/misc/ …, viewed 29 October 2013, <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.106.5542&rep=rep1&type=pdf>. 64 Appendix A – Glossary Term Description Container files Files which contain other files. In this category are email mail box files (.OST .PST … etc) and archive files (.zip .7z .rar … etc) Shell bags A part of the registry containing information in shell item format of use in digital forensics for finding viewed folders. Is especially of use to list files from removable storage devices or from folders which now are deleted. Shell Items Shell items are used in Windows to identify items in the windows folder hierarchy and more specifically in windows shortcut (.lnk) files and in the Shellbags registry key Social Media Websites like facebook, Instagram … which are primarily socially focused. URL Universal Resource Locator, Normally used for web links. Userassist A part of the registry which contains a list of the programs a user has run and the last time they ran them. 65 Appendix B – List of formats that Log2Timeline tool parses • Apache2 Access logs • Apache2 Error logs • Google Chrome history • Encase dirlisting • Windows Event Log files (EVT) • Windows Event Log files (EVTX) • EXIF. Extracts exif information or metadata from various media files • Firefox bookmarks • Firefox 2 history • Firefox 3 history • FTK Imager Dirlisting CSV file • Generic Linux log file • Internet Explorer history files, parsing index.dat files • Windows IIS W3C log files • ISA server text export. Copy query results to clipboard and into a text file • Mactime body files (to provide an easy method to modify from mactime format to some other) • McAfee AntiVirus Log files • MS-SQL Error log • Opera Global and Direct browser history • OpenXML metadata, for metadata extraction from Office 2007 documents • PCAP files, parsing network dump files created by tool such as Wireshark and tcpdump (PCAP) • PDF. Parse the basic PDF metadata to capture creation dates and other information from PDF documents. • Windows Prefetch directory • Windows Recycle Bin (INFO2 or I$) • Windows Restore Points • Safari Browser history files • Windows XP SetupAPI.log file 66 • Adobe Local Shared Object files (SOL/LSO), aka Flash Cookies • Squid Access Logs (httpd_emulate off) • TLN (timeline) body files • UserAssist key of the Windows registry - well really NTUSER.DAT parser since there are other keys parsed as well • Volatility. The output file from the psscan and psscan2 modules from volatility • Windows Shortcut files (LNK) • Windows WMIProv log file • Windows XP Firewall Log files (W3C format) 67 Appendix C – USB history report datetime 2013-1024T13:00 :00+00:0 0 2013-1024T13:00 :00+00:0 0 2013-1025T00:00 :00+11:0 0 2013-1025T06:37 :46+00:0 0 2013-1025T06:37 :46+00:0 0 2013-1025T06:37 :46+00:0 0 2013-1025T06:37 :56+00:0 0 2013-1025T17:37 :46+11:0 0 2013-1025T17:37 :46+11:0 0 2013-1026T10:57 :12.0781 25+11:00 2013-1026T10:57 :12.0781 25+11:00 2013-1027T10:37 :18.7968 75+11:00 2013-1027T10:37 :18.7968 75+11:00 timestamp_ Source source desc long message Last Access LNK Time Windows Shortcut [Empty description] File size: 0 File attribute flags: 0x00000010 Drive type: 2 Drive serial number: 0x4e9b6351 Volume label: USBDISK Local path: E:\tmp Last Access LNK Time Windows Shortcut Last Access LNK Time Windows Shortcut Creation Time LNK Windows Shortcut Content Modification LNK Time Windows Shortcut Creation Time LNK Windows Shortcut Content Modification LNK Time Windows Shortcut Content Modification LNK Time Windows Shortcut Creation Time LNK Windows Shortcut Last Written REG NTUSER key [\Software\Microsoft\Windows\CurrentVersion\Explorer\Mo untPoints2\{c3e7850e-3dd0-11e3-bebc-806d6172696f}] BaseClass: [REG_SZ] Drive Last Written REG NTUSER key [\Software\Microsoft\Windows\CurrentVersion\Explorer\Mo untPoints2] Value: No values stored in key. Last Written REG NTUSER key [\Software\Microsoft\Windows\CurrentVersion\Explorer\Mo untPoints2\{c0bdf39c-3e96-11e3-873c-806d6172696f}] BaseClass: [REG_SZ] Drive Last Written REG NTUSER key [\Software\Microsoft\Windows\CurrentVersion\Explorer\Mo untPoints2] Value: No values stored in key. [Empty description] File size: 21491 File attribute flags: 0x00000020 Drive type: 2 Drive serial number: 0x4e9b6351 Volume label: USBDISK Local path: E:\tmp\grml-cheatcodes.txt Working dir: E:\tmp [Empty description] File size: 21491 File attribute flags: 0x00000020 Drive type: 2 Drive serial number: 0x4e9b6351 Volume label: USBDISK Local path: E:\tmp\grml-cheatcodes.txt Working dir: E:\tmp [Empty description] File size: 0 File attribute flags: 0x00000010 Drive type: 2 Drive serial number: 0x4e9b6351 Volume label: USBDISK Local path: E:\tmp [Empty description] File size: 21491 File attribute flags: 0x00000020 Drive type: 2 Drive serial number: 0x4e9b6351 Volume label: USBDISK Local path: E:\tmp\grml-cheatcodes.txt Working dir: E:\tmp [Empty description] File size: 21491 File attribute flags: 0x00000020 Drive type: 2 Drive serial number: 0x4e9b6351 Volume label: USBDISK Local path: E:\tmp\grml-cheatcodes.txt Working dir: E:\tmp [Empty description] File size: 0 File attribute flags: 0x00000010 Drive type: 2 Drive serial number: 0x4e9b6351 Volume label: USBDISK Local path: E:\tmp [Empty description] File size: 21491 File attribute flags: 0x00000020 Drive type: 2 Drive serial number: 0x4e9b6351 Volume label: USBDISK Local path: E:\tmp\grml-cheatcodes.txt Working dir: E:\tmp [Empty description] File size: 21491 File attribute flags: 0x00000020 Drive type: 2 Drive serial number: 0x4e9b6351 Volume label: USBDISK Local path: E:\tmp\grml-cheatcodes.txt Working dir: E:\tmp 68 2013-1027T11:12 Last Written REG :33.5625 00+11:00 2013-1027T19:57 Last Written REG :16.1093 75+11:00 NTUSER key [\Software\Microsoft\Windows\CurrentVersion\Explorer\Mo untPoints2] Value: No values stored in key. SYSTEM key [\ControlSet001\Services\USBSTOR\Security] Security: [REG_BINARY] 2013-1027T19:57 Last Written REG :20.5312 50+11:00 SYSTEM key 2013-1027T19:57 Last Written REG :20.5781 25+11:00 SYSTEM key 2013-1027T19:57 Last Written REG :22.4375 00+11:00 SYSTEM key 2013-1027T19:57 :22.4375 00+11:00 2013-1027T19:57 :22.4375 00+11:00 2013-1027T19:57 :22.4531 25+11:00 2013-1027T19:57 :22.4531 25+11:00 2013-1027T19:57 :22.4531 [\ControlSet001\Control\Class\{36FC9E60-C465-11CF8056-444553540000}\0003] DriverDate: [REG_SZ] 7-12001 DriverDateData: [REG_BINARY] DriverDesc: [REG_SZ] USB Mass Storage Device DriverFlags: [REG_DWORD_LE] 1 DriverVersion: [REG_SZ] 5.1.2600.0 InfPath: [REG_SZ] usbstor.inf InfSection: [REG_SZ] USBSTOR_BULK InfSectionExt: [REG_SZ] .NT MatchingDeviceId: [REG_SZ] usb\class_08&subclass_06&prot_50 ProviderName: [REG_SZ] Microsoft [\ControlSet001\Services\USBSTOR] DisplayName: USB Mass Storage Driver ErrorControl: Normal (1) ImagePath: system32\DRIVERS\USBSTOR.SYS Start: Manual (3) Type: Kernel Device Driver (0x1) [\ControlSet001\Enum\USB\Vid_05dc&Pid_a810\6&38fcc a26&0&1] Capabilities: [REG_DWORD_LE] 4 Class: [REG_SZ] USB ClassGUID: [REG_SZ] {36FC9E60-C46511CF-8056-444553540000} CompatibleIDs: [REG_MULTI_SZ] USB\Class_08&SubClass_06&Prot_50USB\Class_08&Su bClass_06USB\Class_08 ConfigFlags: [REG_DWORD_LE] 0 DeviceDesc: [REG_SZ] USB Mass Storage Device Driver: [REG_SZ] {36FC9E60-C46511CF-8056-444553540000}\0003 HardwareID: [REG_MULTI_SZ] USB\Vid_05dc&Pid_a810&Rev_1100USB\Vid_05dc&Pid_ a810 LocationInformation: [REG_SZ] USB Device Mfg: [REG_SZ] Compatible USB storage device ParentIdPrefix: [REG_SZ] 7&1fb3deb6&0 Service: [REG_SZ] USBSTOR UINumber: [REG_DWORD_LE] 0 First ConREG nection Time SYSTEM key : USBStor Entries [\ControlSet001\Enum\USBSTOR] subkey_name: Disk&Ven_Lexar&Prod_JD_FireFly&Rev_1100 Last Written REG SYSTEM key [\ControlSet001\Enum\USBSTOR\Disk&Ven_Lexar&Prod _JD_FireFly&Rev_1100] Value: No values stored in key. Last Written REG SYSTEM key [\ControlSet001\Enum\USBSTOR\Disk&Ven_Lexar&Prod _JD_FireFly&Rev_1100\7&1fb3deb6&0&AAMA1CG1OXX 83PM1&0\Device Parameters\MediaChangeNotification] Value: No values stored in key. Last Written REG SYSTEM key [\ControlSet001\Enum\USBSTOR\Disk&Ven_Lexar&Prod _JD_FireFly&Rev_1100\7&1fb3deb6&0&AAMA1CG1OXX 83PM1&0\LogConf] Value: No values stored in key. Last ConREG nection Time SYSTEM key : USBStor [\ControlSet001\Enum\USBSTOR] device_type: Disk friendly_name: Lexar JD FireFly USB Device parent_id_prefix: 8&107bbb1a&0 product: Prod_JD_FireFly 69 25+11:00 Entries 2013-1027T19:57 Last Written REG :22.4843 75+11:00 SYSTEM key 2013-1027T19:57 Last Written REG :22.4843 75+11:00 SYSTEM key 2013-1027T19:57 Last Written REG :25.9531 25+11:00 NTUSER key 2013-1027T20:01 Last Written REG :50.2343 75+11:00 SYSTEM key 2013-1027T20:01 Last ConREG :50.2343 nection Time 75+11:00 SYSTEM key : USBStor Entries 2013-1027T20:01 Last Written REG :50.3593 75+11:00 2013-10- Last ConREG 27T20:01 nection Time SYSTEM key SYSTEM key : revision: Rev_1100 serial: 7&1fb3deb6&0&AAMA1CG1OXX83PM1&0 subkey_name: Disk&Ven_Lexar&Prod_JD_FireFly&Rev_1100 vendor: Ven_Lexar [\ControlSet001\Control\DeviceClasses\{53f56307-b6bf11d0-94f200a0c91efb8b}\##?#USBSTOR#Disk&Ven_Lexar&Prod_J D_FireFly&Rev_1100#7&1fb3deb6&0&AAMA1CG1OXX8 3PM1&0#{53f56307-b6bf-11d0-94f2-00a0c91efb8b}\#] SymbolicLink: [REG_SZ] \\?\USBSTOR#Disk&Ven_Lexar&Prod_JD_FireFly&Rev_ 1100#7&1fb3deb6&0&AAMA1CG1OXX83PM1&0#{53f56 307-b6bf-11d0-94f2-00a0c91efb8b} [\ControlSet001\Control\DeviceClasses\{53f56307-b6bf11d0-94f200a0c91efb8b}\##?#USBSTOR#Disk&Ven_Lexar&Prod_J D_FireFly&Rev_1100#7&1fb3deb6&0&AAMA1CG1OXX8 3PM1&0#{53f56307-b6bf-11d0-94f2-00a0c91efb8b}] DeviceInstance: [REG_SZ] USBSTOR\Disk&Ven_Lexar&Prod_JD_FireFly&Rev_110 0\7&1fb3deb6&0&AAMA1CG1OXX83PM1&0 [\Software\Microsoft\Windows\CurrentVersion\Explorer\Mo untPoints2\E] BaseClass: [REG_SZ] Drive [\ControlSet001\Enum\USBSTOR\Disk&Ven_Lexar&Prod _JD_FireFly&Rev_1100\7&1fb3deb6&0&AAMA1CG1OXX 83PM1&0] Capabilities: [REG_DWORD_LE] 0 Class: [REG_SZ] DiskDrive ClassGUID: [REG_SZ] {4D36E967E325-11CE-BFC1-08002BE10318} CompatibleIDs: [REG_MULTI_SZ] USBSTOR\DiskUSBSTOR\RAW ConfigFlags: [REG_DWORD_LE] 0 DeviceDesc: [REG_SZ] Disk drive Driver: [REG_SZ] {4D36E967-E325-11CEBFC1-08002BE10318}\0003 FriendlyName: [REG_SZ] Lexar JD FireFly USB Device HardwareID: [REG_MULTI_SZ] USBSTOR\DiskLexar___JD_FireFly______1100USBSTO R\DiskLexar___JD_FireFly______USBSTOR\DiskLexar_ __USBSTOR\Lexar___JD_FireFly______1Lexar___JD_Fi reFly______1USBSTOR\GenDiskGenDisk Mfg: [REG_SZ] (Standard disk drives) ParentIdPrefix: [REG_SZ] 8&107bbb1a&0 Service: [REG_SZ] disk UINumber: [REG_DWORD_LE] 0 [\ControlSet001\Enum\USBSTOR] device_type: Disk friendly_name: Lexar JD FireFly USB Device parent_id_prefix: 8&107bbb1a&0 product: Prod_JD_FireFly revision: Rev_1100 serial: 7&1fb3deb6&0&AAMA1CG1OXX83PM1&0 subkey_name: Disk&Ven_Lexar&Prod_JD_FireFly&Rev_1100 vendor: Ven_Lexar [\ControlSet001\Enum\USBSTOR\Disk&Ven_Lexar&Prod _JD_FireFly&Rev_1100\7&1fb3deb6&0&AAMA1CG1OXX 83PM1&0\Device Parameters] Value: No values stored in key. [\ControlSet001\Enum\USBSTOR] device_type: Disk friendly_name: Lexar JD FireFly USB Device par- 70 :50.3593 75+11:00 2013-1027T20:10 :37.3281 25+11:00 2013-1027T20:10 :37.3281 25+11:00 2013-1027T20:28 :46.2960 00+11:00 2013-1027T20:28 :46.2968 75+11:00 2013-1027T20:28 :46.5930 00+11:00 2013-1027T20:28 :46.6250 00+11:00 2013-1027T20:28 :46.7031 25+11:00 2013-1027T20:28 :48.1875 00+11:00 2013-1027T20:28 :48.8125 00+11:00 2013-1027T20:28 :48.8125 00+11:00 2013-1027T20:28 :48.8125 00+11:00 2013-1027T20:28 :48+11:0 0 2013-1027T20:28 USBStor Entries ent_id_prefix: 8&107bbb1a&0 product: Prod_JD_FireFly revision: Rev_1100 serial: 7&1fb3deb6&0&AAMA1CG1OXX83PM1&0 subkey_name: Disk&Ven_Lexar&Prod_JD_FireFly&Rev_1100 vendor: Ven_Lexar Last Written REG NTUSER key [\Software\Microsoft\Windows\CurrentVersion\Explorer\Mo untPoints2\{ca738c90-3ee5-11e3-8741-525400d288c3}] BaseClass: [REG_SZ] Drive Last Written REG NTUSER key [\Software\Microsoft\Windows\CurrentVersion\Explorer\Mo untPoints2] Value: No values stored in key. Last Visited Time MSIE WEBHIS Location: Visited: user@file:///E:/tmp/grml-cheatcodes.txt Cache File T Number of hits: 1 Cached file size: 0 URL record Last Written REG NTUSER [\Software\Microsoft\Windows\CurrentVersion\Explorer\Re key : MRUx centDocs\.txt] 1 [0]: grml-cheatcodes.txt List Last Visited Time MSIE WEBHIS Location: :2013102720131028: user@file:///E:/tmp/grmlCache File T cheatcodes.txt Number of hits: 1 Cached file size: 0 URL record Last Visited Time MSIE WEBHIS Location: :2013102720131028: user@file:///E:/tmp/grmlCache File T cheatcodes.txt Number of hits: 1 Cached file size: 0 URL record Last Written REG NTUSER [\Software\Microsoft\Windows\CurrentVersion\Explorer\Re key : MRUx centDocs] 1 [1]: tmp 2 [0]: grml-cheatcodes.txt List crtime FILE NTFS_DET /dev/nbd0p1:/Documents and Settings/user/Recent/grmlECT crtime cheatcodes.txt.lnk mtime FILE NTFS_DET /dev/nbd0p1:/Documents and Settings/user/Recent/grmlECT mtime cheatcodes.txt.lnk ctime FILE NTFS_DET /dev/nbd0p1:/Documents and Settings/user/Recent/grmlECT ctime cheatcodes.txt.lnk atime FILE NTFS_DET /dev/nbd0p1:/Documents and Settings/user/Recent/grmlECT atime cheatcodes.txt.lnk Last Checked Time MSIE WEBHIS Location: Visited: user@file:///E:/tmp/grml-cheatcodes.txt Cache File T Number of hits: 1 Cached file size: 0 URL record Last Checked WEBHIS MSIE Location: :2013102720131028: user@file:///E:/tmp/grmlT Cache File cheatcodes.txt Number of hits: 1 Cached file size: 0 71 :48+11:0 Time 0 URL record 72 Appendix D – Web URL’s The extracted internet history URL’s shown below was sxtracted directly from Plaso. Further python scripts were used to clean up the output and divide off the file:// links from http:// and https:// links before further analysis could happen. :2013102720131028: SYSTEM@:Host: My Computer :2013102720131028: SYSTEM@file:///C:/WINDOWS/system32/oobe/updshell.htm :2013102720131028: user@:Host: My Computer :2013102720131028: user@file:///E:/tmp/grml-cheatcodes.txt Cookie:user@support.microsoft.com/ http://support.microsoft.com/Styles/onemscomcomponents.css http://support.microsoft.com/Styles/oneMscomMaster.css http://windowsupdate.microsoft.com/windowsupdate/v6/default.aspx http://windowsupdate.microsoft.com/windowsupdate/v6/default.aspx?ln=en-us http://www.ninemsn.com.au/?ocid=iefvrt http://www.ninemsn.com.au/css/style.min.css?v=10 Visited: SYSTEM@file:///C:/WINDOWS/system32/oobe/updshell.htm Visited: user@file:///E:/tmp/grml-cheatcodes.txt Visited: user@http://go.microsoft.com/fwlink/?LinkId=54729&clcid=0x0c09 Visited: user@http://ninemsn.com.au/?ocid=iefvrt Visited: user@http://www.microsoft.com/isapi/redir.dll?prd=ie&pver=6&ar=msnhome Visited: user@http://www.ninemsn.com.au/?ocid=iefvrt 73