Daniel Walton thesis c r3 41

advertisement
University of South Australia
Development of an approach using
automation to enhance the process of
Computer Forensic Analysis
Minor Thesis
Student:
Daniel Walton
ID:
110071749
Email Address:
waldj007@mymail.unisa.edu.au
Supervisor:
Dr Elena Sitnikova
Subject:
Master of Science (Cyber Security and
Forensic Computing)
LMIA
Year:
2013
AUTHOR’S DECLARATION
I declare that this Thesis does not incorporate, without acknowledgement, any material
previously submitted for a degree or diploma in any university; and that to the best of my
knowledge it does not contain any materials previously published or written by another
person except where due reference is made in the text.
Daniel Walton
1
ACKNOWLEDGEMENTS
To my wonderful wife Rebecca, thank you for your patience and support all the way through
the process of my study for this thesis, it has been invaluable. To my adorable daughter
Charlotte, your coming into the world has brightened our lives as it’s so much of a joy
watching you grow and discover this world.
To Dr Allan Watt, without your urging I may not have begun these studies. Thank you for
your help and encouragement.
To my supervisor Dr Elena Sitnikova, many thanks for your help, guidance and support with
this thesis.
2
Abstract
One of the biggest issues for digital forensic investigators is that storage
in computers continues to grow in size rapidly, while the ability to extract
intelligence from said data isn't increasing at the same rate. This is
making it harder and harder to find relevant data in the growing sea of
irrelevant data. Digital Forensic analysis is performed usually when an
incident of interest is detected. Incidents are either detected by humans
or by automated systems detecting suspicious behaviour. There are
automated systems which detect a need for digital forensic analysis by
identifying suspicious activity (Forensic Readiness). To do this these
systems usually have a set of criteria or rules that define normal activity,
as well as criteria for abnormal or suspicious activity and will send alerts
when suspicious activity occurs. However when it comes to the forensic
analysis of evidence, there is not much in the way of software to analyse
forensic evidence and provide findings. Automated analysis of evidence
will be invaluable for investigators, as it will help them to remove
irrelevant data and focus on data of interest. This research examined
what is currently available for automatic analysis of evidence, as well as
assessing the implementation of an automatic analysis system.
3
AUTHOR’S DECLARATION ................................................................................... 1
ACKNOWLEDGEMENTS ....................................................................................... 2
Abstract ................................................................................................................... 3
CHAPTER ONE - Introduction ................................................................................ 7
1.
Background ................................................................................................... 7
1.1
Forensic Readiness ................................................................................ 7
1.2
Forensic evidence acquisition................................................................. 8
1.3
Forensic Analysis .................................................................................... 8
1.4
Automated analysis ................................................................................ 9
1.5
Significance of the Problem .................................................................. 10
1.6
Research Issue.................................................................................... 11
1.6.1
Sub-Problems.................................................................................... 11
1.7
Elaboration of the Sub-problems ......................................................... 12
1.8
Research thesis title ............................................................................ 14
Chapter Two – Literature Review .......................................................................... 15
2
Literature review.......................................................................................... 15
2.1
Automating processing of evidence ........................................................ 15
2.2
Computer Profiling ................................................................................... 17
2.3
Timeline analysis ..................................................................................... 19
2.4
Analysis ................................................................................................... 21
Chapter Three - Methodology ............................................................................... 23
3.
3.1
Methodology................................................................................................ 23
Focus of research .................................................................................... 24
CHAPTER Four – Computer profiling ................................................................... 26
4.
4.1
Introduction ................................................................................................. 26
Methods ................................................................................................... 28
4
CHAPTER FIVE – Log2timeline and Plaso ........................................................... 30
5.
Introduction ................................................................................................. 30
5.1
Comparison ............................................................................................. 31
CHAPTER SIX – Analysis of different analysis systems ....................................... 33
6.
Introduction ................................................................................................. 33
6.1
Incident Response/ Malware (Malicious Software) ................................. 33
6.2
Intellectual Property theft (IP Theft) ........................................................ 33
6.3
Access to Child Abuse Material (CAM) ................................................... 33
6.4
Snort ....................................................................................................... 34
6.5
Markov chain analysis methods .............................................................. 35
6.6
Further different analysis methods .......................................................... 35
6.7
Statistics ................................................................................................. 36
CHAPTER SEVEN – Tests and implementations ................................................. 38
7.
Introduction ................................................................................................. 38
7.1
Profiling ................................................................................................... 38
7.2
Large Files ........................................................................................... 40
7.3
Most Visited websites .......................................................................... 42
7.4
User profile registry dates and times: .................................................. 43
7.5
Analysis ................................................................................................... 44
7.6
Intellectual Property (IP) Theft ............................................................ 45
7.7
Incident Response / Malware analysis ................................................ 49
7.8
Time changing .................................................................................... 51
7.9
Statistics .............................................................................................. 51
7.10
Rule based analysis .......................................................................... 52
7.11 Findings .................................................................................................. 53
CHAPTER EIGHT - CONCLUSIONS AND FURTHER WORK ............................. 54
8.1
Research Conclusions ............................................................................ 54
5
8.2
Areas for further study ............................................................................. 57
8.3
Conclusion .............................................................................................. 59
References:........................................................................................................... 61
Appendix A – Glossary .......................................................................................... 65
Appendix B – List of formats that Log2Timeline tool parses ................................. 66
Appendix C – USB history report .......................................................................... 68
Appendix D – Web URL’s ..................................................................................... 73
6
CHAPTER ONE - Introduction
1. Background
Digital forensic investigators are swimming in a sea of data, the size of storage
keeps increasing as well as the amount of data people are generating is also
increasing. On the other side digital forensic tools abilities to deal with all this data is
not increasing at the same rate. Data reduction is done by removing known files by
their cryptographic hash, using date ranges or finding files based on specific
keywords (Richard & Roussev 2006).
There currently is a lot of manual analysis being performed by investigators. The
motivation for this paper is to find what sort of automatic analysis techniques are
currently in use, what research has been done (often different to real world), find
research areas of interest and investigate an approach which will be able to assist
with the analysis of evidence
1.1 Forensic Readiness
Forensic investigations are normally started when something suspicious is detected
and forensic analysis is initiated to find out exactly what happened. This detection
either is manually detected by a human suspicious about some activity or lack of,
who then initiates procedures to get this investigated or the suspicious activity is
detected automatically by a specialised computer monitoring system which then
sends an alert and then a decision can be made whether or not forensic analysis is
required. If forensic analysis is required then the evidence is acquired and analysis
started.
There are many different types of anomaly detection systems and methods for
computers. Anti-virus software is used on most computers and is used to detect and
clean viruses, including other forms like malware, worms, Trojan horses and
spyware. An Intrusion Detection System (IDS) is used to detect suspicious network
traffic; it monitors network traffic and raises alerts when suspicious traffic is detected.
Security Information Event Management (SIEM) systems monitor the logs generated
from Computers (mostly servers), IDS's and network devices like firewalls, routers
and switches for suspicious activity. Anti-Spam software is used to detect unsolicited
7
commercial email otherwise known as SPAM. These systems all have methods for
discerning between legitimate content and behaviour and abnormal content or
behaviour.
Forensic readiness is a model for the early detection and collection of evidence
relating to suspicious activity (Tan 2001). SIEM systems when configured correctly
provide Forensic Readiness abilities for the ICT infrastructure that they have been
configured to monitor, especially by providing rule based event detection and secure
logging systems (Rowlingson 2004). This works by collecting logs from all servers,
network infrastructure (switches, routers, IDS's) and sometimes data from
workstations. As these logs all come from different places they are also in different
formats with windows event logs, syslog event logs from Unix/Linux and more
commonly, different types of text based delimited logs. Conversions are performed
across the different formats so as to combine them in one aggregated log. These
aggregated logs are then automatically examined for suspicious activity. Forensic
readiness helps detect incidents which will need investigation by Computer Forensic
Analysis.
1.2 Forensic evidence acquisition
Once an event is detected evidence must be acquired into a form for analysis.
Acquired evidence is a snapshot of the system at the point of acquisition
(McKemmish 1999; Rider, Mead & Lyle 2010). This could be a copy of a mobile
phone, Memory from a computer or storage like hard disk drives (HDD's) and USB
flash drives (Sutherland et al. 2008). The acquired evidence is collected in a way so
that it can be verified afterwards to confirm the evidence hasn't been modified since
the initial acquisition (NIST 2004). This is done with a cryptographic hash and is
usually a MD5 hash or a SHA1 hash.
Once the evidence has been acquired a copy can be made of the acquired evidence,
the master copy stored securely away and the working copy used for analysis.
1.3 Forensic Analysis
Analysis is always performed on the acquired evidence files. As the evidence is a
static resource it is easy to try different analysis methods as well as have the time to
compare tools to make sure the analysis results are repeatable. Most forensic
analysis is manually performed by investigators using any of the four main forensic
8
tools; AccessData’s Forensic Tool Kit (FTK), GetData’s Forensic Explorer, Guidance
Software’s Encase, and X-Ways Forensics. These tools have many automated
features for the processing of evidence and extraction of data of interest but the
actual analysis still has to be done by the investigator.
The automated features provided by these tools are very useful for removing
irrelevant data, the opening and viewing of many different file types, viewing the
content of container files, parsing operating system files for metadata, finding deleted
files, data carving and the ability to search for files with raw searches as well as
indexed searches. All these features help the investigator with the forensic analysis
of evidence yet don't actually do any automated analysis of the evidence.
For example, for the detection of IP theft (Australian Institute of Criminology 2008)
the investigator will analyse the USB device history information, windows shortcut
‘LNK’ files for removable device access, Internet history, registry shellbag entries and
sequential access/create times of files and correlate the results, in the interest to find
whether there are signs that data was copied to a USB flash drive. The process of
doing this is quite manual and there may be ways of automating some of this
process.
1.4 Automated analysis
The open source Intrusion Detection System (IDS) Snort uses a simple rule based
detection system to detect network packets of interest (Roesch 1999). The syntax for
Snorts rule language is flexible enough that rules can be written to detect items of
interest in traffic in protocols that it was not originally designed to detect and an
example of this is ‘SCADA MODBUS’ communications over TCPIP (Morris, Vaughn
& Dandass 2012).
Snorts rule based analysis is primarily packet by packet and rules do not normally
analyse more than once packet at once. The network packets it analyses are usually
standardised and documented protocols but for the automated analysis of computer
evidence there are a lot more different types of sources of data to process. For a
thorough analysis of evidence of a computer running MS Windows examination of
the following is required: the filesystem, Event logs, Registry, parsed registry
structures (an example of these are some of the following: shell bags, userassist,
USB) registry, Internet history of all installed browsers and image EXIF metadata.
9
These different pieces of data are stored in different files and in different formats and
reading from all these different areas would be complicated as the different formats
would all need their own custom parser to extract out intelligence and the different
data from each file would need to be combined.
As Snort’s rule based system lends itself to adaptation to monitor and analyse
different protocols from the same packet dump, a similar rule based system should
also work over a simplified, standardised file format for the recording of metadata
from evidence which differentiates between the different sources of metadata, such
as the differentiates between the entries sources.
A comparison between the different forensic tools currently available to the earlier
tools made by the same companies finds minimal difference in the area of
automated analysis. Their functionality is mostly still the same as it was and they are
mostly just scaled up versions to deal with more data as well as parsing more file
types (Marrington et al. 2007).
Automated analysis has the potential to revolutionise the digital forensic area by
building on existing tools and providing actual analysis of evidence. This would save
the investigator having to manually parse through evidence looking for areas of
interest, as the results of the automated analysis would immediately highlight the
areas of interest for them to analyse. The investigator could then spend more time
on validating the results of the automated analysis and collating their findings
together as the automated analysis system quickly directs them to areas of interest,
saving much labour time. If the analysis system used rules they could be shared and
investigators could write new rules providing additional analysis features.
1.5 Significance of the Problem
Investigators currently spend a lot of time manually analysing evidence and with
large cases with much data it would make a big difference, if there was a way to
speed up analysis of the supplied evidence. This would be so investigators can
quickly get a profile of the activity on each computer, as well easily finding signs of
suspicious activity as this will enable them to spend more time on the solving of
cases and less on the extraction and analysis of evidence.
10
In many police departments there are backlogs for digital forensic analysis which
would benefit with an automatic analysis system, which could analyse evidence and
provide reports with profiles of the compute. This would also include all users as well
as results of automatic analysis, showing what suspicious activity has occurred on
the computer. As usual investigators will need to backup any findings with evidence
and so the reports will contain information explaining where the results came from.
Hard disk drives are only getting bigger and a way to quickly automate the analysis
of evidence will be of large benefit for the digital forensic field.
Development of the ‘log2timeline’ tool gave investigators a large jump in the
capabilities for investigators, as it enabled them to automatically extract file
information and metadata, where previously it was a manual task (Guðjónsson
2010). The plan for this thesis is to advance forensic analysis capability by
combining with the capabilities of ‘Plaso’, by adding automated analysis so as to
reduce the amount of manual analysis the investigator needs to perform.
1.6
Research Issue
The research issue for this project is:
How can automation be used to improve the process of Computer
Forensic Analysis?
1.6.1 Sub-Problems
This has generated a number of sub-problems as follows:
1
What are the existing tools for extracting relevant information from evidence as well
as the quality of the extracted information from these tools?
2
What solutions are there for parsing the many undocumented file and metadata
formats which are yet to be discovered and documented but could contain
information of interest?
3
How to ensure a low false-positive and false-negative detection rates while keeping a
high detection rate of relevant information?
4
What approach can be used to enhance digital forensic analysis with automation?
11
1.7
Elaboration of the Sub-problems
There are many different formats for the storing of data and metadata on computers
which compounds the analysis problem (Brownstone 2004). For many of these files,
the format is undocumented or proprietary which also complicates analysis (Kent,
Chevalier & Grance 2006). For example for an intellectual property theft case the
investigator would want to at least parse the registry (Accessdata 2005; Wong 2007)
and the ‘setupapi.dev.log/setupapi.log’ to gain details with regard to USB removable
device history, parse all link files, to gain details with regard to files opened from
removable devices, parse Internet Explorer for local file access to get further detail
regarding access to removable devices and also parse the NTFS USN Journal
(Carrier 2005), to check that files were not renamed before being copied to
removable devices. It is quite time consuming to parse all these different areas and
combine the information to analyse what has happened.
As many of these different files and metadata are undocumented, the ability to
extract useful information from them is often dependant on individual analysts
performing research and discovering the internal formats and writing software to
extract useful data. Didier Stevens in 2006 posted his ‘Userassist’ tool to parse the
‘userassist’ entries from the Windows Registry, he later discovered that for Windows
7 and Windows server 2008R2, that the format had changed and so released a
newer version of the tool with support for extracting these values from the newer
versions of Windows (Stevens 2010). In 2012 he released an updated version with
beta support for Windows 8. There are new discoveries of different files and
metadata that are useful for forensic analysis. Due to this there is always the
concern that investigators who do not keep up to date with new discoveries with
regard to what new metadata can be extracted may miss information. This will lead
to a lower standard of care, potentially resulting in innocent people being
incarcerated or guilty people being let off because of potentially missing relevant
information.
In regard to automatic analysis, current forensic tools are not able to analyse
evidence and provide the Investigator with a report showing discovered suspicious
activity as well as provide a profile of activity on the computer.
The current forensic tools:
12

are able to view many different file types

that are able to parse and extract data from many different files

that can help exclude known irrelevant files and detect known relevant files

that can analyse email and are able to index and search the evidence.
There are many different tools to help the investigator with forensic analysis but their
abilities can be mostly summed up as assisting the investigator analyse the
evidence, as compared to analysing the evidence themselves and providing the
investigator with the results of analysis for the investigator to check. Some of the
most common of these tools are AccessData’s Forensic Tool Kit (FTK), GetData’s
Forensic Explorer, Guidance Software’s Encase, and X-Ways Forensics.
The best that can be hoped for is better assistance for the investigator with analysis
as compared to better analysis. The investigator still needs to combine together the
different sources of data and perform analysis themselves to discover what actually
happened.
The detection rate of an automated analysis system is important as investigators do
not want to be flooded with irrelevant data. One example is an email spam filter
where a high false-positive rate means spam is not removed and the user gets too
much spam or with a high false-negative rate lots of legitimate emails are filtered out
as spam which is also unwanted. What is wanted is a high detection rate (or high
signal to noise ratio) for items of interest and unwanted items removed.
To automatically analyse evidence, software would need to be able to read all the
relevant parts of evidence needed for the analysis being performed. For intellectual
property theft for example, to detect the use of a USB flash drive to copy files off
computer information from the file system, link files and the registry are used
together to link together activity. Multiple sources of data need to be combined
together and used to help detect findings of relevance. This provides challenges as
13
different formats of information need to be combined in ways, where their content is
still readable without conflicting or losing detail.
1.8
Research thesis title
The proposed research thesis title is

Development of an approach using automation to enhance the process
of Computer Forensic Analysis
14
Chapter Two – Literature Review
2
Literature review
The main aim of this research is to discover what existing systems there are for the
automated analysis of computer evidence as well as research into an approach for
providing automation for analysis.
With the proposal of an automatic evidence analysis system some compromises
need to be made. Snort is a network Intrusion Detection System (IDS) it examines
network packets to find suspicious activity in network traffic. These network packets
arrive serially and have a known size and date and time.
This makes the task of analysis a lot simpler as every packet has an arrival time
connected to it as well as network traffic follows documented standards
and
protocols which helps analysis and the job of detecting anomalies. With the forensic
analysis of computers there are many different places for data to be stored as well
as many different file formats and most are either not standardised or the format isn't
open.
2.1
Automating processing of evidence
Richard and Roussevin wrote a paper discuss the processing of evidence in ways to
minimise and lower the analysis load for investigators because of the growth in the
size of collected evidence (Richard & Roussev 2006). They propose a distributed
system with which to divide up the evidence so as to be able to get multiple
computers processing this evidence. This is automated processing not automated
analysis and it seems very similar to what the commercial product FTK does. They
include many ways to help with culling known irrelevant files as well as ways to help
reveal data of relevance by extracting metadata and other additional information.
This automated processing concept could be of use for automated analysis as it is
designed to scale processing across many systems and might be of use for large
cases where resources are of concern.
In Ayers paper “A second generation computer forensic analysis system” he
proposes a system for the processing of evidence which is very similar to what was
proposed above by Richard and Roussev 2006
and as with theirs it's mostly
concerned with the processing of evidence, searches and hashing of evidence to aid
15
the investigators time spent analysing the evidence, one point of difference is with
his focus on creating an audit trail with regard to actions by the software and users
(Ayers 2009).
Farrell 2009 in his thesis is working from a similar premise in that storage devices
are getting cheaper and larger and so the automatic processing of evidence to
create automated reports will help remove “some of the load” from law enforcement
staff
(Farrell 2009).
His analysis method is fairly easy to implement in that it
primarily collects statistics about various forms of data in the evidence and collates it
together to create a report. An example of this would be listing the most commonly
used email addresses, web pages and recently accessed documents. What is nice
to see is the creation of different reports for different users, which is very useful to
differentiate who on the computer did what actions. As with Richard and Roussevin
he mentions using hashing functions combined with known good and known bad
files to detect files to ignore and to flag as important to help with the extraction of
relevant data and removal of irrelevant data. Statistics can be a simple method of
analysis to implement in an automatic analysis system which helps provide a profile
of the system.
Elsaesser & Tanner discuss the idea of using Abstract models to “guide” or help the
analysis of a computer which has attacked to find details regarding the network
intrusion (Elsaesser & Tanner 2001). As the previous papers it also is looking at
ways to help the investigator deal with the great amount of data in computers,
although in this case its specifically looking at log files to find signs of network
intrusions. The abstract models they put forward can be also applied in the general
analysis of evidence to determine capabilities that a user may have and with that
determine which activities they could or couldn't have been able to do. An example of
this is that new software was installed on the system but the user in question didn't
have the rights to install the software, which should raise an alert to analyse this
further. These concepts are quite viable for detecting abnormal activity but may
require more processing time.
16
Regarding the use of general purpose programming languages Garfinkel's paper on
the creation of fiwalk.py using the ‘pyflag’ library shows how python can be used to
extract file system information and file metadata as well as outputting the processed
information into easily parsible ‘XML’ (eXtensible Markup Language) files which
would leave the output easily available for others to parse and extend in other
programs (Garfinkel 2009). This sort of system helps the development of other tools
as it extracts information which it presents in an open format easy for modification
and use.
2.2 Computer Profiling
Andrew Marrington (2007, 2009, 2011) was involved in the writing of several
important papers on the subject of “Computer profiling”. His premise is to simplify
analysis of evidence by generating a profile of the evidence which will enable the
investigator to get a good idea of what activity has occurred, saving the investigator
from having to do the analysis themselves and enables them to make an easy
choice whether or not they will need to do a full analysis on the evidence. This is a
common theme as storage is increasing in size and being able to create profiles of
evidence will help investigators to quickly see whether or not they may be findings of
interest in the evidence. This is more towards the concept of digital forensics
automatic analysis.
Marrington with regard to the forensic reconstruction of computer activity by using
events, mentions that there are four classes of objects to be found on a computer
system namely “Application, Content, Principal, and System” and they discuss the
detection of relationships between these (Marrington et al. 2007). They mention that
finding relationships between objects is important and that it is complicated in
computers because of many different formats of data and put forward models and
ideas on how this could be done. The proposed profiling system uses information
from the file system, event logs, file metadata (using libextractor), word metadata
and user information from the registry which covers the most important data areas
on a computer although there are no mentions of using link files, jump lists, the
registry ( This includes registry dates and times, userassist entries , shell bag entries
and shim cache entries (Davis 2012)) or internet history which would help broaden
the information for analysis. The analysis performed is profiling based and there is
no checking for common items of interest like signs of ip theft or signs of malware
17
infection or other potential areas of interest but does make it a lot easier for an
analyst to get a good idea of the evidence.
A PHD thesis written by Andrew Marrington expands further on the previous paper
with discussion of Computer Profiling with further in depth analysis of related areas
with examination of ‘datamining’ and analysis of files with statistics based on
extractable text and some major analysis with regard to different “Computational
models” . Examines computational models put forward by Brian Carrier as well as by
Gladyshev and Patel concluding that these models are not feasible without a method
to automatically describe evidence based on “a finite state model (Marrington 2009)
and decides a better system is to model the computer history by using as a
foundation the computer event log. Analysis of computer evidence is a complex
problem as there are lots of different types of file formats and metadata as well as
different event logs to be processed and analysed. He mentioned models sometimes
are hard to translate to the real world and implementation of their model is the best
test.
It can be seen in this following paper about detecting inconsistency in time lines what
the Marrington’s model with a software implementation can perform. Marrington et al
discuss some specific automatic analysis in their paper, regarding detecting the
changing of computer clocks by examining events from the event log with regard to
the fact that certain events and actions cannot occur before a user has logged on to
the system (Marrington et al. 2011). Users need to complete login proceedings
before they can open applications, if there are events showing the user has open
applications at a time when they were not logged in, then that could only occur if the
time has been changed. To detect this they correlate file system and document
metadata with event log information to compare user activity with login and logoff
events, this is an excellent system which could be paired with more user activity
information extracted from the users local registry file (NTUSER.dat), internet history
and other areas to get more detail regarding user activity.
Regarding the detection of changing the system time there are additional methods
which could also of been mentioned in the paper as there are gaps in their analysis.
The windows event log files in themselves are sequential (ring buffers) and new
entries shouldn't be older than previous entries, which can easily detected by
sequentially parsing the event log files then sorting by event file offset and
18
comparing the dates for each entry. There are further places which can be examined
to detect time changing like thumbs.db thumbnails database files, NTFS USN
Journal, Windows restore points, Volume shadow snapshots for example which all
have sequential entries.
The creator of the bulk extractor tool Garfinkel compares different computers using
what he calls “Cross Drive Analysis” over the output of the Bulk Extractor program to
find which computers are related (Garfinkel 2006). The bulk extractor tool is a
program which processes a disk image at sector level, it doesn't read the file system
or parse any files it just processes the text it can extract from each sector, which in
itself isn't all that sophisticated but from this it is able to give an rough profile of the
computers activity. Most tools are document or file system based and don't focus on
analysing unallocated areas which this tool does. The gathered profile data can then
be compared to over computers profiles to see if there are any connections. The
main areas of weakness are that it will not be able to read fragmented files properly
although modern file systems are self-defragmenting and so this is not so much of
an issue and the other area of weakness is the ability to read inside compressed or
encrypted files. Bulk extractor is quite impressive in that it has the ability to
decompress and read some compressed file types and from looking at its roadmap
this will only improve. The statistical abilities combined with the data it collects shows
that showing what occurs the most will often reveal a profile of behaviour on the
machine. An example is the most visited internet urls and most occurring email
addresses, these show how useful statistics can be to quickly bring information of
interest out of a sea of irrelevant data. With regard to analysis of activities this will
only be lightly covered by this tool as it doesn't parse the registry link files event logs,
file system and other areas of file metadata.
2.3 Timeline analysis
In the interest to extract as much intelligence with regard to activities and dates and
times, timelines of computer activity were created to help with analysis. Initially
investigators would create a timeline of file system activity and use that for the
analysis of incidents and as it was helpful the use of this analysis method has grown
with investigators parsing as many areas of a computer as possible to extract dates
and times and signs of activity. This was originally manually done by combining in
Excel the Internet history, parsed event logs, registry entries and file system data
19
culled from many different tools. This was a very manual process using different
tools with different outputs which all needed to be combined into a common format
for analysis.
This manual and time consuming job was automated by the creation of the
log2timeline software as described in Guðjónsson's 2010 seminal paper on
mastering the super timeline. The ‘log2timeline’ software improved the making of
timelines by parsing many different areas of the computer for artefacts and dates
and times and combining it all together into a format which is easy for analysts to
use. Guðjónsson calls it the “Super Timeline”. The ‘log2timeline’ tool also helped in
that provided tools to help with basic filtering.
Timelines help with the analysis of incidents as an investigator can look at a time
period of interest and examine all the recorded activity for that time. As an example,
looking at the timeline of when a virus infected the system can help find all the virus's
changes to the file system and registry and help pinpoint the infection vector as well
as discovering how it starts and where it is stored. Creating time lines is a lot easier
using the ‘log2timeline’ tool although the analyst still has to do the analysis
themselves can be a burden with the vast amount of data that these timelines often
contain. For analysts to analyse the timeline fully they need to have an
understanding of the meaning of the entries and how they fit into the operation of a
computer (Guðjónsson 2010) which requires a lot of knowledge.
Guðjónsson had concerns that the ability of the ‘log2timeline’ tool to extract so much
detailed information raises the need for filtering to remove the irrelevant information
and just keep the relevant. At the moment there is no simple automated way to do
this apart from knowing time periods of interest to focus on or having specific
whitelist or blacklist keywords to help refine the timeline. Being able to easily find
suspicious entries would speed up investigations while quickly reducing the amount
of information the investigator needs to wade through which raises the requirement
for some way to quickly and easily remove irrelevant or known good entries so as to
focus more on the items of interest (Guðjónsson 2010) which is something best
automated. Creating a graph of the number of entries with regard to time is a method
which will help to easily visualise the timeline and will help with the detection of
20
spikes and surges of activity as well as gaining an overall profile of activity on the
computer. Easy visualisation of the timeline will also help with the presentation of
reports as it will help non-technical people like lawyers and judges to understand the
data. Most of the ideas proposed by Guðjónsson for the simplification of the output of
‘log2timeline’ is directly applicable in an automatic analysis system.
2.4 Analysis
As part of his model Marrington 2011, proposed “Four phases of analysis” which fits
with his event log focus. They are discovery, content categorisation and extraction,
relationship extraction and event correlation and pose an excellent model for the
processing of data into more manageable forms (Marrington et al. 2011). The last of
the Four phases of analysis “Event correlation” is required to link all the discovered
data and relationships to the event log entries, where a system not focussing on log
entries could leave out this phase and example of this is using the output of
Guðjónssons 2010 “Log2Timeline” software.
The ‘Log2Timeline’ software is quite impressive with the breadth of metadata and
logs it is able to extract. As mentioned above Marrington's system extracted file
system, metadata and logs information from the most important places, but this
pales into insignificance compared to the list of places that ‘Log2Timeline’ extracts
file system, metadata and log data from. ‘Log2Timeline’ extracts its information from
a wide array of files and metadata and provides a vast amount of data but luckily it is
in a kind of standardised format to ease analysis.
Super timelines make it a lot easier to visualise activity on the system at any point of
time. Temporal Analysis is the analysis of events around a specific time. So with a
malware infection temporal analysis of the timeline at the time of the infection can
provide us with; the point of infection, all malware related files, where all these
malware files are, the viruses method for start-up and other changes the malware
may have made, like for example. The timeline can be analysed for spikes of activity
which often can indicate events of interest and some examples of this are: file
copying, deletion of many files, antivirus scans, and software installation.
The log2timeline software doesn't do much in the way of analysis in itself but does
collect and parse information from many areas an presents it in a simple format
ready for analysis. It would be interesting to see the results if Marrington were to
21
combine the information that Log2Timeline extracts with their analysis system and
see the additional level of detail and relationships that their report tool would be able
to produce.
Peisert and Bishop discuss modelling system logs with regard to the detection of
actions by intruders.Their paper is mostly applicable to the implementation of
Forensic Readiness with the detection of suspicious incidents as compared to digital
forensic analysis. Although they do discuss an analysis model which they label
“Requires/Provides” and involves the concept of “Capabilities” which are needed to
gain a goal as well as the capabilities then provided by gaining that goal (Peisert &
Bishop 2007b). With regard to forensic analysis this can be used for example to
detect that software has been installed or used which provides capabilities that don't
fit with the types of activity expected from the user like the use of peer to peer
software like frostwire or even visiting certain types of websites.
Carrier and Spafford used an automated analysis technique of looking for outlier
files by discovering and classifying normal activity (Carrier & Spafford 2005). Outlier
files or activity is files or activity which is not normal and can be detected once rules
have been created which whitelist normal activity. An example of this is the discovery
of executable files in the c:\windows\fonts directory which is abnormal and should be
detected as outlier activity. Categorisation of outlier files can be helpful for automatic
analysis yet is dependent on a good rule set of blacklisted files and whitelisted files.
This is also directly applicable to analysis of the registry with regard to detection of
outlier “Autoruns”. Autoruns is a term referring to programs automatically being run
at system startup and are commonly used by unwanted programs like malware and
viruses.
22
Chapter Three - Methodology
3. Methodology
The research methodology will be similar to the agile software development model
working in an iterative model with analysis, development, implementation and
evaluation as different stages.
It will involve examining the data which the system collects after performing certain
suspicious actions. Windows XP will be the operating system of choice as it very well
understood by forensic tools and provides a wealth of data for parsing.
The methodology to address each of the sub-problems is proposed below.
Sub-Problem 1 – What are the existing tools for extracting relevant information from
evidence as well as the quality of the extracted information from these tools ?
Comparison will be done between the Log2timeline tool, the Plaso tool and different
specific tools for the extraction of file metadata. The use of the Plaso and
log2timeline tools in investigations isn't quite so common as it could be. The
integration of these tools with an automated analysis system will help reduce many
of the issues there are with the many different tools for the extraction of metadata.
These tools will be compared and the one most suitable for integration with an
automated analysis system will be used.
Sub-Problem 2 – What solutions are the for parsing the many undocumented file and
metadata formats which are yet to be discovered and documented but could contain
information of interest?
Microsoft and Apple will both continue making new operating systems with new file
systems, file types and metadata to be extracted. Collaboration on the ‘log2timeline’
and Plaso tools by adding the new file types and metadata will help enable that new
file types and metadata can be parsed and understood by forensic tools. Forensic
analysis systems would need to keep up to date and have new rules to deal with the
new metadata.
23
Sub-Problem 3 – How to ensure a low false-positive and false-negative detection
rate while keeping a high detection rate of relevant information?
Research with regarding to text based analysis systems, rule based analysis
systems like Snort, statistical analysis systems and comparisons made to discover
which methods provide a more reliable analysis system with higher detection rates
and more relevant data reliability. Testing with regard to which rules work better and
comparisons to find the most reliable methods.
Sub-Problem 4 - What approach can be used to enhance digital forensic analysis
with automation?
Comparison with similar tools like Bayesian spam filters and primarily with Snort and
how it's rule based system works as well as examining and comparing different
analysis based systems like using statistics, Markov chains, natural language
processing, rule based systems and finding an approach to provide the automating
of analysis.
3.1 Focus of research
Guðjónsson discusses in his paper with regard to future work in in the computer
timeline creation area that “The need for a tool that can assist with the analysis after
the creation of the super timeline is becoming a vital part for this project to properly
succeed” (Guðjónsson, 2010) and the focus of this research will be to work towards
assisting with analysis.
The concept for the implementation of such a system is based on the concept of the
open-source network Intrusion Detection System (IDS) tool called Snort. This is a
tool which reads a standardised data source (in snorts cases network packets)
analyses them by the use of user created rules and creates alerts for packets of
internet.
Statistics as well as profiling each user account will help the investigator to get an
idea of the activity of each user on a computer, which when combined will provide a
profile of the computer itself. A rule based system like Snort could use the output of
24
the ‘log2timeline’ program as a standardised source of computer activity information
for analysis. With a flexible enough rule language investigators will be able easily
write up new rules to detect new and unforeseen activity.
25
CHAPTER Four – Computer profiling
4. Introduction
The idea of computer profiling is to provide the investigator with a report which
describes the computer and an outline of user activity. This should give them a
rough idea of who has used the computer, some of the activity that has occurred as
well as when it occurred.
This can help investigators by saving the time that they would need to do the
profiling themselves especially as some of the profiling that can be done
automatically can be very hard to do manually like statistics.
Good profiling data can help the investigator to quickly direct and focus their
investigation in the relevant areas. Computer evidence provides a “quantity and
complexity” problem for the investigator especially as computer storage and
complexity is increasing so computer profiling can help filter out less relevant
information and help put focus on the more important specifics of need for the
investigator (Marrington 2009).
With all analysis compromises need to be made. Some information is a lot more
complicated to extract than others and some is of lower priority. The choice needs to
be made between what's more useful and what is harder to extract and obtain a
useful balance. As there is a lot of detail extracted by Log2Timeline and Plaso this
information can be processed to provide computer and user profile information.
The following information lists some of the potential information that these reports
could contain:
The “Quick Report” could contain the following:
System profile:
• Operating system details ( Windows 7, Home Premium, 64bit …)
• Hardware details, (CPU details, RAM capacity, Storage information, Add on
cards)
• Users, ( list all users)
• time zone
26
• Installed software
The “Medium Report” could contain the following:
(Including the contents of the “Quick Report”)
System profile:
• Categorising of installed software with flagging of suspicious software.
• Listing the top ten large files on the system (can be used to find Truecrypt
containers, large archives/videos/mailboxes)
User profile:
• Connected USB storage drives per user.
• Statistics of “user created” files.
• Top viewed websites.
• Recent website domains.
• Recent opened files
• Most recent Created, modified or accessed files in user profile.
• Users profile paths
• Date range showing the first and last date that each user used this computer.
The “Exhaustive Report” could contain the following:
(including the contents of the “Medium Report”)
System profile:
• Activity spikes, Find activity spikes for the system. Often shows times of interest.
User Profile:
• Categorising Internet history to profile users web activity and flag suspicious
sites.
• List folders which only contain Audio or Video files (Potential porn). It's common
for people with Child Abuse Material (CAM) to catalogue them in different folders so
27
detecting such behaviour is important. (Also common for normal people to structure
their data this way,but helps in CAM cases as it can point to potential areas of
interest.
• Statistics on top file authors (Taken from file metadata). Many file types record
the name of the user which edited them (Especially office documents)
• List recent Internet searches
• List logon and logoff dates for each user. Potentially flag when they logged in at
abnormal hours/days.
• List all occurrences of sequentially created or accessed files as it could show file
copying.
• From NTFS USN Journal find all renamed files and flag all occurrences where
the new filename occurs in access to removable devices. One potential anti-forensic
method to try to avoid forensic analysis is to rename the file before copying it or
opening it from the removable device. An example of this is renaming the file
“company customerlist.xls” to ccl.xls which is a lot more innocuous file name and
then copying that
• Activity spikes, Find activity spikes for each user. Often shows times of interest.
4.1 Methods
Many forensic tools already have the ability to create basic computer profiles for
evidence. Encase 6 has the ability to create computer profile reports and these do
contain a lot of useful information about the computer and its configuration and which
users there are, although it doesn't provide any information regarding user activity.
Its computer profile report mostly covers what's mentioned under the “Quick Report”
in the above section.
The program called “Forensic Scanner” written by Harlan (Carvey 2011) goes further
by doing some basic analysis for each user as well as detecting some signs of
malware infection. Compared to the above lists it would be like creating a “Basic
Report” but with some information from the “Medium Report”.
28
The normal way to extract much of the data mentioned under “Medium Report” and
“Exhaustive Report” is by manual manipulation of extracted information. For example
the internet history domain list can be exported and by the use of easily written (In
this case a simple python script) script the domains can be counted up and a list of
the most visited domains and numbers can be created and used for analysis. The
issue with this is that it is quite a manual process and needs to be repeated for each
new case.
29
CHAPTER FIVE – Log2timeline and Plaso
5. Introduction
Kristinn Gudjonsson wrote his 2010 paper regarding the extraction of metadata and
file system data. This paper culminated with the development of his log2timeline tool
which extracts data from different sources and types which are combined together to
provide a timeline of system activity (Guðjónsson, 2010).
Log2timeline development continued with further file parsers being added all the way
to version 0.65 where by this stage development had moved to the Plaso tool. Plaso
is written in Python and ‘Log2Timeline’ was written in Perl.
Manual extraction of metadata has been the traditional method for analysts. One of
the issues with this is that not all tools are the same and some tools miss a lot of
information. Analysts often need to keep current with the state of metadata extraction
for different file types because some programs will be unable to extract all the
possible information and may only extract part of the information. The concept of
creating timelines by combining many different parsed metadata logs together wasn't
that well used, partially because of the amount of effort to combine different types of
data. The ‘log2timeline’ and ‘Plaso’ tool's removed a lot of that complexity.
If an investigator is to ignore these two programs and their ability for extracting
metadata for a timeline they would have to manually extract the data by using many
different parsers and then have to combine the output of these different tools into
one format. This isn’t as simple as some will have different columns of information
which will need to be manipulated so that they can be combined together.
This was done as a test to compare manual methods with automated using encase.
Using Encase 6 and using the Link file, Internet History, and Event log parsers as
well as file system information and which was then combined together using Excel.
To combine the different tables of information from the different parsers was quite a
time consuming job. The link file parser output and the file system information each
contained at least 3 different columns with Date/time information which all needed to
30
be added to the timeline and also the parsed event logs and Link files both
contained multiple columns of information which needed combining together.
Log2Timeline makes what would be an exhausting manual job a relatively simple
processing job. From having to use many different tools (Commercial, Free and
open-source) for this process it has been reduced down to the use of one tool. When
it came out it was a ground breaking tool for the creation of timelines and there was
at the time no commercial tool which could do this and this is the same today.
‘Plaso’ is a newer version of this tool which has been rewritten in python to be easier
to expand with third party developed python code because of its open architecture
but currently hasn't reached feature parity with log2timeline for the amount of
different parsed metadata file types. (Appendix B contains a full list of file types that
Log2time parses)
5.1 Comparison
The following table contains a quick comparison between Plaso and Log2Timeline.
Attributes
Log2Timeline
Plaso
Maturity
Mature
Not mature yet
Metadata formats parsed
Many
Medium (Most important are covered)
NTFS Volume Shadow
Unable
Able to parse
Source – Folder
Able
Able
Source – Disk image
Unable
Able
Development
Medium
Easier, Python is simpler to develop with
Snapshots (VSS)
Table 1 – Tool comparison
‘Plaso’ currently can't parse all the different file formats that L2T (Log2Timeline) was
able but can parse the most useful formats. Its ability to access NTFS Volume
Shadow Snapshots (VSS) and disk images directly is a big jump in capabilities over
L2T. It is a lot easier to develop for and the developers are looking at addressing the
missing parsers.
31
The ‘Plaso’ developer website has a Google documents spread sheet which contains
a list of currently implemented features and wanted features. Users can collaborate
on existing features and propose new features including file format parsers. This
ability to easily collaborate will help resolve issues with unknown file formats.
Unknown file formats and their reverse engineering and documentation will still be a
problem but having a program (Plaso) with an easily accessible collaborative
development system will make it a lot easier to implement new parsers for newly
discovered file formats. There are no tools which can automatically parse and
decode unknown file and metadata formats. The Digital Forensics community by
collaboration and data sharing can help make sure that new formats can be parsed.
32
CHAPTER SIX – Analysis of different analysis systems
6. Introduction
Before examining what software and algorithms can be used for analysis it is best to
look at what types of analysis that potentially may be used first. Once the types of
analysis have been examined, different analysis systems can then be examined for
suitability.
What are some common Digital Forensic types of analysis?
The following are the most common types of Digital Forensic analysis (Carrier 2003;
Davis 2012; Turner 2006).
6.1
Incident Response/ Malware (Malicious Software)
The purpose here is to look for unauthorised software on the computer. Now
malware includes viruses, worms, rootkits, Trojan horses and key loggers. Analysis
of malware involves looking for the malware initial entry point
vector), its propagation mechanism
(initial infection
any artifacts left on the system and its
Persistence Mechanism (Baker, Hutton & Hylender 2011; Carvey 2013) .
6.2
Intellectual Property theft (IP Theft)
The purpose here is to find all possible methods for copying IP and look for any
signs that this has occurred. Common areas for analysis here are emails with
attached files being sent to the staff members personal email address and also
removable storage devices which were connected on the staff members last days of
work where access to work related files is found (Australian Institute of Criminology
2008; Goodman 2001). This will be expanded on further in Chapter 7 showing the
implementation of scripts using Plaso to help with IP theft investigations.
6.3
Access to Child Abuse Material (CAM)
The purpose here is to look for collections of media files ( like Videos and Photos), to
find signs of large container files, find file copying methods, look for suspicious
33
internet browsing activity as well as look for the use of evidence hiding software
(Watt 2012).
Excluded from the list are more specific types of analysis.
Specific analysis is
commonly involves proving that something did or didn't occur and who was using the
computer at that time. An example of this is confirming when the computer was used
and by whom to confirm some ones alibi. Electronic discovery is also not listed as it
doesn't contain any real analysis (mostly just searching and exporting relevant
search hits).
6.4
Snort
Snort is often seen as being a signature based program and also as a rules based
program (Eckmann 2001). As snort has a rules based system which can detect
suspicious activity as well as the flexibility to support creation of new rules it can stay
current with detection of threats on the network. So for forensic analysis a rules
based system which also supports the sharing and creation of new rules will help the
analysis process and endeavour to keep the analysis system current (Aickelin,
Twycross & Hesketh-Roberts 2008; Roesch 1999).
There is the example of the ‘RegRipper’ program by Harlan Carvey (Guðjónsson
2010; Stevens 2006). The continual updating of the application's analysis plug-ins
with new and the updating of existing plug-ins has enabled it for example to add the
parsing of registry shell bag artefacts (Carvey 2012).
Yet all discussion and rules regarding Snort are mentioned were relating to matching
one packet only. No mention of linking together information from various packets for
a match which is a concept which would be useful for Digital Forensics. An example
of this is finding link files pointing to removable devices near the time a USB storage
device was inserted which suggests that the files that the link file point to came from
this USB device.
Further research into Snort confirmed the idea that Snort matches per packet only
with no rules able to reference prior packets. Being able to examine the state of
multiple packets in one rule would be very useful.
The Prelude IDS (Intrusion detection system) has rules written in python which
integrate directly and have full access to the program and python syntax which
makes it very flexible for rule writing (Zaraska 2003).
34
A flexible rule based analysis system like Snort but able to reference other entries at
once would be very useful for Digital Forensic analysis.
6.5
Markov chain analysis methods
Security Information and Event Management (SIEM) systems are used for the
analysis of log files and provided much useful information regarding log file analysis
and correlation (Swift 2006). When it comes to analysing log files “Markov chain”
analysis methods, well more specifically Hidden Markov Model (HMM) has the
potential to be quite useful for the detection of abnormal entries as well as the
filtering out of known good entries.
With HMM known good log files are analysed to create a baseline of normal activity
and then this information can be used for the analysis on evidence log files to look
for anomalies (Peng, Li & Ma 2005). HMM is a system, where probabilities are used
to calculate the expected next log entry and major deviations from the norm can be
flagged (García-Teodoro et al. 2009). This has the potential to detect activity like
failing hard drives which doesn't act often as well as artefacts of a malware infection.
There has been research that suggests that HMM can be effective but requires
considerable computing resources (Peisert & Bishop 2007a). A modified Bayes
algorithm has been used and compared to HMM with similar behaviour (Peng, Li &
Ma 2005). Principal Component Analysis (PCA) has been proposed as an algorithm
which is more simple to implement as well as more efficient for processing than
HMM (Xu et al. 2009). These analysis methods could be quite useful for examining
log files but are not expected to be of much use for file system or registry file system
data.
6.6
Further different analysis methods
There are some other analysis techniques which are worth looking at for digital
forensic analysis.
White/black listing:
A simple alternative to HMM and PCA is to use white/black listing. There are a lot of
known bad entries and keywords that could be used (Simpson et al. 2011). Blacklists
can be used to flag for immediate analysis known suspicious filenames, internet
traffic and applications.
35
Child Abuse Material can possibly be detected by checking all of ‘Plaso's’ timeline
information for usual keywords. Expanding on this “specific” keywords for different
types of crimes or activity could also be used to quickly flag different types of
behaviour like for example finding drug related or pornographic material.
White listing and blacklisting of known services and programs which are run at
startup would remove known entries, flag known bad entries and reveal unknown
entries which could be potential areas for analysis for a malware analysis case
(Wong 2007). The system and network service accounts under windows cannot be
logged into by common users so any “User like” activity like internet history or link
files found under these user profiles is activity that should be detected by a blacklist.
This is a simple analysis method which would quickly help detect known bad as well
as unknown entries which can be further analysed.
6.7
Statistics
Statistics can be used to find common activity an example of this is finding that a
staff member’s most accessed website on their work computer is be facebook.com
which suggests that they don’t do much work. So statistics can be used to show
things the user commonly accessed or used. Statistics can be used to show which
user account was used the most and the time periods it was used.
It can be used to detect find instances of sequentially accessed or created files
which is common when files are copied and could point to intellectual property theft.
This would appear as sudden spikes of activity for the last accessed or date created
date/time for files on the file system.
Most users usually use their computer over a certain time frame for office computers
this is usually business hours, so the detection of activity outside of these times
might signify a malware infection or other suspicious activity (Guðjónsson 2010).
Rules based analysis:
The flexibility of rule based systems has already been examined with a discussion
regarding the IDS Snort. Rules could also be used to tag entries and those tags
could also be used in rules (Garfinkel 2009; Marrington et al. 2011).
36
Some example rules:
Example rule1.
if application category “wiping tool” exists AND registry MRU empty then raise Alert
that “possible evidence cleaning has occurred”
Thresholds could be used to detect simple spikes in certain activity.
Example rule 2
Raise alert when > 2 password failures occur within two minutes.
Example rule3
raise alert when > 10 file system create times within two minutes (Could detect
copying ?)
A potential issue with rules is the flexibility of the rule language. In some ways the
Prelude Intrusion Detection System with its rules written in python have so much
flexibility(Zaraska 2003). This concept would be directly applicable to creating rules
for ‘Plaso’ as well.
Some examination was put into looking at model based diagnosis and found that
they are a lot more complicated because they try to model the system and can be
very computationally intensive (Elsaesser & Tanner 2001).
37
CHAPTER SEVEN – Tests and implementations
7. Introduction
For automated Digital Forensic analysis to occur there a few prerequisites. An easily
available source of data to analyse is important, ‘Plaso’ has the ability to extract and
provide this data as well as be a platform for the development of automatic analysis.
In this chapter there will be an examination regarding how ‘Plaso’ can be utilised to
provide automated Digital forensic analysis. The long term plan will be to contribute
the results of my findings back in to ‘Plaso’.
This was implemented using version 1.0.2 of ‘Plaso’ running under Debian. Tests
were performed to understand the behaviour of ‘Plaso’ and find limitations which
might need to be worked around. Initial limitations were that the file size of files
wasn't gathered and the developers were helpful and added that feature.
Testing and analysis were originally only going to be with windows XP but after
preliminary tests with windows 7 found some interesting artefacts it was decided that
windows 7 was worth including in the tests.
With the tests which would be performed, manual analysis methods and results were
compared to the results gained by using the metadata gathered by ‘Plaso’ and
running analysis scripts over this data to make sure the results for both types of
analysis was the same.
The approach used was to first use Plaso tools like psort.py and pinfo.py to test the
viability of the proposed filtering and analysis methods and then implement using
python. This is demonstrated in the IP theft section in this chapter, which is also
described in chapter 6. The final intention is to contribute the python code back into
the Plaso codebase to add analysis abilities to the Plaso tool.
7.1
Profiling
When ‘Plaso’ is used to process a computer it collects an initial profile of the
computer to help with processing. This data is very helpful for forensic analysis
because it provides essential details regarding the computers configuration. Knowing
this information saves the investigator from having to find out this information.
38
The following text box contains example profile information gathered by ‘Plaso's’ tool
at the time the evidence is processed and metadata extracted. ‘Plaso’ gathers this
information to help it understand the environment and decide what metadata needs
to be collected. In this case it can be seen that the operating system is Windows XP
so because of this ‘Plaso’ will not try to extract Apple OSX or Linux specific
metadata.
windir = //WINDOWS
hostname = XP
users =
[
{
u'name': u'systemprofile',
u'path': u'%systemroot%\\system32\\config\\systemprofile',
u'sid': u'S-1-5-18'},
{
u'name': u'LocalService',
u'path': u'%SystemDrive%\\Documents and Settings\\LocalService',
u'sid': u'S-1-5-19'},
{
u'name': u'NetworkService',
u'path': u'%SystemDrive%\\Documents and Settings\\NetworkService',
u'sid': u'S-1-5-20'},
{
u'name': u'user',
u'path': u'%SystemDrive%\\Documents and Settings\\user',
u'sid': u'S-1-5-21-1957994488-1409082233-839522115-1003'}]
zone = UTC
time_zone_str = AUS Eastern Standard Time
guessed_os = Windows
sysregistry = //WINDOWS/system32/config
systemroot = //WINDOWS/system32
osversion = Microsoft Windows XP
store_range = (1L, 1L)
code_page = cp1252
Figure 1 – Plaso pinfo.py Windows XP computer profile information.
The computer profile data gathered by ‘Plaso’ provides a foundation for the
development of additional analysis abilities.
39
With normal computer forensic analysis usually the most that could be expected for a
computer profile was what Encase 6's Initialise Case would provide. It would provide
details about the computer hardware, software installed, the operating system and a
list of users but the information wasn't the easiest to deal with (Especially the
hardware and software sections) and didn't provide any detail regarding the users
activity. The software and hardware sections were not that useful as the way they
have been presented made them too verbose as well as having too much white
space making them hard to read. This information is directly accessible to python
scripts utilising Plaso to analyse it’s data store files (these contain the extracted
metadata).
The above profile generated by ‘Plaso’ contains the most essential information that
would have been contained in an Encase 6 Initialise Case report. With the data
already extracted by ‘Plaso’ and with the right analysis rules something with similar
content to the medium report referred to in the computer profiling section could quite
easily be created.
Simple rules with ‘Plaso’ were created to extract USB history (Wong 2007),
“Userassist” user activity reports (Stevens 2010), Internet history, Parsed Link files ,
and largest file reports and some if this information can be seen below.
7.2
Large Files
Find and examining the largest files found in the evidence is a common thing to do at
the beginning of analysis as it may help find container files (like .zip .rar .7z archive
files), email mailbox files (some examples are:
.PST or .OST Outlook mail box
files.), large video files (which could be copies of downloaded movies) or even find
encrypted volumes. For example, Truecrypt encrypted volumes are usually very
large files so often can be detected by their size.
As ‘Plaso’ has a field containing the files size it’s quite easy to iterate over the files in
the evidence looking for the largest files. With Encase the process is to just show all
files and then sort by size, this is a manual process in Encase where with ‘Plaso’ it
can be automated.
For example for our Windows XP the following large files were found using Plaso:
40
For Windows 7:
Size (Bytes)
Filename
20,971,520,000 /temp/disk_image.raw
5,883,215,872 /pagefile.sys
4,412,411,904
/hiberfil.sys
4,193,572,720 /Windows/RE_DRIVE/recoverycd_iso2/OSImg2.swm
4,185,625,598
/Windows/RE_DRIVE/RECOVERYCD_ISO/RECOVERY_DVD/OSImg.s
wm
3,295,094,784 /temp/BT5R3-GNOME-32.iso
2,811,326,464
1,556,324,352
1,301,618,688
801,424,592
/temp/BT5R2-KDE-64.iso
/System
Volume
Information/{15d3b509-a95e-11e2-9343-
6c626d311a8a}{3808876b-c176-4e48-b7ae-04046e6cc752}
/System
Volume
Information/{a62ac45c-ae52-11e2-87e3-
e0b9a5aa25e5}{3808876b-c176-4e48-b7ae-04046e6cc752}
/ProgramData/Microsoft/Application
Virtualization
Client/SoftGrid
Client/sftfs.fsd
Table 2 – Windows 7 top 10 large files
A quick examination of table 1 finds that there is a 20GB file in the temp folder called
“disk_image.raw”. If this was a Child Abuse Material (CAM) case then this file might
be an encrypted volume or a possible virtual machine disk image file needing
examination. The files under the “/System Volume Information/” show that there are
Volume shadow snapshots on the disk and in the above list there is at least 2.8GB
of changes in the snapshots( which potentially are deleted files) .
The .ISO
DVDROM image files “BT5R3-GNOME-32.iso” and “BT5R2-KDE-64.iso” based on
their filenames most likely contain the ‘BackTrack’ penetration testing distribution
which contains many hacking tools.
For Windows XP
41
Size
Filename
1,610,612,736
/pagefile.sys
331,805,736
/share/Service pack3/WindowsXP-KB936929-SP3-x86-ENU.exe
131,170,400
/share/sp1a/xpsp1a_en_x86.exe
76,699,621
/WINDOWS/Driver Cache/i386/driver.cab
67,108,864
/$LogFile
24,412,160
/$MFT
20,056,462
/WINDOWS/ServicePackFiles/i386/sp3.cab
20,056,462
/WINDOWS/Driver Cache/i386/sp3.cab
16,258,580
/WINDOWS/Fonts/batang.ttc
14,688,256
/WINDOWS/ime/IMJP8_1/DICTS/imjpst.dic
Table 3 – Windows XP top 10 large files
Here it can be seen that there are no user created files larger than 14MB, which
means that they don't have an Outlook mailbox or any other large files. The sp3.cab
files suggest that windows XP service pack 3 has been installed, further proof is that
the file listing also contains service pack 3’s installation file (WindowsXP-KB936929SP3-x86-ENU.exe).
7.3
Most Visited websites
Reporting on the most visited websites can provide a profile of user activity on the
computer.
For our Windows XP virtual machine there isn't much in the way of activity.
Count
item
376
support.microsoft.com
71
runonce.msn.com
48
clients1.google.com.au
45
www.google.com.au
35
www.microsoft.com
24
windowsupdate.microsoft.com
22
www.ninemsn.com.au
42
21
asset.9msn.com.au
18
s3.buysellads.com
18
html5test.com
Table 4– Windows XP top 10 web domains
Count
item
677
www.google.com.au
336
www.facebook.com
80
mail.google.com
58
bits.wikimedia.org
54
code.google.com
52
en.wikipedia.org
46
webmail.mycompany.com.au
41
www.mozilla.com
31
e5.onthehub.com
29
www.socketmobile.com
Table 5 – Windows 7 top 10 web domains
Comparison between these two tables gives us an idea of how much activity has
occurred on each computer as well as what activity. It can be seen on the Windows 7
computer that there was more traffic to social media sites (Facebook) and cloud
email (Gmail) and to the work web-mail web-page which might be indicative of how
the user uses their time. The data extracted from Plaso here wasn’t immediately
usable for creating statistics and cleaning up of the URL’s was required first, as can
be seen in “Appendix - D”.
7.4
User profile registry dates and times:
The dates and times of the user profiles registry files reveal how long the user profile
has been used for. Of course for more specific detail, examination of the event logs
for log in and log off times will help provide further data (depending of course if this
logging has been enabled).
43
datetime
Timestamp type
message
2012-01-16
Create time
C:/Documents and Settings/user/NTUSER.DAT
Modify time
C:/Documents and Settings/user/NTUSER.DAT
00:51:19 (UTC)
2013-10-27
09:45:44 (UTC)
Table 6 – Windows XP user profile data
Here the table displays the dates and times for the registry files for the user called
'user'. From this can be seen that this user account was first used 16/01/2012 and
last used 27/10/2013 thus excluding this user profile from any activity from before or
after this time period. As there are no other user profiles (system user accounts are
not included) on this system a time period for the use of this computer can be
established.
7.5
Analysis
As can be seen from the above examples it is quite easy to extract and analyse
information using ‘Plaso’ as the foundation for the activity. There are many other
small but useful types of analysis which can be done not directly being part of any
particular overall analysis theme which can help with getting an idea regarding the
activities by users or by the operating system.
Some examples of this are:
• Statistics showing what user creatable file types live in the users and the amount
of each. Shows the activity which has or has not occurred.
• Extracting and putting the Internet history, USB activity reports, link file report and
Internet history local file access information available for manual analysis may help
as the investigator might require the additional detail that these provide.
• Creating and exporting timelines around activity of interest may help provide
additional detail. A possible example of this is to create a timeline of the time when
malware infected the system or when a user was suspected of stealing company IP).
There a few types of analysis commonly performed manually where automation can
help.
44
7.6
Intellectual Property (IP) Theft
In IP theft cases analysts look for methods where data may have been ex-filtrated
from the computer. The most common methods for copying company data is by the
use of USB removable drives because they are portable, small with these days have
a lot of storage capacity. Someone could copy all the Companies IP onto a USB
flash drive and walk straight out the front door without anyone noticing. The old days
when information was on paper it was a lot harder.
Some common forensic areas to examine to find whether this has happened:
– Windows Shortcut files (link files) to removable drives
– USB device activity
– Windows Internet history for file:// access to removable drives
The manual analysis method that the analyst would commonly use here would be to
parse the link files with Encase 6 and export the registry and parse the registry for
USB history information with Woanware's USB Device Forensics tool then compare
to correlate link file access with USB device insertion so that our client can be told
which USB devices most likely have their stolen information on. The link files contain
quite a bit of detail about the drive the files came from, like “Volume Label” and
“Drive Serial number” which can be used to confirm which drive it was.
With Plaso under python it is simple to iterate through the dump file extracting all
USB history information as well as extracting Link file activity information as well. It is
then quite simple to look for USB drive activity when there was access to files on
removable devices. The system stores information regarding the first time the USB
drive was plugged and some information regarding the other times it was inserted.
There is device information stored which describes the connected USB remove able
drive.
In our test the USB flash drive was inserted at 19:57 and a text file was opened from
it at 20:28. The information recovered from ‘Plaso’ corroborated this activity.
At 2013-10-27 on 19:57:22, the USB Flash drive was connected. The following data
had been extracted by ‘Plaso’ from the Registry and found by the USB analysis
script.
45
device_type: Disk
friendly_name: Lexar JD FireFly USB Device
parent_id_prefix: 8&107bbb1a&0
product: Prod_JD_FireFly
revision: Rev_1100 serial: 7&1fb3deb6&0&AAMA14G4OXX83PM1&0
subkey_name: Disk&Ven_Lexar&Prod_JD_FireFly&Rev_1100
vendor: Ven_Lexar
At 2013-10-27 on 20:28:46 a text file was opened from the USB flash drive. The
following information was discovered.
The Internet history showed that the following local file was accessed:
user@file:///E:/tmp/grml-cheatcodes.txt
A Link files was created at: C:/Documents and Settings/user/Recent/grmlcheatcodes.txt.lnk
This link file contained the following information about the USB flash drive:
File size: 21491
Drive serial number: 0x4e9b6351
Volume label: USBDISK
Local path: E:\tmp\grml-cheatcodes.txt
Working dir: E:\tmp
An examination of the ‘Plaso’ timeline at 20:28 found UserAssist information
revealing that the program “C:\WINDOWS\system32\NOTEPAD.EXE” was executed at this
time which matches up with the above artefacts regarding a text file being opened
from a USB Flash drive.
So if this file was important company intellectual property then the extracted
information could be used to pinpoint when it happened as well as which user
46
account was active at the time and what device the file was accessed on. Further
details can be found in Appendix C.
Using the Plaso tools the above information can be extracted by using command line
tools.
First from the Plaso dump file all windows shortcut entries (.lnk) for removable drives
are exported to a Plaso dump file.
Image 1 – Export removable lnk information
Then all USB activity is exported to the same file.
Image 2 – Export USB information
Image 3 – List entries
The top two entries extract the information from the Plaso store file using two
different search terms to get link file entries and USB activity into one dump file.
Combining the different search terms into one dump file means that the results of
47
querying the dump file will provide combined results showing when USB devices
were connected and files which were opened from them.
A script called script_usb.py has been created to extract the USB device and the link
file information for analysis extracting all the information extracted above in one
pass, the below image contains some example output.
Image 4 – Extract Link file and USB Storage devices information as one report.
The output can be sent into a text file and analysed in Excel.
A report showing all connected USB storage devices can be obtained by the
following command.
Image 5 – Display connected USB Storage devices
48
This command can also be scripted in python as seen in the following screenshot.
Image 5 – Display connected USB Storage devices
These scripts can be combined so that they all run in one pass speeding up the
processing as compared to using the psort.py command where each query has to be
run one at a time. As these scripts can all be combined they can create multiple
different reports for the analyst all at the same time. An example would be exporting
separate reports for Link files, Internet history, USB devices and userassist activity
reports which would provide a good overview of activity on the computer.
7.7
Incident Response / Malware analysis
For this type of analysis the analyst is looking for things that don't belong, as in
software which has infected the computer. Users don't intentionally install malware,
it's gets onto their machine without their consent and because of that tries to hide
but it is an abnormality so comparing a normal computer to a malware infected one
can help find abnormal activity by helping remove “known good” activity.
With
malware our analysis focuses on finding the initial entry point, it's propagation
method, artefacts left on the system and its persistence mechanism (how it starts)
(Baker, Hutton & Hylender 2011).
.
If the time of the infection is known then from Plaso a timeline of events can be
extracted from around the same time period for examination to find exactly which
files are affected and potentially how the malware came in. An example of this is that
a certain web page was visited and a Java program was downloaded and executed
and this downloaded the malware from another site.
49
This is where statistics can be useful as they can help point to spikes in activity
which may be evidence of a malware infection, although with such statistics there is
a need to be careful to exclude normal activity like the system booting or windows
updates being installed which also create spikes in activity.
Tests were done regarding whitelisting and blacklisting malware persistence
artefacts with the examination of the registry run keys, windows start up folder
contents and services and this worked well with the basic tests that were performed.
The next stage would be obtaining some malware for testing purposes in a virtual
machine making comparisons between before and after the infection. Training to
create whitelists of known good services and start-up applications will help improve
the detection rate.
Additional blacklisting areas added for detection are:
– any internet history under any of the computer system user profiles (
systemprofile, LocalService and NetworkService user accounts)
– Any Link files under the same folders.
– Suspicious software (Evidence cleaners, hack tools, remote access tools,key
loggers)
There should not be any signs of user activity under any of those mentioned profiles
as they are system accounts and not used by users so any detected user activity is
suspicious.
With Plaso it’s possible to loop through the data running back lists and white lists
over the relevant areas looking for relevant entries.
[\Microsoft\Windows\CurrentVersion\Run] IMEKRMIG6.1:
C:\WINDOWS\ime\imkr6_1\IMEKRMIG.EXE
[\Microsoft\Windows\CurrentVersion\Run] IMJPMIG8.1:
C:\WINDOWS\IME\imjp8_1\IMJPMIG.EXE /Spoil /RemAdvDef /Migration32
[\Microsoft\Windows\CurrentVersion\Run] MSPY2002:
C:\WINDOWS\System32\IME\PINTLGNT\ImScInst.exe /SYNC
[\Microsoft\Windows\CurrentVersion\Run] PHIME2002A:
C:\WINDOWS\System32\IME\TINTLGNT\TINTSETP.EXE /IMEName
50
[\Microsoft\Windows\CurrentVersion\Run] PHIME2002ASync:
C:\WINDOWS\System32\IME\TINTLGNT\TINTSETP.EXE /SYNC
[\Microsoft\Windows\CurrentVersion\Run] SchedulingAgent: mstinit.exe /firstlogon
Table 7 – Windows XP autorun entries
In this case most of the entries were related to Microsoft’s Input Message Editor
(IME) software so was added to the white list. The last application “mstinit.exe” has
been identified as the Microsoft Scheduling Agent which was whitelisted also. It’s
important to note that many malware and virus files will adopt the name of a program
which normally exists on the system as camouflage so rules need to be specific
enough to ignore the correct version and detect when there are files of this name in
unusual places.
7.8
Time changing
The system time could be changed by either users or malware to try to evade
detection of activity(Marrington et al. 2011; Willassen 2008) as an anti-forensic
technique (Guðjónsson 2010). As ‘Plaso’ stores the offsets into the event logs time
changing can be detected by iterating though event log entries and checking for date
and time moving backward. Comparison between the dates and sequence numbers
for other parts of the system which increment (MFT records, USN Journal, Windows
XP system restore folder names, User assist entries (Win XP), thumbs.db to detect
if the time goes backward. Depending on Logging settings the windows event log
can record any attempt to change the date and time. This was tested manually in
Encase 6 by parsing the event logs and manually examining these logs. Plaso's
dump file information regarding event logs was examined and offset information was
found and compared.
7.9
Statistics
Statistics are a potential analysis area for digital forensics analysis which would be
quite time consuming for a person to manually generate but quite quick and easy for
a computer.
Some uses are for detection of the following behaviour:
– Activity spikes (potentially detect malware installations)
51
– Sequential file accesses or creations can be used to detect files’ being copied
which is useful in IP Theft cases.
– Flag rarely happening events in the event logs. (needs training but can be used
to detect abnormal behaviour)
– Provide statistics regarding numbers of different file types (Helps profile user
profiles)
– Use Averages (Mean / Median), standard deviation and most common and least
common to provide additional information for analysis.
From a high level the statistics engine could record the number of certain activities
per day/week/month/year and this will be based on raw entries as well as over
specific areas to provide further more focused statistics. Some of these potential
specific areas are:
overall starts for the computer,
user profiles and
windows/software folders. Having specific areas for statistics can help with
differentiating between spikes in user activity to spikes in the operating systems
activity.
Potentially be used to discover normal computer usage times and to flag activity
outside these hours.
7.10
Rule based analysis
Many of the proposed analysis rules are already common for manual computer
forensics but can all be run at once as compared to a human spending the time to do
each one at a time.
– user created files outside of profile folders (abnormal behaviour)
– Flagging archive and backup files in user folders. (IP Theft?)
– Folders which only contain photos and videos (of interest in a CAM case)
– Flag Cloud storage and file uploading software like uTorrent,Frostwire, Dropbox
and Mega which will be useful for IP Theft cases.
– Flag endpoint monitoring software, which could have useful log's for analysis.
52
– Flag Phone backup files and folders
– Mass file deletions (recycle bin entries)
These different rules can be quite useful to help direct an analyst into relevant areas
of interest saving them the time needed to look around and manually gather this
information.
7.11 Findings
It was demonstrated that ‘Plaso’ can be extended to perform automatic analysis and
reduce the amount of manual analysis required for investigators. The python
programming language using the ‘Plaso’ libraries can be used to process extracted
metadata and find relevant information. The power of an easy to use programming
language like python combined with all the information that ‘Plaso’ extracts provides
a lot of potential for analysis.
Implementing an analysis script which works together with ‘Plaso's’ extraction of
system and file metadata is a good step forward towards helping analysts to focus
their valuable time into relevant areas and provide information which can help them
avoid performing irrelevant analysis.
The ‘Plaso’ tool itself provides a good stepping stone for analysis as it is able to
extract a lot of information from the evidence. This can be an issue as it's possible to
extract more information than is required, for the USB analysis our rule extracted far
too much information with at least 10-15 artefacts all generated within a minute
which mostly contained the same data. Moving forward this information could be
trimmed to just select the most relevant and useful records for presentation. It is
possible that when examining the activity of one USB storage device that most of the
artefacts will occur for when the device was first connected to this computer and also
for the last time it was connected so that care will need to be taken when limiting
search results for USB devices so as not to miss artefacts for times it was inserted
between the first and last insertion times.
53
CHAPTER EIGHT - CONCLUSIONS AND FURTHER
WORK
This thesis examines an overall research question regarding the use of automation
for the improvement of digital forensic analysis. Different methods and concepts are
examined, analysed and some tested for viability. Comparison between Log2timeline
and ‘Plaso’ revealed that ‘Plaso’ has the most potential for further growth and
expandability with regard to adding the abilities for the analysis of evidence.
Examination of Snort found that even though Snort is quite a flexible tool it wasn't
directly applicable in this case. Its flexibility had merit but as it is aimed toward
network packets and as its rules are unable to analyse/match across multiple
packets at once it wasn't suitable for this type of analysis.
Of the many different types of analysis that could be performed the analysis types
that were closer to real world analysis were simpler to implement (in this case
Black/White lists and simple rules) with lower false positives. Analysis systems
based on statistics like PCA and Markov chains do require a good amount of training
first to reduce the amount of false-positives hits. The open IOC (Indicators of
Compromise) website which was setup by Mandiant does hold a lot of potential for
providing a useful source of information about real world malware indicators which
could be integrated into blacklists.
Creating computer profile and user profile reports are quite beneficial and do help
the analyst get a rough idea of activity on the computer. Of course if more
information is required from the report then more time is needed for extracting and
processing to generate it.
The information parsed by Plaso was able to be processed to gather computer and
user profile based information as seen above, perform rule based analysis as well as
generate statistics. Tests were performed which confirmed that using ‘Plaso’ as a
foundation automated analysis is very viable.
8.1
Research Conclusions
It is now possible to answer the original questions posed by the research:
54

Q. What are the existing tools for extracting relevant information from
evidence as well as the quality of the extracted information from these
tools ?
As already mentioned there are many commercial, free and opensource tools for
extracting information from evidence. These all have differing levels of the amount,
quality and detail of the information extracted. Opensource tools lend themselves
toward improvement as the code is easily available so long as there is good
documentation and support. There is the example of Encase 6 where the users had
asked for HFS+ filesystem support for at least four years and it wasn't implemented.
They finally were encouraged to use Encase 7 (when it was released) which did
have HFS+ support but had a terribly unintuitive user interface.

Q. What solutions are there for parsing the many undocumented file and
metadata formats which are yet to be discovered and documented but
could contain information of interest?
There are no tools that can automatically parse decode unknown file and metadata
formats. Discovering the formats of unknown files and extracting useful metadata is
either done by reverse engineering or by using information from the developer of that
format. Parsing the NTFS USN Journal was made possible by the information which
was provided by Microsoft.
For undocumented formats the Digital Forensics community can work together to
help make sure new formats are parsed into a usable format by reverse engineering
it together or by lobbying the developer of the format to provide documentation.
The Log2Timeline and Plaso tools are open-source which means people can
collaborate in their development and improvement. As they currently parse more
metadata formats than other open source tools they provide a good foundation for
further development. The Digital Forensics community should collaborate together to
make sure that Plaso can parse all known relevant file formats and make sure that
55
when new discoveries are made that Plaso is updated to handle these new file
formats. Especially since the Plaso developers have a spread sheet of implemented
and unimplemented formats and users can easily monitor progress as well as help
the project by contributing parsers.
It is well known that proprietary commercial tools are slow in adding newly
documented formats for parsing and open-source software has the advantage of
being able to update more quickly.

Q. How to ensure a low false-positive and false-negative detection rate
while keeping a high detection rate of relevant information?
One method is to rate the output based on perceived level of quality. Results from
the white and black lists to be rated at alert level while the results based on
algorithms (HMM, PCA, Bayes) are rated at a “suspicious” alert level until tests and
training improve results. This helps prevent alert fatigue or the “call wolf” syndrome
where the analyst effectively becomes trained to ignore results because of the
amount of false positives.
Training the system with a lot of known good data to strengthen the white listing first
will help minimise the probability of false positives.

Q, What approach can be used to enhance digital forensic analysis with
automation?
Automation can be used to help with the extraction of metadata and also to perform
many analysis tasks which are currently still performed manually by skilled forensic
analysts.
Using python scripts in conjunction with Plaso and the extracted dump file a lot of
usual manual analysis work can be quickly automated. From using rules and
statistics to process and analyse data, to the simple extraction of parsed data into
reports saves the analyst from manually having to parse different sources of
56
information. An example of this is extracting from the Plaso dump file the USB
storage history, Internet history, Link file report into separate reports for the
investigator.
8.2 Areas for further study
8.2.1 Categories
This could be considered in a way as supporting statistics. Categorising Internet
history and software application types as well as creating graphs showing the
percentage of each type can help with the computer profiling process for the analyst
to get an idea of activity on the computer. Combining the dates and times from all
user attributable actions can help map out exactly when the user normally uses the
computer and enable flagging of abnormal activity like for example activity on a work
computer after hours which is suspicious when usual activity on that computer is
between 9am to 5pm.
Another example, having a graph showing that 40% of Internet use was for the work
related websites and that 60% of Internet was social media will quickly help provide
an overview of the users Internet browsing activities. So categorisation has potential
to help the analyst.
8.2.2 Correlation between different data types
An example of this is the correlation done above between link files and registry
entries for USB storage to link the USB device (Serial number, manufacturer and
model) information with the link file information (filename, date/time accessed, drive
volume name and drive serial number) so that the investigator can confirm which
USB storage device a particular file was accessed from. This shows that correlation
is already in use by investigators howbeit manually which will limit more advanced
correlation being performed.
There is a lot of potential for research in these areas. The literature review showed
that there has already been much research in these areas. Similar analysis like this
is already being performed by SIEM’s.
57
SIEM systems log many different types of logs (Windows and Unix server logs,
network switch, network Router, network firewall and Network IDS (Intrusion
Detection System)) and attempt to correlate related log's together. Most papers on
this subject attempt to correlate by statistics. The SIEM model isn't a clean fit for
computer analysis as evidence contains log type evidence (each entry sequential
and related by time) as well as file system based evidence (each entry related by file
system hierarchy + dates and times).
There is potential for filesystem based
evidence to use some of the SIEM statistics based analysis methods, but as these
items are related by file system structure as well as at least 3 time stamps there is
additional complexity for analysis. Log based evidence can use the SIEM correlation
models fully but consideration is needed for analysis of filesystem activity by SIEM
like analysis systems.
8.2.3 File contents
All of the above analysis is primarily based on metadata. Analysis based on file
contents would be a new and difficult frontier to research for analysis. Files
containing text are currently already dealt with by index searching and in the
Electronic discovery area new technology like Predictive Coding and context
searching are beginning to extract out more and more intelligence every day. The
application of this technology can only but help with Digital Forensic analysis.
Text clustering for categorising files and emails provides the ability to find similar
emails and documents which can help grouping similar information and topics
(Decherchi, Tacconi, & Redi 2009). Yet detecting user behaviour based on emails
and other correspondence is a challenge for further research.
.
8.2.4 Plaso - tagging
The analysis performed using Plaso didn't utilise its powerful tagging system.
Using this could help with linking similar things together:
Some examples are:
58
• Software related information; Software application folders, software related
registry entries, software link files, prefetch information as well as “user assist”
information regarding executed programs.
• User actions; including Internet history , user profile files, link files , security
events , user-assist
• User generated data files (not always in profile)
• installed software; can be found in registry uninstalls section, program files
folders , Application menu folders (All users and individual)
Plaso – correspondence
Adding more correspondence related information into the timeline can help
investigators get a better idea of user activity. Currently Plaso already has support
for parsing and including Skype chat messages but adding email, SMS and other
chat based messaging information into Plaso can only help provide more vision on
the behaviour of the computer/user for the investigator.
With the flexibility of Plaso plug-ins could be created providing the ability to import
email, SMS and other chat based messaging communication listings from the
different commercial forensic tools. This could dovetail in quite nicely with the “User
Actions” tagging suggested above.
8.3
Conclusion
An approach for providing automation for analysis has been trialled showing the
flexibility of the Plaso toolkit in conjunction with python scripts. Computer and user
profiling was tested as well as different types of analysis which were successful
showing the strength of controlling Plaso for analysis with python.
The ability for the extraction of many types of metadata reports from Plaso is also
beneficial for analysis which can also be coupled with using python to provide further
analysis or statistics with the output can be very beneficial. The openness and ease
of using python to control Plaso leads itself for development by the forensic
community.
59
Plaso with analysis code contributed by the community working together has great
potential for adding many different possible analysis abilities for the investigator.
60
References:
Accessdata 2005, Registry Quick Find Chart, p. 16.
Aickelin, U, Twycross, J & Hesketh-Roberts, T 2008, ‘Rule Generalisation using Snort’,
International Journal of Electronic Security and Digital Forensics (IJESDF), vol. x, no. x,
viewed 27 May 2013, <http://www.cs.nott.ac.uk/~uxa/papers/ijesdf_fuzzy_ids.pdf>.
Australian Institute of Criminology 2008, Intellectual Property Crime and Enforcement in
Australia,
Criminology,
no.
94,
viewed
9
November
2013,
<http://www.aic.gov.au/documents/B/D/0/%7BBD0BC4E6-0599-467A-8F6438D13B5C0EEB%7Drpp94.pdf>.
Ayers, D 2009, ‘A second generation computer forensic analysis system’, Digital
Investigation,
vol.
6,
pp.
S34–S42,
viewed
6
March
2013,
<http://linkinghub.elsevier.com/retrieve/pii/S1742287609000371>.
Baker, W, Hutton, A & Hylender, C 2011, 2011 data breach investigations report, … investigations-report…,
p.
72,
viewed
24
October
2013,
<http://www.wired.com/images_blogs/threatlevel/2011/04/Verizon-2011-DBIR_04-1311.pdf>.
Brownstone, RD 2004, ‘Collaborative Navigation of the Stormy’, Technology, vol. X, no. 5.
Carrier, B 2003, ‘Defining Forensic Examination and Analysis Tools Using Abstraction
Layers’, Internationl Journal of Digital Evidence, vol. 1, no. 4, pp. 1–12, viewed 2 April
2011,
<http://scholar.google.com/scholar?hl=en&btnG=Search&q=intitle:Defining+Digital+Forens
ic+Examination+and+Analysis+Tools+Using+Abstraction+Layers#2>.
―
2005,
File
system
forensic
analysis,
viewed
20
April
2013,
<http://scholar.google.com/scholar?hl=en&btnG=Search&q=intitle:file+system+forensic+ana
lysis#0>.
Carrier, B & Spafford, EH 2005, ‘Automated digital evidence target definition using outlier
analysis and existing evidence’, in Proceedings of the 2005 Digital Forensics Research
Workshop,
Citeseer,
pp.
1–10,
viewed
2
April
2011,
<http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.81.1345&rep=rep1&ty
pe=pdf>.
Carvey, H 2011, Forensic Scanner.
― 2012, RegRipper Updates, ‘Windows Incident Response’ Blog post.
― 2013, HowTo: Malware Detection, pt I, viewed 24 October
<http://windowsir.blogspot.com.au/2013/07/howto-malware-detection-pt-i.html>.
2013,
Davis, A 2012, Leveraging the Application Compatibility Cache in Forensic Investigations.
61
Eckmann, S 2001, ‘Translating Snort rules to STATL scenarios’, Proc. Recent Advances in
Intrusion Detection, pp. 1–13, viewed 23 May 2013, <http://www.raidsymposium.org/Raid2001/papers/eckmann_raid2001.pdf>.
Elsaesser, C & Tanner, M 2001, ‘Automated diagnosis for computer forensics’, The Mitre
Corporation,
pp.
1–16,
viewed
20
April
2013,
<http://www.mitre.org/work/tech_papers/tech_papers_01/elsaesser_forensics/esaesser_forens
ics.pdf>.
Farrell, P 2009, ‘A Framework for Automated Digital Forensic Reporting’, no. March,
viewed 20 April 2013, <https://calhoun.nps.edu/public/handle/10945/4878>.
García-Teodoro, P, Díaz-Verdejo, J, Maciá-Fernández, G & Vázquez, E 2009, ‘Anomalybased network intrusion detection: Techniques, systems and challenges’, Computers &
Security,
vol.
28,
no.
1-2,
pp.
18–28,
viewed
17
October
2013,
<http://linkinghub.elsevier.com/retrieve/pii/S0167404808000692>.
Garfinkel, SL 2006, ‘Forensic feature extraction and cross-drive analysis’, Digital
Investigation,
vol.
3,
pp.
71–81,
viewed
11
March
2013,
<http://linkinghub.elsevier.com/retrieve/pii/S1742287606000697>.
― 2009, ‘Automating Disk Forensic Processing with SleuthKit, XML and Python’, 2009
Fourth International IEEE Workshop on Systematic Approaches to Digital Forensic
Engineering, Ieee, pp. 73–84.
Goodman, M 2001, ‘Making computer crime count’, FBI Law Enforcement Bulletin, vol. 70,
FBI,
no.
8,
pp.
10–17,
viewed
2
September
2011,
<http://www.ncjrs.gov/App/abstractdb/AbstractDBDetails.aspx?id=190553>.
Guðjónsson, K 2010, ‘Mastering the super timeline with log2timeline’, SANS Institute,
viewed
21
May
2013,
<http://scholar.google.com/scholar?hl=en&btnG=Search&q=intitle:Mastering+the+Super+Ti
meline+With+log2timeline#0>.
Kent, K, Chevalier, S & Grance, T 2006, ‘Guide to integrating forensic techniques into
incident response’, NIST Special Publication, viewed 23 August 2011,
<http://cybersd.com/sec2/800-86Summary.pdf>.
Marrington, A 2009, ‘Computer profiling for forensic purposes’, viewed 20 October 2013,
<http://eprints.qut.edu.au/31048>.
Marrington, A, Baggili, I, Mohay, G & Clark, A 2011, ‘CAT Detect (Computer Activity
Timeline Detection): A tool for detecting inconsistency in computer activity timelines’,
Digital
Investigation,
vol.
8,
pp.
S52–S61,
viewed
2
April
2013,
<http://linkinghub.elsevier.com/retrieve/pii/S1742287611000314>.
Marrington, A, Mohay, G, Clark, A & Morarji, H 2007, ‘Event-based computer profiling for
the forensic reconstruction of computer activity’, vol. 2007, pp. 71–87, viewed 20 April
2013, <http://eprints.qut.edu.au/15579>.
62
McKemmish, R 1999, ‘What is Forensic computing?’, Trends and Issues in Crime and
Criminal Justice, vol. 0817-8542, Australian Institute of Criminology, no. 118, pp. 1–6.
Morris, T, Vaughn, R & Dandass, Y 2012, ‘A Retrofit Network Intrusion Detection System
for MODBUS RTU and ASCII Industrial Control Systems’, 2012 45th Hawaii International
Conference on System Sciences, Ieee, pp. 2338–2345, viewed 16 March 2013,
<http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=6149298>.
NIST, NI of S and T 2004, Forensic Examination of Digital Evidence: A Guide for Law
Enforcement,
Office,
viewed
13
June
2013,
<http://www.ncjrs.gov/App/abstractdb/AbstractDBDetails.aspx?id=199408>.
Peisert, S & Bishop, M 2007a, ‘Analysis of computer intrusions using sequences of function
calls’,
Dependable
and
Secure
…,
viewed
27
April
2013,
<http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4198178>.
― 2007b, ‘Toward models for forensic analysis’, UC Davis Previously Published Works,
viewed 20 April 2013, <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4155346>.
Peng, W, Li, T & Ma, S 2005, ‘Mining logs files for data-driven system management’, ACM
SIGKDD Explorations Newsletter, vol. 7, no. 1, pp. 44–51.
Richard, GG & Roussev, V 2006, ‘Next-generation digital forensics’, Communications of the
ACM, vol. 49, no. 2, viewed 11 May 2013, <http://dl.acm.org/citation.cfm?id=1113074>.
Rider, K, Mead, S & Lyle, J 2010, ‘Disk Drive I/O Commands and Write Blocking’,
International Federation for Information …, vol. 242, pp. 163–177, viewed 27 June 2013,
<http://cs.anu.edu.au/iojs/index.php/ifip/article/view/11099>.
Roesch, M 1999, ‘Snort-lightweight intrusion detection for networks’, Proceedings of the
13th
USENIX
conference
on
…,
viewed
9
April
2013,
<http://static.usenix.org/publications/library/proceedings/lisa99/full_papers/roesch/roesch.pdf
>.
Rowlingson, R 2004, ‘A ten step process for forensic readiness’, International Journal of
Digital Evidence, vol. 2, Citeseer, no. 3, pp. 1–28, viewed 16 October 2011,
<http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.65.6706&rep=rep1&ty
pe=pdf>.
Simpson, S, Howard, M, Randolph, K, Goldschmidt, C, Coles, M, Belk, M, Saario, M,
Sondhi, R, Tarandach, I, Yonchev, Y & Vähä-Sipilä, A 2011, Fundamental Practices for
Secure Software Development 2ND EDITION.
Stevens,
D
2006,
Userassist,
viewed
<http://blog.didierstevens.com/programs/userassist/>.
― 2010, New Format for UserAssist Registry Keys, no. December.
― 2012, UserAssist Windows 2000 Thru Windows 8, no. July.
63
27
June
2013,
Sutherland, I, Evans, J, Tryfonas, T & Blyth, A 2008, ‘Acquiring volatile operating system
data tools and techniques’, ACM SIGOPS Operating Systems Review, vol. 42, ACM, no. 3,
pp. 65–73, viewed 8 April 2011, <http://portal.acm.org/citation.cfm?id=1368516>.
Swift, D 2006, ‘A Practical Application of SIM/SEM/SIEM-Automating Threat
Identification’, Paper, SANS Infosec Reading Room, The SANS …, viewed 27 October 2012,
<http://scholar.google.com/scholar?hl=en&btnG=Search&q=intitle:A+Practical+Application
+of+SIM/SEM/SIEM+Automating+Threat+Identification#0>.
Tan, J 2001, ‘Forensic readiness’, Cambridge, MA:@ Stake, pp. 1–23, viewed 16 October
2011, <http://isis.poly.edu/kulesh/forensics/forensic_readiness.pdf>.
Turner, P 2006, ‘Selective and intelligent imaging using digital evidence bags’, Digital
Investigation,
vol.
3,
pp.
59–64,
viewed
9
November
2013,
<http://linkinghub.elsevier.com/retrieve/pii/S174228760600065X>.
Watt, AC 2012, ‘Development of a Framework for the Investigation into the Methods Used
for the Electronic Trafficking and Concealment of Child Abuse Material’, University of
South Australia, p. 348.
Willassen, S 2008, ‘Finding evidence of antedating in digital investigations’, ARES, Ieee, pp.
26–32,
viewed
5
May
2013,
<http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=4529317>.
Wong, L 2007, ‘Forensic analysis of the Windows registry’, Forensic Focus, viewed 26 June
2013,
<http://www.forensictv.net/Downloads/digital_forensics/forensic_analysis_of_windows_regi
stry_by_lih_wern_wong.pdf>.
Xu, W, Huang, L, Fox, A, Patterson, D & Jordan, M 2009, Detecting large-scale system
problems by mining console logs, … on Operating systems …, viewed 13 August 2013,
<http://dl.acm.org/citation.cfm?id=1629587>.
Zaraska, K 2003, Prelude IDS: current state and development perspectives, URL http://www.
prelude-ids.
org/download/misc/
…,
viewed
29
October
2013,
<http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.106.5542&rep=rep1&type=pdf>.
64
Appendix A – Glossary
Term
Description
Container files
Files which contain other files. In this category are email mail box files
(.OST .PST … etc) and archive files (.zip .7z .rar … etc)
Shell bags
A part of the registry containing information in shell item format of use
in digital forensics for finding viewed folders. Is especially of use to list
files from removable storage devices or from folders which now are
deleted.
Shell Items
Shell items are used in Windows to identify items in the windows
folder hierarchy and more specifically in windows shortcut (.lnk) files
and in the Shellbags registry key
Social Media
Websites like facebook, Instagram … which are primarily socially
focused.
URL
Universal Resource Locator, Normally used for web links.
Userassist
A part of the registry which contains a list of the programs a user has
run and the last time they ran them.
65
Appendix B – List of formats that Log2Timeline tool
parses
• Apache2 Access logs
• Apache2 Error logs
• Google Chrome history
• Encase dirlisting
• Windows Event Log files (EVT)
• Windows Event Log files (EVTX)
• EXIF. Extracts exif information or metadata from various media files
• Firefox bookmarks
• Firefox 2 history
• Firefox 3 history
• FTK Imager Dirlisting CSV file
• Generic Linux log file
• Internet Explorer history files, parsing index.dat files
• Windows IIS W3C log files
• ISA server text export. Copy query results to clipboard and into a text file
• Mactime body files (to provide an easy method to modify from mactime format
to some other)
• McAfee AntiVirus Log files
• MS-SQL Error log
• Opera Global and Direct browser history
• OpenXML metadata, for metadata extraction from Office 2007 documents
• PCAP files, parsing network dump files created by tool such as Wireshark and
tcpdump (PCAP)
• PDF. Parse the basic PDF metadata to capture creation dates and other
information from PDF documents.
• Windows Prefetch directory
• Windows Recycle Bin (INFO2 or I$)
• Windows Restore Points
• Safari Browser history files
• Windows XP SetupAPI.log file
66
• Adobe Local Shared Object files (SOL/LSO), aka Flash Cookies
• Squid Access Logs (httpd_emulate off)
• TLN (timeline) body files
• UserAssist key of the Windows registry - well really NTUSER.DAT parser
since there are other keys parsed as well
• Volatility. The output file from the psscan and psscan2 modules from volatility
• Windows Shortcut files (LNK)
• Windows WMIProv log file
• Windows XP Firewall Log files (W3C format)
67
Appendix C – USB history report
datetime
2013-1024T13:00
:00+00:0
0
2013-1024T13:00
:00+00:0
0
2013-1025T00:00
:00+11:0
0
2013-1025T06:37
:46+00:0
0
2013-1025T06:37
:46+00:0
0
2013-1025T06:37
:46+00:0
0
2013-1025T06:37
:56+00:0
0
2013-1025T17:37
:46+11:0
0
2013-1025T17:37
:46+11:0
0
2013-1026T10:57
:12.0781
25+11:00
2013-1026T10:57
:12.0781
25+11:00
2013-1027T10:37
:18.7968
75+11:00
2013-1027T10:37
:18.7968
75+11:00
timestamp_
Source
source
desc
long
message
Last Access
LNK
Time
Windows
Shortcut
[Empty description] File size: 0 File attribute flags:
0x00000010 Drive type: 2 Drive serial number:
0x4e9b6351 Volume label: USBDISK Local path: E:\tmp
Last Access
LNK
Time
Windows
Shortcut
Last Access
LNK
Time
Windows
Shortcut
Creation
Time
LNK
Windows
Shortcut
Content
Modification LNK
Time
Windows
Shortcut
Creation
Time
LNK
Windows
Shortcut
Content
Modification LNK
Time
Windows
Shortcut
Content
Modification LNK
Time
Windows
Shortcut
Creation
Time
LNK
Windows
Shortcut
Last Written REG
NTUSER
key
[\Software\Microsoft\Windows\CurrentVersion\Explorer\Mo
untPoints2\{c3e7850e-3dd0-11e3-bebc-806d6172696f}]
BaseClass: [REG_SZ] Drive
Last Written REG
NTUSER
key
[\Software\Microsoft\Windows\CurrentVersion\Explorer\Mo
untPoints2] Value: No values stored in key.
Last Written REG
NTUSER
key
[\Software\Microsoft\Windows\CurrentVersion\Explorer\Mo
untPoints2\{c0bdf39c-3e96-11e3-873c-806d6172696f}]
BaseClass: [REG_SZ] Drive
Last Written REG
NTUSER
key
[\Software\Microsoft\Windows\CurrentVersion\Explorer\Mo
untPoints2] Value: No values stored in key.
[Empty description] File size: 21491 File attribute flags:
0x00000020 Drive type: 2 Drive serial number:
0x4e9b6351 Volume label: USBDISK Local path:
E:\tmp\grml-cheatcodes.txt Working dir: E:\tmp
[Empty description] File size: 21491 File attribute flags:
0x00000020 Drive type: 2 Drive serial number:
0x4e9b6351 Volume label: USBDISK Local path:
E:\tmp\grml-cheatcodes.txt Working dir: E:\tmp
[Empty description] File size: 0 File attribute flags:
0x00000010 Drive type: 2 Drive serial number:
0x4e9b6351 Volume label: USBDISK Local path: E:\tmp
[Empty description] File size: 21491 File attribute flags:
0x00000020 Drive type: 2 Drive serial number:
0x4e9b6351 Volume label: USBDISK Local path:
E:\tmp\grml-cheatcodes.txt Working dir: E:\tmp
[Empty description] File size: 21491 File attribute flags:
0x00000020 Drive type: 2 Drive serial number:
0x4e9b6351 Volume label: USBDISK Local path:
E:\tmp\grml-cheatcodes.txt Working dir: E:\tmp
[Empty description] File size: 0 File attribute flags:
0x00000010 Drive type: 2 Drive serial number:
0x4e9b6351 Volume label: USBDISK Local path: E:\tmp
[Empty description] File size: 21491 File attribute flags:
0x00000020 Drive type: 2 Drive serial number:
0x4e9b6351 Volume label: USBDISK Local path:
E:\tmp\grml-cheatcodes.txt Working dir: E:\tmp
[Empty description] File size: 21491 File attribute flags:
0x00000020 Drive type: 2 Drive serial number:
0x4e9b6351 Volume label: USBDISK Local path:
E:\tmp\grml-cheatcodes.txt Working dir: E:\tmp
68
2013-1027T11:12
Last Written REG
:33.5625
00+11:00
2013-1027T19:57
Last Written REG
:16.1093
75+11:00
NTUSER
key
[\Software\Microsoft\Windows\CurrentVersion\Explorer\Mo
untPoints2] Value: No values stored in key.
SYSTEM
key
[\ControlSet001\Services\USBSTOR\Security] Security:
[REG_BINARY]
2013-1027T19:57
Last Written REG
:20.5312
50+11:00
SYSTEM
key
2013-1027T19:57
Last Written REG
:20.5781
25+11:00
SYSTEM
key
2013-1027T19:57
Last Written REG
:22.4375
00+11:00
SYSTEM
key
2013-1027T19:57
:22.4375
00+11:00
2013-1027T19:57
:22.4375
00+11:00
2013-1027T19:57
:22.4531
25+11:00
2013-1027T19:57
:22.4531
25+11:00
2013-1027T19:57
:22.4531
[\ControlSet001\Control\Class\{36FC9E60-C465-11CF8056-444553540000}\0003] DriverDate: [REG_SZ] 7-12001 DriverDateData: [REG_BINARY] DriverDesc:
[REG_SZ] USB Mass Storage Device DriverFlags:
[REG_DWORD_LE] 1 DriverVersion: [REG_SZ]
5.1.2600.0 InfPath: [REG_SZ] usbstor.inf InfSection:
[REG_SZ] USBSTOR_BULK InfSectionExt: [REG_SZ]
.NT MatchingDeviceId: [REG_SZ]
usb\class_08&subclass_06&prot_50 ProviderName:
[REG_SZ] Microsoft
[\ControlSet001\Services\USBSTOR] DisplayName: USB
Mass Storage Driver ErrorControl: Normal (1) ImagePath:
system32\DRIVERS\USBSTOR.SYS Start: Manual (3)
Type: Kernel Device Driver (0x1)
[\ControlSet001\Enum\USB\Vid_05dc&Pid_a810\6&38fcc
a26&0&1] Capabilities: [REG_DWORD_LE] 4 Class:
[REG_SZ] USB ClassGUID: [REG_SZ] {36FC9E60-C46511CF-8056-444553540000} CompatibleIDs:
[REG_MULTI_SZ]
USB\Class_08&SubClass_06&Prot_50USB\Class_08&Su
bClass_06USB\Class_08 ConfigFlags:
[REG_DWORD_LE] 0 DeviceDesc: [REG_SZ] USB Mass
Storage Device Driver: [REG_SZ] {36FC9E60-C46511CF-8056-444553540000}\0003 HardwareID:
[REG_MULTI_SZ]
USB\Vid_05dc&Pid_a810&Rev_1100USB\Vid_05dc&Pid_
a810 LocationInformation: [REG_SZ] USB Device Mfg:
[REG_SZ] Compatible USB storage device ParentIdPrefix:
[REG_SZ] 7&1fb3deb6&0 Service: [REG_SZ] USBSTOR
UINumber: [REG_DWORD_LE] 0
First ConREG
nection Time
SYSTEM
key :
USBStor
Entries
[\ControlSet001\Enum\USBSTOR] subkey_name:
Disk&Ven_Lexar&Prod_JD_FireFly&Rev_1100
Last Written REG
SYSTEM
key
[\ControlSet001\Enum\USBSTOR\Disk&Ven_Lexar&Prod
_JD_FireFly&Rev_1100] Value: No values stored in key.
Last Written REG
SYSTEM
key
[\ControlSet001\Enum\USBSTOR\Disk&Ven_Lexar&Prod
_JD_FireFly&Rev_1100\7&1fb3deb6&0&AAMA1CG1OXX
83PM1&0\Device Parameters\MediaChangeNotification]
Value: No values stored in key.
Last Written REG
SYSTEM
key
[\ControlSet001\Enum\USBSTOR\Disk&Ven_Lexar&Prod
_JD_FireFly&Rev_1100\7&1fb3deb6&0&AAMA1CG1OXX
83PM1&0\LogConf] Value: No values stored in key.
Last ConREG
nection Time
SYSTEM
key :
USBStor
[\ControlSet001\Enum\USBSTOR] device_type: Disk
friendly_name: Lexar JD FireFly USB Device parent_id_prefix: 8&107bbb1a&0 product: Prod_JD_FireFly
69
25+11:00
Entries
2013-1027T19:57
Last Written REG
:22.4843
75+11:00
SYSTEM
key
2013-1027T19:57
Last Written REG
:22.4843
75+11:00
SYSTEM
key
2013-1027T19:57
Last Written REG
:25.9531
25+11:00
NTUSER
key
2013-1027T20:01
Last Written REG
:50.2343
75+11:00
SYSTEM
key
2013-1027T20:01 Last ConREG
:50.2343 nection Time
75+11:00
SYSTEM
key :
USBStor
Entries
2013-1027T20:01
Last Written REG
:50.3593
75+11:00
2013-10- Last ConREG
27T20:01 nection Time
SYSTEM
key
SYSTEM
key :
revision: Rev_1100 serial:
7&1fb3deb6&0&AAMA1CG1OXX83PM1&0 subkey_name:
Disk&Ven_Lexar&Prod_JD_FireFly&Rev_1100 vendor:
Ven_Lexar
[\ControlSet001\Control\DeviceClasses\{53f56307-b6bf11d0-94f200a0c91efb8b}\##?#USBSTOR#Disk&Ven_Lexar&Prod_J
D_FireFly&Rev_1100#7&1fb3deb6&0&AAMA1CG1OXX8
3PM1&0#{53f56307-b6bf-11d0-94f2-00a0c91efb8b}\#]
SymbolicLink: [REG_SZ]
\\?\USBSTOR#Disk&Ven_Lexar&Prod_JD_FireFly&Rev_
1100#7&1fb3deb6&0&AAMA1CG1OXX83PM1&0#{53f56
307-b6bf-11d0-94f2-00a0c91efb8b}
[\ControlSet001\Control\DeviceClasses\{53f56307-b6bf11d0-94f200a0c91efb8b}\##?#USBSTOR#Disk&Ven_Lexar&Prod_J
D_FireFly&Rev_1100#7&1fb3deb6&0&AAMA1CG1OXX8
3PM1&0#{53f56307-b6bf-11d0-94f2-00a0c91efb8b}] DeviceInstance: [REG_SZ]
USBSTOR\Disk&Ven_Lexar&Prod_JD_FireFly&Rev_110
0\7&1fb3deb6&0&AAMA1CG1OXX83PM1&0
[\Software\Microsoft\Windows\CurrentVersion\Explorer\Mo
untPoints2\E] BaseClass: [REG_SZ] Drive
[\ControlSet001\Enum\USBSTOR\Disk&Ven_Lexar&Prod
_JD_FireFly&Rev_1100\7&1fb3deb6&0&AAMA1CG1OXX
83PM1&0] Capabilities: [REG_DWORD_LE] 0 Class:
[REG_SZ] DiskDrive ClassGUID: [REG_SZ] {4D36E967E325-11CE-BFC1-08002BE10318} CompatibleIDs:
[REG_MULTI_SZ] USBSTOR\DiskUSBSTOR\RAW ConfigFlags: [REG_DWORD_LE] 0 DeviceDesc: [REG_SZ]
Disk drive Driver: [REG_SZ] {4D36E967-E325-11CEBFC1-08002BE10318}\0003 FriendlyName: [REG_SZ]
Lexar JD FireFly USB Device HardwareID:
[REG_MULTI_SZ]
USBSTOR\DiskLexar___JD_FireFly______1100USBSTO
R\DiskLexar___JD_FireFly______USBSTOR\DiskLexar_
__USBSTOR\Lexar___JD_FireFly______1Lexar___JD_Fi
reFly______1USBSTOR\GenDiskGenDisk Mfg:
[REG_SZ] (Standard disk drives) ParentIdPrefix:
[REG_SZ] 8&107bbb1a&0 Service: [REG_SZ] disk UINumber: [REG_DWORD_LE] 0
[\ControlSet001\Enum\USBSTOR] device_type: Disk
friendly_name: Lexar JD FireFly USB Device parent_id_prefix: 8&107bbb1a&0 product: Prod_JD_FireFly
revision: Rev_1100 serial:
7&1fb3deb6&0&AAMA1CG1OXX83PM1&0 subkey_name:
Disk&Ven_Lexar&Prod_JD_FireFly&Rev_1100 vendor:
Ven_Lexar
[\ControlSet001\Enum\USBSTOR\Disk&Ven_Lexar&Prod
_JD_FireFly&Rev_1100\7&1fb3deb6&0&AAMA1CG1OXX
83PM1&0\Device Parameters] Value: No values stored in
key.
[\ControlSet001\Enum\USBSTOR] device_type: Disk
friendly_name: Lexar JD FireFly USB Device par-
70
:50.3593
75+11:00
2013-1027T20:10
:37.3281
25+11:00
2013-1027T20:10
:37.3281
25+11:00
2013-1027T20:28
:46.2960
00+11:00
2013-1027T20:28
:46.2968
75+11:00
2013-1027T20:28
:46.5930
00+11:00
2013-1027T20:28
:46.6250
00+11:00
2013-1027T20:28
:46.7031
25+11:00
2013-1027T20:28
:48.1875
00+11:00
2013-1027T20:28
:48.8125
00+11:00
2013-1027T20:28
:48.8125
00+11:00
2013-1027T20:28
:48.8125
00+11:00
2013-1027T20:28
:48+11:0
0
2013-1027T20:28
USBStor
Entries
ent_id_prefix: 8&107bbb1a&0 product: Prod_JD_FireFly
revision: Rev_1100 serial:
7&1fb3deb6&0&AAMA1CG1OXX83PM1&0 subkey_name:
Disk&Ven_Lexar&Prod_JD_FireFly&Rev_1100 vendor:
Ven_Lexar
Last Written REG
NTUSER
key
[\Software\Microsoft\Windows\CurrentVersion\Explorer\Mo
untPoints2\{ca738c90-3ee5-11e3-8741-525400d288c3}]
BaseClass: [REG_SZ] Drive
Last Written REG
NTUSER
key
[\Software\Microsoft\Windows\CurrentVersion\Explorer\Mo
untPoints2] Value: No values stored in key.
Last Visited
Time
MSIE
WEBHIS
Location: Visited: user@file:///E:/tmp/grml-cheatcodes.txt
Cache File
T
Number of hits: 1 Cached file size: 0
URL record
Last Written REG
NTUSER
[\Software\Microsoft\Windows\CurrentVersion\Explorer\Re
key : MRUx
centDocs\.txt] 1 [0]: grml-cheatcodes.txt
List
Last Visited
Time
MSIE
WEBHIS
Location: :2013102720131028: user@file:///E:/tmp/grmlCache File
T
cheatcodes.txt Number of hits: 1 Cached file size: 0
URL record
Last Visited
Time
MSIE
WEBHIS
Location: :2013102720131028: user@file:///E:/tmp/grmlCache File
T
cheatcodes.txt Number of hits: 1 Cached file size: 0
URL record
Last Written REG
NTUSER
[\Software\Microsoft\Windows\CurrentVersion\Explorer\Re
key : MRUx
centDocs] 1 [1]: tmp 2 [0]: grml-cheatcodes.txt
List
crtime
FILE
NTFS_DET /dev/nbd0p1:/Documents and Settings/user/Recent/grmlECT crtime cheatcodes.txt.lnk
mtime
FILE
NTFS_DET /dev/nbd0p1:/Documents and Settings/user/Recent/grmlECT mtime cheatcodes.txt.lnk
ctime
FILE
NTFS_DET /dev/nbd0p1:/Documents and Settings/user/Recent/grmlECT ctime cheatcodes.txt.lnk
atime
FILE
NTFS_DET /dev/nbd0p1:/Documents and Settings/user/Recent/grmlECT atime cheatcodes.txt.lnk
Last
Checked
Time
MSIE
WEBHIS
Location: Visited: user@file:///E:/tmp/grml-cheatcodes.txt
Cache File
T
Number of hits: 1 Cached file size: 0
URL record
Last
Checked
WEBHIS MSIE
Location: :2013102720131028: user@file:///E:/tmp/grmlT
Cache File cheatcodes.txt Number of hits: 1 Cached file size: 0
71
:48+11:0 Time
0
URL record
72
Appendix D – Web URL’s
The extracted internet history URL’s shown below was sxtracted directly from Plaso. Further
python scripts were used to clean up the output and divide off the file:// links from http://
and https:// links before further analysis could happen.
:2013102720131028: SYSTEM@:Host: My Computer
:2013102720131028: SYSTEM@file:///C:/WINDOWS/system32/oobe/updshell.htm
:2013102720131028: user@:Host: My Computer
:2013102720131028: user@file:///E:/tmp/grml-cheatcodes.txt
Cookie:user@support.microsoft.com/
http://support.microsoft.com/Styles/onemscomcomponents.css
http://support.microsoft.com/Styles/oneMscomMaster.css
http://windowsupdate.microsoft.com/windowsupdate/v6/default.aspx
http://windowsupdate.microsoft.com/windowsupdate/v6/default.aspx?ln=en-us
http://www.ninemsn.com.au/?ocid=iefvrt
http://www.ninemsn.com.au/css/style.min.css?v=10
Visited: SYSTEM@file:///C:/WINDOWS/system32/oobe/updshell.htm
Visited: user@file:///E:/tmp/grml-cheatcodes.txt
Visited: user@http://go.microsoft.com/fwlink/?LinkId=54729&clcid=0x0c09
Visited: user@http://ninemsn.com.au/?ocid=iefvrt
Visited: user@http://www.microsoft.com/isapi/redir.dll?prd=ie&pver=6&ar=msnhome
Visited: user@http://www.ninemsn.com.au/?ocid=iefvrt
73
Download