The Needle in the Haystack - Office of Information Technology

advertisement
The Needle in the Haystack:
Find the Offending File
Robert K. Henry

CISSP, GCIH, GCFA

Information Security
Officer
HR Has an Employee Grievance

Hostile Workplace – Sexual Harassment


Inappropriate/offensive files stored on web server
and displayed in office
College Staff Already Involved
College Investigation

Course Site Files Deleted


Six weeks prior to HR grievance report
No Backups!

Backup System on the fritz at time files were
deleted
College Investigation

How do we get the goods?

College systems admin made manual backups to local
PC drive

Not removed from local drive after backup system was repaired
The Mission:


Find inappropriate material among 6 GB of
mixed images, word-processed, and text files.
Identify owner/creator of files
> 7000 files
Search Options

Manual
grep
ssdeep
foremost
sorter
Content Based Image Retrieval, CBIR

Evaluation Criteria:







Easy!
Free!
Search Options

Manual (The First Responder's Strategy)




zzzzzzzzzzzzzzzzzz!


Thumbnails
Slide Show
One-at-a-time
Too much room for error
Pretty Inefficient (32 hours of searching)

Two people spent two workdays each going
through the DVD's
Search Options

But . . .

it worked!


Identified inappropriate word-processed files and images
in one directory on one of the DVD’s
Due to multiple file copying, creator/owner of files doesn't
show up in Windows file properties

Did I mention the files were uploaded via ftp with shared
userID’s?

Not much accountability!
Search Options
There’s gotta be an easier way!
Search Options-- grep


Built-in *nix string search command also available
for Windows
Steps to conduct search with grep (1)

Make a forensic image of the disks
#dd if=/dev/sr0 of=dvdimage.img conv=noerror,sync
Search Options--grep

Steps to conduct search with grep (2)

Extract Strings

Ascii strings first
#cat dvdimage.img | strings --radix=d dvdimage.img > dvdimage.str

Unicode strings second
#cat dvdimage.img | srch_strings -t d -e > dvdimage.uni.str
Search Options--grep

Steps to conduct search with grep (3)

Examine Strings Files

Create “dirty word” file

Use “dirty word” file to search strings for, well, dirty words
#grep -f dirtyWords.txt dvdimage.str > grepOutput.txt
#grep -f dirtyWords.txt dvdimage.uni.str > grepOutput.uni.txt
Search Options--grep

Results

process sounds a little involved, however . . .

Took about 30 minutes to image DVD’s and run commands.
 Not Bad!

Identified Word-Processed files with inappropriate jokes

Doesn't get image files (didn't expect it to)

Doesn't Identify Creator of files
 Zero non-repudiation
 Doesn't help investigation confirm or deny ownership of files

Bonus: found survey data with Too Much Information
 Protected student information in clear text
Search Options--ssdeep

linux and Windows

http://ssdeep.sourceforge.net/

Uses fuzzy hashing


A “partial” or “inexact” hashing of files to identify similar
files
Its author, Jesse Kornblum, even uses the phrase
“finding needles in haystacks” in his documentation!

Haven't heard of it being used to find questionable pictures, but
why not give it a try?
Search Options--ssdeep

“ssdeep! Go find files in the test directory that
look like files in the “homeStuff” directory!”
#ssdeep -lrd test homeStuff

Bummer-
Identified exact matches only
Search Options--ssdeep

Need to try carving out portion of file for true
fuzziness

Skip the first 20 blocks (header info and more) of file and
cut out the next 70 blocks for the hash comparison:
#dd if=dsc00219.jpg of=219partial.jpg skip=20 count=70

Create file for comparison
#ssdeep dsc00219partial.jpg > testhash.txt

Compare fuzzy hash of image to images in directory
#ssdeep -lrm testhash.txt homeStuff
Search Options--ssdeep

Results:

Not Promising

Can check for similarities in files on a file-by-file basis, but that's
too much like a manual search

Can easily find exact matches
 so you must have the file you are looking for ???

However . . .
 Useful for an intellectual property issue or finding known bad
files
Search Options--foremost

linux and Windows


http://sourceforge.net/projects/foremost/
Identifies files based on a database of file headers and
footers

Find a list of most file headers at http://www.wotsit.org
Search Options--foremost
This is the header of a gzip file displayed in a hex
editor
The gzip header is 0x1f 0x8b 0x08
Search Options--foremost
#foremost –o pathToOuptutFile –c pathToConfigFile pathToImageFile
foremost--Results
Search Options--sorter

linux and Windows

perl wrapper for several Sleuthkit tools
http://www.sleuthkit.org/

Runs against a disk image

Finds active or deleted files

Then displays thumbnail view of the files
Search Options--sorter
#sorter –s –d pathToutputFile pathToInputFile
Search Options--sorter

Results

Save many steps
compared to foremost

Still have a bunch of
thumbnails to look
through
Search Options
There’s gotta be an easier way!
Search Options--CBIR

Content Based Image Retrieval

Commercial Versions Available


My Office (me) too cheap—didn’t even look into
commercial options!
Free and Open Source

imgSeek

Linux and Windows
http://www.imgseek.net/

Gnu Image Finding Tool

Linux
http://www.gnu.org/software/gift/gift.html
Search Options--CBIR

ImgSeek Demo
Lessons Learned

Mission Accomplished!

Not so much

Found inappropriate material among 6 GB of mixed
images, word-processed, and text files

Failed to identify owner/creator of files

Identified a potentially useful tool
Lessons Learned

Need to develop incident response procedure for
entire organization


Procedure for breaches of Personally Identifiable
Information and Payment Card data are on the books
Procedures for responding to HR requests needs
documentation

And needs distribution to de-centralized IT units
References:

The Sleuthkit (includes sorter)


foremost


http://www.imgseek.net/
GIFT (Gnu Image Finding Tool)


http://ssdeep.sourceforge.net/
imgSeek


http://sourceforge.net/projects/foremost/
ssdeep


http://www.sleuthkit.org/
http://www.gnu.org/software/gift/gift.html
Presentation available at:

http://boisestate.edu/oit/iso/HTCIA&CBIR.ppt
Questions?
bhenry@boisestate.edu
Download