PCAP data How we get it Direct capture from the NIC on a machine tcpdump wireshark Netwitness etc. 1 Network coverage – an aside Network coverage is how much of the traffic on the network that your sensor network can see. You can have different types of monitoring on different parts of the network, but the main idea is to avoid blind spots. This applies to PCAP, flow, logs, and everything else. 2 Network coverage – an aside Since different segments of the network carry different traffic, where you decide to place you sensors will determine what you can see. What would you see on the outside of the border firewall that you wouldn’t see inside? What kinds of things do you WANT to see? 3 Network coverage – an aside Things to think about NAT – solve with placement of sensors VPN – solve with placement of sensors or VLAN specific configuration Multiple border gateways – solve using channel bonding/aggregation 4 Network coverage – an aside On the outside of your firewall, you see the attacks that didn’t get through in addition to the things that did. On the inside of your firewall you see things that actually got through. The outside tells you who’s attacking and how. The inside tells you what attacks worked. 5 Network coverage – an aside In addition to the amount of the network that’s covered, we can also think about WHEN the network is being covered. Sometimes you’ll want PCAP data for a couple of hours, but couldn’t handle 24/7. When might that be? Could you perhaps trigger full PCAP for a time based on some event? Absolutely! 6 PCAP data Hands on Now that we know where, why, and how to collect PCAP data, let’s go do some captures. 7 PCAP data Doing analysis - Wireshark Wireshark is your good old fashioned, run of the mill, go-to, protocol analyzing, packet capturing, file carving buddy. Learn to love it. 8 PCAP data Doing analysis - Wireshark What we’ll be doing today Learning the layout of the interface Capturing PCAP data Looking at the structure of packets Filtering packets to find interesting things Following a TCP session Carving files Reading emails 9 PCAP data Doing analysis - Wireshark Sources for pcaps http://wiki.wireshark.org/SampleCaptures http://packetlife.net/captures/ http://www.pcapr.net http://www.icir.org/enterprisetracing/download.html Your own machine 10 PCAP data Doing analysis - Wireshark So that’s Wireshark. Pretty nice, huh? When it comes to finding out exactly how your machine got pwned (aka owned, pwnt, etc.), it’s pretty effective. Also, the functionality of Wireshark can be extended by coding up plugins and decoders, and anything else you want. It’s open source! 11 PCAP data Doing analysis - Wireshark But what if we don’t have time to do all that poking about and sifting through packets? Is there a better way to look through a big pile of PCAP data? I thought you’d never ask… 12 PCAP data Doing analysis - Netwitness What we’ll be doing today Learning the interface Importing some PCAP data Doing (almost) everything we just did in Wireshark in less time than it took us before Catching things that we might have missed before 13 PCAP data Doing analysis - Netwitness Netwitness is a tool for getting a quick picture of what someone was doing on the network, especially if you’re going after less advanced threats, like insider threats or the average criminal. Currently there’s a freeware version and a paid version. Give it a try next time you get stuck during an investigation. Often you can catch certain clues via the session based view that you wouldn’t simply by digging through PCAPs. 14 PCAP data Doing analysis – Other tools In addition to sitting down and doing deep dive analysis on PCAP data by hand, we can also run it through automated processes (sometimes even at line speed!) to do all sorts of other stuff. This is how firewalls and IDS work, after all. Depending on the audience, this is where we discuss our organization’s custom tools 15 PCAP data Generating flow and alert data Useful when someone hands you a big wad of PCAP and you have no other data Can be done when you’ve got data from before you fielded your flow monitoring or alert generating apps (IDS, firewall, etc.) Makes analysis of large data sets easier since it’s faster to look at coarse grained data. We’ll cover this when appropriate. 16 PCAP Data Conclusion When you have PCAP you can see pretty much everything. It’s very heavy weight whenever you start dealing with enterprise level networks. It’s the only way you’ll see what’s being said on the network, but it’s not as good as flow or log/alert data for figuring out what’s important to look at. 17 Agenda Day 1 Agenda and motivation Intro to forensic data types Working with PCAP data What it looks like How to interpret it How to get it Working with flow data What it looks like How to interpret it How to get it Day 2 PCAP and flow recap Working with logs and alerts What they look like How to interpret them Getting them all in one place SIEM’s and their familiars Fielding a monitoring solution 18 Flow data Things to keep in mind This is easy data to get, so make sure you do. Better used to figure out where to look, than to figure out exactly what happened. Even when you’re not on an investigation, you should collect flow data to do baselining. Visualization helps a lot. 19 Flow data What is flow data? There’s some variation, but generally a record contains the following: Source and dest ip Source and dest port Protocol Start time + (duration | end time) # of packets # of bytes Directionality? Depends on format. 20 Flow data Netflow v5 protocol Source: caida.org/tools/utilities/flowscan/arch.xml 21 Flow data Command line output 22 Flow data Directionality Some types of flow records are unidirectional (SiLK, rw tools), and others are bidirectional (argus, ratools, original flow data). Unidirectional flow data has a separate record for both sides of the conversation. This is how Cisco NetFlow v5, v9, and IPFIX records are specified. Bidirectional flow data combines both sides into one record, usually having extra fields for “# of sender packets”, “# of destination bytes”, and other things that would get muddled by combining two unidirectional flows. 23 Flow data Directionality Depending on what you need, you can convert between bidirectional and unidirectional using whatever tool is appropriate to your data set. 24 Flow data Cutoff and Aging Until conversations end, their flow data sits in the router/switch/etc. memory, taking up space (DOS?). So if we’ve got lots of very long lived flows or flows that didn’t end well (FIN ACK) we need to free up that memory and write the flows. For long flows, we have a configurable time (say 30 minutes) after which we write a record and start a new one. Figuring out how long the flow actually was will require massaging your data. For broken flows, another cutoff time (maybe 15 seconds?) will clear them out. 25 Flow data Sampling When there’s too much traffic for your switch, NIC, or whatever to handle, sampling is used to throttle the workload. Instead of every packet being recorded in a flow (sample rate = 1 out of 1), we take 1 out of N packets, make flow records, and then scale the appropriate values by N. We will miss flows due to this but for very large throughputs it’s necessary. Also, N is not always constant over time. 26 Flow data Formats And then there are different formats… Cisco NetFlow v5 and v9 are very common. V5 will only do IPv4, though. IPFIX is a lot like v9 plus some interesting fields. Open protocol put out by IETF. sFlow hardware accelerated, forced sampling, mainly an HP thing. And there are others, but we’ll focus on v5/v9 and IPFIX. 27 Flow data Formats There isn’t a current standard for how to store flow data on disk, so different software suites will store it differently to suit their search and compression capabilities. Choose your software suite based on what formats it can consume, and be prepared to perform a conversion if you switch. 28 Flow data Capturing Switches and routers Flow data is gathered by the network hardware, and then sent over the network to one or more listeners. To set up collection and forwarding, look up instructions particular to your device and the revision of its OS (typically Cisco IOS). Remember, this is going over the network, so it can be intercepted, falsified, or blocked by attackers, outages, and misconfigurations! 29 Flow data Capturing Machines on the network Creates flow data based on what network traffic that machine can see. Can either generate flow data and forward it to another collector, store it locally, or both. Also possible to collect flow data from other machines or network hardware. Eventually your flow data will have to end up somewhere. You want that somewhere to be handy to your analysts. 30 Flow data Analyzing with argus Argus is another popular tool which is much easier to deploy, so we’ll be using it to do some sleuthing. Become familiar with a few of the tools Locate a scanning machine Detect beaconing Find activities by a compromised machine Find routing misconfigurations 31 Flow data Capturing with SiLK YAF – yet another flowmeter Produces IPFIX data from files or network traffic Can write to disk or push out over network Lightweight, easy to install Works well with SiLK tools 32 Flow data Capturing – consolidating in SiLK rwflowpack Part of the SiLK toolset Designed to receive input from multiple sensors and build a consolidated repository for analysis Just one of the pieces of a full sensor network. 33 Flow data Analyzing with SiLK SiLK tools Produced by CERT NetSA Relatively easy to use We’ve already been using them and have done a decent amount of writing on how to use them (check my transfer folder) 34 Flow data SiLK tools - conclusion Free, very powerful, extensible, pretty easy to use. Command line tools are great for things that we have running as daemons, but for visualizing flow data we can find a better interface. With the right tools, we can add better visualization. 35 Flow data Visualizing Open source Afterglow + graphviz: cheap, but too much work to set up Free/commercial Scrutinizer: quick and easy, consumes pretty much any flow data, free version is limited to 24 hours of data Lynxeon: belongs in the SIEM category, visualization tool is worth a mention though, 60 day trial 36 Flow data Visualization http://www.networkuptime.com/tools/netflow/ http://freshmeat.net/search/?q=netflow&section=projects TONS more Source: plixer.com, vizworld.com, networkuptime.com 37 Flow data Continuing research Flowcon, Centaur Jam, etc. Come join us! Share your tools! Statistical anomaly/group detection Complicated math New-ish technology, but worth a look if you’ve got a pile of netflow data that you’re sitting on. 38 Agenda Day 1 Agenda and motivation Intro to forensic data types Working with PCAP data What it looks like How to interpret it How to get it Working with flow data What it looks like How to interpret it How to get it Day 2 PCAP and flow recap Working with logs and alerts What they look like How to interpret them Getting them all in one place SIEM’s and their familiars Fielding a monitoring solution 39 PCAP reCAP Most granular data we can collect Takes a lot of resources to gather Great for finding out how machines got pwned Bad for figuring out what’s going on quickly Can be converted into flow and alert data with the right tools 40 FLOW reFLOW Info about conversations on the network Cheap and easy to collect Quick to analyze with the right tools Different analysis suites, formats 41 CTF Forensics Jim Irving Overview • Forensics in a CTF Context • Network forensics tools • Host based forensics tools Working in a CTF environment Unlike a typical forensic investigation, a CTF will always be down and dirty with as close to 0 rules as you can get. You also care a lot more about speed than you do accuracy. There isn’t a court case going on here. Working in a CTF environment Forensics generally requires that you know an awful lot about the underlying systems, but in a CTF there are tons of systems that you don’t have the time to learn. For that reason we’ll be focusing on tools that you can use where the learning has already been done for you. Possible scenarios • Each team has a server VM that has to be protected/attacked • Teams get a VM or disk image of a compromised machine to be analyzed • Points are awarded for a series of web challenges • Crazy mess, like half the team has to play Team Fortress or something… For reals forensic challenges If you’re given VM’s or disk images to analyze, then you’ll be doing a lot of host based analysis. Invariably, I cannot teach you what you need to perform a comprehensive analysis in 8 hours. So we’re going to focus on the things that at least help you figure out where to start googling. When you do google, you’ll probably be looking for free tools or scripts to perform a particular task. When competing with more intelligent opponents, rapidly aquire and use tools to level the playing field. Forensics as a support class In challenges where you have machines that must be protected while you attack others, forensics allows you to see what’s happening. You’ll have access to network traffic, system logs, and whatever else you can get visibility to within the system. The purpose of using your forensic tools in this context will be to: • See where attacks are coming from so you can set up defenses • See what kinds of attacks are being used so you can see what works and figure out how to neutralize it ( stealing ideas is a legit strategy) How the class is divided • Look at several network tools and practice deploying them quickly – – – – Wireshark Snort Argus Etc. • Look at freely available host based forensics and practice common techniques – – – – Disk dumping Memory analysis (Volatility) Timelining File carving/recovery Phase 1: Network stuff Something to think about when dealing with network traffic is where is it coming from and who can see it. If you have a VM to protect, you will at least be able to see the traffic going into and out of it. In this situation you want to be getting full packet capture to that machine. If possible it would be nice to get that data passively so that if you get owned the attacker won’t be able to see what you’re using for defense. Also you won’t have to set up tools on the machine you’re protecting. Phase 1: Network stuff If you’re all attacking the same thing, like web challenges, you probably won’t have a good way to monitor data going into and out of the victim server. If you can, that might actually be considered cheating. If not, you should definitely do it. Setting up your sensors Assuming we’re defending a VM, and that we’re passively monitoring traffic, we’re going to set up the following tools: • Wireshark – for keeping a packet level eye on certain types of data • Snort – for alerting us when certain things happen • Argus – for quickly sifting through connection data • And maybe some other things Setting up your sensors We’re going to start with the SIFT Workstation because it’s already got most of the tools on it. Once you get it downloaded, install argus, like so: sudo apt-get install argus-client And the password is forensics Setting up your sensors Now we need to clean up snort. Open up /etc/snort/snort.conf (you’ll need to sudo). Now find the section that says “preprocessor http_inspect_server” and make the following changes: preprocessor http_inspect_server: server default \ profile all ports { 80 8080 8180 } oversize_dir_length 500 \ server_flow_depth 0 \ client_flow_depth 0 \ post_depth 0 Additionally, any port that you think might have http going over it should be added to the brackets. Setting up your sensors Once you’ve changed the preprocessor block, scroll down to the bottom of the file and comment out all the “include something.rules” lines EXCEPT local.rules. Now save and close. Now to practice… While you are here, before leaving this page, turn on your VM’s and make absolutely sure that you can do all of the following. • Start wireshark and see packets • snort –A console –c /etc/snort/snort.conf • argus –d –e ‘localhost’ –w argus.log • Anything else you think you might use, like tcpdump, netcat, telnet, FIREFOX serious Uses for wireshark Wireshark is going to serve as your real time, packet level visibility tool. You want it to let you know when very specific types of traffic occur and allow you to quickly inspect them. To this end you’ll probably want to have more than one instance of it running, tiled across the screen, with some very specific capture filters in place. If(derp what’s a capture filter?) remediate(); Uses for wireshark Let’s practice this now. We’re going to set up the following capture filters AND TEST THEM: • DNS traffic out of the server • HTTP requests into the server • HTTP responses from the server with error codes • FTP traffic into or out of the server Uses for wireshark SAVE THESE FILTERS!!! You will forget them when the time comes and it is far better to have them ready to go and a snapshot in place. Can you think of any more filters that might be useful? Once you’ve got a good set of filters that prove useful in the field, please consider making them publicly available. Snort Snort is going to be particularly useful when defending a VM. From your passive monitoring box it allows you to make very specific filters for alerts. From your vulnerable server it gives you something with which to filter and drop packets. That means you’ll need to learn to write some rules. www.snort.org/assets/166/snort_manual.pdf (Start on page 130. Careful, it’s not always right.) Snort The rules that you write will go into /etc/snort/rules/local.rules Whenever you write a new rule you’ll need to save the file, and restart snort by re-running the command that started it. We’re going to write a few rules to get the hang of the syntax and then try to make actual useful rules that you can modify and use in a competition. Snort rule syntax Snort rules, at their most basic, look like this: action protocol src_ip src_port direction dst_ip dst_port (rule options) Here’s one with the information filled in: alert tcp any any -> any 80 (msg: “text goes here"; sid: 10001; rev:1;) Inside the rule options, the msg, sid, and rev are required. Snort rule syntax Unfortunately, the syntax for snort rules is really really expansive, the documentation has a lot of deprecated stuff, and it’s really hard to find a good quick tutorial. So here’s a link to the shortest, closest to right one that I’ve found. It’s linked here since we certainly can’t reprint it and it’s way too big to recreate. Bookmark this on your analysis machine. ftp://petrinet.dvo.ru/pub/Vyatta/build-iso/pkgs/vyattasnort/debian/my/snort_rules.html Snort rule syntax Now let’s try writing some rules using the documentation that we have. • Alert on http get to the server with “../” in the URL. • Drop on tcp traffic to port 22 on the server (it’s like iptables, but completely different) • Alert on http response from server containing the following byte string “E3 80 04 32 54” Quick snort review Snort on your sensor for seeing very specific attacks. Write rules that match the type of traffic that the attackers will need to hit your services. Snort on your server to drop packets that look scary as an impromptu IPS. Can also be used to deny scouting information by intercepting outbound error messages (if you perhaps can’t turn off debugging output). Using argus in a hurry Argus is a netflow collector/analyzer. Typically netflow is only useful when you have a lot of it, but there are a few things that can make it useful even in a quick setting. Netflow is a record of all the conversations that go across a network. It doesn’t contain all the contents of the packets. It’s helpful when you want to see the forest and not concentrate on the trees, so to speak. Using argus in a hurry Whenever you start your sensor box, start argus in daemon mode and have it write out to a file (argus –d –w /somefile.log). This is going to keep writing to a file, and you’re going to analyze that file using ra. ra reads the output of argus and will print its contents in different ways depending on what parameters you pass it, and there are a lot of parameters… Using argus in a hurry ra commands look like this: ra (bunch of options) – (filter expression) The options tell ra how to display the data, modify the data, and a little bit of how to filter the data (like by time), and the filter expression part tells ra what data you want (or more specifically what you don’t want). That’s a hyphen surrounded by spaces in between them. It’s necessary. ra options Here are some of the important options to use: • r = which file to read, this is your log file • t = time range to look in, will search all if omitted • s = which fields to print, will print a default set if omitted So the first half of a call to ra where you look through the last 15 minutes and only want to see the src_ip dest_ip and direction it would look like this: ra –r /somefile.log –t -15m –s saddr dir daddr - ra filter expressions You can omit the filter expression and ra will just print everything. There are a lot of these but here are the really useful ones • tcp = only tcp traffic • src port ## or dst port ## or just port ## = only traffic on the specified port • host (hostname) = only traffic involving the specified host, can be prepended with src or dst These can be combined using the keywords “and” and “or” to make compound statements. You can also prepend anything with “not” to negate it. rasort A lot of time the output of ra isn’t going to help much, so what you do is right after the ra command, put the following: | rasort –m (keyword) | less Where keyword is typically bytes or pkts or something. For more keywords just do “man rasort”. Same thing goes for “man ra”. Playing with argus So now that you’ve got a little idea of how to use argus we’ll try a few. • List all hosts that are talking and sort them by sent packets • List all hosts that were talking to the vulnerable server in the last 5 minutes. • List all the hosts that the top host from the previous example talked to in the past hour. Using argus to greatest effect Argus is mostly going to help you see who’s talking to you and sort them by how and when they’re doing that talking. When you detect that you’re getting attacked, check who’s been talking to you a lot recently. Check every so often for high numbers of low packet flows for evidence of scanning. Also every so often search for everything but what you expect to see so that you’ll see what you’re not expecting. Network summing up So you’ve got snort running all the time and someone on defense writing filters specific to whatever service that you’re running. That’s going to tell you when you’re getting attacked. You’ve also got wireshark up and running so that when you see an attack you can quickly look at the packets and the tcp stream and figure out what’s going on and improve the filters. Finally you’ve got argus running so that once you calm down from the attacks you can see who all is attacking you and maybe even who else they’re talking to. In addition to that, you’re checking it every so often to see how and when you’re getting scanned. There you go. That’s how you play defense. Phase 2: Host based forensics We’re going to assume that you know how to look for rootkits, so you can find hidden processes and hidden things in the registry. What we’ll concentrate on is the forensic investigative techniques and not necessarily the defensive techniques. First thing’s first… The first thing you need to do in most situations is take images of memory or disk. Here’s what you’ll most likely see: • A virtual machine • A disk image of a virtual machine • A busted usb key • A machine accessible only across the net Imaging a virtual machine - memory I’m going to focus on vmware here, FYI. Taking images of VM’s is a particularly entertaining thing to do and there are a couple of tricks. To get a physical memory image, snapshot the machine and the *.nvram that gets created is the best memory image you can get. Better than you can even get by running apps on the host. Imaging a virtual machine -disk Disk imaging however is not as easy. I have found no reliable means to go from .vmdk to a raw image, so you want to do the following. 1. Make an iso containing MoonSols community edition (http://www.moonsols.com/windows-memory-toolkit/), dd.exe (http://www.chrysocome.net/dd), and a statically linked version of dcfldd (http://dcfldd.sourceforge.net/). 2. Make a directory on your analysis machine and share it with the VM (as a web share for windows VM’s). 3. For windows run “dd.exe –list” and pick which device looks right, then run “dd.exe if=(thing from –list) of=(path to a file on shared folder”. For linux, run “mount” and see what’s mounted on /, and then run “dcfldd if=(dev/whatever) of=(path to file on share) bs=512”. 4. Profit. Imaging a disk image It’s a disk image, dummy. But seriously, if you want a particular type of memory image, or to boot from it, create a new VM using the image as the disk, and then snapshot the VM once it’s running and use the .nvram as the memory image. Imaging a usb key Plug it into a linux machine, figure out which device it is, and then run the same dcfldd command from before, but with the if parameter set to the usb device. dcfldd will continue on errors and so will be effective even for damaged drives. Imaging across a network On your analysis machine install netcat (nc), gzip, and dcfldd. Then run the following: nc –l –p (some port) | gzip –dfc | dcfldd of=(some filename) And then on your source machine install all of those and run the following: dcfldd if=(drive to image) | gzip cf | nc (ip of analysis machine) (that port) –q 30 This will pump the disk image across the network, gzipped, and then deflate it on the other end and write it to a file. Things to do with memory images You’ll learn a lot more about this from your rootkit class, but I’ll put some of this in brief for quick and dirty analysis. On your analysis machine, install python and download Volatility (use tortoiseSVN to pull it from http://volatility.googlecode.com/svn/branches/Vola tility-1.4_rc1). Make sure python is on your PATH, and then go drop your memory image wherever Volatility is sitting. Volatility quick and dirty Once you’ve got all that set up, here are things to do. Prepend all of these commands with “python volatility” and append them with “-f (path to image file)” 1. Run both pslist and psscan3 2. Find any process that’s in one and not the other. 3. Run filescan, connscan, regkeys, sockscan, or anything else and look very closely at anything that’s reported for the mismatched processes 4. Run memdump –p (pid of process) –o (offset of process) –output-file=(some filename) 5. Run strings on that output file and start googlin’. Finding deleted files For this class we’re going to focus on finding deleted files in disk dumps. I’m not going to discuss trying to find files hidden by rootkits, since rootkits ought not to be running on your analysis machine. To accomplish this, we’ll be focusing on using the SANS SIFT 2.0 VM. It’s got Sleuthkit, PSK, and a bunch of other things already installed on it. There’s also a ton of video tutorials for doing this sort of thing on www.sans.org. You should certainly take a look at them since we’ll be doing it very very quickly here. Finding deleted files 1. Obtain your disk image (not memory image, they don’t work well for this) as previously directed. 2. Make it available on your analysis machine somehow. 3. Open PSK or Autopsy from the VM. 4. Open a new case, fill out data, whatever. 5. Literally JUST CLICK ON THE TAB FOR FILE ANALYSIS AND FOLLOW THE DIRECTIONS. It more or less really is that easy. In fact, the methodology for getting timelines of file activity. Now we’re going to do that with as many sample disk images as I can bring. This is our easy win, folks. Host based summation So when you get a forensics challenge, you’re going to do the following things: 1. Get images of memory and disk. 2. Run volatility on the memory image. 3. Load the disk image into PTK and look for files, make a timeline, and do anything else that looks shiny in the UI. 4. Profit more. Wrap up So that’s the gist of it. You have now used several tools, have a working analysis machine, and have commands that will get you some of the things you’ll need. The toolkit that you have is popular and videos of it being used are all over youtube. If I didn’t cover it, you shouldn’t have any trouble finding someone who has. Go forth, and conquer.