STATISTICAL FLOW ANALYSIS Section 4.1 Network Forensics TRACKING HACKERS THROUGH CYBERSPACE PURPOSE • • Identify compromised hosts • Send out more traffic • Use usual ports • Communicate with known malicious systems Confirm / Disprove data leakage • • Volume of exported data Individual profiling • Reveal • Normal working hours • Periods of inactivity • Sources of entertainment • Correlate activity exchanges PROCESS OVERVIEW • Defined • “Flow record—A subset of information about a flow. Typically, a flow record includes the source and destination IP address, source and destination port (where applicable), protocol, date, time, and the amount of data transmitted in each flow.” (Davidoff & Ham, 2012) FLOW RECORD PROCESSING SYSTEM • Flow record processing systems include the following components: • Sensor—The device that is used to monitor the flows of traffic on any given segment and extract important bits of information to a flow record. • Collector—A server (or multiple servers) configured to listen on the network for flow record data and store it to a hard drive. • Aggregator—When multiple collectors are used, the data is typically aggregated on a central server for analysis. • Analysis—Once the flow record data has been exported and stored, it can be analyzed using a wide variety of commercial, open-source, and homegrown tools.1 1. PG 161 SENSORS • Sensor types • Network Equipment • Many switches support flow record creation and export • Cisco - NetFlow format • Sonicwall – IPFIX and NetFlow • Be cautious of “sampling” which is not comprehensive data • Standalone appliances • Used if existing network software does not support flow data • Software • Argus – Audit Record Generation and Utilization System • Softflowd • Yaf – Yet Another Flowmeter SENSOR SOFTWARE • • • Argus • Two packages • Argus Server • Argus Client • Libpcap- based • Supports BPF filtering • Documentation specifically mentions forensic investigation • Argus’ compressed format over UDP Softflowd • Passively monitor traffic • Exports record data in NetFlow format • Linux and OpenBSD • Libpcap- based Yaf • Libpcap and live packet transfer • IPFIX format over SCTP, TCP or UDP • Supports BPF filters SENSOR PLACEMENT • Investigators often do not have much control over placement • Infrastructures should be set up with flow monitoring in mind but usually are not • Factors to consider • Duplication is inefficient and must be minimized • Time synchronization is crucial • Most flow records are collected on external devices such as firewalls but this ignores internal network traffic which can be valuable • Resources are important when planning, prioritize • Do not over load your network capacity MODIFYING THE ENVIRONMENT • Leverage existing equipment • Switches, routers, firewalls, NIDS / NIPS • Upgrade network equipment • If existing equipment will not work deploy replacements • Deploy additional sensors • Use port mirroring to send packets to standalone sensor • Network tap another option FLOW RECORD EXPORT PROTOCOLS • Proprietary – Cisco’s NetFlow • Open source – IPFIX • Relatively new and not yet matured – better tools on the horizon NETFLOW • Maintains a cache that tracks the state of all active flows observed • Completed flows marked as “expired” and exported as a “NetFlow Export” packet to a collector • Newer versions (NetFlow v9) are transport-layer independent: UDP, TCP and SCTP • Older versions only support UDP and IPv4 IPFIX • Extends NetFlow v9 • Handles bidirectional flow reporting • Reduces redundancy • Better interoperability • Extensible flow record data using data templates • Template defines data to be exported • Sensor uses template to construct flow data export packets SFLOW • Supported by many devices – not Cisco • Conduct statistical packet sampling • Does not support recording and processing every packet • Scales very well • Generally not very good for forensic analysis COLLECTION AND AGGREGATION • Placement factors to consider • Congestion • Flow records generate network traffic and can intensify congestion • Choose location where this will cause low network impact • Security • Export flow records on separate VLAN if possible • Isolate physical cables • Encrypt using IPSec or TLS • Reliability • Consider using TCP or SCTP over UDP • Capacity • One sensor or many? • Analysis strategy • Can affect all of the above, plan accordingly COLLECTION SYSTEMS • Commercial options • Cisco NetFlow Collector • Manage Engine’s NetFlow Analyzer • WatchPoint NetFlow Collector COLLECTION SYSTEMS CONTINUED • Open source options • SiLK – System for Internet Level Knowledge • Command-line • Most powerful – biggest learning curve • Collector specific tools – flowcap and rwflowpack • Flow-tools • Modular and easily extensible • Only accepts UDP input • Nfdump / NfSen • Collector daemon – nfcapd • UDP network socket or pcap files • Argus • Supports Argus format and NetFlow v 1-8 • NetFlow v9 and IPFIX not yet supported ANALYSIS • Defined • “Statistics—“The science which has to do with the collection, classification, and analysis of facts of a numerical nature regarding any topic.” (The Collaborative International Dictionary of English v.0.48).” (Davidoff & Ham, 2012) • Purpose • Store a summary of information about the traffic flowing across the network • Forensic data carving does not apply • Still very useful FLOW RECORD TECHNIQUES • Goals and resources • This should shape your analysis • Access available time, staff, equipment and tools • Starting indicators – triggering event • Example evidence: • IP address of compromised or malicious system • Time frame of suspect activity • Known ports of suspect activity • Specific flows which indicate abnormal or unexplained activity FLOW RECORD TECHNIQUES CONTINUED • Analysis techniques • Filtering • Baselining • “Dirty Values” • Activity pattern matching FILTERING • Important to narrow down a large pool of evidence • Remove extraneous data • Start by isolating activity relating to specific IP address/es • Filter for known patterns of behavior • Use small percentages of data for detailed analysis BASELINING • Advantage of flow record data vs full traffic capture • Dramatically smaller allowing for longer retention • Build a profile of “normal” network activity • Network baseline • General trends over a period of time • Host baseline • Historical baseline can identify anomalous behavior • Most flow patterns will change dramatically if host is compromised or under attack “DIRTY VALUES” • Suspicious keywords • IP addresses • Ports • Protocols ACTIVITY PATTERN MATCHING • Elements • IP address • Internal network or Internet-exposed network • Country of origin • Who are they registered too? • Ports • Assigned / well-known ports link to specific applications • Is system scanning or being scanned? • Protocols and Flags • Layer 3 and 4 are often tracked in flow record data • Connection attempts • Successful port scans • Data transfers • Directionality • Data coming in (something downloaded) or going out (something uploaded) • Volume of data transferred • Lots of small packets can indicate port scanning • Large amounts of data usually cause for concern SIMPLE PATTERNS • • • • Many-to-one IP addresses • DOS attack • Syslog server • “Drop box” data repository on destination IP • Email server (at destination) One-to-many IP addresses • Web server • Email server (at source) • SPAM bot • Warez server • Network port scanning Many-to-many IP addresses • Peer-to-peer file sharing • Widespread port scanning One-to-one IP addresses • Targeted attack • Routine Server communication COMPLEX PATTERNS • Fingerprinting • Matching complex flow record patterns to specific activities • Example: • TCP SYN port scan • One source IP address • One or more destination IP addresses • Destination port numbers increase incrementally • Volume of packets surpass a specified value within a given period of time • TCP protocol • Outbound protocol flags set to “SYN” FLOW RECORD ANALYSIS TOOLS • flowtools • SiLK • Argus • FlowTraq • Nfdump / NfSen SiLK • • • • • Rwfilter • Extracts flows of interest • Filters by time and category • Partitions them by protocol attributes • Generally as functional as BPF Rwstats, rwcounts, rwcut, rwuniq • Basic manipulation utilities Rwidsquery • Can be fed a Snort rule or alert file and it will figure out which flow matches it and writes an rwfilter to match it Rwpmatch • Libpcap-based program that reads in SiLK-format flow metadata and an input source and save only the packets that match the metadata Advanced SiLK • Includes a Python interpreter “PySiLK” FLOW-TOOLS • Variety • Flow export data collection • Storage • Processing • Sending tools • “flow-report” • ASCII text report based on stored flow data • “flow-nfilter” • Filter based on primitives specific to flow-tools • “flow-dscan” • Identifies suspicious traffic based on flow export data ARGUS CLIENT TOOLS • • • • • Ra • Reads • Filters • Prints • Supports BPF filtering Racluster • Exports based on user-specified criteria Rasort • Sorts based on user-specified criteria Ragrep • Regular expression and pattern matching Rahisto • Generated frequency distribution table for user-selected metrics: flow duration, src and dst port numbers, byte transfer, packet counts, average duration, IP address, ports, etc FLOW TRAQ • Commercial tool by ProQueSys • Supports many formats and sniffs traffic directly • Users can • Filter • Search • Sort • Produce reports • Designed for forensics and incident response NFDUMP • • Part of the nfdump suite Includes • Aggregate flow record fields by specific fields • Limit by time range • Generate statistics • IP addresses • Interfaces • Ports • Anonymize IP addresses • Customize output format • BPF-style filters NFSEN • Graphical, web-based interface for nfdump ETHERAPE • Libpcap-based graphical tool • Visually displays activity in real time • Colors designate traffic protocol • HTTP • SMB • ICMP • IMAPS • Does not take flow records as input Works Cited Davidoff, S., & Ham, J. (2012). Network Forensics Tracking Hackers Through Cyberspace. Boston: Prentice Hall.