Processing Intelligence Feeds with Open Source Software Chris Horsley, SC Leung, Tomas Lima, L. Aaron Kaplan, Raphael Vinot Overview • Current topics in automatic incident handling for CERTs • IFAS • HKCERT , IFAS and use-cases • IHAP project • ContactDB project • Current R&D IFAS • Information Feed Analysis System Knowing what’s going on How do national CSIRTs know what’s happening? National CSIRTs need visibility on network in their economy However, many national CSIRTs don’t operate networks themselves, and normally don’t have global (or any) direct visibility How does the CSIRT know what’s going on in their country? The kindness of strangers Luckily, there are a lot of network operators, research teams, vendors, and other CSIRTs out there that collect information, and will share it with national CSIRTs. And here comes the “but”... So much data, so many formats There are many feeds, all with their own data formats and mediums: Formats: CSV, JSON, XML, STIX, IODEF Mediums: HTML, RSS, email, HTTP APIs While there are efforts to standardise data formats, this will take a long time, and will likely never cover 100% of feeds We can’t change the format of remote feeds - we can only change what we do with the data. The need for standards Different feeds use many terms to mean the same thing: ip, source_ip, src_ip, endpoint, attacker_ip, cnc_ip... If we receive events from many feeds, we need to normalise so we can compare them together. The need for storage As a national CSIRT, we’re concerned with the health of national networks: which means measurement. We can only measure longterm if we store events, enabling us to analyse them. We also want to search through events, like: C&C servers in domestic networks in last week Bots infected with Trojan.abc on BigISP Defaced web sites targeting gov.zz Need for automation There’s way too much network event data out there to manually process Options: a) use lots of analyst time doing tedious log processing b) write lots of small, independent scripts c) ignore inbound logs completely d) use an automated processing system So what do we need? We need something which automatically: Gathers many different types of feeds Normalises the data in those feeds Stores that data somewhere Allows search and performs statistical analysis IFAS IFAS = Information Feed Analysis System Project sponsored by HKCERT and developed by HKCERT and CSIRT Foundry An integration of open source tools, released as open source for CSIRTs Architecture Architecture Abusehelper: gather, process, and enrich feeds, generate events Logstash: process and normalise feeds Elasticsearch: store events in schema-free index server Kibana: search through events IFAS Reporter: get overall statistics, build realtime dashboards Kibana event searches Freeform statistical reporting Nesting, filtering, deduplication IFAS – Dashboard Visualize information *Drill down right at the chart What you need to start Software Open source under Apache 2.0 License Only possible with the hard work released under open source licenses from Abusehelper and Elasticsearch teams Contributions, bug reports, feature requests most welcome! Hardware Production: 8-16GB memory machine Dev: 4GB possible Multi-core machine (4+ ideal) Runs in a VM no problem Out of the box feeds Out of Box Feed Plugins (4 publicly available) Abuse.ch CleanMX Millersmiles Phishtank Other developed Plugins Malc0de Malicious Domain List Arbor SRF Shadowserver Zone-H Future … more, and your own Where to get it Currently under closed pilot to trusted CSIRTs Eventually public release Please contact contact@ifas.io for details Demos IFAS and Use Cases SC Leung, HKCERT Give a sense of Today’s Events IFAS - Log Search Powerful search on all the information collected Keywords here Feed Details Add columns of interests IFAS - Reporter Statistical analysis-Trends & Distributions Free form statistical reports 5. 2. 1. 6. 3. 4. Nesting, filtering, deduplication Number of phishings in “.AU” in each ASN by brand IFAS - Alert Set tracking criteria – get notify ASAP domain: *.gov.hk Alert lists : educational institutions (hkedu), NGOs (hkorg) ! Dashboard Real-time situational awareness for CERT management Public Situational Awareness on Compromised Servers / PCs Hong Kong Security Watch Report Analysis of Trend with Events • Correlate Cryptolocker 2013-Oct with Zeus Engage ISPs for large scale incident handling ISP • • Data do help HKCERT engaging ISPs (their sales team) Data do help a server hosting SP understand their customers’ security problems Converting security events into incident reports • Defacement • Phishing • Export to CSV for batch processing, with some other scripts • Malware hosting – a bit difficult • Large volume of incidents – need prioritisation Future of IFAS - a collaboration platform • All you can use • All you can contribute • • Add input filters for new feeds • Add new plug-in modules • Add new chart and visualization • Integrate with other systems, e.g. RTIR • … Standard language: STIX, taxonomy of ENISA DSMS (Decision Support & Monitoring System) • An ongoing project that turn security events into Actionable Data • Set Priority, Choose Monitors, Consolidate Results Decision Support Sub-system Interfaces to Monitors Input URL IFAS Profile Tasks Request to monitor Output Incident Mgmt Story Monitoring Services Interface Module Status ? (online /offline) Status Check (HTTP, DNS) via proxy Interface Module Public analysis sys (VirusTotal, ThreatExpert) Interface Modules Private analysis Interface Modules Web reputation (D-Shield) sys Consolidated Results Incident handling automation project IHAP IHAP • Very similar to IFAS, developed in parallel by CERT.pt, CERT.at • Also uses Logstash, Elastic Search and Abusehelper • Less work on the Webinterface, more work on Ontology, „Data harmonisation document“ IHAP - History • Discussions about CERT.AT developments/documents • Discussions about cooperation between CERTs • ENISA support IHAP - Goals • Open Source • Maintainable • • • • • • • Flexible and Modular - must be possible to integrate existing software and modules (Pastemon, AbuseHelper, etc..) Reusable Easily Extendable - should require little knowledge and basic programming skills Easily Deployable Easily Updatable – easy to share new developments with other CERTs and update the system with that new code Easily Configurable - config files that can be easily modified to fit CERT‘s needs Documented - must be well documented Links & Code http://www.enisa.europa.eu/activities/cert/support/incident-handlingautomation Common field names for AH • https://bitbucket.org/clarifiednetworks/abusehelper/wiki/Data%20Har monization%20Ontology • A standard set of well defined field names within Abusehelper (AH) • Allows CERTs to: • Write bots which are interoperable within AH • Measure in identical ways • Easier to parse different feeds („generic santizer bot“) : you just have to define the mappings contactDB Background/ problem • abuse@ lookups suck (IRT object not in use, no standard; Just now RIPE DB is changing with abuse-c:) • Getting the right lookup is non-trivial, complex • Many (national) CERTs create their own abuse contact lookup DBs. • National CERT DB, TI directory, FIRST data can not be looked up automatically via scripts. Idea • A caching contact database with more specific internal data • Some of this data (tel nos, etc) will never be in the public whois • Unify with TI, FIRST etc data • Make it query-able by scripts What databases exist? What can we query? Abuse contact lookup - flow Number based resource: IP addr, netblock, ASN Name based resource: domain name, hostname TI, FIRST, CERT.org DBs Get country() Gethostbyname() Extract ccTLD Whois DB (RIPE, ARIN, ..) Maxmind RIPE DB Cymru, ... Whois DB (registrant, registrar) National CERT DB Country code CERT.org IRT object, abuse-c, ... National CERT for country Email Address IANA ccTLD list Country code What exists now? • Public code repo ;-) • Whois server (thx Mauro) • RESTful API (Mauro, Rafiot) • Some scripts to import TI data (Aaron, David) • Still some bugs ;-) Code & document with RIPE • • • • Document (WIP): https://github.com/certtools/contactdb/blob/master/doc/contactdatabases-for-abuse-handling.mkd Codebase: https://github.com/certtools/contactdb (thx Rafiot, David, Mauro!) Summary Summary • The CERT community has limited ressources for development • We re-implement the same thing all the time • Let‘s share code or at least exchange ideas on how to automate incident handling! • Let‘s share on how to measure success • Thanks HKCERT, ENISA, CERT.at, CERT.pt, CIRCL, etc.. • Mailinglist: https://tiss.trustedintroducer.org/mailman/listinfo/ihap Thanks!