CySIS Cyber-Socio Intelligent Systems Laboratory Making Smart Decisions in Cyber and Information War Paulo Shakarian Arizona State University Tempe, AZ shak@asu.edu CySIS Russian Cyber-Warfare Estonia (2007): Massive hacktivist DDoS Georgia (2008): Botnet driven DDoS followed by hacktivist DDoS for the purpose of silencing news media and government sites LiveJournal (2011): Massive DDoS attacks by the Optima botnet to silence anti-Putin journalism CySIS 2014 Russian Cyber-warfare in Ukraine and Crimeia • Small-scale cyber attacks by independent hacking groups • Some disruption of communication networks between Crimea and Ukraine by conventional forces • Ukraine parilaiment member phones hacked, and Ukraine gov’t website down for 72 hours • Sandworm Cyber-Espionage platform (discovered Oct. 2014) • No large denial of service on the scale of Estonia, Georgia, or LiveJournal • Where are the big DDoS attacks? CySIS Military-political, economic, [and] informational competition does not subside but grows in the world. Vladimir Putin, Dec. 2013 CySIS Social Media Tactics • Recruitment of Trolls to increase pro-Kremlin opinion in social media • Paid to post ~100 comments a day on social media and major news media articles • Generally write provocative messages to disrupt normal conversation on a message • Pro-Russian social media accounts • “Polite People” features Russian Army personnel as respectful to local population • Recruitment for fighters in East Ukraine • Narratives stressing religious commonality between Ukraine and Russia and vilifying the West • Deliberate false information • Information operation used to disrupt and delay counter-information campaigns CySIS MH-17 Disinformation 6 Putin immediately blames Ukrainian military for the incident. Militia claims “Only dead bodies were aboard the plane” “Spanish air traffic controller” working in Ukraine blames Ukraine military for the attack All highlydisseminated All false CySIS Early Identification • Can we identify viral cascades before they go viral? • Two queries: • Size-based: If we observe a cascade that has m number of participants, can we predict if it will grow to size T or greater? • Time-based: If we observe a cascade that has occurred for t time periods, can we predict if it will grow to size T or greater? • Ideally, we would prefer to set T to be an order-of- magnitude greater than the current observation. CySIS Large Cascades are Rare Our study on a Sina Weibo dataset (17.9M users, 22M Tweets) confirmed the previously-observed power-law relationship between cascade size and frequency Hence, when viewed as a classification problem, the classes are highly imbalanced CySIS Structural Diversity • An individual adopts A behavior based on the fraction of circles he is associated with that previously adopt. • Inspired by real-world results of Ugander et al. 2012. B • Allows for additional information to be considered (i.e. geography, culture, etc.). Intuition: Leverage structural-diversity based measures that are derived from the subgraph of the initial number of adopters. CySIS Viral Classification Size-Based Time-Based • Our method (feature set Am) significantly outperformed previously published best results (Bm) and baseline timebased features (Cm). CySIS Viral Classification (Size-Based) Stability Precision vs. Recall • Our were generally more stable when used to predict cascades of greater sizes • By varying the training threshold (and maintaining the definition of “viral” for classification) we could trade precision for recall. CySIS Power Grid Cascading Failure The power grid is heterogeneous – meaning large scale reconnaissance is difficult. However, to cause a cascade, the adversary may need to recon and attack only a small portion of the power grid. G G D G T T D G D T D D G T D D CySIS The Model The Attacker conducts cyber-attacks against power grid infrastructure IT systems to disable certain substations that lead to a cascading failure. The Defender can harden a limited number of systems to prevent the attacker from causing them to fail. CySIS Technical Preliminaries Power grid network: Source and load nodes: Edge load: Failure Operator (applied iteratively): Payoff function (zero-sum game): CySIS Approach • Deterministic Best-Response: To deal with NP- hardness (in most cases), we utilized a greedy heuristic • Minimax (Mixed) Strategy: Leveraged double-oracle algorithm (provides exact solution with oracles to best response) using greedy algorithms for oracles • Deterministic Load-Based: From the physics literature, based on a definition of load applied to nodes. CySIS Experimental Evaluation Dataset: An Italian 380kV power transmission grid. • 310 nodes, 113 were source, 96 were load, and the remainder were transmission nodes • The nodes were connected with 361 edges representing the power lines All experiments were run on a server with • An Intel X5677 Xeon Processor, 3.46 , a 12 MB Cache • 288 GB of physical memory • Hat Enterprise Linux version 6.1 CySIS Expected Payoff (Disconnected Nodes) Defense Against the Attacker’s Minimax Strategy 90 80 70 60 50 40 30 20 10 0 1 2 3 4 Resources (ka=kd) 5 6 CySIS Expected Payoff (Disconnected Nodes) Defense Against the Attacker’s Best Response to DLB 100 90 80 70 60 50 40 30 20 10 0 1 2 3 4 Resources (ka=kd) 5 6 CySIS Disconnected Nodes Analysis of Attack Positions 50 45 40 35 30 25 20 15 10 5 0 Low-load / high-payoff!! 0 2 4 6 Load 8 10 12 CySIS Cyber Adversarial Intent • Conducting malware forensics is a time-consuming task for an analyst – even with a malware sandbox: A [automated] sandbox cannot tell you what malware does. It may report basic functionality, but it cannot tell you that the malware is a custom Security Accounts Manager (SAM), hash dump utility, or an encrypted keylogging backdoor, for example. Those are conclusions that you must draw on your own. Practical Malware Analysis • Can we quickly infer a set of malware tasks from attributes observed in a sandbox run? Key takeaways: • Advanced Persistent Threats (APT’s) are the most likely course of action for an enemy to conduct intelligence gathering in cyberspace. • Social engineering is the most common attack vector for launching even the most complex APT’s. • Social media presents a large attack surface that is well suited for social engineeringlaunched APT’s. Why do so many APT’s originate from China? 1999: Active offense (Zhu Wenguan and Chen Taiyi): importance of pre-emptive offense 1999: Unrestricted Warfare (Qiao Liang and Wang Xiangsui): warfare extends to political, scientific, and economic arenas, and also can occur during “peace time.” 2002: Gen. Dai Qingmin: Cyber operations precursory (before operations) and whole course (during operations) Long Fancheng and Li Decai: cyber-operations against social, economic, and political targets can be done without fear of such activities leading to large-scale military engagements. Wang Wei and Yang Zhen (Nanjing Military Academy): in a war against an informationcentric community, political system, economic potential, and strategic objectives are high-level targets CIA World Fact Book Photo CySIS CySIS How Do We Determine the Adversary’s Intent? • Current approaches rely on analysis of discovered malware in the aftermath of an attack • High reliance on a human analyst supported by tools • Disassembler (IDA-pro) – is an interactive disassembler that creates maps of program execution • Sandbox – a controlled environment for malware program execution • Reports generated by these approaches needs the aid of security analysts to determine intent CySIS Toward Automating a Solution • Given malware “attribute atoms” (features) • We wish to infer “Tasks” CySIS System Design Knowledge base (malware samples represented as a set of attributes) Malware X Sandbox (Generates analysis reports) Probability distribution over the set of families that X could belong to Input Instance Based Model Input Parser (represents Malware X as a set of attributes) Assign family probabilities to the task associated with it and sum up all the tasks Final result Return set of tasks with a probability of at least 0.5. CySIS Results Average F1 1 0.8 0.6 Mandiant GVDG SVM RF ACTR-IB MetaSploit Incenvia ACT-R Instance based model outperforms standard machine learning approaches and a state of the art malware capability detection system offered by INVINCEA Inc. CySIS Can we do better? • Malware analysis is primarily reactive – done in the aftermath of an attack • Can we be more proactive against emerging threats? Hackers in groups like Anonymous rely on anonymized social connections to plan and execute hacktivist operations Can we leverage this communication to gain threat intelligence? Other hackers use these communication channels to buy/sell exploits and malware CySIS Introduction to Cyber-Warfare Rated 9 out of 10. Outstanding overview… fascinating read about a most important subject - Slashdot Should be on the shelf of every professional concerned with computer security. - ComputingReviews.com, A balanced blend of history and technical details - Help Net Security If you are teaching this subject then use this book. - Krypt3ia This book feels as if it can stand the test of time. -Professional Security Magazine This book will be indispensable. - Lieutenant General (ret.) Charles P. Otstott Currently used as a text at the U.S. Naval Postgraduate School. CySIS Thank You! shak@asu.edu http://shakarian.net