Malicious Information Gathering Caezar Introduction By creating mobile agents that communicate and make decisions, a vast number of “secrets” can be learned without triggering the target system’s alarms. Presented by Caezar, a network security research developer History Knowing your opponents strategy has always been critical to success Since the advent of radio transmission opposing groups have monitored each other’s strategic communications Today, the medium has changed toward the Internet but the espionage concepts are identical Topics of Discussion Distributed Hacking Agents Information Representation Goal Seeking Mobility Application Distributed Hacking Performs identically to common hacking Appears to come from many Internet addresses Very difficult to stop, but fairly easy to detect Recent examples include Trinoo and TFN attacks on major web sites Distributed Hacking Each node is a module of the central command With increasing distance between nodes, communication latency becomes extremely difficult to manage The central command decides which action to take so the flexibility of the network decreases as its size increases Agents An agent is a person or thing that acts on your behalf so that the authority does not need to be present Normally, agents make the best available decision without consulting the authority Only in exceptional cases does an agent halt to wait for orders or permission Agents A human food-delivery agent, for instance, will not halt at a closed road, rather he will choose another route and continue to perform his tasks These are often called “autonomous agents” in artificial intelligence (AI) research, but I will refer to them simply as “agents” Agents To be effective, computer agents need sensors, actors and a management unit For information gathering, the sensors are packet filters and pattern matchers The actors simply communicate interesting data from the sensors back to the Authority Agent, or an intermediary Director Agent Agents The management unit receives requests from other Agents or Director Agents and prioritizes them, possibly declining to perform the service Due to ideal small size requirements, we use a simple rule set with a busy indicator to decide when to accept or reject requests Agents vs. Modules Modules have a predefined purpose and use, agents are multipurpose and have interchangeable uses Modules perform their tasks immediately, agents may schedule the task for later completion Agents can choose to reject a request, modules cannot make that choice Information Representation For the system of agents to communicate effectively they must agree on a format of representing their collected data If the data collected is going to be used to mount a network attack on a particular target, the agents should be able to identify weaknesses and catalog them with a Librarian for future use by a Director Information Representation To catalog and retrieve learned information, it must be consistently and uniquely named XML or ASN.1 make good information descriptions Both can be used easily with SQL-based exploit libraries by a Librarian Agent Goal Seeking Bruce Schneier of Counterpane Systems recently elaborated on attack trees in Doctor Dobb’s Journal, December 1999 The attack tree model is an excellent pattern against which the data collected by the Librarian can be compared Each fully matching branch indicates an available method for attack success Goal Seeking Creating an attack tree for every desirable goal allows each goal to be broken into small pieces and managed by a Director Agent Redundant nodes can be applied to several trees without wasted effort With a good set of attack trees and agent tools, a Director can find and exploit security flaws without intervention Goal Seeking Given several goals, the management unit can schedule the agent to fulfill many requests without requiring further communication with the network Director Agents divide larger tasks into smaller ones and distribute the work amongst available agents Goal Seeking An example is port-scanning a Class C domain The task can be divided between Directors who then subdivide the job between their agents Each agent will make a small number of requests As a whole, this example agent network makes a complete scan Mobility By adding an exploit sensor to every agent literally thousands of machines can be found to infect If a Librarian agent provides exploit code to the agents, those found machines become hacked machines If the exploit code starts an agent running, then the agent network grows of its own volition Mobility For the rest of this presentation, assume that a particular target network is being attacked; for instance “bigcompany.com” We do not want to infect the whole Internet to attack or monitor the target By restricting the agent to infect only machines “closer” to the target, we eliminate the radical growth problem Application To bring all of this material together, we need the following components: A goal like “Collect all of the e-mail to or from bigcompany.com” A library of exploit code designed to inject a running agent into currently popular systems A Librarian Agent capable of searching the database and returning code matched to a requested target Application Component list (continued): E-mail capture code with source and destination filters (e.g. dsniff by Dug Song) A Director Agent to manage agent communication, e.g. to Librarian Agent A communication channel; direct TCP for simplicity Application Identify a large list of initial nodes, perhaps by scanning for known Trojans Insert Librarian and Director Agents Communicate the initial target list to the Director As the Director spawns new agents they begin to feed data back regarding more potential targets, network layout, adjacency, etc. Application The agent network surrounds the target Depending on the quality of the exploit library it will be able to monitor portions of the target’s network traffic The ability to attack routers, ISP hosts and other “hard” targets is critical to the success of this sort of attack What This Means Distributed information gathering, especially agent-based, can do much of the work of network mapping software without triggering your IDS Corporate and government security models must consider simple TCP services like SMTP, POP3 and HTTP as totally insecure unless explicitly audited What This Means Assume that mobile, autonomous agents are already available to the intelligence community Assume that attackers do not need to probe your network before attacking They can listen long enough before attacking to passively identify your systems and servers What This Means With the introduction of automated distributed attacks, it is reasonable to expect that Artificial Intelligence and Superficial Intelligence will pose new threats to online security