Heat-seeking Honeypots : Desing and Experience

advertisement
Heat-seeking Honeypots : Desing
and Experience
Reporter :鄭志欣
Advisor: Hsing-Kuo Pao
Date : 2011/05/26
1
Conference
• John P. John, Fang Yu, Yinglian Xie, Arvind
Krishnamurthy and Martin Abadi. "Heatseeking Honeypots: Design and Experience."
20th International World Wide Web
Conference (WWW 2011).
2
Outline
•
•
•
•
•
Introduction
System Design
Result
Discussion
Conclusions
3
Introduction
• Many malicious activities
– Phishing、Malware Pages、Open proxies
– Vulnerable servers
– 90% (compromised legitimate site)
• Understanding
– How attackers identify Web servers running
vulnerable applications?
– How they compromise them?
– What subsequent actions they perform on these
servers would therefore be of great value?
4
Introduction
• Honeypot
– Client-based
• visiting suspicious servers
• executing malicious binaries
– Server-based
• Passive
• wait for attackers
• Challenge
– How to effectively get attackers to target these
honeypots?
– How to select which Web applications to emulate ?
5
Our System
• Our system Heat-seeking Honeypots
– Actively attract attackers
– Dynamically generate and deploy honeypot pages
– Analyze logs to identify attack patterns
6
System Design
•
•
•
•
Obtaining attacker queries
Creation of honeypot pages
Advertising honeypot pages to attackers
Detecting malicious traffic
7
Architecture
8
Obtaining attacker queries
• Perform brute-force port scanning on the
Internet.
• Make use of Internet search engines.
– PHP vulnerability : Phpizabi、v0.848b、 c1 、
hfp1
• Bing log
– Group
– SBotMiner : inurl:/includes/joomla.php [a-z]{3,7}
9
Creation of honeypot pages(cont.)
• How do we create an appropriate honeypot?
– (a) Install vulnerable Web software
• Pros : How the attacker interacts with and compromises
• Cons: domain expert、set up the software
– (b) Set up Web pages matching the query
• Pros: similar to the ones created by real software(auto)
• Cons: fewer interactions (depth of attack)
– (c) Set up proxy pages
• Pros: (a)(b)
• Cons: malicious attacks
10
Creation of honeypot pages
• In our deployment, we choose a combination
of options (a) and (b)
– Search engines (Bing and Google )
– Top three results (emulate)
– Web pages at these URLs(crawler)
– Rewrite all the links on the page(Javascript)
– Ex http://path/to/honeypot/includes/joomla.php
• VMs(few common Web applications)
– separate
11
Advertising honeypot pages to
attackers
• Ideally, we want our honeypot pages to
appear in the top results of all malicious
searches。(Major search engine help)
• In our deployment
– boost the chance of honeypot pages
– adding surreptitious links pointing to our
honeypot pages on other public Web pages
(author homepage)
• SEO
12
Detecting malicious traffic
• Identifying crawlers
– Characterizing the behavior of known crawlers
– Identifying unknown crawlers
• Identifying malicious traffic
13
Identifying crawlers(cont.)
• Well-known : Google’s crawler uses
Mozilla/5.0(compatible;Googlebot/2.1;+http://www.google.c
om/bot.html)
• Characterizing the behavior of known crawlers
– We identify a few known crawlers by looking at the user agent
string and verify that the IP address.
– Single search engine use multiple IP addresses to crawl pages.
(AS)
• To distinguish static links(honeypots pages) and dynamic
links(real web software)
–
–
–
–
–
Dynamic links are accessed by one crawler.
/ucp.php?mode=register&sid=1f23...e51a1b
/ucp.php mode=register sid=[0-9a-f]{32}. (AutoRE)
Dynamic links (#E)
Static links (#C)
14
Identifying crawlers(cont.)
• Identifying unknown crawlers
– identify other IP addresses
• Similar is defined in two parts:
– First, must access a large fraction of pages
• K = |P|/|C|
• All of links (#P)、 Dynamic links (#E)、 Static links (#C)
– Second, |P|-|C| = |E|
15
Identifying crawlers(cont.)
• Identifying malicious traffic
– heat-seeking honeypots(static pages) attract attacker visits.
– From honeypot logs,
• Not targeting these static pages
• access non-existent files or private files.
• WhiteList
–
–
–
–
Honeypots pages、real software pages、favicon.ico.
Out of whitelist links are suspicious
Blacklist-based need Human operators or security experts.
Automated、applied different type of software.
16
Result
• Time : 3 mouth
• Place : Washington university CS personal
home page.
• 96 automatically generated honeypot web
pages
• 4 manually installed Web application software
packages
• 54,477 visits 、 6,438 distinct IP
17
Result
•
•
•
•
Distinguishing malicious visits
Properties of attacker visits
Comparing honeypots
Applying whitelists to the Internet
18
Distinguishing malicious visits
• Popular Search engine crawler
Low pagerank
• Google, Bing and Yahoo
• One crawler visitors links are dynamic
links in the software.
19
Crawler visit
We choose K = 75%
20
Attack visits to each honeypot pages
Joomla
21
Properties of attacker visits
0.1 aggressive IP
22
Geographic locations & Discovery time
12
Discovery time : We calculate the number of days between the
first crawl of the page by a search crawler and the first visit to
23
the page by an attacker
Comparing Honeypots
• 1. Web server
– No hostname
– Just IP
– No hyperlinks
• 2. Vulnerable software
– Links to them on Web sites
– Search engine can find them
• 3. Heat-seeking honeypot pages
– Emulate Vulnerable pages
– Search engine can find them
24
Comparison of the total number of
visits and the number of distinct IP
addresses
25
Attack types
26
Applying whitelists to the Internet
0.25
27
Discussion
• Detectability of heat-seeking honeypots
– Attackers may detect client-based honeypot
– Install not full versions of software package
• Attracting more attacks
– PlanetLab (different domain)
• Improving reaction times
– Cooperation of search engines
• Whitelist
– Administrators can secure Web application
28
Conclusion
• In this paper, we present heat-seeking
honeypots, which deploy honeypot pages
corresponding to vulnerable software in order
to attract attackers.
• Further, our system can detect malicious IP
addresses solely through their Web access
patterns
• false-negative rate of at most 1%.
29
Thank You
30
Download