P2P安全探讨 1 章节内容 6.1 匿名 6.2 声誉和信任 6.3 文件污染 6.4 路由安全 6.5 安全前沿研究 2 6.1 匿名 “匿名”的根源可以追溯到“安全散列函 数”,包括发送者匿名、接收者匿名、文 件标识匿名、关系匿名等 P2P系统天然可用于匿名 3 Anonymity Author anonymity: 分享资源的作者的身份不能够被有 心人知道,即身份和分享的资源不能够有关连性。 Publisher anonymity: Publisher的身份不能够和提供 的资源有所关连性。 Reader (Requester) anonymity: 同个网络中的资源任 何人都可以读取,但读取者的信息不能够被公开或得知。 Server anonymity: 服务器的信息不能够和提供的资源 有任何的关连性。 Document anonymity: 服务器也不知道储存的内容。 Query anonymity: 服务器知道请求资源的ID,但是不能 够有第三者去确认此ID的正确性。 4 匿名的方法 匿名代理:用户通过匿名代理发送消息, 但匿名代理安全性以及本身是系统瓶颈, 易受攻击 混合中继网mix-net:用户通过一组“混合 中继”mix relays结点连接到服务器,核心 中继结点是安全隐患(Tor) 随机中继:Freenet,Tarzan(理论上完备细 致,但实现极其困难)等 5 混合中继网-onion routing 常用的匿名传输代理服务器Tor是基于洋葱路由(Onion Routing) 用户在本机运行一个洋葱代理服务器(onion proxy), 这个代理周期性地与其他Tor交流,从而在Tor网络中构成 虚拟环路(virtual circuit)。同时对于客户端,洋葱 代理服务器又作为SOCKS接口。一些应用程序就可以将Tor 作为代理服务器,网络通讯就可以通过Tor的虚拟环路来 进行。 Tor是在7层协议栈中的应用层进行加密(也就是按照 ‘onion’的模式)而它之所以被称为onion是因为它的结 构就跟洋葱相同,你只能看出它的外表而想要看到核心就 必须把它层层的剥开。 每个router间的传输都经过symmetric key来加密,形成 有层次的结构。它中间所经过的各节点,都好像洋葱的一 层皮,把客户端包在里面,算是保护信息来源的一种方式, 6 这样在洋葱路由器之间可以保持通讯安全。 洋葱路由– Onion routing ( A real-time MIX network ) 一个通用可用于如Internet开放式网络上的匿名通信体系 general purpose infrastructure for anonymous communications over a public network (e.g., Internet) 通过适当的代理支持多类应用如:HTTP,FTP,SMTP… supports several types of applications (HTTP, FTP, SMTP, rlogin, telnet, …) through the use of application specific proxies 应用数据通过动态建立的匿名连接传输 anonymous connections through onion routers are built dynamically to carry application data 具有分布式、容错、安全等特性 distributed, fault tolerant, and secure 7 洋葱路由--网络设置和操作 在邻居路由器间保持长期的socket连接(links) long-term socket connections between “neighboring” onion routers are established links 一个连接link上的两邻居采用两个DES加密key,每个方向一 个确保通信安全 neighbors on a link setup two DES keys using the Station-toStation protocol (one key in each direction) 多个匿名连接可以复用在一个连接link上,这时每个匿名连 接分配一个ACI标识(局部性的标识)。 消息类似ACM传输,分成48bytes定长信元。信元用DES加 密。传输中来自不同连接的信元mix复用,但保持连接有序。 6 5 4 3 2 1 6 5 4 4 3 3 2 2 1 1 mixing 4 3 2 1 8 Overview of architecture long-term socket connections application (initiator) onion router application proxy - prepares the data stream for transfer - sanitizes appl. data - processes status msg sent by the exit funnel onion proxy - opens the anonymous connection via the OR network - encrypts/decrypts data application (responder) entry funnel - multiplexes connections from onion proxies exit funnel - demultiplexes connections from the OR network - opens connection to responder application and reports a one byte status msg back to the application proxy 9 Onions消息包 onion是多层数据结构,它encapsulate了OR网络中的匿名连接 it encapsulates the route of the anonymous connection within the OR network 每层包括: backward crypto function (DES-OFB, RC4)后向加密函数 forward crypto function (DES-OFB, RC4)前向加密函数 IP address and port number of the next onion router下一跳 路由<ip,port> expiration time过期时间 key seed material 用于前向和后向加密函数的密钥 used to generate the keys for the backward and forward crypto functions 同时每一层都用相应的洋葱路由器的公钥加密 each layer is encrypted with the public key of the onion router for which data in that layer is intended bwd fn | fwd fn | next = blue | keys bwd fn | fwd fn | next = green | keys bwd fn | fwd fn | next = 0 | keys 10 Anonymous connection setup illustrated onion proxy onion application (responder) 11 Anonymous connection setup illustrated onion proxy onion application (responder) bwd: entry funnel, crypto fns and keys fwd: blue, ACI = 12, crypto fns and keys 12 Anonymous connection setup illustrated onion proxy onion ACI = 12 application (responder) 13 Anonymous connection setup illustrated onion proxy application (responder) onion bwd: magenta, ACI = 12, crypto fns and keys fwd: green, ACI = 8, crypto fns and keys 14 Anonymous connection setup illustrated onion proxy onion ACI = 8 application (responder) 15 Anonymous connection setup illustrated onion proxy application (responder) onion bwd: blue, ACI = 8, crypto fns and keys fwd: exit funnel 16 Anonymous connection setup illustrated bwd: entry funnel, crypto fns and keys onion proxy fwd: blue, ACI = 12, crypto fns and keys bwd: blue, ACI = 8, crypto fns and keys fwd: exit funnel open socket bwd: magenta(紫红), ACI = 12, crypto fns and keys application (responder) fwd: green, ACI = 8, crypto fns and keys 17 Tarzan--P2P匿名网络层 Tarzan在英文中的意思为“泰山” Tarzan是一个P2P的匿名IP网络覆盖,它通过数据多层加 密和消息多跳路由来实现匿名性。 Tarzan将mix-net的匿名方法扩展到P2P环境中,结点之间 通过中继结点序列(这一序列结点构成一条隧道)来通信。 实现:发送者匿名、接收者匿名、关系匿名(一对结点之 间相互通信的关系不会被其他结点发现) Tarzan: A Peer-to-Peer Anonymizing Network Layer ACM CCS 2002 http://pdos.lcs.mit.edu/tarzan/ 18 Anonymity Participant can communicate anonymously with non-participant User ? User can talk to CNN.com • Nobody knows who user is 19 The Vision for Anonymization Thousands of nodes participate Bounce traffic off one another • Mechanism to organize nodes: peer-to-peer • All applications can use: IP layer 20 Alternative 1: Proxy Approach User Proxy Intermediate node to proxy traffic Completely trust the proxy Anonymizer.com 21 Threat model • Corrupt proxy(s) – Adversary runs proxy(s) – Adversary targets proxy(s) and compromises, possibly adaptively • Network links observed – Limited, localized network sniffing – Wide-spread (even global) eavesdropping e.g., Carnivore, Chinese firewall, ISP search warrants 22 Failures of Proxy Approach User Proxy Proxy • Proxy reveals identity • Traffic analysis is easy 23 Failures of Proxy Approach User X Proxy X • Proxy reveals identity • Traffic analysis is easy CNN blocks connections from proxy • Adversary blocks access to proxy (DoS) 24 Alternative 2: Centralized Mixnet Relay User Relay Relay Relay MIX encoding creates encrypted tunnel of relays Individual malicious relays cannot reveal identity Packet forwarding through tunnel Onion Routing, Freedom Small-scale, static network 25 Failures of Centralized Mixnet Relay User Relay Relay X Relay • CNN blocks core routers 26 Failures of Centralized Mixnet Relay Relay User Relay Relay Relay • CNN blocks core routers • Adversary targets core routers 27 Alternative 2: Centralized Mixnet Relay Relay User Relay Relay Relay • CNN blocks core routers • Adversary targets core routers • So, add cover traffic between relays – Hides data traffic among cover 28 Failures of Centralized Mixnet Relay User Relay Relay Relay • CNN blocks core routers • Adversary targets core routers 29 Failures of Centralized Mixnet Relay User Relay Relay Relay Relay Relay • CNN blocks core routers • Adversary targets core routers • Still allows network-edge analysis 30 Failures of Centralized Mixnet Relay User Relay Relay Relay • Internal cover traffic does not protect edges • External cover traffic prohibitively expensive? – n2 communication complexity 31 Tarzan: Me Relay, You Relay • Thousands of nodes participate – CNN cannot block everybody – Adversary cannot target everybody 32 Tarzan: Me Relay, You Relay • Thousands of nodes participate • Cover traffic protects all nodes – Global eavesdropping gains little info 33 Benefits of Peer-to-Peer Design ? ? ? ? ? • Thousands of nodes participate • Cover traffic protects all nodes • All nodes also act as relays – No network edge to analyze – First hop does not know he’s first 34 Tarzan: Joining the System User 1. Contacts known peers to learn neighbor lists 2. Validates each peer by directly pinging 35 Tarzan: Generating Cover Traffic User Nodes begin passing cover traffic with mimics: Nodes send at some traffic rate per time period Traffic rate independent of actual demand All packets are same length and link encrypted 36 Tarzan: Selecting tunnel nodes PNAT User To build tunnel: Iteratively selects peers and builds tunnel from among last-hop’s mimics 37 But, Adversaries Can Join System PNAT User 38 But, Adversaries Can Join System PNAT User • Adversary can join more than once by spoofing addresses outside its control Contact peers directly to validate IP addr and learn PK 39 But, Adversaries Can Join System PNAT User • Adversary can join more than once by running many nodes on each machine it controls Randomly select by subnet “domain” (/16 prefix, not IP) 40 But, Adversaries Can Join System PNAT User • Adversary can join more than once by running many nodes on each machine it controls Randomly select by subnet “domain” (/16 prefix, not IP) 41 Tarzan网络安全模型 考虑到一个路由器 上可能有多个IP地 址,从而虚拟地操 纵多个Tarzan结点, 因此定义了域 domain概念,以此 标识被某个恶意节 点控制的子网。 如图,恶意的路由 器控制了整个域 (子网),而一般 的恶意结点则不能 控制整个域,但它 能监听域内其它结 点的通信。 域的划分粒度通常 为<当前ip/16>, <当前ip/24> 42 Tarzan体系架构 43 6.2 声誉和信任 匿名隐藏网络行为,而“声誉”与匿名相 反,它对“好”的网络行为的鼓励 “信任”往往是基于“声誉”的,很多时 候二者不做区分 Bittorrent的阻塞算法实质就是采用声誉机 制。目前对匿名、信任等的研究得到较大 的关注。 44 设计P2P声誉、信任系统涉及的问题 此系统必须是自管辖的(self-policing),系统本身 为其用户定义了共有的行为准则和声誉/信任衡 量,即使在没有集中式认证或权威第三方的情况 下,系统用户也能总体上遵循并加强这些准则 此系统必须是匿名的,一个用户的声誉应该同一 个不透明的ID相关联 不应该给予新来者任何额外的利益,用户的声誉 必须通过多次事务中的表现来衡量 应该尽量最小化声誉/信任机制带来的额外开销 应该对恶意结点有较强的容错性 45 CCS02[Damiani et.al.,2002]:提出了一种基于声 誉的,在P2P网络中选择可靠资源的方法。每个 Peer在下载资源前,通过分布式的投票算法 (polling algorithm)来评价资源的可靠性,从而限 制恶意资源在P2P网络中的传播。 ACM Conference on Electronic Commerce [Xiong and Liu.03] 设计了一个服务于P2P电子商 务社群的、基于声誉的信任模型PeerTrust。此模 型基于事务回馈(transaction feedback)来量化 和比较Peer的可信任性(trustworthiness)。 BitTorren的阻塞算法是隐匿的声誉方法,但只基 于本次下载而不考虑历史行为。 46 EigenTrust算法-完备的P2P声誉管理 EigenTrust特征信任,www2003[Kamvar et al.2003] Standford University EigenTrust使用用户间满意度矩阵的特征向量来 计算信任值 信任值基础: 每次事务后,用户要互相评价。如用户i从用户j那里下载 一个文件后(也可能下载失败),用户i会以一个信任值 tr(i,j)来评价这次事务。1为成功,-1为不成功(下载失败 或非想要的)。 一个用户i对j历史性的评价(称为满意度s),记为sij 47 EigenTrust收集、计算信任值的方法 传递信任值(friends of friends):用户i信任 那些给他提供正确下载的用户,所以也信 任这些用户所提供的信任值。满意度的规 范化(normalize):规范化可以有效地避免 恶意结点给予其他结点太高或太低的评价。 规范后的信任值记为cij, 有∑cij=1 48 The Math i对k的信任通过朋友对k的信任传递 c ik cij c jk ' What they think of peer k. j Ask your friends j .1 .1 .3 .2 .3 .1 .1 c And weight each friend’s opinion by how much you trust him. ' i 0 .2 0 .3 0 .5 .1 0 0 0 .5 0 0 0 .2 T C ci C为矩阵[cij],CT表示矩阵的转置,ci表示包含cij的向量 49 问你的朋友: t=CTci. 问朋友的朋友: t=(CT)2ci. 重复n次问: t=(CT)nci.,步数n越大,得到的评价越广泛从 而越准确。 可以证明,当n很大时,每个用户i的信任值向量ti都将趋向 于矩阵C的“左主特征向量”(left principal eigenvector)e 也就是说,在EigenTrust模型中t是一个全局特征向量,它 的每个元素ti代表了整个系统赋予用户j的信任值。 因此,每个peer并不需要存储或者计算它自己的信任向量。 因为这是一个全局特征量,统一的。 Therefore, each peer doesn’t have to store and compute its own trust vector. The whole network can cooperate to store and compute t. 50 简单的、非分布式算法 Initialize: Repeat until convergence: t (k 1) C t T (k) 51 Simple algorithm pseudocode 52 Distributed Algorithm 以下算法暂时忽略lie/dishonest For each peer i { .1 -First, ask peers who know you .1 .5 .3 for their opinions of you. 0 0 .2 0 .3 0 .5 .1 0 0 0 .2 -Repeat until convergence { 0 .3 -Compute current trust 0 .1 .2 value: ti(k+1) = c1j t1(k) +…+ cnj .1 tn(k) -Send your opinion cij and ( k 1) (k ) (k ) t c t ... c t trust value ti(k+1) to your i 1i 1 ni n acquaintances. -Wait for the peers who 详细算法及分析请自行参考论 know you to send you their trust 文[kamvar et al.,2003] values and opinions. } 53 } 计分安全的EigenTrust算法 54 6.3 文件污染 文件污染,是指P2P文件共享网络中的恶意用户, 可称之为“污染者”,将虚假甚至含有恶意内容 的文件贴上某些热门内容的标签进行发布,诱骗 其他用户下载,并利用P2P网络的自由共享功能 进行更广泛散播的现象。 案例: Overpeer公司于2003年成功地使当时最受欢迎的 Kazaa/FastTrack网络上被污染的文件占到总文件数量的 一半以上。http://www.slyck.com/story1019.html 55 Titles, versions, copies The title is the title of song/movie/software 主题(索引) A given title can have thousands of versions 版本 Each version can have thousands of copies 副本 56 56 文件污染类型 (1)索引污染 是指在P2P网络的索引服务系统中注入大量虚假 的记录,这些记录指向不存在的版本和/或副本。 当用户按照这些记录的指示尝试下载时,将得到 “无法连接”的提示。如果注入的虚假索引记录 足够多,那么没有耐心的用户可能在几次失败的 尝试之后放弃下载的努力。 索引污染既可以针对版本也可以针对副本。它与普通的版 本污染和副本污染的不同之处在于,污染者注入网络中的 索引记录指向并不存在的对象,因此污染者并不需要拥有 强大的污染服务器来提供大量的上传服务。 57 (2)版本污染 实施版本污染的污染者首先针对一个(或同时针 对多个)目标关键词制造出大量含有恶意或错误 内容的污染版本。然后污染者将这些版本的索引 信息注入目标P2P网络,并在其污染服务器上提 供大量可供下载的副本。如果没有有效的识别措 施和管理机制,网络中的用户在搜索相关主题时 就很容易被这些具有大量可下载副本的污染版本 所吸引。一旦下载了污染版本而又没有及时加以 检验,一般用户很可能将该版本的本地副本设置 为共享,并提供给其他用户下载。如此一来,污 染版本将在网络中广泛的传播开来,甚至会超过 了正确版本的副本数量,最终将正确副本淹没在 污染副本中,使得该主题资源变得不可用。 58 P2P共享文件的污染版本有很多不同的表现形式, 例如,对于mp3歌曲文件,污染者可以采用截短、 插入噪声、插入不可解码的数据片断甚至插入辱 骂词句等方式来制造污染版本,而对于可执行文 件,则可能是插入蠕虫、木马等恶意代码。由于 P2P网络中共享资源的多样性,对文件版本的好 坏,很难有有效的自动识别措施,因此,版本污 染具有很强的隐蔽性,大多数情况下只能依靠人 工的识别。正是这种人工识别的滞后性,使得 P2P网络中被污染的文件版本不仅可以通过污染 服务器直接散发,还可以通过正常用户的共享行 为得到更加广泛和迅速的传播。 59 File Pollution: Infocom 05 original content polluted content pollution company 60 60 File Pollution pollution server pollution company file sharing network pollution server pollution server pollution server 61 61 File Pollution Unsuspecting users spread pollution ! Alice Bob 62 62 File Pollution Unsuspecting users spread pollution ! Yuck 63 63 Index Poisoning: Infocom 06 123.12.7.98 index title location bigparty 123.12.7.98 smallfun 23.123.78.6 heyhey 234.8.89.20 23.123.78.6 file sharing network 234.8.89.20 64 64 Index Poisoning 123.12.7.98 index title location bigparty 123.12.7.98 smallfun 23.123.78.6 heyhey 234.8.89.20 bighit 111.22.22.22 23.123.78.6 234.8.89.20 111.22.22.22 65 65 FastTrack/Kazaa Overlay ON = ordinary node SN SN = super node ON ON ON Each SN maintains a local index 66 66 FastTrack Query Alice ON = ordinary node SN SN = super node ON ON ON 67 67 FastTrack Download ON = ordinary node HTTP request for hash value SN SN = super node ON ON ON Bob 68 68 FastTrack Download ON = ordinary node SN SN = super node P2P file transfer ON ON ON 69 69 Index Poisoning in FastTrack and Overnet FastTrack/Kazaa Advertise to supernodes (target_song, bogus_IP) for many bogus IP’s, many versions of target_song Overnet/E-donkey Advertise record: (hash_target_keyword, bogus_version_id) 70 70 Attacks: How Effective? For a given title, what fraction of the “displayed copies” are Clean ? Poisoned? Polluted? Brute-force approach: attempt download all versions versions that don’t download are poisoned for those versions that download, listen/watch each one How do we determine pollution levels without downloading? 71 71 Solution: Harvest version ids and copy locations FastTrack: Crawl Overnet: Insert node, receive publish msg’s Heuristic for classifying versions into poisoned, polluted, clean versions 72 Copies at Users FastTrack Overnet For certain titles, a tiny fraction of users advertise the majority of the copies 73 73 6.4 路由安全 John R. Douceur. The Sybil Attack. In Proceedings of the IPTPS02 Workshop, Cambridge, MA (USA), March 2002. Atul Singh, Miguel Castro, Peter Druschel, and Antony Rowstron. Defending Against Eclipse Attacks on Overlay Networks. In Proceedings of the European SIGOPS Workshop, Leuven, Belgium, September 2004. Miguel Castro, Peter Druschel, Ayalvadi Ganesh, Antony Rowstron and Dan S. Wallach. Secure routing for structured peer-to-peer overlay networks. OSDI2002.OSDI (even year), SOSP (odd year) 74 Sybil Attack 75 Why Use Sybil Attack? disruption for-profit motives: RIAA美国唱片业协会(Recording Industry Association of America) disproportionate access to resources (computation, storage) control network 76 Eclipse Attack Overlay network Decentralized graph of nodes on edge of network Each node maintains a neighbor set Typically limited control over membership Eclipse Attack Malicious nodes conspire to hijack and dominate the neighbor set of correct nodes “Eclipse” correct nodes from each other Control data traffic through routing 77 Example C & F controls traffic A B C D B to * E F G H I 78 日蚀 79 Secure routing for structured peer-to-peer overlay networks Miguel Castro, Peter Druschel, Ayalvadi Ganesh, Antony Rowstron and Dan S. Wallach.. http://research.microsoft.com/enus/um/people/mcastro/ 80 The problem P2P systems: resilient but not secure Malicious nodes: fake IDs distort routing table entries prevent correct message delivery “Techniques to allow nodes to join, to maintain routing state, and to forward messages securely in presence of malicious nodes” 81 Sub-problems Securely assigning IDs to nodes attacker may capture all replicas for an object attacker may target a particular victim Securely maintaining routing tables attackers may populate with faulty entries most messages are routed to faulty nodes Securely forwarding messages even with proper routing tables, faulty nodes can corrupt, drop, misroute messages 82 6.5 安全前沿研究 安全会议列表: computer security conference ranking and statistic. CCS:ACM Conference on Computer and Comm unications Security Security:Usenix Security Symposium NDSS:ISOC Network and Distributed System Security Symposium Sigcomm&infocom IPTPS/IEEE P2P/IEEE Infocom 83 论文阅读 CCS2009: ShadowWalker: Peer-to-peer Anonymous Communication using Redundant Structured Topologies, Carmela Troncoso and George Danezis 84 IPTPS2010:Blindfold: A System to "See No Evil" in Content Discovery Ryan S. Peterson, Bernard Wong, and Emin Gün Sirer, Cornell University and United Networks, L.L.C. IPTPS2010:Strange Bedfellows: Community Identification in BitTorrent David Choffnes, Jordi Duch, Dean Malmgren, Roger Guiermà, Fabián Bustamante, and Luís A. Nunes Amaral, Northwestern University 85 Infocom2010:Identifying Malicious Nodes in Network-Coding Based Peerto-Peer Streaming Networks ICCCN '09 A Systematic Study on Peer-to-Peer Botnets Ping Wang, Lei Wu, Baber Aslam, and Cliff C. Zou 86 P2P Botnet IRC HTTP P2P HTTP->FFSN IRC->P2P 87 Botnet Architecture Botmaster Bot Bot Bot Recruiting Recruiting Recruiting 88 Botnets Botnet Admin Bot Spammer 89 P2P Botnet Storm(overnet) Nugache WaleDac 90 P2P Botnets While IRC bots simply connect to their IRC server, P2P bots must follow a series of steps to connect with their P2P network The initial P2P bot code contains a list of possible peers and code that attempts to connect the bot with the P2P network After the bot joins the network, the peer list is updated Then the bot searches the network and downloads the secondary injection code (code that instructs the bot to send spam or perform other malicious activities) 91 91 P2P Botnet: Storm 92 92 Effectiveness of Storm 93 [Smith08] 93 Hybrid P2P Botnet 94 Botnet Construction 路由信息的构建时机:New infecton;Reinfection New Infection:A感染B时,把自己的peer list交给B;判 断B是servent?如果是就把B加入A的peer list, A加入B 的peer list Reinfection:当A试图感染B(A,B都已经是servent bot了) A和B都把对方加进自己的peer list; 然后从自己的peer list中随机抽取R个peer信息发给对方, 同时从对方那里接受R个peer信息来充实自己的peer list。 如果peer list满,就做替换。 叫做hockey card algorithm 95 进一步阅读与实践: 1、匿名通道Tor下载安装运行,理解onion routing技术 2、阅读安全研究前沿的论文 96 学期论文idea之五: 匿名 INFOCOM2009 提出了一种文件检索系统的匿名模式,可考虑引 申为优化的结构设计? MIX-Crowds, an Anonymity Scheme for File Retrieval Systems Wai Hung Tang (The University of Hong Kong, HK); H. W. Chan (The University of Hong Kong, HK); 信任 EigenTrust提出了信任度计算算法,IPTPS09提出了EigenSpeed 带宽的安全评估算法。如何将人与人之间交互的信任引入到算法 中? IPTPS09 EigenSpeed: Secure Peer-to-peer Bandwidth Evaluation 路由安全 可否将经济模式引入到安全领域? INFOCOM 2009 Routing Fairness in Chord: Analysis and Enhancement 97 准备项目答辩 答辩安排:2010/4/29 上午8:00-11:40, 每组由一人做PPT演讲,限时20分钟。 98