Proceedings of the 7th Annual ISC Graduate Research Symposium ISC-GRS 2013 April 24, 2013, Rolla, Missouri Doyal Mukherjee Department of Computer Science Missouri University of Science and Technology, Rolla, MO 65409 PRIVACY PRESERVING COMMUNICATIONS USING SPHERICAL CHORD ABSTRACT Structured overlay networks help in providing simplistic methods in data storage and data lookup, but they are always vulnerable against different kinds of attack. These overlay networks try to achieve a protocol which adheres to anonymity and security. Anonymity is one of the parameter for network security. Anonymity in a network is defined as the state to be undefined or unacknowledged. The paper firstly analyses the Chord protocol and models the system security when malicious nodes are in the system. Different types of attack are analyzed on the system and proven that anonymity gets compromised in such a system. Later in the paper we present a rough extension to Chord where we will be able to show the resilience of our system against such attacks. Current systems like Tor [4] and Tarzan [3] encrypt the data across servers and create the anonymous circuits. These systems rely on the communication of messages on a small set of servers, thus when the number of nodes and the bandwidth increases the system faces scalability issues along with system overhead and maintenance. Along with it the systems also faces security issues as the nodes have full knowledge of other nodes in the system thus acting as single point of failure and attack. We propose a system called as "Spherical Chord" which adapts the basic principle of Chord protocol but it scales to a significant greater number of nodes and also proves to be resilient against different security attacks on the system. 2. RELATED WORK For our research there are was a lot of analysis done on the following concepts, 1. INTRODUCTION 2.1. Overlay Network Structured overlay peer to peer networks are distributed systems without a centralized approach and where each node has a particular responsibility assigned to them. These networks always try to achieve availability, scalability, reliability, anonymity and security. However the basic premise it locate a particular data item among the several nodes present. The Chord protocol is an efficient lookup mechanism which performs its operation in O(log N) number of hops. We analyze this particular protocol and present a new solution which will help in addressing the current problems of anonymity in this protocol. During communication over a network browser advertise the IP address, the domain, platform and the information which is requested. These critical data is always out on the network and can be easily monitored and mishandled by attackers on the network.[1] Thus, data is never private on a network and to achieve privacy for communication, anonymous communication was introduced. The advantages of having anonymous communication are as follows Sending private data or distributing anonymous content on the network Keeping the existence of a Virtual Private Network (VPN) private. For covert missions among various organizations. Hiding the personal data on a user for example health history data, financial data etc. A typical peer to peer network is always considered to be a overlay network wherein the nodes of the system are not physically connected to each other but they designed in such a way that virtual links are formed between them. These virtual links help in structuring and designing a better overlay network. These links help in performing the operations like data lookup in a efficient manner if designed properly. The protocols should clearly specify how the networks perform lookup, how nodes join and leave the system with the least possible complexity. The protocol also focuses on creating a decentralized overlay network such that there is no single point of failure. The popular peer to peer protocols are Napster, Gnutella, CAN and Chord. Each protocol has its strength and weaknesses, but in our paper we will focus more on the Chord protocol.[1] 2.2. Distributed Hash Table (DHT) The whole premise of DHT[12] is to find a node in system which is responsible for a particular data item. The scheme is such that each data item is associated with a key. This key helps in locating the node or a group of nodes which are responsible for storing the data item in correlation with the key. The popular protocols using this scheme are Napster, Gnutella, Pastry and Chord. Each node in the system helps in the lookup operation by maintaining virtual links to a small subset of nodes. When a particular data item is requested with corresponding key then the lookup requests is forwarded across the nodes till the target node with data item is reached. The Chord protocol requires O(log N) number of hops to complete a particular request. Our 1 system Spherical Chord will require at most of 3 * O(log N) operations where N is the number of nodes in the whole system. Nodes and keys in the system are mapped into something called as identifiers by a hash function. Chord and Pastry generally use a SHA-256 hash function to hash nodes, data items and keys to identifiers. These identifiers are the basic building block in any lookup operation. 2.3 CONSITENT HASHING The Chord protocol uses the consistent hashing scheme where the keys and nodes are evenly distributed by the hash function. We generally use the SHA-1 hash function to generate an m bit identifier for every node and the key. A node consisting of the data item is hashed by using its IP address while the key is hashed using the key itself. The consistent hashing is done is such a way that the probability of two nodes or keys hashing to a common identifier is negligible. The keys are assigned to the nodes in the following way. A identifier circle is created and the identifiers are placed in identifier circle 2^m modulo structure. Key k is assigned to node whose identifier is equal or follows k in the identifier space. This is done using the function as successor (k) or also called as the successor node. In totality the nodes are arranged in a circular fashion from 0 to -1, and successor node of key k is the first node that k follows in a clockwise direction. [9][10] The Fig 1 shows an identifier circle with m=6. It therefore has total of identifiers. The identifiers are from 0 to 7( -1). The nodes containing the data items are 10. The numbers of keys are 5. 3. CHORD LOOKUP PROTOCOL We are going to extensively look into the Chord protocol as it provides an extension to our own system of Spherical Chord. [8] 3.1 SIMPLE NODE LOCALIZATION In simple node localization we perform lookup for a node by traversing all the identifiers in the system. The algorithm of the same simple node localization is given as follows, // ask node n to find the successor of id n.find_successor(id) if (id (n; successor]) return successor; else // forward the query around the circle return successor.find_successor(id); This algorithm shows that the number of messages required is linear to N where N is the number of nodes in the system. This scheme was adopted in the overlay structure because it helps us to realize the basic goals for an efficient protocol namely, Load balancing Scalability Dynamic nature No critical point Deterministic. Fig 2: Simple Node Localization for lookup(54) Fig 2 shows the lookup operation using Key (54), where it transverses across the circle to find node 56 which is the successor node of key 54 and demonstrates how it takes linear time for a particular lookup. Fig 1: An identifier circle with m=6, k=5 and n =10. 2 3.2 SCALABLE NODE LOCALIZATION To accelerate the process from simple node localization Chord protocol maintains additional information for efficient lookup purposes. This information is stored in finger table. If m is the number of bits in the identifier then each node n maintains at least m entries in the finger table. The entry in the table at node n contains the identity of the first node, s, that succeeds n by at least on the identifier circle, i.e., s=successor (n + )), where 1 ≤i ≤m (and all arithmetic is modulo ). We call node s the finger of node n and denote it by n.finger[i].node. The definition of the finger table is given as follows, Notation finger[k].start .interval Definition ( N + 2k-1) mod 2m (finger[k].start, finger[k+1].start) .node First node >= n.finger[k].start Successor The next node on the identifier circle; finger[1].node Predecessor The previous node on the identifier circle Table 1: Definition of the finger table away from it. For a particular lookup operation we are just checking the entry of the finger table and propagating on basis of the successor points. When we lookup for a key we search the node which immediately precedes and find successor node of that node through the finger table. The algorithm is as follows, //ask node n to find the successor //of id n.find_successor(id) if (id in (n,successor]) return successor else n‘ = closest_preceding_node(id); return n‘.find_successor(id); //search the local table for the //highest predecessor of id n.closest_preceding_node(id) for i=m downto 1 do if (finger[i] in (n,id)) return finger[i]; return n; Using the above algorithm we can state that the number of hops a request needs to complete is only O (log N) where N is the number of nodes in the system. Fig 3: Finger Table for node 8. Fig 4: Lookup operation for Key (54). Fig 3 shows finger table for the propagation for node N8. The entry of the finger table is given as finger[i] = successor (n + ) Fig4. Demonstrates that using the finger table for node 8 we can take a long hop to node 42 as it the node which is closest to the target node 56. Consequently we take smaller hops and reach the destination. When a lookup operation is performed in such a scenario we need to ensure all the successor pointers are up to date and an algorithm must be run in the background periodically to update the finger table in case of any node joins or failures. In the scalable node localization each node stores information of only small number of nodes. Each node knows information about nodes that follow it rather than the nodes which are farther This operation is thus scalable and more time efficient as the number of hops taken is only O (log N). Our system is also an extension of chord and thus the time operations required would be also be O (log N). 3 4. ANONYMITY IN CHORD PROTOCOL The Chord protocol can query for a particular data item in two ways of routing schemes namely recursive and iterative routing. In the recursive form of routing the node finds a node in the finger table that is the successor and sends the request to that particular node. This particular lookup request is forwarded recursively till we reach the node containing the data item. Some information of this data is passed back along the reverse path which could be information of the data or IP address of the data item. In iterative type of lookup the request initiator queries nodes which are its successor. The entry in the finger table with the node closest to the target node is returned. Consequently the initiator queries the target node and the particular data item is returned. The larger the number of requests for a data item D that a node sees, the better is its estimate of the frequency with which D is accessed. This notion is specially used when anonymity in Chord protocol is mentioned. The Chord protocol achieves anonymity if the request for a particular data item is not virtually seen by a lot of nodes in the system. Storage Anonymity is defined when the node containing the data item is not revealed when a particular data is requested. Chord protocol tries to lookup a particular data item and thus storage anonymity is not achieved in the Chord protocol. Requester Anonymity is defined as hiding the existence of the initiator of the request. Recursive routing as explained earlier will provide a higher degree of anonymity against observers because lookup requests will not suggest the existence of the initiator. Anonymity is achieved in Chord as a malicious node in the system would never know who actually initiated the request because it would seem that the initiating node it actually participating in forwarding the request for that node. [15][16] 5. THREAT MODEL To evaluate the system let us first develop a threat model against which our system is going to develop resilience. The popular attacks that we are going to consider is as follows,[7][2] Dropping Lookup Requests: This is a typical denial of service attack. Here when a malicious node receives a packet for a particular request it will just drop that particular packet. The system must be designed to recover from such an attack. Randomly misrouting packets: In this type of attack the malicious node does not just drop the packet but simply routes the lookup requests to a different hop altogether. By doing this, the malicious node makes the lookup query to take a different path and thus never actually reaching the destination. This kind of attack is difficult to detect as in this attack model it seems that the malicious node is actually cooperating with the system to send a particular request. Performing a sub ring attack: This type of attack is the most hard to detect as a groups of malicious nodes collaborate so that the lookup request does not reach the destination. Here the attackers have two types of finger tables one which is the correct and the other which contains the successor node of the malicious node for each entry in the correct finger table. When the lookup request is received the malicious node will use its finger table and make the lookup request propagate across the malicious nodes and it will finally reach the malicious destination. Here the malicious nodes collaborate and give a impression that they are propagating the request but ultimately it will reach the malicious node. 6. SPHERICAL CHORD The protocol that we have developed is an extension to the earlier protocol so that anonymity is preserved more in such a system. Our protocol consists of many Chord rings to move a spherical structure and thus formally called as "Spherical Chord". Consistent hashing returns a m bit identifier for both the nodes and the key. The number of rings that are formed in our system is going to be m+1. Fig 5: Spherical Chord Structure.(m=3) The above figure shows a typical structure for Spherical Chord. As the number of bits in the identifier m=3 . the number of rings in the structure is m+1 i.e. 4. Each ring has nodes and the total number of nodes in the system is given as, + m* 4 – 2*m (1) In our structure for communication we have to consider the interlocking points. All the inner m rings have an interlocking point with the outer ring. The points can be formulated as follows, 01- /2 (Outer Ring) /2 + 1 (Ring 1) /2 +2 (Ring 2) /2 --- (2) (Ring m) Equation(2) suggests the interlocking points. For our example in Fig 5 the interlocking point for outer ring is 0 to 4. Ring 1 the points are 1 and 5 and so on. When a lookup is initiated the number of hops required to communicate is divided into the number of interlocking positions and the number of non interlocking positions. If a node in one ring wants to communicate with other ring's interlocking position it takes a maximum of 2*O(log N) hops. If node in one ring wants to communicate with other ring's node the number of hops is would be required is of 3*O(log N) where N is the number of nodes in each ring of the system. The number of path availability also increases than the standard Chord protocol. If the communication is done among interlocking points from the source to destination then the number of paths available is given as ((m*2)-2)*2 (3) If the communication is done among points other than the interlocking positions, then the path available from the source to destination is given as follows, ((m*2) –((m*2)-2))*3 (4) The increase in path availability helps in preserving the communication even in presence of malicious nodes. The Chord protocol is also unidirectional as it moves in the clockwise direction. Through our system we would support multidirectional routing on account of the different paths available. 6.1.THREAT MODEL FOR SPHERICAL CHORD The threat model on Spherical Chord will be discussed in this paper in accordance with the threat model discussed for Chord previously. We have assumed that we have not analyzed our protocol against Sybil Attack[11] or Integrity attack. The performance of our system can be concluded as follows, Dropping Lookup Requests: This attack can be drastically reduced as the number of paths available is significantly increased. Each ring has two interlocking points and total interlocking points in the system is m*2. Due to the points the path available is increased atleast by twice than the earlier systems and thus we conclude this type of Denial of Service attack will be significantly reduced. Randomly Misrouting Packets: In the earlier Chord protocol the malicious nodes would route the packet to the farthest node possible. In our system the path available increases significantly and thus the malicious nodes have to keep in mind the structure of the different rings, the total paths available to actually misroute the packet. Furthermore if one request is misrouted a different path can be taken by the initiator. Performing a sub ring attack: If a lot of malicious nodes collaborate to perform this attack then also our system will be compromised. We would analyze the system further and provide resilience for such a attack. 7. CONCLUSIONS Anonymity is the key when we want to perform any operation in a peer to peer based system against malicious nodes. We earlier analyzed the Chord protocol as it is one of the popular protocols which are required to lookup a data item in a decentralized system. We present the Spherical Chord system which incorporates more nodes and efficiently provides more path among nodes to communicate thereby preserving privacy in data lookup requests. Further in our research we will analyze our protocol against other attacks and build a simulator to prove the efficiency of our protocol against the standard Chord protocol. 8. ACKNOWLEDGMENTS This research is currently sponsored by the Intelligent Systems Center(Missouri University of Science and Technology) 9. REFERENCES [1] N. Borisov. Anonymous routing in structured peer-to-peer overlays. PhD thesis, University of California at Berkeley, Berkeley, CA, USA, 2005. [2] N. Borisov, G. Danezis, P. Mittal, and P. Tabriz. Denial of service or denial of security? How attacks of reliability can compromise anonymity. In Proceedings of CCS, October 2007 [3] M. J. Freedman and R. Morris. Tarzan: A peer-to-peer anonymizing network layer. In Proceedings of CCS, Washington, DC, November 2002. [4] D. Goodin. Tor at heart of embassy passwords leak. The Register, September 10 2007. [5] B. N. Levine, M. K. Reiter, C. Wang, and M. K. Wright. Timing attacks in low-latency mix-based systems. In A. Juels, editor, Proceedings of FC, pages 251–265. SpringerVerlag, LNCS 3110, February 2004. [6] M. Wright, M. Adler, B. N. Levine, and C. Shields, “An analysis of the degradation of anonymous protocols,” in Proceedings of the Network and Distributed Security Symposium. [7] N. Borisov, G. Danezis, P. Mittal, and P. Tabriz, “Denial of service or denial of security? How attacks on reliability 5 [8] [9] [10] [11] [12] [13] [14] [15] [16] can compromise anonymity,” in Proceedings of CCS 2007, October 2007, pp. 92–102. I. Stoica, R. Morris, D. Liben-Nowell, D. R. Karger, M. F. Kaashoek, F. Dabek, and H. Balakrishnan, “Chord: A scalable peer-to-peer lookup protocol for internet applications,” IEEE/ACM Trans. Netw., vol. 11, no. 1, pp. 17–32, 2003. LEWIN, D. Consistent hashing and random trees: Algorithms for caching in distributed networks. Master’s thesis, Department of EECS, MIT, 1998. Available at the MIT Library, http://thesis.mit.edu/. KARGER, D., LEHMAN, E., LEIGHTON, F., LEVINE, M., LEWIN, D., AND PANIGRAHY, R. Consistent hashing and random trees: Distributed caching protocols for relieving hot spots on the World Wide Web. In Proceedings of the 29th Annual ACM Symposium on Theory of Computing (El Paso, TX, May 1997), pp. 654– 663. J. Douceur, “The Sybil Attack,” in Proceedings of the 1st International Peer-To-Peer Systems Workshop, March 2002. A. Tran, N. Hopper, and Y. Kim, “Hashing it out in public: common failure modes of dht-based anonymity schemes,” in WPES ’09: Proceedings of the 8th ACM workshop on Privacy in the electronic society. New York, NY, USA: ACM, 2009, pp. 71–80. E. Sit and R. Morris, “Security considerations for peer-topeer distributed hash tables,” in IPTPS ’01: Revised Papers from the First International Workshop on Peer-to-Peer Systems. London, UK: Springer-Verlag, 2002, pp. 261– 269. D. Wallach, “A survey of peer-to-peer security issues,” in International Symposium on Software Security, Tokyo, Japan, 2002, pp. 42–57. Charles W. O’Donnell and Vinod Vaikuntanathan, "Information Leak in the Chord Lookup Protocol" in MIT Computer Science and Artificial Intelligence Laboratory Cambridge, 2005. Keith Needels "Detecting and Recovering from Overlay Routing Attacks in Peer-to-Peer Distributed Hash Tables" in Department of Computer SciencRochester Institute of Technology, 2008 6