Project Proposal CS8803 AIA Advanced Internet Application Development Instructor: Prof. Ling Liu. Feb 13th, 2007 Submitted By: Girish Saini (GTID: 902176994). Kaushik Bhandankar (GTID: 902176968). 1 Motivation: • BitTorrent: BitTorrent is a peer-to-peer (P2P) file distribution protocol. It is a method of distributing large amounts of data widely without the original distributor incurring the whole of the corresponding costs of hardware, hosting and bandwidth resources. Instead of the distributor alone servicing each recipient, under BitTorrent the recipients each also supply data to newer recipients, thus significantly reducing the cost and burden on any given individual source as well as providing redundancy against system problems, and reducing dependence upon the original distributor. To share a file or group of files, a peer first creates a "torrent." This is a small file which contains meta-data about the files to be shared, and about the tracker, the computer that coordinates the file distribution. Peers that want to download the file first obtain a torrent file for it, and connect to the specified tracker which tells them from which other peers to download the pieces of the file. • The problem with BitTorrent: The Tracker tells the requesting peer which other peers have a copy of the file. However, currently the BitTorrent protocol chooses peers randomly to connect. As a result of this, there is a possibility that the chosen peers do not perform well and the protocol takes a longer time to stabilize (i.e. to find a state when all connected peers provide good download speed). This would in turn increase the total download time for the file. 2 Objectives: • To Design and implement a Peer to Peer system based on trust and reputation of peers. If prior knowledge is present about the history or past performance of peers better decisions can be made on which peers to select for downloading a particular file. By storing historical information about the peers who we have interacted with. For peer selection, the historical data can be used to spot peers that are trustworthy. • To develop a scheme to 'rank' peers based on their past interactions with other peers in the network. • Since issues such as fairness and the overloading of the highest ranked node with too many transfer requests come into the picture if the highest ranked node is selected repeatedly by all clients, time permitting we will determine the optimal ratio of the percentage of peers to be selected via our ranking scheme and the percentage of peers to be selected randomly. 3 Related Work: The area of developing trust-based networks has gained a lot of focus in recent years owing to the increasing popularity of P2P systems. This project follows the efforts of a lot of related work in this field. The prominent ones are listed here • Managing trust in a P2P Information system [1] – This paper talks about implementing trust in a P2P system in a decentralized manner without the • • peers having global knowledge of the network topology. Enhancing BitTorrent with Trust Management [2] – This paper talks about developing a trust-based scheme in BitTorrent to develop index of trustworthiness of peers and share resources based on this index. PeerTrust: Supporting Reputation-Based Trust in Peer-to-Peer Communities [3] – This paper talks about parameterized model of estimating trustworthiness in a distributed fashion in a P2P system. 4 Proposed Work: • Phase One: This phase of our project will Involve studying the existing BitTorrent protocol in more detail and coming up with an implementation that we can augment to incorporate our additions for our peer ranking and selection scheme. During this phase we expect to get a complete understanding of how both the tracker and the peers function. • Phase Two: In this phase of our project we will incorporate our modifications into the BitTorrent protocol that we have implemented in phase one. This phase would also involve developing a ranking scheme based on various parameters like number of interactions, upload/download speed etc. Our scheme's architecture is as follows: PEER Tracker Local File System Peer Selection Module Intelligence Module Data Transfer Module BitTorrent Protocol Interface To/From peer Figure 1: Proposed Architecture • Each peer would comprise of 3 modules: • Peer Selection module – Associated with selecting the peers for file transfers. • Intelligence module – Implements the tit-for-tat scheme of BitTorrent protocol. • Data Transfer module – Associated with the actual transfer of data with connected peers (Data transfer would be done via the BitTorrent interface) • In our scheme each peer stores the clients that it has interacted with in the past and the ranking in terms of 'goodness' or 'badness' of those peers. The tracker is modified to keep records of peer interaction history. However, It stores only the other peers a peer has interacted with and does not store or compute any peer ranks. The client obtains a torrent file from anywhere on the web as usual. It contacts the tracker to get information on the list of peers that have a copy of that file. We call this list List-A. The tracker sends the client this list. However, it also sends another list of peers for each peer that is on List-A. This list (List-B) contains the peers that each peer on List-A has interacted with. The requesting peer then contacts the peers on List-B to get information about the rank of all the peers on List-A. The requesting peer then selects a certain number of the highest ranking peers from list-A. It also selects a certain number of peers randomly from the peers Remaining on List-A. • • • • • Phase Three: In this phase we will test and benchmark our code. We plan to measure the overhead involved in selecting the highest ranked peers discounting disparity in network conditions. We will run tests to see how downloads are affected by selecting the highest ranked peers mixed with randomly selected peers. • Phase Four: This Phase will be completed if time permits. We will run several experiments in which we would be benchmarking which ratio of random to highest ranked peers is optimal in terms of total download quality. 5 Plan of Action The initial thrust would be to perform an in-depth study of BitTorrent protocol and learn about various algorithms which are part of the current implementation. This would be followed by developing a basic model of BitTorrent client/server which would serve as a platform to implement our distributed trust scheme. Peer selection algorithm would make use of the trust scheme as a criterion to perform its selection process. Finally, we would evaluate the performance of our system. • • • Feb 13 – Completion of the project proposal. Feb 25 – Learn about BitTorrent terminology, protocol and implementation, especially implementation of tracker and peer-level interaction March 15 – Implement a basic BitTorrent model. • • • • • March 30 – Develop a distributed trust scheme on top of our BitTorrent implementation. This would involve developing a ranking scheme to estimate ranks of peers based on various parameters like number of interactions, upload/download rate etc. April 10 – Tweak the peer selection algorithm of BitTorrent to use the trust index developed by the trust scheme. April 15 – Evaluation and testing of the developed model. April 24 – Buffer period and documentation. In case some of the modules require a time extension, the buffer period would make up for it. May 3 – Project completion (in terms of deliverables) 6. Evaluation and Testing Method The tweaked Peer Selection algorithm would perform the peer selection process by using the trust model developed by us. It would select a certain percentage of peers based on the trust scheme (This would constitute the best peers with respect to the peer performing the selection) and select the remaining peers in a random fashion (A degree of randomness is necessitated to prevent starvation of new or weakly connected peers as has been already discussed). We would evaluate the performance of the system (in terms of total download time of a torrent file in a single session) by varying this percentage of randomness incorporated in the peer selection algorithm. For performing the evaluation, we would need a few machines with our BitTorrent model running on them (Network conditions would be assumed stable for evaluation purposes) 7. Things not addressed by the project The project does not propose to secure the centralized tracker against various attacks by malicious peers. Also, the project does not address issues related to optimistic choking/unchoking carried out by peers (The BitTorrent model developed would lack this feature). We also do not plan to consider file transfers spanning multiple sessions. 8. Bibliography [1] Managing trust in a P2P Information system by Z. Despotovic and K. Aberer in the proceedings of the Ninth International Conference on Information and Knowledge Management (CIKM), 2001. [2] Enhancing BitTorrent with Trust Management by Chris Lenfest [3] PeerTrust: Supporting Reputation-Based Trust in Peer-to-Peer Communities. L. Xiong and L. Liu, in IEEE Transactions on Knowledge and Data Engineering (TKDE), July 2004. [4] http://wiki.theory.org/BitTorrentSpecification