ODISSEA Security Issues Mehdi Kharrazi Kulesh Shanmugasundaram SYN SYN P2P Security Basics Introduction to ODISSEA Security Issues in ODISSEA Trust via Reputation FIN P2P Basics All nodes are created equal. Not really! Network classification based on network connectivity – Exponential Networks: Homogenous network, [average] node connectivity is equally distributed – Scale-free networks: Follows power-law for connectivity, that is there are some highly connected nodes and many not too highly connected nodes Current P2P systems are scale-free networks Network Maps Partial map of Gnutella Network Note the hierarchical structure of the network Network Maps… Gnutella Neighborhood Map Failure vs. Attack Failure: – Random failure of nodes and/or infrastructure elements Attack: – Systematic failure of nodes and/or infrastructure elements Scale-free networks are failure-tolerance Exponential networks are attack-tolerance Why? Most P2P systems give priority for failure- tolerance over attack-tolerance Possible Targets Underlying protocol layers P2P routing mechanism Nodes themselves Trust system Homeostasis (of the system) Applications/Application Protocols Users More on that: “Security Issues in Peer-to-Peer Systems ” http://vip.poly.edu/kulesh/skunk/talks/ ODISSEA: A p2p Search Engine A p2p search engine Applications: – Search in p2p networks – Search in intranets – Web search – Middleware How the search engine works? ODISSEA: A p2p Search Engine Security Issues Three Categories: 1. P2P Search Engine Related 2. P2P Network Related 3. General Security Issues Search Engine Related: – Content Poisoning: • • • – Protocol Security • • – Protection against MIMs Truthful Execution of Ranking Algorithms Compartmentalization • – Crawler Parser Query Processor Search on a multi-level security network Anonymity • P2P networks are used for anonymity Content Poisoning Crawler: – Crawler associates wrong URL with some document – E.g.: Associates playboy.com/index.html with ODISSEA web site! Suggested solutions: 1. Random Re-Crawling: • • • At random re-crawl a URL Simple but has re-crawling overhead No verification from the source! 2. Signed Documents: • • • • Have the web server sign the document (Just another header) Parser verifies the signature prior to parsing No re-crawling overhead Requires PKI and web server needs to support signatures Content Poisoning Parser: – Malicious parser associates wrong keywords – E.g: Associates ODISSEA with porn! Suggested Solutions: – TruthSayer for XML documents (Oakland ’01) Query Processor: – Censorship by query processors! Protocol Security ODISSEA Search Protocol – Has no security primitives at all – MIM a good and easy possibility • Queries, query results can be altered • Postings and documents can be altered • E.g. Integrity of copies Ranking Algorithms – Users have the option to send their own algorithm – There is no way to assure proper algorithm is used – I say “PageRank” query processor uses “PigeonRank” ODISSEA for Multilevel Security Architecture Ideal Setting: NSA Information Processing Facility Environment: – Large secure intranet (100,000 nodes) – Multi-level security (from Unclassified to Umbra) – Users/nodes move between levels Design Goals: – Optimal use of resources across levels – Enforces multi-level security via compartmentalization – Allows for a fast, scalable search engine – Agile enough to allow users move back and forth – Withstand malicious users, nodes etc. Simple, Stupid, Scheme: – Assign a key (bit string) to each level – XOR every token of a document with the corresponding key – Search for (keyword XOR key) – Trivial to break and not scalable Trust Local Trust Local trust value (ebay): Problems: Does not get a wide view about the peer’s reputation Or It aggregates the whole network and causes congestion Solution Transitive trust, if I trust you, then I would trust the one you trust Aggregate Local Trust Normalized local trust Aggregate local trust values If C = matrix [cij] : ti=CTci To get a wider view peer i would ask his friend’s friend: ti=(CT)2ci ....and so on …. ti=(CT)nci For large n, the trust vector converges to same vector for every peer i Distributed EigenTrust Each node can calculate it’s eigen trust value by: Were p is a distribution over pre-trusted peers – Pre-trusted peers are essential for breaking malicious collectives – For example the very first nodes in the network i.e. designers Distributed EigenTrust Algorithm Distributed EigenTrust Algorithem Fast convergences Secure Eigentrust Calculate the trust value of each peer by more than one peer (score managers) If there is difference of opinion then vote! Use DHT to assign score managers, using different hash functions. Upsides: – Anonymity (can’t tell who’s trust your computing) – Randomization (can’t make yourself your own score manager) – Redundancy (more than one score manager) Load distribution Deterministic algorithm – Chose the responding peer with highest trust value Probabilistic Algorithm – Choose peer i with probability . With probability of 10% select a peer j with zero trust value. – Why 10%? • A balance between allowing new users to gather trust, at the same time not granting malicious users a high chance of providing inauthentic files FIN Questions, comments, concerns?