Preserving Peer Replicas By Rate-Limited Sampled Voting Petros Maniatis, Mema Roussopoulos, TJ Giuli, David Rosenthal, Mary Baker, Yanto Muliandi Problem Academic publishing is moving to the Web Libraries rent accesses to publisher’s copy But… What if publishers go out of business? Solution: LOCKSS Digital preservation among libraries Need to address scalability and security issues Characteristics of LOCKSS Long-term large-scale Lack of central control Avoid long-term secrets like encryption keys Resist random failures and deliberate attack for a long time Design Assumptions Storage is unreliable Third-party reputation is problematic Vulnerable to slander and subversion Can cash in a history of good behavior Strong adversary Need to prepare for unforeseen attacks Design Principles No long-term secrets Secrets require storage that is effectively impossible to replicate, audit, repair, or regenerate Use inertia Rate-limit changes Design Principles Reduce predictability Intrinsic intrusion detection Bimodal behavior The Existing LOCKSS System Use persistent Web caches Crawl the journal websites Distribute to local readers Preserve by cooperating with other caches Use “opinion polls” in a peer-to-peer network Compare the hash values of specified part of the content The Opinion Polls Provide content authenticity and integrity Based on independently obtained copies Peers vote on large archived units (AUs) An AU is checked every three months With ~17 peers Only repair a replica if it participated in the past Prevent free-loading and theft The New Opinion Poll Protocol Assumptions Each peer uses one of a number of independent implementations of the LOCKSS protocol to limit common-mode failures Each peer’s AU is subject a low rate of undetected random damage Polling rate >> random damage rate The New Opinion Poll Protocol Definitions Malign peer: one tries to subvert the system Loyal peer: one that follows the LOCKSS protocol at all times Damaged peer: a loyal peer with a damage AU Healthy peer: a loyal peer with the correct AU Goal: high probability of healthy peers despite failures and attacks The Idea of Polling A peer invites a small subset of the peers it has recently encountered Each computes a fresh digest of its AU If the caller of the pool receives votes that overwhelmingly agree with its own version Do nothing The Idea of Polling If the caller of the pool receives votes that overwhelmingly disagree Ask for a copy to repair its own Vote again If the result of the poll is neither a landslide win nor a landslide loss, then the caller raises an alarm to attract human attention to the situation Voting Membership Inner circle Decide the poll outcome Outer circle Nominated by inner circle May become members of the inner circle in the future Sybil-Attack Preventions Sybil attack: Use an unlimited number of forged identities to subvert a system Prevention schemes: Infrequent voting (Limits the rate of change in the system Bimodal distribution of system states (increase the chance to trigger alarms) Require each peer to expend significant computing power for each step Computing the hash for an AU Churn (to be explained later) Details Each peer maintains two lists Reference list Recently encountered peers Friends list Peers with out-of-band relationship Bootstrapping Copy all entries from its current friends list into its reference list Each reference has a random expiration time Poll Initiation Choose N random peers from the reference list (inner circle) Send encrypted poll messages Remove peers that cannot answer the challenge-response questions within a specified time frame from the inner circle If too few inner circle members, invites additional peers from the reference list Abort when the reference list is exhausted Poll Effort Receiver must solve a puzzle to show effort Make it computationally difficult for attackers to forge multiple identities Inner circle also nominates outer circle members Every inner circle nominator affects the outer circle equally Initiator also polls outer circle members Vote Verification If the proof of effort is incorrect, the vote is invalid, and the peer if black listed If the proof is correct, and the hash matches, it is valid and agreeing If the proof is correct, and the hash mismatches, it is valid and disagreeing Vote Tabulation Agreeing votes are smaller than a threshold (landslide loss), the initiator needs to repair its copy Agreeing votes are greater than a threshold (landslide win), the initiator updates its reference list and schedules the next poll Otherwise, raise an alarm Inter-poll Alarm Triggered if an initiator fails to collect enough votes for a long time Repair Need to detect inconsistencies between the voting information and the repaired AU If initiator cannot complete the repair process, raise the corresponding alarm Reference List Update Remove all disagreeing peers and some randomly chosen agreeing peers from the inner circle Resets the expiration time for the remaining peers Insert all outer circle peers whose votes were valid and agreeing Insert randomly chosen entries from friends list up to a churn factor Vote Construction Consists of a hash of AU and interleaved with provable computational effort Vote computation is divided in rounds, each with computational effort and the hashed portion double in size A subsequent challenge is dependent on the previous challenge Protocol Analysis Need to achieve the following Prevent one from gaining a foothold Make it expensive for the adversary to waste another peer’s resources Make it likely for attacks to be detected Effort Sizing Use memory-bound computations An initiator needs to expend more effort than the cumulative effort it imposes on the voters Timeliness of Effort Only proofs of recent effort can affect the system Need to expend resources to maintain foothold Rate Limiting Loyal peers call polls autonomously and infrequently The rate of progress for an attack is limited by victims, not by attackers Reference List Churning Avoid depending on a fixed set of peers They become easy targets Avoid depending on entirely on random peers They can launch Sybil attacks With friends list Attackers can gain foothold on the outer circle list but not the friends list Obfuscation of Protocol State Encrypt all but the first protocol message exchanged by a poll initiator and each potential voter Make all loyal peers invited into a poll, even those who decline to vote Can’t deduce the number of loyal peers who are involved in deciding the outcome of a poll Alarms Raising an alarm is expensive Involve human examinations If an attacker’s goal is to raise alarms…. Adversary Analysis Complete parameter knowledge Exploitation of common peer vulnerability Take over a fraction of populations running the same implementation Unconstrained identities Infinite IP addresses Stealth One cannot discern loyal peers from compromised ones Adversary Analysis Total information awareness Identities of all malign peers Perfect work balancing Perfect digital preservation Incorruptible copies of good and bad Aus Local eavesdropping Local spoofing One end of the communication needs to be in the local network Adversary Attacks Platform attacks Can take over a fraction of peers instantaneously Protocol attacks Play against the LOCKSS protocol Protocol Attacks Stealth modification Replace good AUs with bad ones Nuisance Raise many alarms Attrition Prevent loyal peers from repairs Theft Obtain published content without paying Protocol Attacks Free-loading Obtain services without supplying services in return Counter-Attack Techniques Adversary foothold in a reference list Need to wait for invitation to vote Need to behave well for a long time before the attack (without raising alarms) Vote base on good AU, supply the bad AU for repair Ask random sample bits (verified) before each poll The repair AU must match the initial bits Stealth Modification Attack Strategy Two phases Lurk to build a foothold in loyal peers’ reference lists Attack Need to have the majority of votes Need to have loyal peers < the alarm threshold An adversary… Needs to wait for an initiator to call for votes Needs to go through many rounds of voting without triggering an alarm Needs to expend effort to maintain the foothold in the reference list Simulation Running LOCKSS for 30 years 1000 peers Clusters of 30 peers 29 peers in the initial friends list 80% from the local cluster 20 years of lurking 10 years of attacking Results Low rates of false alarms in the absence of attacks Can sustain up to 1/3 of the peers subverted (with 10% churn) System degrades gracefully