Trustworthy Services from Untrustworthy Components: Overview Fred B. Schneider Department of Computer Science Cornell University Ithaca, New York 14853 U.S.A. Joint work with Lidong Zhou and Robbert van Renesse. Fault-tolerance by Replication Servers Client The “fine print” … Replica failures are independent. Replica coordination protocol exists. No secrets stored in server’s state. 1 Trustworthy Services A trustworthy service… – tolerates component failures – tolerates attacks – might involve confidential data N.b. Cryptographic keys must be kept confidential and are useful for authentication, even when data is not secret. 2 Revisiting the “Fine Print” Replica failures are independent. Replica Coordination protocol exists. No secrets stored in server’s state. 3 Revisiting the “Fine Print” Replica failures are independent. – But attacks are not independent. Replica Coordination protocol exists. No secrets stored in server’s state. 4 Revisiting the “Fine Print” Replica failures are independent. – But attacks are not independent. Replica Coordination protocol exists. – But such protocols involve assumptions, and assumptions are vulnerabilities. Timing assumptions versus Denial of Service No secrets stored in server’s state. 5 Revisiting the “Fine Print” Replica failures are independent. – But attacks are not independent. Replica Coordination protocol exists. – But such protocols involve assumptions, and assumptions are vulnerabilities. Timing assumptions versus Denial of Service No secrets stored in server’s state. – But secrets cannot be avoided for authentication Replicating a secret erodes confidentiality. 6 Compromised Components Correct component satisfies specification. Compromised component does not. – Adversary might control a compromised component. – Component is compromised if adversary knows secrets being stored there. A recovery protocol transforms component: compromised correct 7 Component Correlation Components are correlated to the extent that one attack suffices to compromise all. Correlation arises from: – Dependence on the environment – Vulnerabilities in shared design / code – Shared secrets Goal: Eliminate sources of correlation. 8 Correlation: Environment Vulnerabilities Vulnerabilities = Assumptions – Weaker assumptions are better. “Synchronous system” assumption: – Bounded message delivery delay – Bounds on process execution speed • violated by denial of service attacks • needed for “agreement protocols” in deterministic systems [FLP] 9 Correlation > Towards Weaker Assumptions: Eschewing Synchronous Systems Asynchronous system model is weaker but requires making “sacrifices”: – Sacrifice determinacy: Use “randomized protocols” (requires randomness) – Sacrifice liveness but preserve safety. – Sacrifice state machine replication Use quorums or other weaker mechanisms Some service semantics cannot be implemented. 10 Component Correlation Correlation arises from: – Dependence on environment – Vulnerabilities in shared design / code – Shared secrets 11 Correlation: Eschewing Shared Design / Code Solution: Diversity! Expensive or impossible to obtain: • Development costs • Interoperability risks Still, what diversity does exist should be leveraged. 12 Correlation > Leveraging Extant Diversity: Adversary Structures t-resilience: Service is not compromised unless more than t components are. – Known as a threshold structure. FS-resilience: If FS = {F1, F2, … Fr} then service not compromised provided the set C of compromised components satisfies C Fi for some i. – Select FS according to dimensions of diversity. – Known as an adversary structure. 13 Component Correlation Correlation arises from: – Dependence on environment – Vulnerabilities in shared design / code – Shared secrets 14 Correlation: Eliminating Shared Secrets (n,t) secret sharing – – – – [Shamir, Blakley]: Secret s is divided into n shares. Any t or more shares suffice for reconstructing s. Fewer shares convey no information about s. Can be adapted for arbitrary adversary structures. Threshold cryptography: – Perform cryptographic operations piecewise using shares of private key; result is as if private key was used. Example: Threshold digital signatures 15 Proactive Recovery When is recovery protocol run? – After an attack is detected. Not sufficient to reboot from good system image. • Must get system state (or have stateless service). • Must also “refresh” secrets. – Periodically, even if an attack is not detected. Not all attacks are detected, proactive recovery defends against undetected attacks. Adversary strategy: Increase the window of vulnerability, interval between proactive recovery executions. 16 Proactive Recovery: Secret Refresh Refresh secret shares: PSS and APSS Refresh symmetric keys: Revisit KDC. Force new password choices. Refresh public / private key pairs: Invent new server private key Must disseminate new server public key. 17 Proactive Recovery > Secret Refresh: Refresh Private / Public Keys I Approach: Tamper proof hardware. – Key material stored in tamper-resistant hw. Key cannot be read or modified. Attacker can still instigate crypto operations with key. Protocols must accommodate such possible rogue behavior. 18 Proactive Recovery > Secret Refresh: Refresh Private / Public Keys II Approach: Use off-line private keys. – New public keys are propagated through a secure out-of-band channel. Use off-line private keys to sign the new public keys. Components storing off-line keys can be connected to network using a one-way channel (e.g. “pump”). 19 Proactive Recovery: Transparency and Change I Scalability concerns dictate that clients be shielded from changes due to proactive recovery. Service public / private key: – Proactive secret sharing changes private key shares without changing private key (or public key). Server identities: – A single contacted server operates as a delegate. – Service key signs responses to client. – Self-verifying messages impede rogue delegates from spoofing as clients. 21 Proactive Recovery: Transparency and Change II Server public keys. If client must know… – Local certificate: <Server name, New server public key, Epoch number> Signed by server using server’s off-line key – Global certificate: Local certificate signed by service private key • Service signs only if local signature on certificate is valid • Use t+1 threshold crypto for service signature Stored at 2t+1 servers. (Out of 3t+1) – Client obtains current public key for server i: Retrieve global certificate for all servers from 2t+1 servers epoch numbers in t+1 sets will be the same---that is current 22 Research Programme Trajectory Cornell On-line Certification Authority (COCA) Asynchronous Proactive Secret Sharing (APSS) Distributed Blinding Protocol Codex Secret Store Key ideas: – Weak computational models (asynchronous) – Thresholdization [sic] / “multi-party computation” – Proactive protocols (vs Transparency) 23