Privacy and Contextual Integrity: Framework and Applications Anupam Datta CMU Joint work with Adam Barth, John Mitchell (Stanford), Helen Nissenbaum (NYU) and Sharada Sundaram (TCS) Problem Statement Is an organization’s business process compliant with privacy regulations and internal policies? Examples of organizations – Hospitals, financial institutions, other enterprises handling sensitive information Examples of privacy regulations – HIPAA, GLBA, COPPA, SB1386 Goal: Develop methods and tools to answer this question 2 Privacy Project Space What is Privacy? [Philosophy, Law, Public Policy] Formal Model, Policy Language, Compliance-check Algorithms [Programming Languages, Logic] Implementation-level Compliance [Software Engg, Formal Methods] Data Privacy [Databases, Cryptography] 3 Project Overview What is privacy? – Policy language – Privacy laws including HIPAA, COPPA, GLBA expressible Compliance-check algorithms – Conceptual framework Does system satisfy privacy and utility goals? Case studies – – – Patient portal deployed at Vanderbilt Hospital UPMC (ongoing discussions) TCS 4 Contextual Integrity Philosophical framework for privacy Central concept: Context – [N2004] Examples: Healthcare, banking, education What is a context? – Set of interacting agents in roles – Norms of transmission – Roles in healthcare: doctor, patient, … Doctors should share patient health information as per the HIPAA rules Purpose Improve health 5 MyHealth@Vanderbilt Workflow Humans + Electronic system Health Answer Yes! except broccoli Now that I have cancer, Should I eat more vegetables? Health Question Secretary Doctor Patient Health Answer Nurse Privacy: HIPAA compliance+ Utility: Schedule appointments, obtain health answers 6 MyHealth@Vanderbilt Improved Health Answer Health Answer Yes! except broccoli Secretary Health Question Now that I have cancer, Should I eat more vegetables? Doctor Patient Health Answer Nurse Responsibility: Doctor should answer health questions •Message tags used for policy enforcement •Minimal disclosure 7 Privacy vs. Utility Privacy – Utility – Certain information should be communicated Tension between privacy and utility “Minimum necessary” Workflows Violate Privacy Feasible Workflows Permissiveness Certain information should not be communicated Violate Utility 8 Design-time Analysis: Big Picture Purpose Contextual Integrity Business Objectives Utility Checker (ATL*) Norms Privacy Policy Privacy Checker (LTL) Business Process Design Utility Evaluation Privacy Evaluation Assuming agents responsible 9 Auditing: Big Picture Business Process Execution Run-time Monitor Audit Logs Privacy Policies Utility Goals Audit Algos Policy Violation + Accountable Agent Agents may not be responsible 10 In more detail… Model and logic Privacy policy examples – – GLBA – financial institutions MyHealth portal Compliance – – checking Design time analysis (fully automated) Auditing (using oracle) Language can express HIPAA, GLBA, COPPA [BDMN2006] 11 Model Inspired by Contextual Integrity Health Question Bob – Alice Communication via send actions: – Now that I have cancer, Should I eat more vegetables? Sender: Bob in role Patient Recipient: Alice in role Nurse Subject of message: Bob Tag: Health Question Message: Now that …. Data model & knowledge evolution: contents(msg) vs. tags (msg) Agents acquire knowledge by: – – receiving messages deriving additional attributes based on data model Health Question Protected Health Information 12 Model [BDMN06, BDMS07] State determined by knowledge of each agent Transitions change state – – Set of concurrent send actions Send(p,q,m) possible only if agent p knows m A11 K11 ... K0 A12 K12 A13 K13 ... Concurrent Game Structure G = <k, Q, , , d, > 13 Logic of Privacy and Utility Syntax ::= send(p1,p2,m) | contains(m, q, t) | tagged(m, q, t) | inrole(p, r) | t t’ | | | x. | U | S | O | <<p>> p1 sends p2 message m m contains attrib t about q m tagged attrib t about q p is active in role r Attrib t is part of attrib t’ Classical operators Temporal operators Strategy quantifier Semantics Formulas interpreted over concurrent game structure 14 Specifying Privacy MyHealth@Vanderbilt In all states, only nurses and doctors receive health questions G p1, p2, q, m send(p1, p2, m) contains(m, q, health-question) inrole(p2, nurse) inrole(p2, doctor) 15 Specifying Utility MyHealth@Vanderbilt Patients have a strategy to get their health questions answered p inrole(p, patient) <<p>> F q, m. send(q, p, m) contains(m, p, health-answer) 16 MyHealth Responsibilities Tagging Nurses should tag health questions G p, q, s, m. inrole(p, nurse) send(p, q, m) contains(m, s, health-question) tagged(m, s, health-question) Progress – Doctors should answer health questions G p, q, s, m. inrole(p, doctor) send(q, p, m) contains(m, s, health-question) F m’. send(p, s, m’) contains(m’, s, health-answer) 17 Gramm-Leach-Bliley Example Sender role Attribute Subject role Financial institutions must notify consumers if they share their non-public personal information with nonaffiliated companies, but the notification may occur either before or after the information sharing occurs Recipient role Temporal condition 18 Workflow Design Results Theorems: Assuming all agents act responsibly, checking whether workflow achieves – Privacy is in PSPACE (in the size of the formula describing the workflow) – Utility is decidable for a restricted class of formulas ATL* model-checking is undecidable for concurrent game structures with imperfect information, but decidable with perfect information Idea: – Use LTL model-checking algorithm Check that all executions satisfy privacy and utility properties Definition and construction of minimal disclosure workflow Algorithms implemented in model-checkers, e.g. SPIN, MOCHA 19 Auditing Results Who to blame? Accountability – Design of audit log – Use Lamport causality structure, standard concept from distributed computing Algorithms – – Irresponsibility + causality Finding agents accountable for policy violation in graph-based workflows using audit log Finding agents who act irresponsibly using audit log Algorithms use oracle: – – O(msg) = contents(msg) Minimize number of oracle calls 20 Conclusions Framework inspired by contextual integrity Business Process as Workflow Role-based responsibility for human and mechanical agents Compliance checking Workflow design assuming agents responsible Privacy, utility decidable (model-checking) Minimal disclosure workflow constructible Auditing logs when agents irresponsible Automated From policy violation to accountable agents Finding irresponsible agents Using oracle Case studies – – MyHealth patient portal deployed at Vanderbilt University hospital Ongoing interactions with UPMC 21 Future Work Framework – – – Compliance vs. risk management Privacy principles – – Focusing on healthcare Detailed specification of privacy laws – Current results apply to system specification More case studies – Utility decidability; small model theorem Auditing algorithms Privacy analysis of code – Minimum necessary one example; what else? Improve algorithmic results – Do we have the right concepts? Adding time , finer-grained data model Priorities of rules, inconsistency, paraconsistency Immediate focus on HIPAA, GLBA, COPPA Legal, economic incentives for responsible behavior 22 Publications/Credits – A. Barth, A. Datta, J. C. Mitchell, S. Sundaram Privacy and Utility in Business Processes, to appear in Proceedings of 20th IEEE Computer Security Foundations Symposium, July 2007. – A. Barth, A. Datta, J. C. Mitchell, H. Nissenbaum Privacy and Contextual Integrity: Framework and Applications, in Proceedings of 27th IEEE Symposium on Security and Privacy , pp. 184-198, May 2006. Work covered in The Economist, IEEE Security & Privacy editorial 23 Thanks Questions? 24 Additional Technical Slides 25 Related Languages Model Sender Recipient Subject Attributes Past Future Combination RBAC Role Identity XACML Flexible Flexible Flexible o o EPAL Fixed Role Fixed o P3P Fixed Role Fixed o o LPU Role Role Role Legend: unsupported o partially supported fully supported Utility not considered LPU fully supports attributes, combination, temporal conditions 26 Deciding Utility ATL* model-checking of concurrent game structures is – – Decidable with perfect information Undecidable with imperfect information Theorem: There is a sound decision procedure for deciding whether workflow achieves utility Intuition: – Translate imperfect information into perfect information by considering all possible actions from one player’s point of view 27 Local communication game Quotient – structure under invisible actions, Gp States: Smallest equivalence relation K1 ~p K2 if K1 K2 and a is invisible to p – Actions: [K] [K’] if there exists K1 in [K] and K2 in [K’] s.t. K1 K2 For all LTL formulas visible to p, Gp |= <<p>> implies G |= <<p>> Lemma: 28 Auditing Results Definitions – – Design of audit log Algorithms – – Policy compliance, locally compliant Causality, accountability Finding agents accountable for locally-compliant policy violation in graph-based workflows using audit log Finding agents who act irresponsibly using audit log Algorithms use oracle: – – O(msg) = contents(msg) Minimize number of oracle calls 29 Policy compliance/violation Contemplated Action Judgment Policy Future Reqs History Strong compliance – – [BDMN2006] Action does not violate current policy requirements Future policy requirements after action can be met Locally compliant policy – Agents can determine strong compliance based on their local view of history 30 Causality Lamport Causality [1978] “happened-before” 31 Accountability & Audit Log Accountability – Causality + Irresponsibility Audit – – log design Records all Send(p,q,m) and Receive(p,q,m) events executed Maintains causality structure O(1) operation per event logged 32 Auditing Algorithm Goal Find agents accountable for a policy violation Algorithm(Audit log A, Violation v) 1. 2. Construct G, the causality graph for v in A Run BFS on G. At each Send(p, q, m) node, check if tags(m) = O(m). If not, and p missed a tag, output p as accountable Theorem: – The algorithm outputs at least one accountable agent for every violation of a locally compliant policy in an audit log of a graph-based workflow that achieves the policy in the responsible model 33 Proof Idea Causality graph G includes all accountable agents – There is at least one irresponsible agent in G – – Accountability = Causality + Irresponsibility Policy is satisfied if all agents responsible Policy is locally compliant In graph-based workflows, safety responsibilities violated only by mistagging – O(msg) = tags(msg) check identifies all irresponsible actions 34 MyHealth Example 1. Policy violation: Secretary Candy receives health-question mistagged as appointment-request 2. Construct causality graph G and search backwards using BFS Candy received message m from Patient Jorge. O(m) = health-question, but tags(m) = appointment-request. Patient responsible for health-question tag. Jorge identified as accountable 35