Sensitive Data Data that should not be made public What if some but not all of the elements of a DB are sensitive • Inherently sensitive • From a sensitive source • Declared sensitive • Part of a sensitive attribute or record • Sensitive in relation to previously disclosed information Access Decisions Need an access policy (programmed into DBMS) Availability – blocking; permanent blocking Acceptability of Access (sensitive data) Assurance of Authenticity Types of Disclosures Exact Data Bounds Negative Results Existence of Data Probable Values Security vs. Precision Aim to protect all sensitive data while revealing as much nonsensitive data as possible Want to maintain perfect confidentiality with maximum precision Inference Way to infer / derive sensitive data from nonsensitive data Direct Attack • List NAME where SEX=M ^ DRUGS=1 • List NAME where (SEX=M ^ DRUGS=1) v (SEX#M ^ SEX#F) v (DORM=AYRES) Indirect Attack Sum • Show STUDENT-AID WHERE SEX=F ^ DORM=Grey Count • Show Count, STUDENT-AID WHERE SEX=M ^ DORM=Holmes • List NAME where (SEX=M ^ DORM=Holmes) Median Tracker Attacks – using additional queries that produce small results Controls Suppression – don’t provide sensitive data Concealing – don’t provide actual values (“close to”) Limited Response Suppression • n-item k-percent rule eliminates low frequency elements from being displayed (may need to suppress additional rows/columns) Controls Combined Results • Sums • Ranges • Rounding Random Sample Random Data Perturbation Query Analysis – “should the result be provided” Conclusion on the Inference Problem Suppress obviously sensitive information Track what the user knows Disguise the data Aggregation Building sensitive results from less sensitive inputs Data mining – process of sifting through multiple databases and correlating multiple data elements to find useful information Multilevel Databases Differentiated Security • Security of single element may be different from security of other elements • Two levels – sensitive and nonsensitive are inadequate to represent some security situations • Security of an aggregate (sum, count,…) may be different from security of the individual elements Granularity Security Issues Integrity • *-property for access control • Either process cleared at a high level cannot write to a lower level or process must be a “trusted process” Confidentiality • Different users at different levels may get different query results • Polyinstantiation – record can appear more than once with different levels of confidentiality Proposals for Multilevel Security Separation • Partitioning – divide DB into separate DBs with own level of sensitivity • Encryption (time consuming) • Integrity Lock – each data item contains a sensitivity label and a checksum Sensitivity label must be unforgeable, unique, concealed Checksum must be unique Sensitivity lock Design of Multilevel Secure Databases Integrity Lock – not efficient (space/time) Trusted Front-end (Guard) – does authentication and filtering Commutative Filters – • screen user’s requests, reformats, so that only appropriate data is returned Design of Multilevel Secure Databases Distributed (federated) database • Trusted front-end controls access to two DBMSs – one for high-sensitivity data and one for low-sensitivity data • Very complex Window/View • Subset of a database containing exactly the information that the user is entitled to access