Chapter 6 – Database Security Integrity for databases: record integrity, data correctness, update integrity Security for databases: access control, inference, and aggregation Multilevel secure databases: partitioned, cryptographically sealed, filtered Introduction to Databases Database – collection of data and set of rules that organize the data by specifying certain relationships among the data Database administrator (DBA) Database management system (DBMS) – database manager, frontend Introduction to Databases Records – contain related group of data Fields (elements) – elementary data items Schema – logical structure of database Subschema – view into database Introduction to Databases Relational • Rows (relation); columns (attributes) • DB2, Oracle, Access Hierarchical • IMS Object-oriented Introduction to Databases Queries • SELECT NAME = ‘ADAMS’ • SELECT (ZIP = ‘43210’) ^ (NAME = ‘ADAMS’) Project • SHOW FIRST WHERE (ZIP = ‘43210’) ^ (NAME = ‘ADAMS’) Join • SHOW NAME, AIRPORT WHERE NAME.ZIP = AIRPORT.ZIP Advantages of Using Databases Shared access Minimal redundancy Data consistency Data integrity Controlled access Security Requirements Physical database integrity Logical database integrity Element integrity Auditability Access control User authentication Availability Integrity of the Database Users must be able to trust the accuracy of the data values Updates are performed by authorized individuals Integrity is the responsibility of the DBMS, the OS, and the computing system manager Must be able to reconstruct the database at the point of a failure Element Integrity Correctness or accuracy of elements Field checks Access control Maintain a change log – list every change made to the database Auditability & Access Control Desirable to generate an audit record of all access to the database (reads/writes) Pass-through problem – accessing a record or element without transferring the data received to the user (no reads/writes) Databases separated logically by user access privileges Other Security Requirements User Authentication Confidentiality Availability Reliability and Integrity Database integrity Element integrity Element accuracy Some protection from OS • File access • Data integrity checks Two-Phase Update Failure of computing system in middle of modifying data Intent Phase – gather resources needed for update; write commit flag to the database Update Phase – make permanent changes Redundancy / Internal Consistency Error detection / Correction codes (parity bits, Hamming codes, CRCs) Shadow fields Log of user accesses and changes Concurrency/Consistency Access by two users sharing the same database must be constrained (lock) Monitors –check entered values to ensure consistency with rest of DB Range Comparisons State Constraints – describes condition of database (unique employee #) Transition Constraints – conditions before changes are applied to DB Sensitive Data Data that should not be made public What if some but not all of the elements of a DB are sensitive • Inherently sensitive • From a sensitive source • Declared sensitive • Part of a sensitive attribute or record • Sensitive in relation to previously disclosed information Access Decisions Need an access policy (programmed into DBMS) Availability – blocking; permanent blocking Acceptability of Access (sensitive data) Assurance of Authenticity Types of Disclosures Exact Data Bounds Negative Results Existence of Data Probable Values Security vs. Precision Aim to protect all sensitive data while revealing as much nonsensitive data as possible Want to maintain perfect confidentiality with maximum precision Inference Way to infer / derive sensitive data from nonsensitive data Direct Attack • List NAME where SEX=M ^ DRUGS=1 • List NAME where (SEX=M ^ DRUGS=1) v (SEX#M ^ SEX#F) v (DORM=AYRES) Indirect Attack Sum • Show STUDENT-AID WHERE SEX=F ^ DORM=Grey Count • Show Count, STUDENT-AID WHERE SEX=M ^ DORM=Holmes • List NAME where (SEX=M ^ DORM=Holmes) Median Tracker Attacks – using additional queries that produce small results Controls Suppression – don’t provide sensitive data Concealing – don’t provide actual values (“close to”) Limited Response Suppression • n-item k-percent rule eliminates low frequency elements from being displayed (may need to suppress additional rows/columns) Controls Combined Results • Sums • Ranges • Rounding Random Sample Random Data Perturbation Query Analysis – “should the result be provided” Conclusion on the Inference Problem Suppress obviously sensitive information Track what the user knows Disguise the data Aggregation Building sensitive results from less sensitive inputs Data mining – process of sifting through multiple databases and correlating multiple data elements to find useful information Multilevel Databases Differentiated Security • Security of single element may be different from security of other elements • Two levels – sensitive and nonsensitive are inadequate to represent some security situations • Security of an aggregate (sum, count,…) may be different from security of the individual elements Granularity Security Issues Integrity • *-property for access control • Either process cleared at a high level cannot write to a lower level or process must be a “trusted process” Confidentiality • Different users at different levels may get different query results • Polyinstantiation – record can appear more than once with different levels of confidentiality Proposals for Multilevel Security Separation • Partitioning – divide DB into separate DBs with own level of sensitivity • Encryption (time consuming) • Integrity Lock – each data item contains a sensitivity label and a checksum Sensitivity label must be unforgeable, unique, concealed Checksum must be unique Sensitivity lock Design of Multilevel Secure Databases Integrity Lock – not efficient (space/time) Trusted Front-end (Guard) – does authentication and filtering Commutative Filters – • screen user’s requests, reformats, so that only appropriate data is returned Design of Multilevel Secure Databases Distributed (federated) database • Trusted front-end controls access to two DBMSs – one for high-sensitivity data and one for low-sensitivity data • Very complex Window/View • Subset of a database containing exactly the information that the user is entitled to access