Sensitive Data and Multilevel Database Issues with narration

Sensitive Data
Data that should not be made public
 What if some but not all of the
elements of a DB are sensitive

• Inherently sensitive
• From a sensitive source
• Declared sensitive
• Part of a sensitive attribute or record
• Sensitive in relation to previously
disclosed information
Access Decisions
Need an access policy (programmed
into DBMS)
 Availability – blocking; permanent
blocking
 Acceptability of Access (sensitive
data)
 Assurance of Authenticity

Types of Disclosures

Exact Data

Bounds

Negative Results

Existence of Data

Probable Values
Security vs. Precision
Aim to protect all sensitive data
while revealing as much nonsensitive
data as possible
 Want to maintain perfect
confidentiality with maximum
precision

Inference
Way to infer / derive sensitive data
from nonsensitive data
 Direct Attack

• List NAME where SEX=M ^ DRUGS=1
• List NAME where (SEX=M ^ DRUGS=1)
v (SEX#M ^ SEX#F) v (DORM=AYRES)
Indirect Attack

Sum
• Show STUDENT-AID WHERE SEX=F ^
DORM=Grey

Count
• Show Count, STUDENT-AID WHERE SEX=M ^
DORM=Holmes
• List NAME where (SEX=M ^ DORM=Holmes)


Median
Tracker Attacks – using additional queries
that produce small results
Controls
Suppression – don’t provide
sensitive data
 Concealing – don’t provide actual
values (“close to”)
 Limited Response Suppression

• n-item k-percent rule eliminates low
frequency elements from being
displayed (may need to suppress
additional rows/columns)
Controls

Combined Results
• Sums
• Ranges
• Rounding
Random Sample
 Random Data Perturbation
 Query Analysis – “should the result
be provided”

Conclusion on the Inference
Problem

Suppress obviously sensitive
information

Track what the user knows

Disguise the data
Aggregation
Building sensitive results from less
sensitive inputs
 Data mining – process of sifting
through multiple databases and
correlating multiple data elements to
find useful information

Multilevel Databases

Differentiated Security
• Security of single element may be
different from security of other elements
• Two levels – sensitive and nonsensitive
are inadequate to represent some
security situations
• Security of an aggregate (sum, count,…)
may be different from security of the
individual elements

Granularity
Security Issues

Integrity
• *-property for access control
• Either process cleared at a high level cannot
write to a lower level or process must be a
“trusted process”

Confidentiality
• Different users at different levels may get
different query results
• Polyinstantiation – record can appear more
than once with different levels of
confidentiality
Proposals for Multilevel Security

Separation
• Partitioning – divide DB into separate
DBs with own level of sensitivity
• Encryption (time consuming)
• Integrity Lock – each data item contains
a sensitivity label and a checksum
Sensitivity label must be unforgeable,
unique, concealed
 Checksum must be unique
 Sensitivity lock

Design of Multilevel Secure
Databases
Integrity Lock – not efficient
(space/time)
 Trusted Front-end (Guard) – does
authentication and filtering
 Commutative Filters –

• screen user’s requests, reformats, so
that only appropriate data is returned
Design of Multilevel Secure
Databases

Distributed (federated) database
• Trusted front-end controls access to two
DBMSs – one for high-sensitivity data
and one for low-sensitivity data
• Very complex

Window/View
• Subset of a database containing exactly
the information that the user is entitled
to access