Security When Collaborating Trust and Security in Biological Databases: Gio Wiederhold

advertisement
Trust and Security in Biological Databases:
Security When Collaborating.
Gio Wiederhold
Depts. of Computer Science, Electr. Eng. and Medicine
Stanford University, CA.
www-db.stanford.edu/people/gio.html
Four related points will be made, they are not primarily technological,
since the the majority of failures we experience in protecting privacy are
caused by misunderstanding of settings and objectives.
•
•
•
•
Protection of Privacy requires checking what goes OUT.
Access control mechanisms only keep bad guys from getting IN.
In bioinformatics and medicine there are many types of collaborators.
Collaborators are allowed in, but what they take out must be controlled.
7/26/2016
Gio AAAS 04
1
1 & 3. Protection of Privacy requires checking what goes OUT.
Privacy requires that data considered private do not fall into inappropriate or
public hands.
Private data resides in a variety of systems used by a variety of collaborators
a.
b.
c.
d.
e.
Medical record systems - holistic
Drug toxicity and effectiveness studies
Hospital and clinic admission records
Financial records
Billing and payment information
Caregivers and researchers
Researchers and pharmas
Caregivers and managers
Managers and accountants
Accountants and payors
7/26/2016
Patient
Physician
Gio AAAS 04
CDC
Accreditation
Clinics
Pharmacy
Insurance Carriers
Inpatient
Laboratory staff
Accounting
Billing
The complexity
of usage is such
that imposing a
fine-grained cell
structure is not
practical for the
information providers
Laboratory
Those participants have legitimate access rights.
They don’t have the right to reveal the information they need.
They don’t have the right to read related information.
Ward
staff
Etc..
2
2. Access control mechanisms only keep bad guys from getting IN.
Current Solution
• Keep bad guys OUT
– Access control requires authentication and authorization
– Collaborators and Customers get into authorized areas
• Once they are IN no further checking occurs in computer systems
– Further checking is done when physical assets are protected
• Examples: warehouses, even warehouse stores:
Access control
Collected
contents
Not done now in
computer systems
Release filter
Why this omission?
Privacy is entrusted to security specialists and surrogates
• Cryptographers: important tools, but serves binary
settings
• Database administrators: valued for making data available
• Network administrators: keep accessibility by users
7/26/2016
Gio AAAS 04
3
4. Collaborators are allowed in,
but what they take out must be controlled. Release filter
Solution
•
Symmetric checking of access to information systems and
also the subsequent release of their contents
– Act like a warehouse store
o
Check and/or remove restricted topics in outgoing documents
a.
b.
c.
o
Researchers: Names, employers, addresses, emails, . . .
Payors: other incidents, prior diseases, admissions, . . .
. . . : check specific contents for each collaborating & authorized role
Better: check that all terms in outgoing documents are acceptable

Use a topic-specific inclusive word/phrase list, and filter others
o

Paranoia is safest, and the cost is bearable
o
o
most application / usage areas use less than 3000 terms
Trapped documents can be released by a security officer
Extract text from images, as x-rays, and then check those texts
o
7/26/2016
Many media contain unexpected private or identifying data
Gio AAAS 04
4
Release checking can also protect privacy
in commercial domains
Release filter
Customers are collaborators; you want customers IN, not OUT
In simple and perfect systems they cannot access private areas, but
 System failures - trap doors, etc. abound
o Release checking provides a backstop and intrusion detection
 Updates for customer convenience create unexpected interactions
o Helpful query modification broadens access
 New usages were not foreseen during design partitioning
o Customer access to inventory for rapid supply-line verification
o New, unthought of collaborators -- Russians in Kosovo Techniques
Techniques -- much content has signatures that are (nearly) unique
o Check to stop credit card numbers in outgoing data, as from music sites
o Check to stop email addresses in outgoing reports
• Don’t rely exclusively on access control when the objective is to
protect release of private information !
7/26/2016
Gio AAAS 04
5
Abstract: Security when Collaborating
Panel presentation on “Trust and Security in Biological Databases”; Gio Wiederhold, Ph.D, Stanford University, CA
Traditional security mechanisms have focused on access control, assuming that we can distinguish the good and the
bad guys, and can label any data collection as being accessible to the good guys. If those assumptions hold the technology is
conceptually simple, and only made hard by technical faults. However, there are many practical situations where such sharp
distinctions cannot be made, so that the technologies developed to solve access control become inadequate. In medicine, but
also in many commercial data collections we find unstructured data. Such data are collected and stored without the submitter
being fully aware of their future use and hence unable to consider all future access needs. A complementary technology to
augment access control is result filtering: namely inspecting the contents of documents before they leave the boundary of the
protected system.
I will briefly cite the issue in two settings, one simple and one more complex. Military documents have long been
classified into mandatory and discretionary classifications. Legitimate accessors are identified with respect to those
categories. But when a new situation arises, the old labels are inadequate. When we had to share information with the
Russians in Kosovo, no adequate labeling existed. Relabeling all stored documents was clearly impractical. A filter can be
written to check the text for limited, locally relevant contents, and make those available. Any document containing
unrecognized noun-phrases would be withheld, or could be handed over to a security officer for manual processing.
More complex situations occurs when we have statistical data, as census, or, as in bioinformatics, phenotypic and
genomic data. We want to prevent the release of statistical summaries for cells that have fewer than 10 instances say, to
reduce the likelihood of inference back to an individual. If we use access control, we have to precompute the minima for
columns and rows and aggregate their categorizations for access to prevent release. However, the distributions in those cells
is very uneven. So if we check the actual contents at the time of release, we can allow much smaller categories to be used for
access and only omit or aggregate cells that are too small.
Checking results being released can also provide a barrier for credit card theft and the like. If a person who
masquerades as a customer locates a trapdoor and removes 10,000 credit cards instead of an MP3 tune, that can easily be
recognized, since those data have very different signatures.
In summary, many of our accessors are collaborators or customers, although we know little about them. We want to
give them the best possible service, and still protect our property or the privacy that individuals are trusting us to keep.
Focusing only on access control, and then not checking what is released is an inadequate, even a naive approach for systems
involving collaboration.
Research leading to these concepts and supporting technologies was supported by NSF under the HPCC and DL2 programs
7/26/2016
Gio AAAS 04
6
Trust and Security in Biological Databases: Brief Biography
Security when Collaborating; Gio Wiederhold, Stanford University, CA
Gio Wiederhold is an emeritus professor of Computer Science,
Medicine and Electrical Engineering at Stanford University. Since 1976
he has supervised 33 PhD theses in these departments. Currently Gio is
continuing part time at Stanford and consulting. He still has seminars
on Business on the Internet and and on Genome databases. Research
being disseminated includes privacy protection in collaborative settings,
large-scale software composition, enabling interoperation of
semantically heterogeneous information systems, including simulations
for projecting outcomes. His consulting now focuses on valuation of
intellectual property inherent in software.
Gio Wiederhold was born in Italy, received a degree in Aeronautical
Engineering in Holland in 1957 and a PhD in Medical Information
Science from the University of California at San Francisco in 1976.
Prior to his academic career he spent 16 years in the software industry.
Wiederhold has authored and coauthored more than 350 publications
and reports on computing and medicine. He spent 1991-1994 in
Washington as a program manager at DARPA. Wiederhold has been
elected fellow of the ACMI, the IEEE, and the ACM. His web page is
http://www-db.stanford.edu/people/gio.html.
Information about protection the release of private information can be
found at http://www-db.stanford.edu/pub/gio/TIHI/TIHI.html
7/26/2016
Gio AAAS 04
7
Download