Offline Auditing for Privacy
Jeff Dwoskin, Bill Horne, Tomas Sander
Trusted Systems Laboratory
Princeton
© 2004 Hewlett-Packard Development Company, L.P.
The information contained herein is subject to change without notice
Why Auditing for Privacy?
Potential advantages
1.
–
Collect and analyze log data to detect privacy violations offline
May also work where enforcement doesn’t
2.
3.
•
•
•
Create trail of what happened to privacy sensitive data for
Documentation
Forensics
Demonstrate compliance with internal privacy policy
Watch the watchers
18 April 2020 page 2
Two challenges
1.
•
•
How can we audit for the benefit of privacy?
Privacy violation detection system functionality
Compliance functionality
2.
•
•
•
•
How can auditing itself be performed in a privacyfriendly and secure way.
Integrity
Encrypted storage
Pseudonymization and anonymization of audit file data
Etc.
18 April 2020 page 3
What can we collect?
• Data access
–
–
User, Application, Time, Data record accessed
•
Source
E.g. machine the request came from, internal/external etc.
–
•
Part of the data record itself
E.g. age of data record subject
–
•
Consent information present
Opt in, opt out
• Privacy sensitive activities
• Deletion of records
• Consequences
– e.g. alert issued, where enforcement inappropriate
18 April 2020 page 4
How can we analyze collected data?
• Against simple privacy policy rules
– (e.g., expressed in languages like EPAL)
• Have counters and collect statistics about behaviors that might be suspicious.
• Organize them into reports.
• Hope:
– Offline auditing can be more sophisticated due to lack of real-time requirements.
18 April 2020 page 5
What does HIPAA say about auditing?
We propose that audit control mechanisms be put in place to record and examine system activity. We adopt this requirement in the final rule.
18 April 2020 page 6
How is this interpreted?
• Create events
– creation of records that contain PHI
– import of records that contain
PHI
• Delete events
– user command to delete PHI
– automated command to delete PHI
•
–
–
Modify events
– editing of data re-association of data de-identifying of PHI
• View events
– access to PHI by any user
–
– export of PHI to digital media or network print or FAX of PHI
•
–
–
–
–
–
Non-PHI events user login & logout changes to user accounts detection of a virus network link failures changes to network security configuration
– etc..
18 April 2020 page 7
What kinds of things might you look for?
•
•
•
•
•
•
•
•
• access to PHI by anyone not directly related to the patients treatment, payment of healthcare operation access to information not corresponding to the role of the user access to PHI of VIPs or community figures access to records that have not been accessed in a long time access to PHI of an employee access to PHI or a terminated employee access to sensitive records such as psychiatric records access to PHI of minors data recorded without a corresponding order
18 April 2020 page 8
Pseudonymization
• Work by Flegel:
– Audit data is intercepted by a local pseudonymiser and then forwarded by syslog to remote hosts or stored
–
•
Pseudonymiser substitutes (predefined) identifying features (types of identifying info) by shares, generated via
Shamir’s secret sharing scheme.
Record encrypted under key K. K can be reconstructed if at least k shares are found.
18 April 2020 page 9
Further work on pseudonymization
• Anonymouse log file anonymiser: analysis possible, but anonymised data cannot be recovered
• Privacy enhanced IDS supports the recovery of pseudonymised info e.g. IDA, AID
18 April 2020 page 10
Searching encrypted log data
• Ex: public key based solutions:
–
•
IBE based solutions
Waters, Balfanz, Durfee, Smetters
• Boneh, Crescenszo, Ostrovsky,Persiano
• Idea:
– In Identity Based Encryption (IBE) every string can be used as a public key for encryption
– Corresponding decryption key supplied by key distribution center (KDC)
18 April 2020 page 11
Searching Encrypted Log Files II
•
1.
2.
3.
Encryption:
For each document m choose random sym. key K and encrypt m under K
For keywords w1,….wl in m encrypt (FLAG, K) with public keys w1...wl.
Store results c1, …cl with encrypted document.
•
1.
2.
3.
Keyword search:
For keyword w investigator request private key corresponding to w from KDC
For each doc m investigator attempts decryption of c1…cl
If FLAG is found, doc contains w and K is found.
18 April 2020 page 12