Gio CS Forum Oct01-1 TIHI: Protecting Information when Access is Granted for Collaboration Gio Wiederhold 1. Stanford University CSD (mostly) www-db.stanford.edu/people/gio.html 2. Symmetric Security Technologies www.2ST.com Gio CS Forum Oct01-2 Information for Collaboration Medical Records Insurance Company Medical Records Medical Researchers Manufacturer’s Specs Subcontractor Business Vendor Content Customer Operational Data Logistics Provider Intelligence Data Front-line soldier Strategic Data Allied Forces Gio CS Forum Oct01-3 Laboratory staff Clinics Laboratory Accounting Accreditation Access Patterns versus Data: Patient Physician Pharmacy Inpatient Billing Insurance Carriers Ward staff Etc.. CDC Gio Wiederhold TIHI Oct96 3 Gio CS Forum Oct01-4 Primitive and Safe: Isolation • No communication among disjoint systems • All sharing of information by data re-entry Discretionary security airgaps Mandatory security Gio CS Forum Oct01-5 Automation of Sharing • Multi-level secure (MLS) system – Involves OS and DBMS – Programmed read up – write down permitted – Complex – hard and lenghty (1y+) to validate Gio CS Forum Oct01-6 MLS problem: inconsistency • Information at each level is incomplete – Make up cover stories ? • Ok for enemies • Not acceptable for our own staff/soldiers Secret | Secret Gio CS Forum Oct01-7 Multi-computer system approach • Uses more computers – are cheap now • Secure communication – typically manually monitored • Avoids complexity, lags of MLS systems – Validation in communication portals Gio CS Forum Oct01-8 Security and Cryptography • Encryption is essential – Hides information from enemies – Isolates layers from each other – Allows shared use of communication paths • Encryption is not the solution, only a tool – – – – Isolated data do not provide information Software processes clear data Software is too large, dynamic to validate timely 95% of failures are people failures No obvious solution: new thinking needed Gio CS Forum Oct01-9 False Assumption Data in the files of an enterprise are organized according to external access rights Inefficient and risky for an enterprise which uses information mainly internally and then must serve external needs Gio CS Forum Oct0110 The Gap: Assumption that Access right = Retrievable data • Access rights assume a certain partitioning of data • Enterprise data are partitioned for internal needs • Partitions only match in simple cases/artificial examples firewall customer result query authentication database access & authorization agent data sources are rarely perfectly matched to all access rights Gio CS Forum Oct0111 Technical Access Problems: Military More direct connectivity creates risks `disintermediation’ Query can not specify object precisely `Causes for low unit readiness?’ (helpful database gets extra stuff) Objects (N) are not organized according to all possible access classifications (a) = (Na) `Problems with ship propulsion, but not propellers Some objects cover multiple classes `Units in Persian Gulf?’ Some objects are misfiled (happens easily to others), costly/impossible to guarantee avoidance Intel data in operational mission file Gio CS Forum Oct0112 Technical Access Problems: Health Care Query do not specify object precisely Relevant history for low-weight births (helpful database gets extra stuff) Objects (N) are not organized according to all possible access classifications (a) = (Na) Nursing hierarchy by bed and ward Infectious disease hierarchy by risk Some objects cover multiple classes Patient with stroke and HIV Some objects are misfiled (happens easily to others), costly/impossible to guarantee avoidance Psychiatric data in patient with alcoholism Gio CS Forum Oct0113 Access Rights/Needs Overlap NCA C O T S Logistics Intel JC Warfighters Allies PR Gio CS Forum Oct0114 Security Objective in Collaboration? Prevent Inappropriate Disclosure of Information! differs from preventing access to computers and information, as is needed to protect from invaders and hackers ACCESS CONTROL is based on Metadata Descriptions and labels, set a priori, are checked RELEASE CONTROL also sees contents Works also when metadata cannot / does not adequately describe content information Gio CS Forum Oct0115 Dominant approach for Data • Authenticate Customer in Firewall • Validate query against database schema • If both O.K., process query and ship results customer firewall result query sources authentication database access & authorization agent Gio CS Forum Oct0116 Today: Many Coalitions Foreign: NATO, +, British, French, Kosovo IFOR, ... • Each has its own, intersecting requirement • Discretionary access at lower levels – Policies for dozens of countries controlling release of Data and Metadata • Many duplicated systems – High rate of information transfer among them – Excessive load creates high error rates – Difficult to protect from hackers and enemies Gio CS Forum Oct0117 Changing Security Protection Yesterday Internal Focus Access is granted to employees only Today External Focus Payors, suppliers, customers and trusted prospects all need some form of access Centralized assets Applications and data are centralized in fortified IT bunkers Distributed assets Applications and data are distributed across servers, locations, and business units Prevent losses The goal of security is to protect against confidentiality breaches Generate revenue The goal of security is to enable e-Commerce & collaboration IT control DB/Network manager decides who gets access Local control Functional units need the authority to grant access Gio CS Forum Oct0118 Access right = Retrievable data • Access rights assume a certain partitioning of data • Domain data are partitioned accord to internal needs • They only match in simple cases / artificial examples firewall customer result query authentication database access & authorization agent data sources are rarely perfectly matched to all access rights Gio CS Forum Oct0119 Symmetric Solution Symmetric checking both access to data and the subsequent release of data • Access Control with authentication and authorization of collaborators upon entry • Content-based release filtering of data when exiting the secure parameter Gio CS Forum Oct0120 Filling the Gap Check the content of the result before it leaves the firewall result Security mediator : Human & software agent module query firewall Gio CS Forum Oct0121 Security Mediator • Dedicated hardware plus software module, intermediate between "customers" and databases within firewall • A modern tool for the security officer accessed via firewall protection by customers (or collaborators) with assigned roles • Managed by the security officer, via simple security-specific rules that match filters to roles • Performs symmetric screening (queries and results) Gio CS Forum Oct0122 Result Checking is understood and performed today in many non-computerized settings: • Briefcases are inspected when leaving secure facilities • Computers can not be taken (in nor) out of SCIFs • Vehicles are inspected also on exiting warehouses with valuable contents Computer security system requirements have been modeled poorly wrt such practice Gio CS Forum Oct0123 Overall Schematic Firewall External Customer Security Officer's Mediator System Database Internal Customer Network Gio CS Forum Oct0124 Hardware • Computer workstation – UNIX and NT implementation – external access through firewall ? firewall can provide authentication – internal access to database(s) that contain releasable information ? multi (two)-level security provision – internal storage, inside firewall: • rules defining cliques - external roles • log of accepted and denied requests • mediator software Gio CS Forum Oct0125 Software Components C++ and Java implementations service maintenance support • Rule interpreter • Primitives to support rule execution • • • • • Rule maintenance tools Log analysis tool Firewall interface Domain database interface Logger Gio CS Forum Oct0126 Rule Processing Features: • Paranoia: Every applicable rule must be enforced for a query to be successful or a result to be releasable, else process by the security officer (SO) • Default: If no rule applies rules then process by SO • SO can pass, reject, or edit queries and results • SO may inform customer, mediator software will not • All queries and results, successful or not, are logged for audit • Rules are stored within the mediator, with exclusive security access by the SO Gio Wiederhold TIHI Oct96 26 Gio CS Forum Oct0127 The Rule Language Goals: • Simple and easy to formulate by the SO • Easy to enter and observe into the system • Employs a collection of primitive functions to provide comprehensive and adequate security • Functions can exploit views in RDBMS • Some rule functions provide text validation • Some functions may need domain knowledge – Functions to process manufacturing designs – Functions to extract text from images Gio CS Forum Oct0128 Rule Organization • Rules are categorized as: – SET-UP (Maintenance) – PRE-QUERY – POST-PROCESSING • External, authenticated users are grouped into Cliques to simplify rule management • Tables and their columns are grouped into segments to simplify access mgmnt • Rules use primitives supplied by specialists Gio CS Forum Oct0129 Primitives Selected by rule for various clique roles • Allow / disallow values • Allow / disallow value ranges • Limit results to approved good-word lists • Disallow output containing bad words • Limit output to specified times, places • Limit number of queries per period • Can augment queries for result filtering • Etc. Gio CS Forum Oct0130 Content primitives tested in TIHI* *NSF/NIH funded HPCC projects • Check against good-word dictionary – dictionary created by processing ok records • Check against a bad word dictionary – less paranoid, less secure, used by Net-nanny etc. • Check for seeded entries in high value files – password files, • Check for patterns in personal data – credit cards, email addresses • Check cell count in statistical results – at query time append COUNT request • Extraction of text from images – for further filtering Gio CS Forum Oct0131 Creating Wordlists TIHI is Paranoid • Result filtering primarily based on Good-word lists – Created by processing examples of O.K. responses – Augmented dynamically by terms found objectionable by system, but approved by security officer • Current work – Image filtering, to omit and extract text from images • Possible future work – use nounphrases to increase specificity Gio CS Forum Oct0132 Filtering of text Not perfect: • Words out-of-context can pass the filter • ophtamology: don’t pass names: Iris Smith – Risk reduces rapidly with multiple words • Can never have all good-words in list – Load for security officer -- seek a balance • Cost: all of contents must be processed – Good technology from spell checkers – Domain-specific word-lists are modest in size Gio CS Forum Oct0133 Rules implement policy • Tight security policy: – – – – – simple rules many requests/responses referred to security officer much information output denied by security officer low risk poor public and community physician relations • Liberal but careful security policy – – – – – complex rules few requests/responses referred to security officer of remainder, much information output denied by security officer low risk good public and community physician relations • Sloppy security policy – – – – – simple rules few requests/responses referred to security officer little information output denied by security officer high risk unpredictable public and community physician relations Gio CS Forum Oct0134 Security requires attention • Security officer’s focus is security :-( – not for a computer system designer, – nor database or network administrator, – nor for management. • Having and owning the tool enables the role • Security mediator provides logging for – focused audit trail – system improvements – accountability • Must be able to deal effectively with exceptions, else encourages bypassing security without logging. Gio CS Forum Oct0135 Responsibility Assignment :-) • Database administrator – Primary task: assure availability of data – Provides helpful services – broaden search: risk :-| • Network administrator – Primary task: keep network running: transparent :-| • System administrator – Buys glossy product to escape responsibility :-( • Security officer – Not in loop, no tools – Investigates violations, takes blame for failures Needs tools as well Gio CS Forum Oct0136 Coverage of Access Paths Security officer :-( Authentication based good/bad control prior use good guy Security Mediator security needs -) Database oo administrator good query DB schemabased ok control ancillary information validated to be ok history result is likely ok processable query performance, function requests Database Gio Wiederhold TIHI Oct96 36 Gio CS Forum Oct0137 Rule system • Optional: without rules every interaction goes to the security officer (in & out) • Creates efficiency: routine requests will be covered by rules: 80%instances / 20%types • Gives control to Security officer: rules can be incrementally added/deleted/analyzed • Primitives simplify rule specification: source, transmit date/time, prior request, ... Gio CS Forum Oct0138 Benign and ID areas in an X-ray Integrated IDs are crucial for practice (40% of X-rays are lost) Paranoid: { Benign is defined positively a, value range b. good-word list else it is potentially bad } Gio CS Forum Oct0139 Application of Rules authenticated ID Query Parse Query Data Requestor Firewall failure External success error rule customer advice else edits ancillary information Execute Query SO results authenticated ID cleared results Results Query Checking else Result checking edits Gio CS Forum Oct0140 :-( Security Officer • Profile – Human responsible for database security/privacy policies – Must balance data availability vs. data security/privacy • Tasks (current) – Advises staff on how to try to follow policy – Investigates violations to find & correct staff failures – Has currently no computer-aided tools • Tasks (with mediators) – Defines and enters policy rules in security mediator – Monitors exceptions, especially violations – Monitors operation, to obtain feedback for improvements Gio CS Forum Oct0141 Roles :-( Security officer manages security policy, not a computer specialist or database administrator. -) oo Computer specialist provides tools agent workstation program for security mediation Enterprise / institution defines policies its security officer (SO) uses the program as the tool Tool formalizes system practices rules, managed by the SO define the practice Gio CS Forum Oct0142 Assigning the Responsibility Database Administrator :-) – Can create views limiting access in RDMSs – Prime role is to assure convenient data access Network Administrator – Prime responsibility is security & privacy protection – Implements security policy – Interacts with database & network administrators :-( Specialist Security Officer :-| – Can restrict incoming and outgoing IP addresses – Prime role is to keep network up and connected to the Internet Gio CS Forum Oct0143 Hypothetical benefits: Prevents 1. Secure data are inadvertently shipped to insecure backup by trusted user 2. HIV symptoms shown to cardiac researcher 3. US managers obtains EU-restricted personnel data 4. Misclassified data are released at low level 5. Credit card numbers were released when false customer appears to get an MP3 song 6. Passwords transmitted to hacker when access control failed Multiple Internal sources are covered External Requestors original request Firewall certified Security Mediator S.O. certified query Integrating Mediator Internal Requestors result Logs unfiltered result Protected, Shared Databases Gio CS Forum Oct0145 Implementations • UNIX prototype • UNIX - Java at Incyte Corporation [SST] – protect medical & genomic information • NT - Java development system • Primitives for Drawings, as Aircraft Specs • Trusted Image Dissemination • wavelet-based decomposition to locate texts, • extract for OCR • blank text frequency if not found in good rules Gio CS Forum Oct0146 Effective Settings • External access is a modest fraction of total use collaboration, government oversight, safety monitoring • Restructuring internal partitioning would induce significant inefficiencies for example: Hospital: MD/patients vs. research/insurance • Errors are seriously embarrassing in practice 2-5% of data are misfiled, doing better is costly • Locus of control is needed Security officer cannot trust/control DB / network admin’s Gio CS Forum Oct0147 Intrusion detection – two-level Model of normal behavior Observations, initial, continuing Compare Events Monitor Assess Stop Stream of information Gio CS Forum Oct0148 TIHI Summary Avoids the -- often false -- assumption that access rights match data organization Collaboration is an underemphasized issue beyond encrypted transmits, firewalls, passwords, authentication There is a need for flexible, selective access to data without the risk of exposing related information in an enterprise In TIHI service is provided by the Security Mediator: a rule-based gateway processor of queries and results under control of a security officer who implements enterprise policies Our solution has been applied to Healthcare also relevant to Collaborating (virtual) enterprises in many Military situations. and Gio Wiederhold TIHI Oct96 48 Gio CS Forum Oct0149 Security Mediator Benefits • Dedicated to security task (may be multi-level secure) • Uses only its rules and relevant function, all directly, avoids interaction with DB views and procedures • Maintained by responsible authority: the security officer • Policy setting independent of database(s) and DBA(s) • Logs just those transactions that penetrate the firewall, records attempted violations independent of DB logs* • Systems behind firewall need not be multi-level secure • Databases behind firewall need not be perfect * also used for replication, recovery, warehousing Gio CS Forum Oct0150 Backup Gio CS Forum Oct0151 Security officer screen Gio CS Forum Oct0152 Patient's own data screen Gio CS Forum Oct0153 part of Patient result Gio CS Forum Oct0154 Disallowed result Gio CS Forum Oct0155 Security officer reaction Choices: 1. Reject result 2. Edit result 3. Pass result (& Update the list of good-words, making approval persistent ) Gio CS Forum Oct0156 Security Table Definition... (continued) Security Function Object Name Object Value Validate_text table.column invalid_words Min_Rows_Retrieved ALL/clique Num_Queries_Segment ALL/segment Query_Intersection_Clique ALL/clique Query_Intersection_Segment ALL/segment Secure_Keyword_Clique ALL/clique Secure_Keyword_Segment ALL/segment Session_Time ALL/clique User_Hours_Start ALL/clique User_Hours_End ALL/clique Segment_Hours_Start ALL/segment Segment_Hours_End ALL/segment Limit_Function_Clique ALL/clique integer integer integer integer keyword keyword TIME start_time end_time start_time end_time function_name Gio Wiederhold TIHI Oct96 56 Gio CS Forum Oct0157 Rule application - Overview • Does customer belong to a clique? If yes, switch to it • Does the customer clique satisfy all pre-query rules? (e.g., Session_Start, Stat_Only, Queries_Per_session) • Do the columns and tables belong to a segment? • Does the query satisfy all pre-query rules? (e.g., valid segments) • Does query need re-phrasing or augmentation? (e.g., Stat_Only to detailed Select) • Send Query to appropriate Database (or mediator) • Does query result satisfy all post-query rules? (e.g. Min_Rows_Retrieved, Secure_Keyword_Clique) • Apply any result transformation rules (e.g. random falsification of data, aggregation) • Update log and internal statistics Gio Wiederhold TIHI Oct96 57 Gio CS Forum Oct0158 Implementation Set-up • Security Officer enters rules into a file • Rule file is parsed to generated SQL script to insert rows into the security_rules table • SQL script is executed against the database Gio Wiederhold TIHI Oct96 58 Gio CS Forum Oct0159 Implementation... (continued) Customer Session Loop • • • • • • Security Mediator Workstation accepts the customer query, logs it, and passes control to the Security Mediator Software (SMS) SMS reads the security_rules table and calls many different modules (sub-routines) to validate the query (pre-query checks) If okay, SMS executes the query (Embedded SQL calls) Mediator Workstation gets results from the database and calls other SMS modules to perform the post-query checks If all checks are passed, the Mediator Workstation logs and returns results; awaits another invocation Result is accepted by customer and used or displayed Gio Wiederhold TIHI Oct96 59 Gio CS Forum Oct0160 System Operations • Customer connects remotely, via firewall for authentication, to security officer's machine • Clique membership is assessed • System prompts customer for query • Query is parsed and validated against rules • Validated query is sent to database system • Results are retrieved and validated against rules • Validated results are made available to customer Gio Wiederhold TIHI Oct96 60 Gio CS Forum Oct0161 Benign and ID areas in an X-ray Integrated IDs are crucial for practice (40% of X-rays are lost) Paranoid: { Benign is defined positively a, value range b. good-word list else it is potentially bad } Gio CS Forum Oct0162 Processing Flow Gio CS Forum Oct0163 Source X-ray image Whitened to protect privacy for this presentation Gio CS Forum Oct0164 Wavelet decomposition Gio CS Forum Oct0165 Candidate Text areas Gio CS Forum Oct0166 Extracted textual fields Blackened to protect privacy for this presentation Gio CS Forum Oct0167 OCR conversion & analysis Name Not in good-list Not approved Error in OCR Not in good-list Not approved Gio CS Forum Oct0168 Reconstituted image Identification area blurred by removing high frequency components Gio CS Forum Oct0169 Removal of Ident’s from an MRI Image Gio CS Forum Oct0170 Chest X-ray Gio CS Forum Oct0171