Collecting and Protecting Sensitive Data in Research CU Morningside IRB Joyce Plaza MS, MBE, CIP 419 West 119 Street New York, NY 10027 212-851-7040 Fax: 212-851-7044 http://www.columbia.edu/cu/irb/ April 8, 2015 1 Objectives • Review the privacy and confidentiality protections criteria the IRB must consider • Provide definitions relevant for research data • Provide examples of the types of protections • To describe the specific protections used in the Guilamo-Ramos: High Use Alcohol Venues Study April 8, 2015 2 45 CFR 46, 21 CFR 56 Require the IRB to ensure that certain criteria are satisfied prior to approval of human subjects research. 3 §46.111/ 56.111 Criteria IRB shall determine that all of the following requirements are satisfied: (1) Risks to subjects are minimized (2) Risks to subjects are reasonable in relation to anticipated benefits, (3) Selection of subjects is equitable. (4) Informed consent will be sought from each prospective subject or the subject's legally authorized representative 4 (5) Informed consent will be appropriately documented (6) When appropriate, the research plan makes adequate provision for monitoring the data collected to ensure the safety of subjects (7) When appropriate, there are adequate provisions to protect the privacy of subjects and to maintain the confidentiality of data. 5 Submission Issues The manner in which data are collected, recorded and maintained are reviewed by the IRB and influence their determinations. April 8, 2015 6 Definition Identifiable data • Any information about a living individual that is linked, associated with, or contains the name or any details of the individual that would allow someone to be able to directly or indirectly identify a subject from the information collected. April 8, 2015 7 Definition Sensitive Data with Potential High Risk • Information about a living individual that would potentially cause serious risk or harm to a subject if there was a breach of confidentiality (e.g., Social Security numbers, HIV status, substance abuse, criminal activity, negligence in the work place, etc.) 8 IRB Terminology Related to Data • De-identified – identifiers have been removed from the dataset in a manner that any member of the research team is not able to identify the individual from whom such information was collected. • Coded – identifiers have been removed from the dataset but can readily be found through the use of a master list that is accessible to the investigator. 9 IRB Terminology Related to Data Anonymous vs. Confidential • Anonymous –any information about a living individual that was collected in a manner that identifiers were never associated with the information and that no one was ever able to identify from whom the information was collected. • Confidential (is not anonymous) - protection of study participants’ data such that an individual participant’s data is protected and will not be disclosed except to another authorized person. 10 Definition of Privacy “The quality of being secluded from the presence or view of others.” “Refers to a person’s desire to control the access of others to themselves.” (from http://www.research.uky.edu/ori/ORIForms/32-Privacy-vsConfidentiality.pdf) • Is there a risk to subjects’ privacy when collecting the data? 11 Is there risk to privacy when recruiting? Guilamo-Ramos: High Use Alcohol Venues Study - recruited in alcohol-use venues; adults unable to participate in screening or take a detailed flyer due to privacy concerns were provided with a card only containing contact information for the study. 12 Privacy considerations when collecting sensitive data • Are interviews conducted in a private location? • Are subjects reminded that they do not have to answer any questions they do not want? • Focus Groups: Are focus group participants reminded that they should also keep the discussion confidential. 13 Guilamo-Ramos: High Use Alcohol Venues Study: Conducted interviews in the home or at a neutral site chosen by the subject. 14 Definition of Confidentiality • “Discretion in keeping information secret” Refers to the researcher’s handling of the subject’s identifiable private information. • Is there a risk of a breach of confidentiality at any time during study procedures? All data that can potentially cause harm to subjects upon a breach should have direct identifiers of the subjects replaced with a code. 15 Coded Data • The link that cross-references the subject’s identity with the code should be stored in a separate location from the data and should be locked. • Consideration should be given by the Principal Investigator as to how many and which staff should have access to the link. Limiting the number of staff who have access to the link should be considered for more sensitive highrisk data. 16 Data Protection Plans Any data that will be collected for research purposes that is considered to pose risk or harm to subjects upon a breach of confidentiality should have the data protected for a potential breach. The methods or processes for protecting the confidentiality of the data should be proportionate to the level of potential risk of the study. 17 Ensure that all study data is protected • Any other data that is collected during the course of a research study, such as that involving the regulatory or financial management of the study, must also be stored in a secure manner. 18 Guilamo-Ramos: High Use Alcohol Venues Study: • On consent forms, tapes, transcripts and surveys, subjects were identified by a random code number only. 19 Anonymous data Guilamo-Ramos: High Use Alcohol Venues Study: collected “refusal bias information” that did not contain identifiers. 20 Guilamo-Ramos: High Use Alcohol Venues Study Links to the subject codes were kept in locked files on a password protected computer. 21 Guilamo-Ramos: High Use Alcohol Venues Study: Personnel Training All project staff were required to complete certain levels of training ( 40 hours) before they were granted access to the codes. This included training established by the Dominican and International organizations on the protection of human subjects. 22 Guilamo-Ramos: High Use Alcohol Venues Study Personnel signed confidentiality statements requiring reporting breach of confidentiality to PI. Training included data safety, confidentiality of participants, limits of confidentiality and proper administration of the protocol. 23 Storage of Research Data: Paper files • Consider separating data files from consent forms. • Recommend that paper records containing research data should be stored in a locked cabinet with access limited to research personnel. • The level of security and restriction should increase depending on the level of sensitive data being captured in the research records. 24 Computerization of Data • Electronic records containing research data should be maintained on password-protected devices with access limited to research personnel. The level of security and restriction (i.e., encryption, hashing, etc.) should increase depending on the level of sensitive data being captured in the electronic research records. 25 Patient Data: CUMC Policy • CUMC Information Security Policies require that all portable data files stored on USB, CD/DVD, and mobile laptops that include PHI be *encrypted* and *password-protected* at all times. 26 Breach of Confidentiality The three biggest sources of a breach of data stored electronically: • Laptops • USB drives • Web sites 27 Transferring data • Electronic transfer: encryption needed • All electronic transmission of patient information over the Internet must be *encrypted*. This includes email, file transfers and other data transfer modalities. • Paper transfer: transferred by snail mail, fed-ex, hand carried by member of the study team? Data transfer needs to be protected from a breach (e.g., data transferred separately from consent forms, codes). 28 Guilamo-Ramos: High Use Alcohol Venues Study: • Data transferred electronically from the Dominican Republic to the US were stripped of identifiers and contained only code numbers. 29 Guilamo-Ramos: High Use Alcohol Venues Study: • Study team identified the most serious risk as the potential loss of confidentiality. • Participants were notified of the confidentiality procedures in the informed consent. • Procedure for notifying the IRB of any adverse events was included in the Study Description. 30 Guilamo-Ramos: High Use Alcohol Venues Study: Collection of Private Health Information also required HIPAA Form A. 31 HIPAA The Health Information Technology for Economic and Clinical Health Act (HITECH) Act part of the American Recovery and Reinvestment Act (ARRA) of 2009, has established new notification requirements to report the loss or theft of patient information (Protected Health Information - PHI) that is not protected by encryption. These requirements apply in both the clinical and research context. 32 Archiving and Long-Term Storage of Research Data Data protection plans must consider all record-keeping processes and storage of data from the initial collection to poststudy storage or destruction or complete de-identification of the data. Such plans should include details to all modes of storage: paper, electronic, video/audio recordings, films, etc. 33 Audio/Video Recording of Data Recordings and Transcriptions Guilamo-Ramos: High Use Alcohol Venues Study: after the audio recorded interviews were transcribed, the recordings were destroyed. Participants were not identified by name on the transcripts. 34 Physical Security of Data • Computer located in a secure location (e.g. a locked office) • Who has access to this office • Paper files – are they in a locked file cabinet • Identifying codes and data kept separately • Transcripts contain identifiers • Will identifiers be destroyed anytime 35 Secondary Data Requires IRB review if it contains private identifiable information (either direct identifiers or indirect identifiers) If the data is sensitive, confidentiality procedures are required. 36 Social Security Numbers New York has enacted legislation to protect the confidentiality of social security numbers (SSNs). The "NY Social Security Number Protection Law " which became effective on January 1, 2008 imposes harsh penalties on organizations that failed to protect the confidentiality of Social Security numbers that they have collected and stored. 37 Generally, SSNs should not be collected unless permitted by Columbia policy Any plan to collect social security numbers (SSN) for research purposes must be submitted and approved by the IRB prior to such collection. The submission must include a justification for the collection of SSNs and provide the following: • an explanation of how and where the SSNs will be stored; • who will have access to the data; • the plan to protect the confidentiality and security of the data. 38 Certificates of Confidentiality • To protect the confidentiality of sensitive higherrisk data obtain a Certificate of Confidentiality (CoC) issued by the National Institutes of Health (NIH), as well as other HHS agencies to protect identifiable research information from forced disclosure. • Allows the investigator to refuse to disclose identifying information on research participants in any civil, criminal, administrative, legislative, or other proceeding, whether at the federal, state, or local level. 39 Study Description: Document Privacy Protections Describe how subject privacy will be protected, and the limits to protection. Protections should cover (e.g.,) screening activities, HIPAA provisions, forums such as focus groups where private information may be shared, and recordings of research activities, as applicable. Limitations such as compelled disclosure and mandatory reporting should also be described. 40 Study Description: Document Data and Safety Monitoring Describe how data and safety will be monitored locally to identify unanticipated problems (e.g., events, outcomes, or occurrences that are unexpected, at least possibly related to the research, and suggest an increase in risk of harm to subjects or others). 41 Study Description: Document Potential Risks Describe potential risks including data on risks that have been encountered in past studies. 42 Study Description: Document Confidentiality of Study Data Describe how this will be maintained (if it is to be maintained) locally, and during transmission to another site, if applicable. Include a clear description of how data will be stored, specifically indicating whether data will contain direct or indirect identifiers. Describe protections related to accessing the study data, whether in an electronic or paper form. 43 Publication of Research Results • Any publication of research results must be done in a manner in which subjects cannot be identified unless expressed written permission has been provided by the subject(s). 44 Summary: Collecting Sensitive Data • Identify all risks to privacy/confidentiality • Devise a comprehensive plan of protections • Document the details for the IRB • Train study personnel • Monitor the data until the identifiable data is discarded or complete de-identification of the data. 45 Questions? Contact the IRB Offices CUMC IRB For contact information see: http://www.cumc.columbia.edu/dept/irb or call 212 305‐5883 Morningside IRB For contact information see: http://www.columbia.edu/cu/irb/ or call 212-851-7040 46