1 Research data management hands on activities Research data management from conception to publication ................................................................................................ 1 Background ......................................................................................................................................................................... 1 Procedures .......................................................................................................................................................................... 2 Case study ........................................................................................................................................................................... 2 Store and backup data ............................................................................................................................................................ 3 Activity 1 -- Ensure that I always have local copies ............................................................................................................ 3 Activity 2 -- Give myself a 99.999999% assured backup..................................................................................................... 3 Describe, document and organize data .................................................................................................................................. 9 Activity 3 -- Automatically share project documents, code and data across devices and collaborators. .......................... 9 Activity 4-- Track versions of code and data ..................................................................................................................... 12 Share data with colleagues ................................................................................................................................................... 18 Activity 5 -- Make my code and selected data available for others to use ....................................................................... 18 Activity 6 -- Create a data citation .................................................................................................................................... 19 Activity 7 -- Get a PURL to your data................................................................................................................................ 20 Plan for data management ................................................................................................................................................... 22 Activity 9 -- Data Management Planning: Use the DMPTool to write a data management plan .................................... 22 Research data management from conception to publication Background Ensuring that researchers do not lose data means employing data storage methods that 1) use multiple storage media, 2) employ bit-checking software to ensure data is not corrupted and 3) adheres to desired security and privacy objectives. This exercise takes us through the configuration and use of multiple examples of storage environments so you can select the best way to approach your own data storage needs. 1. Be able to identify types of storage environments (e.g. local disk, cloud-sync, disk snapshots/backups) 2. Understand the technical and policy issues surrounding the choice of storage environments 3. Be prepared to create your own data storage environment. Metadata and web services Erik Mitchell 2 Procedures For the purpose of this exercise we will work with three types of storage environments, local disk, cloud-synchronization services and disk/file versioning and backup systems. The following table outlines the features and limitations of each type of service. Type of service Local disk Example Hard Drive, flash drive, external USB disk Benefits High speed I/O, local availability, easy to secure physically Cloud-sync service and network drives Google Drive, Box, Dropbox, Amazon S3, Network shares High replication, built-in bitchecking tools (sometimes), cross-device availability, build-in sharing and security features Disk or file version/backup systems Time Machine (OSX), File History (Win 8), GitHub, Exversion Provides true snapshot that can be recovered/restored, can be automated, may include descriptive metadata, can preserve files as a group enabling batched recovery or access Limitations Most susceptible to "bitrot," risk of data loss if physical device is stolen, few services to manage versions Security and IRB issues are higher risk with network availability, Not a true backup unless implemented in a certain way, may require network connectivity for access Some solutions rely on local disk while others use cloud service, most technically complex to setup and administer, considerable overhead associated with proper administration Case study I have a set of data derived from a database that is not considered to be "protected" in the IRB sense of the word but came with licensing restrictions from the database vendor who runs the resource. The agreement I signed allows me to publish derivative analyses of the data but not the core dataset itself. In order to do the analysis I am designing my own algorithm using a set of programming scripts. My data management approach for this project is to 1) ensure I always have a copy of my source data, 2) track versions of my python script code alongside versions of the code output and 3) be able to publish the code and excerpts of the dataset that the code can process so that others can reproduce and build on my work. In addition, I want to be able to easily access these files from multiple machines and share the code and source files with my collaborators. I happen to know a lot about my storage options and want to experiment with a number of solutions, using different options for each step in the process. After reading an article on hard drive failure rates (http://bit.ly/hddfailure) for example that the Annualized Failure Rate (AFR) for hard drives is around 1.4% on average and that as drives age - the chance that they fail climbs considerably with AFRs growing to 8.6% in year 3 (http://bit.ly/hddfailure_afr). Considering my compliance requirements, my cost sensitivity and my interest in being 'in the cloud' as much as possible I decide to setup the following data management infrastructure: Metadata and web services Erik Mitchell 3 Step Service selected Activity Why? Time machine on my Mac Activity 1 Amazon S3 service Activity 2 Easy to setup, fast, and cheap storage for the 92% (see articles cited above) reliable storage. S3 is not as intuitive as Google Drive or Dropbox but it does automatic version control and bit-fixity checking Google Drive Activity 3 GitHub Activity 4 GitHub Activity 5 Dataverse Activity 6 -- Get a PURL to your data EZID Activity 7 Plan for data management -- Create a data management plan DMP Tool Activity 8 Store and back up data -- Ensure that I always have local copies -- Give myself a 99.999999% assured backup Describe document and organize data -- Automatically replicate the current version of project documents, code and data across devices and collaborators. -- Track versions of code and data Share data with colleagues -- Make my code and selected data available for others to use -- Create a data citation Google Drive is easy to use and crossdevice compatible. It includes version control and allows me to set detailed permissions. GitHub performs true version control over code and forces me to create good metadata. I can even leverage the GitHub wiki and other collaboration features All I have to do is make the right version of the data public! Keeps my data someplace even after I delete or abandon my GitHub account I have my data in GitHub or some other repository but I need a permanent link to it I need a tool to help me create a data management plan for a grant submission Store and backup data Activity 1 -- Ensure that I always have local copies While setting up time machine or Windows File History is not too difficult it is hard to do in this workshop given our lack of external drives! Try out the OSX tutorial (http://support.apple.com/kb/ht1427), the Lifehacker article on Windows backup options (http://bit.ly/windowsfilehistory) or this Ubuntu guide for backups in Linux (https://help.ubuntu.com/community/BackupYourSystem). Activity 2 -- Give myself a 99.999999% assured backup While I am confident in my everyday time machine backup I also want to make sure that I have certain files stored in a secure and positively backed up space. In order to do this I am going to manually upload my dataset into Amazon S3. Amazon S3 (Simple Storage Service) stores files in a system with some important features including 1) automatic file replication across storage media and geographic regions, 2) bit fixity checking to make sure the file you upload is the file you download and 3) file version history (called logging). We will set each of these services up. Metadata and web services Erik Mitchell 4 1. Login to Amazon using the credentials for this class (note all files and credentials will be deleted at the end of the workshop) a. Login URL: https://researchbench.signin.aws.amazon.com/console b. User: ask instructor c. Password: ask instructor 2. Once you login you should see the main console: a. 3. Click on "S3" The S3 console allows you to create file folders, called Buckets in S3 and to upload files one by one. There are also programmatic tools in the Amazon AWS API toolkit to automate this as well as S3 integrated secure file transfer programs (SFTP) like Cyberduck (see below). 4. Click on "Create bucket" and name the bucket with the convention yyyy_dlab _yourinitials (e.g. 2013_dlab_workshop_etm). 5. Click on "Set Up Logging" and fill out the logging setup using the bucket you created as your target bucket. 6. Click "Create" 7. Look through the properties screen once the bucket has been created and see if you can answer the following questions. Metadata and web services Erik Mitchell 5 a. 8. Questions: a. Can I turn my bucket into a website? b. What is Amazon Glacier and how does it relate to S3? c. What permission options do I have? d. What is logging? 9. Let's turn on versioning so that we can track changes to our files over time. If you are not already looking at the properties for your bucket click on the "Properties" button and click on "Versioning." Click on "Enable Versioning" and click ok. 10. Before we upload a file lets create one. Open your favorite text editor and create a text file with the content "This is my data file. It has version 1.0." Save the file with the file name data_file.txt 11. Let's upload a file to our bucket. Click on your bucket name on the left hand side of the screen. Once in your bucket click on "Actions >> Upload." a. 12. Click on Add Files and add the data file you just created 13. Click on Start Upload 14. Once your file is uploaded, take a look at it. Right click on the file name and choose "Download" or "Open." Metadata and web services Erik Mitchell 6 a. 15. Let's look at the file properties as well. Right click on your file again and choose properties. This opens the properties window up on the right hand side of the screen. 16. Look through the properties and note the details in the table. AWS Feature Storage Class Server Side Encryption Permissions Metadata Function Storage class switches between standard and reduced redundancy storage. Reduced redundancy storage is only 99.99% durable (e.g. 1 object in 10,000 may become corrupted) while standard redundancy is 99.999999999%. Encrypts the file before it stores it on disk. This ensures that your files are encrypted on Amazon servers - not as secure as client side encryption Allows you to share or make public specific files in a bucket Shows the metadata tags assigned to your object. Many of these tags are considered "actionable" by S3 meaning that Amazon takes action based on their presence and value 17. You can add metadata tags to your buckets and file that make them easier to work with. For example you may want to tag the files with your research project name. 18. To add a research project name tag to your file click on the metadata stanza in properties and type in your tag name (research_project) and tag value (D-Lab workshop). It is worth noting that this was broken in Safari when I did it! Metadata and web services Erik Mitchell 7 a. 19. You can also add multiple versions of a file. Edit your file to change the version number in the text to "2.0." 20. Upload the file again using the same process as above. Amazon S3 should detect that you are adding the same file by matching file name. 21. With your new file uploaded you can click on the "Transfers" button and see a history of file actions (the screenshot below contains more file actions than we have covered in this worksheet) a. 22. You can also see all of the file versions by clicking on the "Show" button next to versions Metadata and web services Erik Mitchell 8 a. 23. Using the "Download" function in the right click menu for the file make sure you can retrieve both version 1.0 and version 2.0 of your file. 24. Questions: a. Amazon S3 supports server side encryption where can you set this property? b. How could you compare the persistence of a file on Amazon S3 versus one stored on your local hard drive? c. Is storing data on Amazon S3 a breach of IRB protocol or University policy? 25. While the Amazon S3 web service is OK, you can also manage the service using any number of tools. We will familiarize ourselves with "Cyberduck." 26. Download cyberduck from http://cyberduck.ch/ 27. Launch Cyberduck and click on "Open Connection" a. 28. Select Amazon S3 from the option list and use the following credentials (Note - these credentials will only work on 11/19/2013) a. Access Key ID: Ask instructor or create your own Metadata and web services Erik Mitchell 9 b. Secret Access Key: Ask instructor or create your own 29. Browse the file structure in CyberDuck and take notice of available features. In general you should find that CyberDuck allows you to upload files pretty easily but also makes it a bit more difficult to look at file properties. 30. Check out the bucket properties by right clicking on the bucket and choosing "Info" a. 31. Let's update our data file to version 3.0 and upload it. Edit your text file and click on "Action >> Upload." Did it work? In my experience Cyberduck throws some errors but does upload the file 32. Ready for some advanced work? Try the "Action >> Synchronize" activity. What happens? 33. As you can see by turning on logging, it is both good and bad. Logs are great but the AWS S3 logs are so detailed it can be overwhelming. Describe, document and organize data Activity 3 -- Automatically share project documents, code and data across devices and collaborators. While Amazon S3 is great for archival file storage, we have seen that it can be somewhat cumbersome for collaboration and sharing. AWS requires each person to have an AWS account and to manually sync files (*Note -there are programs that use S3 as a back-end and provide the sort of continuous backup services we are describing in this section). Ease of use and transparency are two key issues for anyone implementing a backup and for that reason it is important to Metadata and web services Erik Mitchell 10 identify products that allow us to track versions and sync data every time a file is saved. This process, also known as continuous data protection (CDP) or real-time backup is an excellent intermediate step that recognizes that most changes that are made throughout the day do not need persistent archival versions in systems like S3. For this exercise we will use Google Drive (or Google Drive) but equally good services include Box.net, Dropbox, Microsoft Sky Drive, and Amazon Cloud Drive (uses S3!). More options can be found at http://creativeoverflow.net/the10-best-alternatives-to-dropbox/. Note: You may already have Google Drive or another service like it setup on your machine. If you do not - set it up today! If you do, read through the exercise and look for opportunities to explore. Re-installing Google Drive on your machine will just make problems so don't do it! 1. Lets start by installing Google Drive on your computer if you have not already. You can get Google Drive at https://tools.google.com/dlpage/drive 2. Install the software and login. Allow the installer to place the folder in your home directory 3. If you have time/interest - you can also install the Google Drive app on your smartphone or table (iOS and Android). Login and browse around Google Drive and other services automatically synchronize and track versions of your data by attaching operational hooks to your operating system that copies files to the cloud service every time you save a document. There is nothing magic here and in fact the tools to do this have been around for a very long time (rsync). These file syncing tools are useful however because they include data sharing and collaborative tools, because they are multi-platform and because they work continuously. 4. To move a file into Google Drive, simply copy or move the file or folder into the synchronized folder. Nothing should appear different except you should be able to see your file and other preferences in your system toolbar. a. 5. With your file synchronized go into Google Drive and find the file. 6. Right click on the file and select "Manage Revisions" Metadata and web services Erik Mitchell 11 a. You should notice just a single version of the file! This underscores an important point in file backup and management services - the file versions are unique to the provider and do not migrate across syncing platforms. This means that while we have three versions of our data file in S3, we only have the most recent in Google Drive! Had we started in Google Drive the opposite would be true. This is worth thinking about as you design a data management implementation. Before we leave Google Drive, lets look at sharing. One of the strong suits of Google Drive is that you can share with anyone (as long as they have a Gmail account). You can create your sharing permissions at the folder or file level and permissions are inherited (e.g. a file within a folder shares the folder permissions unless you override them). 7. Right click on your folder name in Google Drive and choose sharing Metadata and web services Erik Mitchell 12 a. 8. Once in the sharing window click on the "change word for the overall file settings. You will notice that the file is set to private but we can also set some pretty wide-open access options. a. 9. When you set your sharing options to allow others to see it (either everyone or specific people) all they have to do is decide to add these files to their Google Drive folder and the files will update automatically for each user. This means no-more emailing files, version control problems and file copying issues! Activity 4-- Track versions of code and data So far we have seen how to check our data file into a long-term repository to ensure that we can always get back to the source data (S3) and have seen how we can use Google Drive or other similar services to setup an 'always-on' backup and collaboration solution. As you may recall however I also want to track the software that I am using in my research. Metadata and web services Erik Mitchell 13 Although I care about my data and my software equally I also want to put some extra "protection" on my software to ensure that the code gets "checked in to a code base. In order to try this out we are going to work with a tool called GitHub. GitHub is a hosted version control system that is popular in the software writing community. GitHub is actually a branded and cloud-based hosted service of the open source software package Git. We will use GitHub for our exercise as it is easier to install the client and have it point to a server and gives us the power of cloud backups. If you were working with very private and secure data however you may decide that it is better to run Git and GitLab (the server portion) locally. 1. Create a GitHub account. Go to http://github.com and register on the main page. Choose the free account a. 2. Download the GitHub application for Mac (http://mac.github.com/) or for Windows (http://windows.github.com). 3. Login a. In OSX the login is under Preferences b. In Windows the login is under. . .? 4. When you look at GitHub you should see a pretty much empty window. Metadata and web services Erik Mitchell 14 a. In GitHub the process of uploading a document to the repository is also known as "Committing." The process of indicating that you are going to work on a file is called "Check out." When you check in or commit code you may have to resolve conflicts if you happen to commit or check in a file that has been modified since you checked it out. It is in this way - check out, modify, check in, resolve conflicts that code changes are managed. GitHub is good for more than just code - it can store any type of file including text, images and other media. For our purposes we will use GitHub to store our data and a code file. 5. Before we do our first commit to the repository, create a new text file called "code_file.txt" that has the text "This is my code file. It has version 1.0." Save the file in the directory that you just added to this repository (aka you Google Drive directory). 6. With our two files created Let's start by creating a repository. Repositories in GitHub are like Buckets in S3. They contain your files along with their versions, metadata and update history. 7. Click on the plus sign at the bottom of the screen or "File >> New Repository" 8. Give your repository a name and select the local file location. You will notice that I chose the folder above the folder that I already had for this project (e.g. My project name is dlab_workshop_etm, I already had a folder called dlab_workshop_etm that I wanted to be my home repository so I pointed GitHub to the parent folder. This tells GitHub to start using my existing folder as a GitHub repository. This is good and bad and you may have a preferred approach. NOTE - you should add your initials to the repository as GitHub does not like conflicting repository names! Metadata and web services Erik Mitchell 15 a. 9. You should be pushed to the Sync Changes screen (see below). If not, double click on your repository and look for the changes icon. a. 10. Every time you add data to a repository you must add a title and some descriptive metadata. Best practice is to accurately describe the changes. After you have entered the description click the "Commit & Sync" button. You should be asked if you want to sync your repository to GitHub. This uploads all of your data and code to GitHub. 11. With your repository synced let's take a moment to look at the service on GitHub (the website). Go to http://github.com. Login and look for your repositories (e.g. https://github.com/eriktmitchell/dlab_workshop) Metadata and web services Erik Mitchell 16 a. You will notice that GitHub recommends that you add a README file to your project. Let's Do that! 12. In the web browser, click the "Add a README button." GitHub will give you a text editor. Add some text! a. 13. Add text to your file and click commit. You will now see the committed file in your web browser 14. With your readme file in the central GitHub repository we now need to get the file onto your machine. In GitHub (the windows/mac client) click "Sync Branch" Metadata and web services Erik Mitchell 17 a. 15. This will download the readme file to your local machine. Look in your finder or windows file explorer to confirm the file exists. 16. We can also change files locally and have our GitHub client detect the changes. Edit your code file to change the version to 2.0. 17. Return to GitHub and look under the Changes tab. You will notice unsynced changes. Review the changes and accept them by clicking the commit & sync button. a. 18. You can also move back to previous versions under the history tab. Choose a version to roll back to and choose "Action >> Rollback." A detail: Rollback erases the change history while revert preserves the history but puts the previous file back into place. It is worth thinking about this - revert preserves your action history while rollback erases it! Git also includes the concept of "Branching" which allows you to create different versions of code and documents to try out. Branching can be particularly useful when trying out different combinations of files or tracking the impact of multiple code or data changes. 19. To Branch your repository click on Branches and click on the plus sign next to your main repository. Name your branch. a. Metadata and web services Erik Mitchell 18 20. You can then publish your branch by clicking the publish button (or test locally!). Let's test out branching by switching to our testing branch, modifying two files and then reverting back to our master. a. Make sure under branches your testing branch is active by double clicking on it. b. Edit your data and code files to add the text "this file is in my testing branch." c. Go to your changes tab and review the changes you made. Comment and sync. 21. Take a look at the website - in your code you should see two branches (master and testing). 22. What is even more amazing is that GitHub helps you track multiple versions of these files on your computer. Return to your GitHub client on your computer and switch back to your master branch (branches, double click on master). 23. With your branch switched go to Finder or your file explorer and open the file. You should see your master file there! The ability to track multiple branches or versions of files simultaneously is an important feature of GitHub. Using this feature you can test different combinations of data and data analysis techniques! Share data with colleagues Activity 5 -- Make my code and selected data available for others to use GitHub is great for managing complex file structures and multiple grouped versions of files. It is also great for engaging the developer and researcher community. Using free accounts everything you put into GitHub is open to the world. These are called public repositories. With a paid account you can make your code and data private in GitHub. This is useful as you are going through the research process - you may not want to make all of your mistakes public! In addition to setting a public/private flag on GitHub you can choose to ignore certain files. In order to support both the developer and the users of code, GitHub has a number of documentation features including an integrated wiki, statistics on use and an issue tracking system. These features take some time to get to know so we will only mention them here. If you like GitHub though it is worth some consideration. Metadata and web services Erik Mitchell 19 Activity 6 -- Create a data citation There are many other options for publishing data for others to use. This includes sites like ICPSR, the Data Hub, Merritt and Dataverse. In this portion of the exercise we will look at a tool hosted by Harvard called Dataverse. 1. 2. 3. 4. Go to Dataverse and create an account (http://thedata.harvard.edu/dvn) Register your account Create a dataverse Create a study The Dataverse is unique from other examples we have looked at so far in that it is centered on the idea of publication. In our work with S3, Google Drive and GitHub we were focused on data production, analysis and management and Metadata and web services Erik Mitchell 20 thought of publication as a final process. In Dataverse you take your final form of data and upload it with properly descriptive data for others to use. In the screenshot below you see some of the details required to publish your data. A full tutorial on creating and managing your data files in dataverse is available at http://www.irss.unc.edu/odum/contentSubpage.jsp?nodeid=659#data Activity 7 -- Get a PURL to your data Making your data citable can be as simple as publishing your GitHub data or as complex as creating an entry in Dataverse, ICPSR or your data publication platform of choice. A key activity among all of these is getting a permanent uniform resource locator (PURL) that will point to the data no matter where it lives. PURLS are most frequently used in the form of DOIs (Document Object Identifiers) for citations. You most likely have already seen DOIs and used them in your citations. While sites like Dataverse create DOIs for you, you can also choose to host your data anywhere you like and assign your own DOI. In order to support this, the UC Berkeley library hosts a DOI service provided by the California Digital Library called EZID. EZID (easy-eye-dee) makes it easy to create & manage unique, long-term identifiers create identifiers for anything: texts, data, bones, terms, etc. store citation metadata for identifiers in a variety of formats update current URL locations so citation links are never broken Metadata and web services Erik Mitchell 21 use EZID's programming interface for automated operation at scale choose from a variety of persistent identifiers, including ARKs and DataCite DOIs [excerpted from the EZID web site] For more details and help with EZID, check out their documentation at http://n2t.net/ezid/home/documentation. Objective Generate a DOI with the simple demo version of EZID, and then resolve the DOI name (i.e., retrieve the digital object from a DOI). Procedures Visit EZID 1. Visit http://n2t.net/ezid/ 2. In menu bar at the top, click Demo. (We’re working with the demo version). Create a DOI 3. In the EZID form, select the DOI option. 4. We’re going to create a DOI for the Berkeleyside article on the Doe Library Centennial. o Open a new web browser tab, and visit http://www.berkeleyside.com/2012/03/22/uc-berkeleys-doelibrary-celebrates-100-years/ o Keep the article open for your reference. Enter metadata for the digital object 5. Return to the EZID form. Enter the metadata details of the Berkeleyside article into the form. 6. After entering all the metadata, click the Create button. 7. A new page is displayed with the identifier details. The created DOI is highlighted in yellow. Copy the DOI here: ___________________ 8. Note: EZID creates a QR code for the DOI, which is displayed on the upper right hand corner. You can read the QR code with your mobile device to retrieve the Berkeleyside article. Resolve a DOI (i.e., retrieve the digital object from a DOI) 9. Open yet another web browser tab. 10. Visit the following URL: http://dx.doi.org/ + [the DOI you generated] o For example, http://dx.doi.org/10.5072/FK2RN39P5 Metadata and web services Erik Mitchell 22 o Please note: It can take a few minutes before the test DOI becomes active. Edit a DOI In EZID, you can edit the metadata for the DOI (including a new location URL). This is done in the Manage IDs functionality. In the EZID demo version we are using, we are unable to edit the temporary test DOI we created. Summary EZID lets you create permanent identifiers (including DOIs and ARKs) via a web interface or an API. Plan for data management Activity 9 -- Data Management Planning: Use the DMPTool to write a data management plan Background The DMPTool provides step-by-step instructions and guidance for writing a data management plan. It helps you meet the specific requirements of different funding agencies. More details at: https://dmp.cdlib.org/about/dmp_about Objective In this exercise, we are going to observe the workflow of writing a data management plan that meets the NSF requirements for Chemistry Division. Procedures 1. Visit https://dmp.cdlib.org/ 2. In the upper right hand corner, click Login. 3. In the Select Your Institution drop-down menu, select University of California, Berkeley. 4. Select your user status (either new or returning user), and click Go. 5. Once you’re logged into the DMPTool, on the top menu bar, click on Funder Requirements. a. You will see a listing of funding agencies. The DMPTool will help you prepare a data management plan that meets these agencies’ requirements. b. Note the links to specific requirements, sample plans, and a word processing document template. 6. Let’s create a data management plan that meets the NSF-CHE: Chemistry Division. a. On the top menu bar, click on My Plans. b. In the Create a new plan drop-down menu, select NSF-CHE: Chemistry Division. Click Go. 7. In the Plan Name box, enter My first plan. Click the Save and next button at the bottom. 8. In the first section of your data management plan, you give details of the products of research. Metadata and web services Erik Mitchell 23 a. The left hand column labeled Progress outlines the sections you will write for your data management plan. b. The right hand column labeled Resources includes links to helpful resources to help you write a section. These resources are context-specific and will help you write the current section. c. The Help box in the center with the green banner provides tips and suggestions for writing the particular section of your data management plan. 9. Enter a random string of characters in the text box in the middle. Click the Save and next button. a. For the remaining 4 sections (from Data format to Archiving of Data), skim the help information, enter some random text in the text box and then click the Save and next button. 10. After you’ve completed the 5th section, Archiving of data, you will see your data management plan displayed. 11. Export the plan as an RTF word processing file, by selecting the Rich Text option at the top and clicking the Export button. 12. In the top menu bar of the DMPTool, click on My Plans. Your new plan is listed, and there are options for editing your plan. 13. Let’s share our data management plan. a. Click on the [share] link for your plan. b. A link is now displayed. This link can be shared with colleagues. It directs the reader to a publically accessible PDF version of your plan. c. Let’s remove your data management plan PDF from public viewing. Click on the [retract] link for your plan. 14. Please Logout when done. Summary DMPTool helps you create data management plans that meet a number of funding agencies’ specific requirements. You can export your plan as a text or PDF file. And you can share the PDF file through a link that is publicly accessible. The tool provides contextual help information for each section of your plan and offers links to local data resources. Metadata and web services Erik Mitchell