VERSIONS Project – WP2 – Report of Researchers Questionnaire – v1b – 29 February 2008 Versions of academic papers online - the experience of authors and readers: report of a survey of researchers Panayiota Polydoratou and Frances Shipsey November 2007, corrections February 2008 Document history Authors Version 1a 1b Panayiota Polydoratou Frances Shipsey Date 16 November 2007 29 February 2008 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 Contents LIST OF TABLES ........................................................................................................................................... 4 LIST OF FIGURES ......................................................................................................................................... 5 ACKNOWLEDGEMENTS ............................................................................................................................... 5 EXECUTIVE SUMMARY ................................................................................................................................ 6 1. INTRODUCTION ......................................................................................................................................11 1.1 Objectives of the User Requirements Study ....................................................................................11 1.2 Focus on the discipline of economics ............................................................................................12 2. METHODOLOGY ....................................................................................................................................13 2.1 Survey design .................................................................................................................................13 2.2 Publicising the survey ....................................................................................................................14 3. RESULTS – GENERAL .............................................................................................................................15 4. IDENTITY CHARACTERISTICS OF THE RESPONDENTS ..............................................................................15 4.1 Role ................................................................................................................................................15 4.2 Responsibilities ..............................................................................................................................15 4.3 Subject discipline ...........................................................................................................................16 4.4 Research experience ......................................................................................................................17 4.5 Countries of respondents ...............................................................................................................18 5. AUTHORS: RESEARCH OUTPUTS ............................................................................................................19 5.1 Types of research outputs ..............................................................................................................19 5.2 Versions/revisions kept ..................................................................................................................22 6. AUTHORS: PAPERS FOR SUBMISSION TO REFEREED JOURNALS ..............................................................23 6.1 Level of research activity ...............................................................................................................23 6.2 Research activity by role................................................................................................................23 6.3 Versions of all research outputs kept, by number of journal articles produced ............................24 7. AUTHORS: REVISING AND STORING ACADEMIC PAPERS FOR REFEREED JOURNALS ................................25 7.1 Versions of journal articles most likely to be kept by authors .......................................................25 7.2 Accessibility of own versions to their authors................................................................................27 7.3 Reasons for not having access to all versions ................................................................................27 7.4 Organisation and storage of different versions .............................................................................29 7.5 Organising own versions – successful strategies ............................................................................30 7.6 Organising own versions - difficulties ............................................................................................31 8. AUTHORS: RESPONSIBILITY FOR SECURE STORAGE OF DIFFERENT VERSIONS ........................................33 9. AUTHORS: DISSEMINATION OF RESEARCH OUTPUTS ..............................................................................35 9.1 Digital repositories ........................................................................................................................35 9.2 Authors’ intentions regarding deposit ............................................................................................36 9.3 Authors’ attitudes regarding deposit – further exploration ...........................................................36 9.3 Other dissemination routes ............................................................................................................39 9.4 Role for libraries/institutions in assisting with dissemination .......................................................41 10. AUTHORS: ATTITUDES TOWARDS MAKING VERSIONS OF RESEARCH OUTPUTS OPENLY ACCESSIBLE ...42 10.1 Versions of academic papers authors are interested in making openly accessible, if permitted .42 10.2 Authors’ understanding of which versions of their papers they are allowed to disseminate .......44 11. RESEARCHERS: FINDING MULTIPLE VERSIONS OF THE SAME ACADEMIC PAPER ...................................46 12. RESEARCHERS: CITING PAPERS FOUND IN FULL TEXT ONLINE .............................................................48 13. RESEARCHERS: PERSISTENCE – LONG TERM AVAILABILITY OF ACADEMIC PAPERS ..............................50 14. IDENTIFYING VERSIONS .......................................................................................................................52 14.1 Labelling and naming versions ....................................................................................................54 14.2 Chronological and numeric labelling ..........................................................................................54 14.3 Describing the content and relationships ....................................................................................55 14.4 Linking - Collocation ...................................................................................................................56 14.5 Textual comparison......................................................................................................................56 14.6 Signposting – Publisher and author versions ..............................................................................57 2 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 14.7 Other suggestions .........................................................................................................................57 15. DISCUSSION .........................................................................................................................................58 16. CONCLUSIONS .....................................................................................................................................59 APPENDIX A - QUESTIONNAIRE .................................................................................................................61 APPENDIX B - OTHER SUBJECT DISCIPLINES OF RESPONDENTS...................................................................71 APPENDIX C – COUNTRIES OF RESPONDENTS .............................................................................................72 APPENDIX D – PERSONAL INFORMATION MANAGEMENT AND VERSIONS – SUCCESS STORIES ....................73 APPENDIX E - PERSONAL INFORMATION MANAGEMENT AND VERSIONS – DIFFICULTIES ...........................78 APPENDIX F – RESPONSIBILITY FOR SECURE LONG TERM STORAGE OF VERSIONS OF ACADEMIC PAPERS ...83 3 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 List of tables Table 1: Role of respondents ........................................................................................ 15 Table 2: Subject disciplines of respondents ................................................................... 17 Table 3: Research experience of respondents .............................................................. 17 Table 4: Research experience by respondent roles ....................................................... 18 Table 5: Responses by geographic region .................................................................... 18 Table 6: Expected research outputs – totals and by subject discipline .......................... 20 Table 7: Types of research output produced in a typical research project ..................... 21 Table 8: Which versions of research outputs do researchers personally keep? ............. 22 Table 9: Number of papers produced in past 2 years for publication in refereed journals ............................................................................................................................... 23 Table 10: Number of papers produced, by role of respondents ..................................... 24 Table 11: Revisions of research outputs kept, by level of research activity .................... 25 Table 12: Versions of journal articles kept by authors.................................................... 27 Table 13: Ease of access by authors to their own final author versions of articles ......... 27 Table 14: Reasons for not having easy access to own final author versions ................. 28 Table 15: Organisation of different versions of papers................................................... 29 Table 16: Responsibility for the secure long term storage of versions ........................... 34 Table 17: Awareness of institutional repositories ........................................................... 35 Table 18: Reasons for not depositing papers in institutional repository.......................... 35 Table 19: Willingness to deposit final author version in IR if invited to do so ................. 36 Table 20: Author attitudes towards providing final author versions to IRs ...................... 38 Table 21: Attitudes towards deposit - ranked ................................................................. 39 Table 22: Other dissemination channels used - Economics/Econometrics .................... 40 Table 23: Number of dissemination channels used by economists ................................ 40 Table 24: University/institution depositing research outputs for authors - usefulness..... 41 Table 25: Further comments by authors on making versions OA................................... 43 Table 26: Level of understanding of versions allowed for dissemination ........................ 45 Table 27: Researchers’ experience: finding more than one version/copy online ............ 46 Table 28: Researchers’ experience: ease of identifying the preferred version ............... 46 Table 29: Researchers: difficulties faced in identifying versions .................................... 47 Table 30: Reference practices when citing papers ........................................................ 48 Table 31: Persistence of papers .................................................................................... 50 Table 32: Version identification methods – All methods ranked in order of support ....... 53 Table 33: Version identification methods - Labelling versions........................................ 54 Table 34: Version identification methods – Describing relations .................................... 54 Table 35: Version identification methods - Chronological labelling ................................ 55 Table 36: Version identification methods - By version control number ........................... 55 Table 37: Version identification methods – Note in the description of papers ................ 55 Table 38: Version identification methods - Describing content....................................... 56 Table 39: Version identification methods - Collocation .................................................. 56 Table 40: Version identification methods - Textual comparison ..................................... 56 Table 41: Version identification methods - Published version ........................................ 57 Table 42: Version identification methods - Author’s latest version ................................. 57 4 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 List of figures Figure 1: Responsibilities of respondents ...................................................................... 16 Figure 2: Types of research outputs authors expect to produce .................................... 21 Figure 3: Other preferred dissemination routes – all subjects ........................................ 41 Figure 4: Which versions authors are interested in making openly accessible, if permitted ............................................................................................................................... 42 Acknowledgements The VERSIONS Project is grateful to all project partners for their assistance with drafting and publicising the online questionnaire. In addition, Neil Jacobs the JISC Programme Manager and several LSE academic staff provided useful feedback on the survey design. Professor Ian Walker, University of Warwick and Professor John Sutton, LSE kindly facilitated dissemination of the survey to the European Economic Association and Royal Economic Society membership respectively, as did a representative of the Latin American and Caribbean Economic Association (LACEA). Thanks are due to Clive Graham for editorial assistance. Any errors in the report remain the authors’ own. 5 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 Executive summary Identity of respondents Four hundred and sixty four people replied to the survey. More than half of the respondents (56.9%) were members of academic staff. Research staff, including both postdoctoral and contract/freelance researchers accounted for almost one fifth of the response (19.4%). The remaining response was divided between students (23.3%) and non-active researchers (0.4%). There was a range of responsibilities declared, the most common (347 researchers) was in teaching. 40 respondents were journal editors. Ninety-six (96) respondents indicated ‘other’ responsibilities such as acting as members of editorial boards and/or having other roles in the editorial/publishing process such as referees and editorial assistants. Subject discipline ‘Economics and Econometrics’ was the most represented subject discipline (75%) in the questionnaire. This was the main target group. Researchers in the disciplines of ‘Accounting and Finance’ and ‘Business and Management Studies’ represented a further 9% of the overall response. Other subject disciplines represented were physics, library and information management, computer science and informatics and statistics and operational research. Research experience of respondents There was an almost equal distribution of research experience amongst the respondents reporting experience between 0-5 years (32%), 6-10 years (30%) and more than 10 years (38%). Furthermore, the research experience appeared to be in synch with the role of the respondents. Geographic spread The questionnaire received response from researchers from 41 countries. The majority of the researchers were based in the UK (80 people), the USA (57 people), Germany (49 people) and the Netherlands (40 people). However, as the sample was small, results are indicative rather than conclusive and cannot be generalised without expanding on this study with further research. Research outputs produced and retained by respondents The researchers produce a range of research outputs, most commonly papers for submission to refereed journals (434 researchers). Conference/workshop/seminar material was indicated by 311 of the researchers as the second most popular expected research output. Conference papers (274) were the third most common research output. The production of journal articles for publication in refereed journals demonstrates the popularity of the medium and it is indicative of the way that the research is communicated and disseminated to this day. Over half (59%) produce 4 or more different types of research output from a typical research project. More than 90% of respondents indicated that they personally keep either all or the major revisions that they make to their research outputs. This appears to be an encouraging finding as regards the availability of the final author version (produced by author / coauthors – agreed with journal, following referee comments), ie a version that can be 6 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 deposited in an institutional repository. Only one person noted that they do not keep a personal copy of their publications, which suggests that the availability of content is or will not be the issue for its inclusion in an open access repository. Papers for submission to refereed journals More than 70% of respondents are likely to produce anything from 1-6 papers for submission to refereed journals during a period of 2 years of conducting research, indicating that the survey respondents were a research active group who could be expected to comment on research practices from direct experience. The role of the respondents and consequently their research experience is also in line with the number of research outputs they were likely to produce. The majority of the professors that participated in the survey were most likely to produce 4 or more research outputs. Lecturers/associate professors were most likely to produce between 4-6 research outputs, while postdoctoral staff and doctoral students were likely to produce research in the range of 1-3 outputs in the period of 2 years of conducted research. Versions of journal articles kept by researchers Focussing specifically on researcher practice relating to papers intended for publication in refereed journals, the majority of the respondents keep permanently a range of different versions of these. Of particular interest is that 91% of the respondents stated that they would personally keep permanently the final author version, post refereed. 92% stated that they keep personally the final published version produced by the publisher (92%), while 79% keep the version of the paper that has been submitted to a journal for peer review. This is an encouraging result for the likely availability of versions permitted for deposit in open access repositories by many publisher standard agreements. Authors’ own ability to quickly access their final accepted versions Ease of access by authors to their personal copies of final accepted versions of journal articles is a more mixed picture, though still encouraging. 59% of the researchers have an easily accessible copy of all their final author versions. Another 37% also indicated that they have an easily accessible copy of most of their final versions outputs. As above, this very positive result indicates that the large majority of those surveyed have access to the version required for deposit into an IR. The primary reason (90 respondents) for not having access to all versions of their final author versions of journal articles was the lack of electronic copies before a certain date. The next most common reasons noted by the respondents were: that the papers were stored on different servers and therefore would prove difficult to retrieve (30 researchers) loss or damage to their computer systems (29 people) Researchers’ personal information management and versions Researchers’ satisfaction with their own personal information management strategies for storing and organising their files, which can run to dozens of versions and revisions, was mixed. Around half were satisfied with their systems and indicated that they tend to organise files by project, date, version control number. A small number of those who were satisfied mentioned use of a version control system or other tool at their institution. Some mentioned that they only retain their final versions. The other half of the respondents expressed difficulties in organising their various versions of a document 7 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 and that was attributed to working on multiple computer systems, retaining too many versions, difficulties imposed by managing the styles of co-authors, insufficient naming systems and labelling of versions. Responsibility for secure long term storage of versions of journal articles Researchers felt that authors should have the primary responsibility for this secure long term storage during all the stages at which authors themselves create the versions (draft, submitted version, final accepted version). From the point at which the publisher creates versions (proof, final published version) researchers saw publishers as the main group with responsibility. Strikingly, authors’ universities/institutions (including libraries) were seen as having a lesser role than both authors and publishers to play in providing secure long term storage of versions. Publishers were viewed by more respondents as having a role to play in long term storage of final author versions (168) than were authors’ own institutions including libraries (81). Dissemination of your work (full text and online) One third (33%) of the researchers indicated the availability of a digital repository at their university where they can deposit their papers. The remaining two thirds of the respondents noted that either there was no such service and/or that they were unaware of such a service. More than half of the researchers that noted they have access to a digital repository in their university replied that they submit their papers there. However, this response represents only one fifth (90 researchers) of the overall response. Attitudes towards depositing final author versions of journal articles More than 81% replied that if their university invites them to place a copy of their paper in the institutional repository and requests from them the ‘final author version’ (definition was provided again along with the question), they would provide this version. Other attitudes regarding deposit of final author versions of journal articles were explored through a series of statements to which researchers Strongly agreed/Slightly agreed at the following rates: Willing to provide – helps to disseminate research quickly (85%) Willing to provide on condition readers are made aware it is not the published version (84%) Willing to provide if link to published version is provided (78%) Would take too much time to provide this version (12%) Consider this version to be inferior to the publisher PDF version (50%) Place the publisher PDF on personal website as first priority, if permitted (73%) Willing to provide author final version to fellow researcher if requested by email (91%) Concerned that I might lose citations to the published version if provide final author version (42%) Unsure whether publisher copyright agreement permits me to provide this version (68%) Intend to provide such versions in the future (66%) Other dissemination routes Researchers are already disseminating their work through a variety of channels, such as personal website, university working paper series website, REPEC, SSRN. Almost half of the economists surveyed (49%), disseminate their work through 3 or more different 8 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 routes (in addition to publication in refereed journals and any use of institutional repositories). 73% disseminate their work through 2 or more different routes. Given this, not surprisingly, a majority of researchers (78%) indicated that it would be useful if their institution could support deposit of their research outputs in those alternative routes such as a personal web page, group or other web pages held at the institution and relevant subject repositories. Author attitudes towards open access dissemination of their versions Researchers were asked which versions of their academic papers produced for publication in a refereed journal they would like to make available on open access, if permitted. The most popular response was for the publisher-created published version (385). Next most popular was the final author accepted version (274), followed by the submitted version (191). Interestingly, a minority were interested in making even more versions openly accessible, if permitted such as: draft version circulated to colleagues or peers (116), publisher’s proof (100). Authors’ understanding of permitted open access versions Just 11% of respondents feel that they have a full understanding of which versions of their academic papers they are allowed to disseminate in full text, in which location(s), and at which time(s). Of the remainder, most (44%) say they have a ‘limited understanding.’ This suggests that the VERSIONS Project could help to raise awareness through inclusion of information about the RoMEO list and copyright transfer agreements in the forthcoming toolkit for researchers. Finding multiple versions of the same academic paper online A large majority of respondents (93.4%) have experienced locating more than one full text version / copy when searching for a paper online. More than half (54%) noted that they frequently or very frequently find several versions of the same research paper available online. Only 5% replied that they have never come across such occurrence and only 2% of the researchers replied that they were not aware if this was happening. 41% of respondents report difficulties with establishing which online version(s) they wish to read, though a slight majority (54%), state that it is generally quick and easy to decide which versions/copies to use. The most common problems experienced are ‘knowing if I have found the latest (most recently issued) version’, ‘knowing whether there is a published version’ and ‘knowing the difference between the content of one version and another’. ‘Time taken to look at different versions’ was the fourth most common difficulty, though it is perhaps implicit in the first three as well. Citing papers that you have found in full text online Asked about citing journal articles when researchers have read an earlier version, 73% said they would cite the published version only, 7% said they would cite both the published version and the earlier version, 13% said they would cite the earlier author version they have read, while 5% would not cite at all if they had not read the published version. A large number of free text comments were received regarding this question to qualify the answers given and these reveal that in fact researchers do spend time and effort to read both published and earlier versions so that they can cite correctly. 9 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 Persistence – long term availability of academic papers About half of the respondents noted that it is important to them that the version of the paper they have cited and is available online, remains available and accessible online. The location appeared not to be so important to the respondents, having quick and easy access to the information preferably via a search engine was raised by few as an important feature. Comments that the respondents made raise questions regarding the preservation of the material that is made available online, its sustainability and questions about the role repositories have in this area. Version identification solutions Of 10 potential version identification methods proposed, the three most popular with survey respondents were: A method of indicating which is the published version – Essential or very important 75% (Essential 37%, Very important, useful for me 38%) A method of indicating which is the author's latest version of a paper – Essential or very important 77% (Essential 25%, Very important, useful for me 42%). A standardised way of recording and displaying the date of manuscript completion – Essential or very important 58% (Essential 19%, Very important, useful for me 39%). We asked researchers to let us know which terms they (or their publishers) use for revision stages. 71 free text replies were received. The most commonly used term was ‘draft’ (33), usually in combination with something else, eg 'Submitted draft'. Terms which were mentioned in significant numbers by researchers included: preprint (6), manuscript (4), do not cite (3), submitted (35), accepted (14), final (15), revised (5), reviewed (3). Postprint was mentioned by just one respondent. 10 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 1. Introduction The VERSIONS Project1, funded by the Joint Information Systems Committee (JISC) Digital Repositories Programme2 addresses issues and uncertainties relating to version identification in digital repositories and open access collections of research papers. The VERSIONS Project investigates how academics produce, archive, disseminate and access electronic versions of papers at different stages in their lifecycles. The project looks at researchers’ attitudes towards the current situation. The project has a focus on research papers, sometimes referred to as eprints, in the subject discipline of economics. It takes a comparative view by drawing on established partnerships and experience with European libraries specialising in economics. Through this partnership it has been possible to conduct and disseminate widely in European countries a user requirements study and survey of current practice among academic economists. This report (along with two related reports3 from the project) forms work package 2 of the VERSIONS Project, the User Requirements Study and Repository Use investigation. The results of the user requirements survey and of a publications list analysis will be used to develop a set of guidelines on good practice in relation to version identification, to produce a toolkit of guidelines for academic researchers and to make recommendations on standards for versions to JISC. The VERSIONS project is led by the London School of Economics and Political Science Library4, with the Nereus5 Consortium of European economics research libraries as associate partners. 1.1 Objectives of the User Requirements Study The user requirements study and repository use investigation are intended to meet the following project aim: to clarify the position on different versions of academic papers in economics available for deposit in digital repositories, in order to help build trust among academic users of repository content The online questionnaire targeted at researchers, which forms the focus of this report, was designed to meet the following project objectives: VERSIONS Project – http://www.lse.ac.uk/versions JISC Digital Repositories Programme http://www.jisc.ac.uk/whatwedo/programmes/programme_digital_repositories.aspx 3 Report of interviews with academic economists and support staff on the question of version identification and Identifying versions of academic papers in digital repositories: report of a survey of experts 4 London School of Economics and Political Science, Library – http://www.lse.ac.uk/library 5 Nereus – http://www.nereus4economics.info 1 2 11 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 finding out about researchers’ understanding of different versions in the lifecycle of an academic paper finding out researchers’ attitudes towards secure storage and open access availability of papers at different stages in the lifecycle discovering any variations in requirements depending on specific stakeholder roles (eg author, journal editor, head of department, teacher, etc) finding out about existing repository use by researchers, looking at both institutional and subject repositories uncovering current practices among academic researchers in retention of their own authors’ versions 1.2 Focus on the discipline of economics It is important to note that the VERSIONS Project has a focus on one discipline, that of economics. The User Requirements Study has therefore predominantly surveyed economics researchers. The reason for this focus partly reflects the experience and priorities of the project partners. However it also stems from the fact that economics is a discipline already rich in content at all stages of the lifecycle of a document. Economists have widely accepted the use of pre-prints and working papers online as a necessary part of the process of publishing academic journal articles, in part to overcome inherent delays in publication. The time lag varies but can be as much as three years. The distributed collaborative project RePEc6 is the best known economics eprints archive to date and is used by many economics researchers. The prevalence of pre-print versions of papers in economics, now often sitting alongside final publisher versions on publisher websites and final author versions in institutional and subject repositories does mean that there is ample material on which to base the investigation of version identification. It was anticipated that the focus on one subject discipline and on one content type would produce a valuable set of results for a real community of research, which will be available for re-use by the JISC community in other subject areas. The report by Sue Sparks7 on disciplinary differences highlighted the fact that there are cultural differences between disciplines which need to be taken into account when considering researchers’ needs for access to resources and dissemination of their outputs. A question posed in her report asked: ‘What is the single most essential resource you use, the one that you would be lost without?’ Economists responded8: • 18.2% preprints • 9.1% postprints • 54.5% journal articles • 18.2% datasets RePEc – www.repec.org Sue Sparks. JISC Disciplinary Differences Report. Rightscom Ltd, August 2005. http://www.jisc.ac.uk/uploaded_documents/Disciplinary%20Differences%20and%20Needs.doc 8 Ibid. Appendix C, Table 43 6 7 12 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 The high importance attached to pre-prints by economists (as compared with other disciplines surveyed in the report on disciplinary differences) confirms the real usage of multiple versions of papers by researchers in this discipline. 2. Methodology 2.1 Survey design During the spring of 2006, notes from the interviews with economists and other stakeholders from the digital repository community were written up and developed into scenarios. These notes and the findings from the interviews also informed the development of an online questionnaire. Following circulation of the first draft questionnaire to the project partners in March 2006, it was decided that the VERSIONS survey would be more effective if split into two separate sets of questions, one for researchers and one for other interested parties (Library staff, IT support staff, research funders, publishers etc.). The questions to be posed to each group are quite different, to avoid wasting respondents’ time on irrelevant material, two separate surveys were considered an ideal solution. Very useful written comments on the draft questions were received from NEREUS partners. Both sets of questions were revised and tested through numerous iterations to ensure that they were relevant and as concise as possible. Two draft surveys were circulated to the project partners and a conference call was held on 24 April, allowing partners to comment further on each questionnaire. These valuable comments provided a fresh view on the questions, and the feedback was used by the VERSIONS team to prepare the final drafts. The JISC Programme Manager also provided useful feedback and the draft questionnaire was tested on a couple of LSE economists. The application used for the questionnaires was Bristol Online Surveys (BOS) and the questions used in the surveys have been made available to BOS academic subscribers as examples of surveys produced using the survey software9. The Project is grateful for the help of the BOS Support Team in facilitating this. Readers looking at the survey of researchers were invited in question 1 to follow a link to the expert survey and to move away if their role was not that of active researcher, for example if they were part of the digital repositories community. Equally researchers who happened across the survey for experts/stakeholders were invited to transfer over to the present survey. 9 Bristol Online Surveys (BOS) – http://www.survey.bris.ac.uk 13 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 The questionnaire can be seen at Appendix A. 2.2 Publicising the survey The VERSIONS Project questionnaires were both publicised during June 2006 through the following mechanisms: By project partners in the Nereus network: Survey details were added to the Nereus website by the Nereus Programme Manager. LSE and Nereus partners helped to promote the survey by emailing the details to their economists and / or adding links to the survey to their library websites. Beyond LSE and Nereus, various promotion efforts were made by the VERSIONS Project Team: 10 11 Through the JISC Digital Repositories Programme wiki. By contacting a large number of economic associations identified through the website of the International Economic Association10, inviting them to publicise the survey via their own discussion lists. Positive responses were received from the European Economic Association, the Royal Economic Society and from the Latin American and Caribbean Economic Association. An invitation to forward the survey to researchers, especially economics researchers, was circulated via several repository, OAI, library and university administrator discussion lists, as well as through library contacts in other countries such as Australia. It was hoped that this approach would yield further responses. SSRN11’s fee-based Economics Research Network Announcements Service was also used to publicise the survey. International Economic Association - http://www.iea-world.com Social Science Research Network – http://www.ssrn.com/ 14 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 3. Results – general The online survey was launched on May 9th and ran for two months closing on July 9th 2006. Four hundred and sixty four (464) researchers responded to the survey. 4. Identity characteristics of the respondents Researchers were asked to indicate their current role, other responsibilities and the subject area they engaged in. They were also asked to provide information about the number of years they had been engaged in research and to indicate the country in which they were based. 4.1 Role The researchers were asked to indicate their current role choosing from the list of options shown in the table below. More than half of the respondents (56.9%) were members of academic staff. Research staff, including both postdoctoral and contract/freelance researchers accounted for almost one fifth of the response (19.4%). The remaining response was divided between students (23.3%) and non-active researchers (0.4%). The response is presented in the following table. Role of respondents Professor: Lecturer / Associate Professor: Post doctoral research staff: Student (PhD or other research degree): Contract / freelance researcher: Not an active researcher: Total Number Percentage 112 152 69 108 21 2 464 24.1% 32.8% 14.9% 23.3% 4.5 % 0.4 % 100% Table 1: Role of respondents 4.2 Responsibilities In addition to their role, the respondents were also asked to select from a list of responsibilities, shown in the figure below. It was possible for respondents to select more than one responsibility. The majority of the respondents (347 researchers), as one might expect from a group of academic researchers, indicated that they had teaching responsibilities. The respondents include a significant number of researchers with responsibility for academic publications or other promotion of economics research: heads of department or research unit (67), journal editors (39), working paper series editors (25), officers of learned societies or research associations (34). 15 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 The ‘Other’ category attracted 96 responses, revealing some additional responsibilities held. Ten (10) of the responsibilities noted by the respondents were related to journal publishing such as acting as a member of an editorial board, referee or journal editorial assistant. A further 19 responses indicated a managerial and/or administrative responsibility in addition to the main stated role. For example, director of PhD programme, deputy director of think tank, research facility manager, director of degree course, exams officer, admissions mentor. The remainder either indicated that they were researchers, teaching assistants or students (which were responsibilities implicit in the question about roles) or replied that they had no other responsibilities. The fairly high number of ‘Other’ responses is partly accounted for by a flaw in this question design which omitted a ‘None’ or ‘N/A’ option. Responsibilities of the respondents 350 317 300 250 200 150 100 96 67 39 50 6 25 34 Wo rking paper series edito r Officer o f learned so ciety o r research asso ciatio n 0 Head o f department o r research unit Dean o r head o f university research Teacher Jo urnal edito r Other (please specify) Figure 1: Responsibilities of respondents 4.3 Subject discipline Another question in the section regarding identity characteristics of the respondents was that of the subject discipline that they engaged in. The categories used in this question were the Units of Assessment (UOA) used for the UK Research Assessment Exercise. Respondents were provided with further information about the subject groupings used12 and were advised that: ‘We are happy to receive responses from all disciplines, but you may find that some of the questions are targeted specifically towards economists.’ 12 RAE 2008 Units of Assessment and Recruitment of Panel Members - http://www.rae.ac.uk/pubs/2004/03/ 16 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 Three subject categories were offered: Economics and Econometrics, Accounting and Finance, Business and Management Studies. Other subject categories could selected by choosing Other and picking from a drop-down list. The majority of the response (75%) was from researchers in the field of Economics and Econometrics, which was the target group aimed for. Taken together with respondents from the related disciplines of Accounting and Finance and Business and Management Studies, these subject disciplines represent 84% of the overall response. The remaining 16% of respondents were mainly drawn from physics, library and information management, computer science and informatics, and statistics and operational research. This information can be found in Appendix B. Subject disciplines Economics and Econometrics (UOA 34) Accounting and Finance (UOA 35) Business and Management Studies (UOA 36) Other Total Response 347 15 29 73 464 Response by % 74.8 3.2 6.3 15.7 100 Table 2: Subject disciplines of respondents 4.4 Research experience The researchers were asked ‘How long have you been engaged in research’. The result appears to be in keeping with the roles of the researchers. For instance the majority of those who had reached a professorship status (91 respondents) noted that they had more than 10 years of research experience. Similarly the majority of those at a lecturer/associate professor level (74 respondents) indicated research experience of 610 years. The majority of the PhD students (97 respondents) fell into the category of 0-5 years of research experience. Results are presented in the following 2 tables. Research experience More than 10 years 6 - 10 years 0 - 5 years (including PhD research) Total Response 175 141 148 464 Table 3: Research experience of respondents 17 Response by % 37.7 30.4 31.9 100 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 Role More than 10 6 - 10 years years Professor Lecturer / Associate Professor Post doctoral research staff Student (PhD or other research degree) Contract / freelance researcher Not an active researcher Totals 0 - 5 (including research) years PhD 91 61 16 1 18 74 35 10 3 17 18 97 6 0 175 4 0 141 11 2 148 Table 4: Research experience by respondent roles 4.5 Countries of respondents The survey was international and the questionnaire received responses from researchers from 41 countries in total. The majority of the respondents were based in Europe (355 respondents). Eighty (80) researchers were based in the UK, 57 in the USA, 49 in Germany and 40 in the Netherlands. Response by geographic region is shown in the following table and responses by all countries can be found in Appendix C. Region Europe North America Latin America Australia Rest of world Total Number of respondents 355 66 30 4 9 464 Table 5: Responses by geographic region 18 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 5. Authors: Research outputs The first substantive section of the questionnaire asked respondents to think about a current or recent typical research project and to say which type(s) of output they expect / hope to produce from that research. They were then asked to comment on which revisions they personally keep from these research outputs. This pair of questions was designed to test some of the findings from the interviews: that dissemination of research in economics is achieved through a range of publications including conference materials, working / discussion papers, journal articles, chapters, and so on and that researchers are therefore handling large personal collections of digital objects representing related versions of the same intellectual content. The question about which revisions researchers personally keep was intended to test the impression which emerged from the interviews, that most researchers keep much or most of what they produce and may in some cases be storing up to 60 or so revisions of the same intellectual content. 5.1 Types of research outputs The researchers were provided with a list of different options from which they were asked to indicate the research output(s) that they expected / hoped to produce as output from that research. Respondents could select more than one type of output. The vast majority of respondents indicated that they expected / hoped to produce an article for publication in a refereed journal as research output (434 researchers). This entirely accords with the findings of the interviews in which researchers all agreed that journal articles are the pre-eminent type of publication in economics. The second most common form of research output was conference/workshop/seminar presentation (311 respondents). Conference papers (274) were the third most common research output type selected from the list presented. Working / discussion paper (institutional working paper series with no quality control) (123), Working / discussion paper (institutional working paper series with quality control) (172), Working paper (membership working paper series such as NBER, CEPR, IZA) (87). Considering all conference material as one type of research output however, shows that 349 individual respondents selected one or both types of conference material. Considering all working / discussion paper series as one type shows that 284 individual respondents selected at least one kind of working / discussion paper series as research output types they would expect to produce. Other was selected by 5 respondents only. The following outputs were given in this category: Books for Students (Schoolbooks) Instructional Materials Report to donor Software 19 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 Website The full list of responses is shown in the table below. Research outputs expected Economics and Econometrics (UOA 34) Accounting Business and and Finance Management (UOA 35) Studies (UOA 36) Other All subjects Conference paper 187 8 20 59 274 Conference / workshop / seminar presentation 233 11 18 49 311 98 2 9 14 123 Working / discussion paper (institutional working paper series with quality control) 154 4 6 8 172 Working paper (membership working paper series such as NBER, CEPR, IZA) 82 1 3 1 87 326 12 28 68 434 Journal article in unrefereed journal 25 2 6 8 41 Report for funding body 36 0 4 20 60 Book chapter 67 3 4 24 98 Book 24 1 3 16 44 Dataset 28 0 2 10 40 Thesis 75 5 10 16 106 Other 2 0 0 3 5 Working / discussion paper (institutional working paper series with no quality control) Journal article in refereed journal Table 6: Expected research outputs – totals and by subject discipline The expected research outputs for all subjects are also illustrated in the following figure. 20 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 Expected produced outputs 434 311 172 123 87 41 60 106 98 44 40 5 C Bo ok : D at as et O : th er T (p h es le as is: e sp ec i fy ): 274 on C fe on re n fe ce W re or nc /w ki e ng or pa ks /d pe h W o i s r: or p cu / ki ss se ng io m /d n in pa W .. is cu pe or ss ki r( ng io in n st pa pa i .. Jo pe . p er ur r( na m (in em la st Jo rti i .. be . ur cle rs na h i n ip la re w rti fe o. cle re .. ed in un jo ur re R na fe ep re l: or ed tf jo or ur fu ... nd in g bo Bo dy : ok ch ap te r: 500 450 400 350 300 250 200 150 100 50 0 Figure 2: Types of research outputs authors expect to produce Further analysis was carried out on the responses to see the extent to which researchers produce multiple types of output from the same research project. The results are shown in the table below. Number of different research output types 1 only 2 or more 3 or more 4 or more 5 or more 6 or more 7 or more 8 or more 9 or more Number of Respondents % of all respondents 57 408 363 273 153 66 19 6 1 12.3% 87.9% 78.2% 58.8% 33.0% 14.2% 4.1% 1.3% 0.2% Table 7: Types of research output produced in a typical research project Over three quarters of researchers typically expect to produce 3 or more types of research output from a project and over half expect to produce 4 or more types of output. Within each of these types, the researcher (and co-authors) could be expected to produce multiple versions and revisions. 21 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 5.2 Versions/revisions kept In answer to the question ‘Thinking about revisions you make to your research outputs during their preparation, which revisions do you personally keep / plan to keep stored in electronic form (eg on your computer or network drive) at the end of the process?’, 90% of researchers replied that they keep either all revisions or the major revisions. A further 8% keep the latest revision only. This result appears to be an encouraging finding regarding the availability of versions of papers that could be deposited in an institutional repository. It appears to eliminate one possible cause for non deposit of research papers – not having retained a personal electronic copy of one’s work. Only 1 respondent did not keep a personal copy. Results are shown in the following table. Versions of research outputs kept Keep all revisions Keep major revisions but not all Keep the latest revision that I worked on only Do not keep a personal copy Don't know Do not produce research outputs Other (please specify) Total Response Response by % 167 36.0 251 54.1 38 8.2 1 0.2 0 0.0 0 0.0 7 1.5 464 100 Table 8: Which versions of research outputs do researchers personally keep? The 7 ‘Other’ responses were as follows: I SHOULD keep only major revisions, but I am not that systematic. Keep all revisions (and only those) that have in some way been seen by others than myself and my co-authors, for example, the revision that has been published as a working paper and the various revisions that have been sent to academic journals for review. Keep all revisions : until I get to a milestone e.g. version submitted for peer review, and then I will clear out revision versions up to that key version - keep doing this until at end I have only milestone versions Keep all versions until formally produced in which case I will only keep the final version I produced. Keep major revisions but not all : keep all the versions I have sent to a refereed journal Keep SELECTED revisions: those deemed as significant, stand-alone output the version submitted plus the final published version 22 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 6. Authors: Papers for submission to refereed journals The next part of the questionnaire began to focus specifically on researchers’ practice in relation to refereed journal articles only. This was of particular interest to the project, partly because of the importance of this publication type to economists themselves and because of the focus of institutional repositories and the open access movement on making high quality refereed research material available to a wider public. 6.1 Level of research activity Researchers were asked to indicate the number of papers they produced in the past two years that were intended for publication in refereed journals. Two years (rather than one) was proposed as the time period, in order to get a better idea of average output and to reduce the effects of any atypical year. More than 70% of respondents noted that they were likely to produce from 1-6 papers for submission to refereed journals during a period of 2 years of conducting research. Of the 52 respondents who had not produced any papers, but expect to do so in the near future, 29 were students. This question confirmed that the survey respondents were genuinely research active and that their replies about attitudes and practice regarding versions of their papers could be held, to some extent, to be indicative of economics researchers. Number of papers produced in past two years Response Response by % More than 6 68 14.7 4–6 161 34.7 1–3 180 38.8 0 – I expect to produce academic papers in the near future 52 11.2 0 – I am unlikely to produce academic papers 1 0.2 Don't know 2 0.4 Total 464 100 Table 9: Number of papers produced in past 2 years for publication in refereed journals 6.2 Research activity by role The role of the respondents and consequently their research experience is also in line with the number of research outputs they were likely to produce. The majority of the professors that participated in the survey were most likely to produce 4 or more research outputs. Lecturers/associate professors were most likely to produce between 4-6 research outcomes while postdoctoral staff and doctoral students were likely to demonstrate research results in the range of 1-3 papers for publication to refereed journals in the period of 2 years of conducted research. Therefore, although in terms of advocacy, those in most senior positions are likely to have more research output and multiple versions of it available, researchers in all groups are likely to produce – on average – from 1 to 3 papers. Results by role of the researchers are presented in the following table. 23 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 Number of produced Student Post Lecturer / (PhD or Contract / Not an papers doctoral Professor Associate other freelance active research Professor research researcher researcher staff degree) More than 6 32 26 6 3 1 0 4–6 51 76 22 11 1 0 1–3 28 44 34 62 12 0 0 - I expect to produce academic papers in the near future 1 6 7 29 7 2 0 - I am unlikely to produce academic papers 0 0 0 1 0 0 Don't know 0 0 0 1 0 0 No Answer 0 0 0 1 0 0 Table 10: Number of papers produced, by role of respondents 6.3 Versions of all research outputs kept, by number of journal articles produced The number of journal articles produced was cross tabulated against the researchers’ declared practice in retaining personal copies of their research outputs in general. Only one person noted that they do not keep a personal copy of their publications, which suggests that the availability of digital content is or will not be the issue for its inclusion in an open access repository. The relations between expected productivity and the availability of different versions are shown in the following table. Number produced of Keep papers Keep all major revisions revisions but not all Keep the latest revision that I worked on only Do not keep a Don't personal know copy Do not produce Other research outputs More than 6 22 38 8 0 0 0 0 4–6 56 88 16 0 0 0 1 1–3 63 99 12 1 0 0 5 0 - I expect to produce academic papers in the near 24 25 2 0 0 0 1 24 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 future 0 - I am unlikely to produce academic papers 0 1 0 0 0 0 0 Don't know 1 0 0 0 0 0 0 No Answer 1 0 0 0 0 0 0 167 251 38 1 0 0 7 Totals Table 11: Revisions of research outputs kept, by level of research activity 7. Authors: Revising and storing academic papers for refereed journals Still maintaining the focus on journal articles, the questionnaire went back to the question of which versions researchers are actually personally keeping themselves (eg on their own computer or network drive). The intention was to drill down very precisely in order to discover whether, in cases where the journal publisher permits deposit of articles in an open access repository, there is a likelihood that the author will be able to produce a usable open access version of their paper. 7.1 Versions of journal articles most likely to be kept by authors Respondents were asked to say for each of six different versions of their journal articles, which ones they would personally keep (eg on their own computer or network drive). The six different versions, which more or less represent the stages in a linear process from draft to publication, were: Early draft versions (not circulated to anyone, other than co-authors) Draft versions circulated to colleagues or peers for feedback before submitting to a journal Version submitted to a journal for peer review Final author version produced by yourself/co-authors agreed with the journal, following referee comments Version produced by publisher – proof copy Version produced by publisher – Final published version (often in PDF as it appears in the journal itself The possible answers for each version were: Keep permanently Keep until an updated version was produced (if applicable) Do not produce/have this version Don’t know Don’t produce papers 25 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 In line with the results shown in earlier sections dealing with all types of research output, the answers to this question showed that authors do retain personal copies of different versions of their journal articles. Of particular note is that 91% of respondents say they keep permanently the final author version produced by themselves/co-authors – agreed with the journal, following referee comments. This is the version which many of the key academic publishers will permit authors to deposit in open access repositories and so this response is an encouraging result for the population of repositories. The full results are shown in the table below. Which of the following versions of a paper, that you have written for publication in a refereed journal, would you personally keep (eg on your own computer or network drive)? Early draft version(s) (before circulation to anyone, other than co-authors) Keep permanently: 39.9% 185 Keep until updated version produced (if 50.4% 234 applicable): Do not produce/have this version 5.4% 25 Don't know: 3.0% 14 Don't produce papers: 1.3% 6 Draft version circulated to colleagues or peers for feedback before submitting to a journal Keep permanently: 53.9% 250 Keep until updated version produced (if applicable): Do not produce/have this version: Don't know: Don't produce papers: Version submitted to a journal for peer review Keep permanently: 38.1% 177 4.1% 19 2.6% 1.3% 12 6 78.9% 366 Keep until updated version produced (if 16.4% 76 applicable): Do not produce/have this version: 1.9% 9 Don't know: 1.5% 7 Don't produce papers: 1.3% 6 Final author version produced by yourself/co-authors - agreed with the journal, following referee comments Keep permanently: 90.7% 421 Keep until updated version produced (if 5.8% 27 applicable): Do not produce/have this version: 1.5% 7 Don't know: 0.6% 3 Don't produce papers: 1.3% 6 Version produced by publisher - Proof copy Keep permanently: 62.5% 290 Keep until updated version produced (if 25.0% 116 applicable): 26 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 Do not produce/have this version: 7.1% 33 Don't know: 4.1% 19 4.1% 19 Don't produce papers: 1.3% 6 1.3% 6 Version produced by publisher - Final published version (often in PDF format as it appears in the journal itself) Keep permanently: 91.8% 426 Keep until updated version produced (if 1.1% 5 applicable): Do not produce/have this version: 4.5% 21 Don't know: 1.3% 6 Don't produce papers: 1.3% 6 Table 12: Versions of journal articles kept by authors 7.2 Accessibility of own versions to their authors While the response above concerning retention of versions was very encouraging, it is one matter to keep all versions or milestone versions and it is another to be able to retrieve these easily. In the following set of questions, researchers were asked about their personal information management practices to see how much this might affect the true availability of usable versions of journal articles. More than half (59%) of the researchers said they have an ‘easily accessible’ copy of all of their ‘final author versions’ among their personal files (electronic or paper). A further 36% indicated that they have an easily accessible copy of most of their final author versions. This again is a very encouraging finding, though it does include the fact that for 42% of respondents there may be some articles for which they do not have an easily accessible final author version. Easily accessible copy of final author versions Response Response by % 271 All 169 Most 18 Some 1 None 2 Don't know 3 Don't produce 464 Total 58.4 36.4 3.9 0.2 0.4 0.6 100 Table 13: Ease of access by authors to their own final author versions of articles 7.3 Reasons for not having access to all versions Researchers who did not answer that they had all their final author versions easily accessible were invited to provide their reasons from a suggested list. They could select all those that applied. The list was developed from comments made during the interview phase of the study and from anecdotal reports by repository managers about difficulties 27 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 in obtaining papers from authors. An option for ‘Other’ responses was also included. A hundred and fifty six (156) researchers (out of an applicable 190) replied to this question. Not having electronic copies of own work before a certain date was selected by 90 respondents as a reason for not having all their articles easily accessible. The next most common reasons were that the papers were stored on various servers and would therefore prove difficult to retrieve (30 responses) and loss or damage to their computer system (29 people). It is interesting to note that this pattern of response is followed by all the researchers irrespective of their role. Other reasons included not having the final version when working with co-authors (26 respondents) and making iterative changes to the manuscript in the later stages meaning that a ‘final author version’ would have to be assembled (for example to take into account hand amendments made to a proof - noted by 24 respondents). These results are interesting and provide a number of pointers to repository managers for advocacy and training about personal information management. The point about electronic copies not available before a certain date should be a problem which will diminish in the future. However, note that software obsolescence was mentioned in the ‘Other’ category. Reason for not having easy access to Number of % of total % of all survey own final author versions respondents respondents to respondents this question (156) who gave this reason I do not have electronic copies before a 90 57.7% 19.4% certain date I do not have copies produced while I 19 12.2% 4.1% was at a previous university/institution I do not have copies of papers that I co26 16.7% 5.6% authored, the principal / lead investigator has this version Changes to the manuscript are made 24 15.4% 5.2% iteratively between myself and the publisher in the later stages so I would have to assemble such a version Loss or damage to my computer 29 18.6% 6.3% Papers are stored electronically but would be difficult to retrieve from various servers Loss or damage to paper files 30 19.2% 6.5% 11 7.1% 2.4% I have discarded print copies of older papers before a certain date and do not have electronic versions Other 17 10.9% 3.7% 19 12.2% 4.1% Table 14: Reasons for not having easy access to own final author versions 28 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 The 19 responses under ‘Other’ are worth reproducing here as they provide a glimpse of how difficult this is for researchers to keep on top of: All of my papers are stored online by a physics database ([name of database]) (including source files), so there is need to keep them locally. I do keep most locally though. Answers above are approximate. I would hope to always have a final version but I can't be sure Changed software and can no longer read electronic versions of old papers Copies to older papers might be in old computers that I do not have access to anymore. don't understand question Files sometimes retrieved with difficulty have electronic copies in ancient programmes (particularly word perfect with graphics in eps) which may not be supported any longer Honestly, I am not sure, whether I have electronic versions of older papers, but I think so. I currently do not have any versions ready for refereed journals I do not have access to the journal on-line I do not have papers older than 12 years stored either on harddisk or on CD; I keep those earlier versions, which contain ideas that were cut at later stages. I don´t remember the subarchives where I stored them. I have moved so much, that my files are a mess!!! I have paper copies of all my publications, but not electronic copies of older ones. I kept most copies of the work I did in previous institutions, but these are not readily available I think in some instances there was not a final electronic version made available by the publisher. This is in particular true for continental journals and book chapters Lost somewhere on my pc My files are spread over multiple computers and thumb drives, in multiple geographic locations, sometimes making it difficult to find things. sloppiness Finally, it should be remembered that despite the reasons given in this section, the vast majority of respondents said they do have all or most of their final author versions easily accessible. 7.4 Organisation and storage of different versions Following on from the questions about keeping and retrieving versions of their research papers, respondents were asked to state how satisfied they were with the way in which they organise revisions and different versions of their own work on their own computer or storage medium. Half of the respondents to the survey said they were satisfied, while just under half (48%) said they were not completely satisfied. This finding suggests that researchers might welcome suggestions and/or alternatives to the way they manage their digital content at present. Satisfaction with storage and organisation Yes No, not completely Don't know Don't produce research outputs Total Response Response by % 229 49.4 222 47.8 12 2.6 1 0.2 464 100 Table 15: Organisation of different versions of papers 29 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 Those who expressed their satisfaction with their personal information management strategy were invited to describe the system they use and how it helps them to organise their files. Those who said they were not completely satisfied were invited to describe some of the difficulties that they face when organising their own revisions of documents. These optional questions were rather popular and attracted 134 and 147 responses respectively from an applicable group of 229 and 222 respectively. The full responses can be seen as Appendix D and Appendix E. 7.5 Organising own versions – successful strategies Those who answered yes to whether they are satisfied with the way they store revisions to their work, were invited to describe briefly the system they use. Selected responses are grouped below in categories. Many respondents used several techniques and typically had invested some time in thinking and planning about how to store their work. For the full list of replies, see Appendix D. By project ‘Different folders for each paper, and sub-folders for different stages of the paper and a specific folder for published papers.’ ‘By project, then by date subfolders… I also name files with a date extension, reflecting the most recent revision.’ ‘For each paper, one file. For each version, date of production.’ ‘Prepare a folder for each article, including dataset and commands list. There I also save the versions, giving different names to files only if the changes are substantial, and only once the work is fully completed.’ ‘Separate folder for each publication, separate subfolders for each version, storing all data (including figures etc.)’ ‘Separate file folders for different projects, versions are always dated in the filename and usually the journal name is added to the file name.’ By date ‘Working drafts are stored in dossiers by year / subdossier by the name of the paper. They are not updated and kept as is. Final versions (PDF and latex) are stored in the special dossier where there are two subdossiers: pdf and latex. These files are updated and replaced by newer ones.’ ‘Every filename includes a date eg. hello060518. That way it is easy to find the latest version among co-authors and for myself.’ ‘Every time I make a new version, I save the version with the same name, except for the date. So when I continue I open the last version.’ ‘I save each article by name and date (plus a, b, c etc. if several in one day).’ ‘Windows and word. Just sort files by date.’ ‘Though my files are not always easily accessible they are dated for version control.’ ‘The name of the version contains the last date of activity on that paper.’ Version control by number ‘name_of_file_version_number, the highest version number is the latest version.’ ‘I number succeeding drafts of my work and the final one I give a title with the word “final” in it.’ 30 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 ‘I use numbers as used for software versions eg. 2.2 for second revision of first major revision.’ ‘I use version numbers, eg. “paper 2.1.doc”, changing the second number with each edit of any significance and the first number if there is a milestone in the process – team review / change of direction etc. I keep the milestone versions in a backup folder within the main folder that the document is developed in. When working with colleagues I ask them to include their initials with the version number so it might go “paper CR2.1”, paper “FB2.2” etc.’ ‘For each project I use a numbering system projectname_xxx.ext. Any revision by any coauthor increments the number.’ ‘Filenames with subsequent versions contain the version number, so that I can easily sort them in my file manager, and see which is the last one. For example: thesis01.pdf thesis02.pdf thesis05.pdf etc’ Use of version control systems or computer file systems ‘Version control system (CVS or subversion).’ ‘Subversion versioning software, running on a university server that is backed up daily.’ ‘SVN archive containing bothLaTex documentation and R code / datasets from computations. SVN stores all revisions automatically, earlier versions can always be retrieved.’ ‘I use BSCW, gmail (store them into emails), my PCs at home and at work, so I have 4 backups. BSCW has the best way to keep in mind what is the current version. This is my primal organiser.’ ‘Computer file system, BSCW.’ ‘Spotlight on Mac OSX.’ Retain the latest version only ‘I usually throw away everything as soon as I have a new version. Unless, ie. the new version is in another language or has some substantial changes in it, so that I may need the first version for some other purpose.’ ‘Keep the latest version only.’ 7.6 Organising own versions - difficulties Respondents who answered that they were not completely satisfied with the way they organised their own revisions of documents were invited to note any of the difficulties that they faced during this process. Selected replies have been grouped below in categories. For the full list of responses, see Appendix E. Changing between computers ‘Maintaining coordinated archives between multiple machines.’ ‘Synchronization of laptop and desktop hard drives.’ ‘Determining the chronological order of the versions; maintaining a common site for all versions – I work on a desktop in the office, a desktop at home, a laptop, and keep files on USB key memory devices as well.’ ‘I am using two different computers for my work. As a result I sometimes have my work at different stages on the two computers. I would like to find an easy way to both systems up to date at any point in time.’ ‘Problems arise because of using multiple machines – use stick to transfer but sometimes versions get muddled! Need a virtual storage space accessible from anywhere and one which has auto back up!’ 31 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 Keeping too many versions ‘I keep some useless versions.’ ‘Sometimes I have too many copies. Difficult to remember which one was a major revision.’ ‘I revise many times and am always afraid to lose bits that are removed from the paper. So, I store everything “just in case” but get lost in the end when I want to recover that. This is even more difficult when I have co-authors.’ ‘I think that I keep too many old versions. I like to keep several while working on the paper in case files get corrupted etc but I seldom go back afterwards and delete all unimportant versions.’ ‘Spending some time on deciding what to keep and what to drop would be more efficient.’ Co-authoring ‘When I work with co-authors who have different archiving or identifying techniques it makes it hard to keep a clear path of revisions.’ ‘The problem is that co-authors sometimes do revisions on the wrong version. We don’t agree which is the latest version.’ ‘Generally, I keep them all in a folder, with versions organised by date the revision was completed, so it is easy to know which is the most recent. However, my co-authors do not always follow this naming system, so sometimes when I get a revisions back from them via email, if I don’t save it and rename it right away, it is hard for me to locate the most recent version when I go back to work on it.’ Knowing or indicating the differences between versions ‘Could do with formal version control system listing changes, rather than tracking versions by date.’ ‘It is difficult to keep track of revisitions that at first may seem to be minor but may develop into major changes.’ ‘Difficult to see which version is the latest when there are only a few minor differences.’ ‘I have developed a system whereby every iteration of a paper is identified by a paper name and date (and, if co-authored, by which of us has done the iteration). What I am not so efficient at is distinguishing between versions with limited differences and those where substantial changes have been made.’ ‘Sometimes I make alternative versions of the same paper (horizontal versions) and find it difficult to acknowledge which specific changes I have made in each version.’ Inadequate naming systems ‘I do not have a consistent renaming system. This causes major problems in finding the correct version after a long period.’ ‘I keep versions of different stages of each research project, and am not happy with my labelling scheme which often does not reveal which version contains what.’ ‘- consistent naming - documents spread over multiple computers (office, laptop) - different word processors (LaTex, Word) and types (pdf) - retrieving last version, retrieving submitted version (if changes have been made)’ Mixing / confusion of versions ‘Sometimes versions get mixed up, and then I do not know which version I sent out to a journal (for example).’ ‘Accidental editing of an earlier version, confusing the date order; co-authors mixing versions so non-tested variants occur.’ ‘no consistent description of version status; not sure who has a copy of each version; different version control systems used over time.’ 32 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 No logical way of labelling versions ‘Sometimes there is no clear ranking of versions because they are produced to match different requirements (conference, journal submission…)’ ‘Sometimes it is difficult to keep track of different versions and sometimes a similar paper belongs to a project and to a conference etc. and I do not know where to file it. Also, I am never sure how many versions to keep – I do not want to lose important information but an information overkill may be a problem as well. Especially, after some time has passed I might not remember which are the relevant versions and what are the differences.’ Don’t retain / ever possess required versions (2) ‘I’d like to have version published… A few times I have noted that the final draft I keep is slightly different (some sentences) from the one published.’ ‘Final version is often still not camera ready ie graphics and figures are often submitted electronically as separate files (in addition to a text file) to the publishers so no actual 'as submitted' final version exists which can be easily submitted to a repository.’ Lack of time and/or lack of skills for information management ‘I am a bad information manager - something else always captures my imagination before I get around to it’ ‘Sometimes it gets messy !’ ‘time constraints’ ‘i am chaos’ 8. Authors: Responsibility for secure storage of different versions The questionnaire then moved on from asking authors about their own practice regarding creation and storage of versions of their work to investigate their attitudes towards the responsibility of others for long term secure storage of versions. Throughout the lifecycle of the research output’s production, the majority of respondents felt that the storage of their documents was mainly the authors’ and co-authors’ responsibility. Particularly during the early stages of a paper’s production for submission to refereed journals, the respondents felt that authors and co-authors should assume responsibility for the paper’s storage. From the moment the paper is submitted to a journal for peer review the responsibility largely remains with the authors of the paper, though some of the respondents (116 respondents) indicated an increase in the publishers’ responsibility for storage. The emphasis is shifted from authors (156 respondents) to publishers (243 respondents). For the final version, produced by the publisher for publication the vast majority (434 respondents) felt that publishers have a responsibility for secure storage of research outputs. However, the role of the authors’ universities/institutions, including libraries as well as a notion of shared responsibility amongst all interested parties was highlighted as well. Interestingly though, at no stage in the lifecycle of a journal article was the author’s university/institution (including library) seen as the most important actor in secure storage of academic papers. The stage at which the institution was seen as having the most significant role was in secure storage of the published version, but even in this case, the publisher was seen by more respondents as having a role to play. Equally 33 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 startlingly, 102 respondents saw no role for the author’s university/institution in secure long term storage of any versions of academic papers including the published versions. Subject repositories were not seen as having a major role to play, though perhaps the high number of ‘Don’t know’ responses indicates that the term subject repositories was not well enough explained. Responsibility storage for secure Authors / Authors’ Publishers Subject Co-authors universities / repositories institutions (including libraries) Early draft version(s) before 181 13 9 14 circulation (other than to coauthors) Draft version circulated to 242 40 6 47 colleagues or peers for feedback before submitting to a journal Version submitted to a journal 340 46 116 47 for peer review Final version produced by 349 81 168 62 yourself / co-authors - agreed with the journal, following referee comments Version produced by publisher 156 47 243 33 Proof copy Version produced by publisher 239 305 434 257 Final published version (often in PDF format as it appears in the journal itself) None of these 18 102 7 35 Don't know 9 16 9 135 Table 16: Responsibility for the secure long term storage of versions Fifty seven (57) respondents answered an optional follow up question inviting them to comment further on responsibility for secure long term storage of versions of academic papers. These are shown in full as Appendix F. Some of the respondents noted the role that subject repositories already play in this area such as RePEc for example, and raised questions about the roles that all parties (universities, authors, repositories, publishers, etc.) should assume in the process of preserving and sustaining digital material. One respondent in particular, encapsulated the above in the following statement: ‘… it depends on the function of the repository. If it is an archival repository of the publisher, of the university, etc. the level of detailed and complete information should be higher than in the case of a repository of eprints. I think that the debate in this area is very confusing because it does not make any difference in the final function of the repository involved. Even if the digital environment creates convergence the distinction should be defined at a logical level to ensure a clear framework of the responsibilities involved and the policies to develop’. 34 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 9. Authors: Dissemination of research outputs The next main section of the questionnaire asked respondents to continue thinking in their role as authors and to consider a series of questions relating to their work and digital repositories. 9.1 Digital repositories The first question in this section asked about awareness of a university digital repository where the researcher could deposit their papers. A couple of links to institutional repositories were provided so that respondents could check that their understanding of the term ‘digital repository’ was correct. One third (33%) of the researchers said that their university did have such a repository. The remaining two thirds of the respondents noted that either there was no such service (43%) or that they were unaware of such a service (24%). More than half of the researchers who knew that they have access to a digital repository in their university replied that they submit their papers there. However, this response represents only one fifth (90 researchers) of the overall response. Therefore, it becomes obvious that advocacy about the availability and use of digital repositories and of libraries’ role in providing this service is still important. Availability of a digital repository at the institution Yes No Don't know Total Response Response by % 152 32.8 200 43.1 112 24.1 464 100 Table 17: Awareness of institutional repositories Researchers who have a digital repository at their institution but who have not deposited any papers were invited to say briefly why this was. Thirty eight (38) replies were received and the reasons have been summarised in the following table. Reasons for not depositing papers in institutional digital repository Response Time taken or lack of time to learn how to do this 10 Didn’t know about it 4 Use personal website/prefer personal communication 3 Interferes with later publication or with anonymity of peer review 2 Satisfied with publication of journal and/or subscription working paper series 2 Author/institution reluctant to disseminate preliminary versions 2 Newness (of author at institution or of repository) 2 University’s back-up systems are sufficient or all that is available 2 Concerns about lack of security 2 Required final author version does not exist as an easily submissible file 1 Concerns about motive of university – surveillance of outputs 1 Complexity of system 1 Table 18: Reasons for not depositing papers in institutional repository 35 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 The main category of reasons given can be seen as a lack of time on the part of researchers and point to a need for further promotion and simplification by the repository management (to address time, lack of awareness and complexity). In many cases researchers sounded interested saying that they hadn’t got around to it. However, one of the responses citing time as a factor noted: takes as long as this survey to submit a paper into the university repository with questiona[b]l[e] benefits; strongly prefer international preprint servers. There were a few other specific concerns which indicated reservations about the idea of institutional repositories or about the practicalities. Some of these are relevant to the question of versions. But again I try to avoid it because they are made available on the web and I don't like that referees can identify the author in this way. Does not bring anything, as there is no feedback given upon "publication", and some journals are reluctant to consider papers that have been made available in such a way. Institution is hesitant to have incomplete or soon to be revised work stored permanently. The University asks for a final as published version but NOT the version which is typeset in the journal. Often no such document exists as an easily submissible file. 9.2 Authors’ intentions regarding deposit Despite the actual rate of submissions noted in the previous section, 81% of respondents noted their willingness to do so if invited. In this carefully worded question about author attitudes we asked ‘If your university invites you to place a copy of your paper in the institutional repository and requests from you the ‘final author version’, would you provide this version?’ The definition of ‘final author version’ which had been previously provided was repeated here to be sure that this was understood. The positive response is very encouraging for population of institutional repositories and suggests that on the whole, issues such as the concerns mentioned in the previous section by a minority of respondents are not what holds most people back from providing an author version of their work. Willingness to provide final author version Yes No Don't know Don't produce papers Total Response Response by % 375 25 60 4 464 Table 19: Willingness to deposit final author version in IR if invited to do so 9.3 Authors’ attitudes regarding deposit – further exploration Authors’ attitudes were further explored by providing a series of statements about providing final author versions of their papers and asking respondents to say to what extent they agreed or disagreed. The statements were developed from some of the findings of the interviews and from other reported concerns by repository managers. The different statements were: I am willing to provide this version – it helps me to disseminate my work quickly 36 80.8 5.4 12.9 0.9 100 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 I am willing to provide this version on condition that readers are made aware that it is not the published version I am willing to provide this version on condition that a link to the published version is provided It would take too much time for me to provide this version I consider this version to be inferior to the publisher PDF version I place the publisher PDF version on my personal website as my first priority, if permitted I am willing to provide an author final version to a fellow researcher if requested by email I am concerned that I might lose citations to the published version if I provide my final author version I am unsure whether the publisher copyright agreement permits me to provide this version I intend to provide such versions in the future In general, respondents had a positive attitude towards the provision of final author versions of their papers. Three hundred and ninety-six (396, 85%) Strongly/Slightly agreed that they would be willing to provide such versions and that this helps them to disseminate their research quickly. The level of agreement with the related statement ‘I intend to provide such versions in future’ was rather less, at 304 respondents (66%). This perhaps confirms the importance for repository managers of establishing workflows which make it easy for researchers to deposit. It is encouraging to see that there was disagreement with the statement ‘It would take too much time for me to provide this version’, with 371 (80%) Strongly/Slightly disagreeing. Of these, 264 (57%) strongly disagreed that it would take too much time. The high level of agreement with the statement that authors would provide a copy to a peer if requested by email (424 respondents, 91%) shows the importance for researchers of personal communication as part of their networking and scholarly communication and the extent to which they welcome such approaches from fellow researchers in their field. They also agreed that they would be happy to provide the final version providing that readers are made aware that this is not the final publisher version (84%) and that there is also a link provided to the publisher version (78%). Concerns about the quality of the final author version as compared with a publisher PDF version were explored through a pair of statements. 50% strongly/slightly agreed with the bare statement ‘I consider this version to be inferior to the publisher PDF version’. 73% agreed that their preference is to place a publisher PDF on their website, if permitted. Uncertainty about what the standard author-publisher agreements permit is a recurring concern expressed by authors, and this is borne out by the 69.3% who strongly/slightly agreed that they are unsure whether the publisher copyright agreement permits them to provide their final author version. If the 72 ‘Don’t know’ responses to this question can also be taken as a measure of uncertainty, it could be said that 84% of respondents were unsure about this issue. The SHERPA/RoMEO service is helping to clear up a great deal of uncertainty in this area of course, though this reply still points to a role for 37 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 individual repository managers to promote this valuable service to their authors and of course to educate authors about negotiating their contracts with publishers. Response was divided about the statement addressing concerns of a perceived loss of citations to the published version if the final author version was made available publicly in a repository, with 42% agreeing and 46% disagreeing with this statement. Extent to which authors agree with these statements about providing final author versions of papers Strongly/Slightly Slightly/Strongly Don't know Don't agree disagree produce papers OK - helps me to disseminate 396 51 14 3 quickly OK - provided readers aware 388 48 24 4 not published version OK - provided link to 360 71 29 4 published version Would take too much time 57 371 29 7 Consider this version inferior 230 202 28 4 to publisher PDF version Place published PDF on 337 88 35 4 personal website as my first priority, if permitted Provide to peer on email 424 26 11 3 request Concerned about loss of 193 213 52 6 citations Unsure whether copyright 317 72 72 3 permits Intend to provide in future 304 46 111 3 Table 20: Author attitudes towards providing final author versions to IRs 38 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 The Strongly/Slightly agree responses are shown in the table below ranked in order. Statement Provide to peer on email request OK - helps me to disseminate quickly OK - provided readers aware not published version OK - provided link to published version Place published PDF on personal website as my first priority, if permitted Unsure whether copyright permits Intend to provide in future Consider this version inferior to publisher PDF version Concerned about loss of citations Would take too much time Strongly/Slightly Response by % Agree 424 91.4 396 85.3 388 83.6 360 337 77.6 72.6 317 304 230 68.3 65.5 49.6 193 57 41.6 12.3 Table 21: Attitudes towards deposit - ranked 9.3 Other dissemination routes The questionnaire at this point moved on to discuss the ways in which researchers already disseminate their own research outputs. Some feedback received when working with economists on adding their papers to the institutional repository is that there is no perceived additional benefit because of the existence of international preprint servers such as RePEc and subscription-based networks such as SSRN (though SSRN is free for individual researchers to deposit in). Subject-based collections such as these are important because this is where networking takes place in economics. Respondents were invited to say whether, in addition to formal publication in refereed academic journals and dissemination in university/institutional collections, as described above, they disseminate their own research findings, in full text, through any of the following channels. The channels suggested are shown in the figure below and researchers were able to select all those that applied. The most popular dissemination channel was the use of personal website or home page, with 301 (65% of all survey respondents) saying they used this method. Use of the RePEc service was less, but with 199 out of 347 Economics and Econometrics respondents using this, the rate is still high at 57%. 39 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 Dissemination channel Number of respondents Personal website / homepage University website for working paper or discussion paper series REPEC (IDEAS, EconPapers) SSRN Don’t produce research outputs Other 234 214 As % of all Economics/Econometrics respondents (347) 67.4 61.7 199 160 3 29 57.3 46.1 0.9 8.4 Table 22: Other dissemination channels used - Economics/Econometrics Other dissemination routes suggested by economics researchers were conference websites, email communication with colleagues, working paper series such as NBER or CEPR. Once again the questionnaire had not provided an option to answer None, so 11 responses in the Other category are accounted for by this answer. This question attempted to discover the extent to which researchers are already engaged in promoting their work and making it accessible online and indeed the results do show that a majority of authors are using 2 or more channels, as shown in the table below with around half disseminating their work through 3 or more channels (in addition to publication in journals and any deposit with institutional repositories). Number of dissemination channels used by economics authors 5 4 or more 3 or more 2 or more 1 or more Response 2 61 171 253 332 Response as % of all Economics/Econometrics respondents (347) 0.6% 18.2% 49.3% 72.9% 95.7% Table 23: Number of dissemination channels used by economists 40 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 The overall figures for use of different dissemination channels are shown in the figure below. Other preferred dissemination routes 350 300 250 200 150 100 50 0 301 256 209 181 60 8 Personal w ebsite / homepage University REPEC w ebsite for (IDEAS, w orking EconPapers) paper or discussion paper series SSRN Don't produce research outputs Other (please specify) Figure 3: Other preferred dissemination routes – all subjects 9.4 Role for libraries/institutions in assisting with dissemination Given that a majority of researchers are working to promote their research through a number of different dissemination channels, the next question asked whether there was a role for universities / institutions to play in doing this on authors’ behalf. The majority of researchers (78%) indicated that it would be useful if their institution could deposit their research on their behalf. This result indicates that academic staff and researchers are aware of the visibility they can get on the web and that they would appreciate some assistance with the administration required to promote their work. Repository managers can take note of this as a potential way to provide additional services connected with the institutional repository. Attitudes towards university/institution depositing research outputs on behalf of authors Very useful Useful A little useful Not useful Don't know Don't produce research outputs N/a Total Response Response by % 179 140 33 31 9 2 38.6 30.2 7.1 6.7 1.9 0.4 70 464 15.1 100 Table 24: University/institution depositing research outputs for authors - usefulness 41 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 10. Authors: Attitudes towards making versions of research outputs openly accessible Following the questions about how researchers disseminate their work through channels that are already familiar to them, they were invited to consider again the question of versions of their work and to reflect on which versions of their work they are interested in making openly accessible to the general public, if permitted. 10.1 Versions of academic papers authors are interested in making openly accessible, if permitted The vast majority (385 people) noted that their preferred version would be the version produced by the publisher and published in a refereed journal. The second preferred version (274 people) would be the final version produced by the author(s) and in agreement with the journal, following referee comments. It was interesting that 116 respondents were even interested in making available an early draft which they have circulated to colleagues or peers for feedback, before submitting it to a journal. Which of the following versions of your academic papers are you interested in making openly accessible to the general public, if permitted 450 400 350 300 250 200 150 100 50 0 385 274 191 116 Draft version Version Final author circulated to submitted to version colleagues journal for or peers - peer review before submission 100 Publisher proof Published version PDF 3 3 Don't know Don't produce papers Figure 4: Which versions authors are interested in making openly accessible, if permitted Additional comments were invited on this question and 44 responses were received and which are summarised in the following table. 42 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 Further comments about making versions of academic papers openly accessible Supporting Open Access Prefer latest version Prefer published version only Role of working papers Citation problems Copyright/IPR uncertainty Miscellaneous N/A Response 10 7 3 4 3 2 9 6 Table 25: Further comments by authors on making versions OA Comments on open access included: I am completely committed to the principle of open access and would always do as much as I can to share my work and encourage others to likewise. Information wants to be free this should be the norm unless VERY good reasons can be given for not doing so Not everything is published in journals, so earlier versions could be made open to public on website I am very much in favour of this One researcher was in favour of open access to all versions, but perhaps only after an interval of time: Depends on timing (e.g., 50 years on is fine) Those who view the latest version as the best were divided between those who were happy to make work accessible as it goes along, replacing papers with later versions and those who do not wish to make earlier versions available at all. Those who preferred to use only the published version also want to see the best quality version used, but accept that copyright agreements do not always permit this: I want to make them accessible at each stage, but not multiple copies at once. That is to say, once a new version is available, I would like to remove the old versions from the public domain to keep clear the definitive version to date I prefer that all major versions should be preserved and accessible, but only the latest and best one visible -- just like arXiv My research is very original and I do not necessarily want others to see how my thoughts developed to the final published form: if I was happy to publish the earlier, inadequate versions I would have published them, but I hold off until I have a final version that I am content/happy with I don't see the point of several different versions of the same paper, people create the final version because it's the best version I would prefer that after publication the only version available was the final published version, both in a repository and in the journal, however, I understand that some publishers do not permit this and so therefore have made the proofs available A few comments were made about the role of working paper series: Sometimes, I publish an early draft as a technical report while being reviewed for a conference. The early draft is typically lengthier and less polished. It is not a 'version' of the published paper. It is more an accompanying paper. 43 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 My experience is that students tend to rely on working paper (and similar) version, probably due to being unfamiliar with journals (or for reasons of convenience). thus I strongly prefer pre-print versions to be explicitly linked to the published version. as a second best I, of course, prefer students reading pre-p[r]int versions rather than no academic papers It is common knowledge which are the accepted secondary channels (working paper series) and which are not. If the paper can get into one of those it is as good as publication. But again I don't like to be pres[s]urised into publishing working paper series in those that are not generally accepted as of high standard. In that case it is more of an obstacle. So it really depends on the school / department and its international standing - e.g. CEPR, NBER, LSE... A couple of comments were made about the refereeing process, one of which was also concerned with the role of working papers: It really depends on the value-added of the refereeing process. If the referee made the paper slightly worse as a condition of an ACCEPT recommendation, I might very well want the better (i.e., better in my opinion) version available to the community. The main advantage of the publisher version over the final author version, as far as I can tell, is ease of citation and typesetting. I think it's important to make the submitted (working paper) version available as the refereeing process occasionally requires the author to remove important material, and this material needs to be accessible somewhere. A few of the researchers raised concerns about confusion over citations when more than one version of the same paper is available: A problem is that basically the same paper gets cited in different ways which can be confusing In order to avoid intel[l]ectual property conflicts, extreme care should be put regarding the credits, dates, etc. Public repositories should be able to produce a cover page with a copyright statement and the full reference to the published journal version to avoid miscitations of a paper and to follow some copyright transfer agreements. Other miscellaneous single comments included: I usually don't get PDF-files from my editors to put on my website; I only get paper versions from proof reading on. Would be perfect to have a place to put all the paper easier than the ones that exist. All versions should indicate status and possibly contain links to later versions. avoid having too many channels like SSRN, REPEC etc that seem to provide the same service I'm a bit puzzled at the emphasis on the distinction between "final author version" and "final published version". Surely it's just a question of formatting. Is there any reason why one should prefer the latter to the former? I find that the copyright rules (a) are complicated and time-consuming to understand, and (b) inhibit dissemination of my work. I find myself risking breaking the rules/consciously breaking them in order to display published work on my web page, with the intention of pleading ignorance and taking them down, should I get challenged for this. The rules should be simplified and better promulgated, authors should retain copyright, they should be able to do what they want with their own work. 10.2 Authors’ understanding of which versions of their papers they are allowed to disseminate While the previous questions explored authors’ interest in disseminating different versions of their work, the present question aimed to discover the level of understanding 44 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 researchers feel that they have about what is permitted. We asked ‘Do you feel that you have a full understanding of which versions of your academic papers (intended for publication in refereed academic journals) you are allowed to disseminate in full text, in which locations and at which times?’ This question was optional and attracted 414 responses. Only 11% of respondents to this question indicated that they have a full understanding of which versions of their papers intended for publication they are allowed to disseminate in full. A large number of the respondents (87%) said they have either, some, limited or no understanding of this. Once again this does point to a role for IR managers and others involved with open access digital repositories in further explaining to authors what their rights are when they negotiate their contracts with publishers. Level of understanding Full understanding Some understanding Limited understanding No understanding Don’t know Don’t produce research outputs Total Response 46 136 182 40 10 0 414 Response by % 11.1 32.9 44.0 9.7 2.4 0.0 100 Table 26: Level of understanding of versions allowed for dissemination 45 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 11. Researchers: Finding multiple versions of the same academic paper Addressing researchers now as readers/seekers of academic papers, we asked a pair of questions about their experience of searching for papers and the effect on them, if any, of finding multiple versions and/or copies online. Researchers’ replies indicate that it is not unusual for them to find more than one version of a research paper available online. More than half of the researchers (54%) noted that they frequently or very frequently find several versions of the same research paper available online. Only 5% replied that they have never come across such occurrences and only 2% of the researchers replied that they were not aware if this was happening. Frequency of finding more than one version Very frequently Frequently Sometimes Never Don't know Total Response 77 172 184 23 8 464 Response by % 16.6 37.1 39.7 5.0 1.7 100 Table 27: Researchers’ experience: finding more than one version/copy online Furthermore, 54% of all respondents stated that if they find multiple versions and / or copies of the same work, it is generally quick and easy to establish which one they wish to read. 41% of respondents though, noted that it is not easy to do so. Ease with which version wanted is identified Yes No Do not find multiple versions / copies Total Response Response by % 251 188 25 464 54.1 40.5 5.4 100 Table 28: Researchers’ experience: ease of identifying the preferred version Respondents who said it was not generally quick and easy to identify the version they wish to read (188) were asked to say what difficulties they had faced, from a suggested list, shown in the table below. They were allowed to select more than one. The most common difficulty that the researchers noted when they find several versions of a paper online is to know if they have found the latest (most recently issued) version (161 respondents). Other concerns related to uncertainty about whether there is a published version, which may often also be the latest version (124 respondents), and knowing the difference between the content of one version and another (106 respondents). 46 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 Difficulties faced in identifying versions Knowing whether there is a published version Knowing if I have found the latest (most recently issued) version Knowing which version is most authoritative Knowing the difference between the content of one version and another Knowing whether I have found all of the versions / copies that could be available to me Time taken to look at different versions Other Number of respondents Table 29: Researchers: difficulties faced in identifying versions Among the ‘Other’ replies the following issues were raised: How to refer to the paper if submitted or in press Knowing if I have an access to the versions Sometimes even the publishing date is not reliable Author(s) contact with no answer 47 124 160 71 106 53 91 6 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 12. Researchers: Citing papers found in full text online Queries and uncertainties among authors about the possible loss of citations if they place earlier versions of their work online do come up from time to time. Therefore researchers were asked to indicate what their attitudes are towards citing others’ work through a question worded as ‘If you read an earlier version of a paper that has been published in a journal, how would you prefer to cite it?’ with possible responses to be selected from the list shown in the table below. The majority (73%) that they would cite the published version only, even though they had read the earlier version. The next most popular choice was to cite the earlier author version they have read, while a small number (7%) went for citing both published and earlier author version. This response should be encouraging to cited authors as it suggests that in 80% of cases they will receive citations for their published version. Preferred cited version Response Cite the published version only, even though I have read the earlier version Cite both the published version and the earlier author version that I have read Cite the earlier author version that I have read Do not cite any version of the paper if I have not read the published version Don't know Total Response by % 339 73.1 33 7.1 58 12.5 22 4.7 12 2.6 464 100 Table 30: Reference practices when citing papers Respondents were invited to make further comments on this issue and 87 responses were received, many of which clarified that researchers do spend time reading and checking both published and earlier versions in case any changes have been made. They noted that they would cite an earlier version if the content was different to the final published version. Also, they noted that due to the current practice of citations and their role in denoting impact they would ensure that they would credit the authors of the final publisher version. In general, they stressed the importance of citing accurately, tracking down and reading any version – preferably the final publisher version – before citing in order to adhere to quality standards. Selected comments are shown below: The standard is to cite the published version, but it might make sense to cite both. Sometimes the working paper version is more extensive; in that case, I might cite that one, depending on the circumstances. I would qualify this answer by adding that sometimes, a journal paper makes use of an accompanying working paper to develop at length particular arguments or present data etc. 48 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 In this case, I might refer to both versions. Otherwise, I would always prefer to reference the peer-refereed version. I prefer to cite the published version, but I may want to cite material in the working version that was excluded from the published version. I would cite both if there were large differences or the working paper had extensive appendices that were useful - increasingly common practice. I would try to get the published version anyway and therefore only cite the pre-pub.-version if there are some differences that are important for my own work that are not included in the published version It is very important that the reader can understand the status of the paper, only cites what they have read and makes that explicit in their citation eg by including ""(online pre-press version)"" However authors can help, I have posted pre-press versions on my website and stated that the final version will only differ in minor typographical changes, the content will not change in a substantive way. Even then it is best for the reader to cite what they have read. I usually cite the published version, unless the published version no longer includes the material that I am particularly referencing. In which case, I cite the working paper and note the published version doesn't contain the material of particular interest to my work Cite the published version and, if it not OA, the best OA copy generally available. Depends on differences between versions. I always read the published version if this is available Try to get a hold of the published version and cite that one. If the published version is, for example, clearly shorted than the earlier version, might have to cite both. I would check the published version to verify that the cite applied. In case it didn't, I would cite the earlier version. 49 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 13. Researchers: Persistence – long term availability of academic papers About half of the respondents noted that it is important to them that the version of the paper they have cited (and is available online), remains available and accessible online. Although the location appeared not to be important to the respondents, having quick and easy access to the information preferably via a search engine was raised by few as an important feature. Comments that the respondents made raise questions regarding the preservation of the material that is made available online and its sustainability. The respondents made several comments in this regard including some about the role of repositories, eg: A main function of public repositories (for me) is to provide a stable URL for preprints of papers (and not version management, which I do myself) This is my biggest worry, I am aware of an important thesis by somebody now dead that is only available in a hidden corner of a university website, I don't think the university realise it is there. It's very annoying if the link changes or disappears! These are essential - increases the quantity and quality of resources to a very great extent. This should be part of the responsibility of a national library, or major archive (such as arXiv). We drill into our students the importance of folks being able to find the sources used, yet use of the internet (and bizarre citations of temporary URLs instead of authors, titles, date of publication and date of use) is making it harder and harder to find a source twice -- without even addressing the very real problem of versions. Persistence Essential Important Unimportant Very unimportant Don't know Total Response 143 234 67 9 11 464 Response by % 30.8 50.4 14.4 1.9 2.4 100 Table 31: Persistence of papers On this question authors were invited to provide further comments and 47 replies were received. Comments were made about the nuisance of broken links: This is my biggest worry, I am aware of an important thesis by somebody now dead that is only available in a hidden corner of a university website, I don't think the university realise it is there. Far too often online references disappear or move to other locations, usually within two years of referencing them. This is a very serious issue which needs sorting out. It is worrying to be citing online publications in my current book, all the time wondering for how long the links will be valid. Out-of-date links will date my book fast. We drill into our students the importance of folks being able to find the sources used, yet use of the internet (and bizarre citations of temporary URLs instead of authors, titles, date of publication and date of use) is making it harder and harder to find a source twice -- without even addressing the very real problem of versions. It's very annoying if the link changes or disappears! 50 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 Some researchers admitted that they download papers to get around the problem of broken links: I tend to store these papers myself so can provide it if requested. It very annoying not to find the paper in the address it used to be, when you return to check something. I usually save things I find interesting, but then get too many things in my computer and cds and often can't locate what i saved. Others were less worried and felt that as long as papers remain available online, they will be retrievable through search engines or by going to authors’ websites: it doesnt need to stay in same place if there is a simple way of finding where it has gone! Personal web pages make the best storage places in this respect - they can cross institutions and life changes! The place is irrelevant, just Google it. It´s more important that the paper is STILL available --don´t mind if is no longer available at the same location. The same location is less important providing that the paper remains online. Google or other search engines can normally locate it. Journal restrictions (eg electronic for a limited time, without extra payment) are the biggest problem Several respondents could see a role in this for libraries and archives: This should be part of the responsibility of a national library, or major archive (such as arXiv) It occurs very frequently that the digital location of a paper (even published) is changed. It is sometimes very times consuming sometimes impossible to find online-papers from the references list. In most cases this is not in the responsibility of the authors, but of libraries or publishers. A main function of public repositories (for me) is to provide a stable URL for preprints of papers (and not version management, which I do myself). Further comments of note: If it moves location it may not be possible to verify which version it is With all online resources, it is important that the URL be highly durable, so that future readers can consult one's sources. The biggest problem seems to come from papers found on the author's person web site and having the author change institutions. Links to the new web page at the old site would be very useful. 51 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 14. Identifying versions In this part of the questionnaire respondents were asked to indicate their views about the potential of 10 different version identification approaches for the research products they produce, applicable throughout their lifecycle. The suggested methods could be used in conjunction with each other and they address issues such as filenaming, metadata, human readable notes, and ways of establishing links between related versions. It was therefore anticipated that the response to these questions would be favourable in general and that nevertheless some priorities from the researchers’ perspective would emerge. These questions were optional. In some cases results are given as a percentage of total survey respondent base (464) and in some, as a % of responses actually made to the question. Of the 10 version identification methods proposed, the 3 most popular with researchers were (as a percentage of all responses received for each proposed method): A method of indicating which is the published version – deemed Essential or very important by 75% (Essential 37%, Very important, useful for me 38%) A method of indicating which is the author's latest version of a paper – viewed as Essential or very important by 67% (Essential 25%, Very important, useful for me 42%). A standardised way of recording and displaying the date of manuscript completion – seen as Essential or very important, useful for me by 58% (Essential 19%, Very important, useful for me 39%). The following table shows all the suggested version identification methods ranked in order of those deemed either Essential or Very Important. The percentages of question respondents who viewed each solution as either Essential, Very Important or Interesting is also shown as an indication of potential for these solutions. 52 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 Q no Version identification method Essential or very important (% of respondents to question) Essential, very important, or interesting (% of respondents to question Essential (% of respondents to question) Q34 A method of indicating which is the published version A method of indicating which is the author's latest version of a paper A standardised way of recording and displaying the date of manuscript completion A standardised note in the description of the paper stating that it is the latest revision available A standardised terminology to describe each stage in the process of developing a research output A standardised way of referring to different revisions by version number A standardised terminology to describe how one version relates to another (for example B is a digital copy of A, C is a digital revision of A): A method of linking records together so all versions of a given paper are retrieved by searches and presented as a group (collocation): A method of comparing the text of different versions and displaying the differences between them Notes provided by the author, describing how one version relates to another 75.2% 83.8% 37.3% 67.2% 79.9% 25.4% 58.0% 79.6% 19.2% 48.3% 69.6% 13.8% 45.3% 81.7% 8.2% 38.6% 71.6% 9.7% 36.2% 77.1% 5.8% 35.1% 68.9% 5.8% 32.1% 65.9% 4.7% 32.1% 40.1% 3.9% Q35 Q28 Q30 Q26 Q29 Q27 Q32 Q33 Q31 Table 32: Version identification methods – All methods ranked in order of support Detailed results for each method of version identification are presented in the following sections. 53 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 14.1 Labelling and naming versions The researchers were asked to note how important it was to have a standardised terminology assigned at each stage in the process of developing a research output as a means to identify the different versions of the research product. Just under half of the respondents to this question noted that such a method would be either essential (8%) or very important, useful for me (37%). Terminology assigned at each stage Essential Very important, useful for me Interesting, but not so important Not important at all Don't know N/A Total Response Response by % 38 172 169 25 11 49 464 8.2 37.1 36.4 5.4 2.4 10.6 100 Table 33: Version identification methods - Labelling versions Furthermore, researchers were asked to indicate which terms they (or their publishers) use for revision stages. More than seventy (70) free text replies were received. The most commonly used term was ‘draft’ (33), usually in combination with something else, eg 'Submitted draft'. Terms which were mentioned in significant numbers by researchers included: preprint (6), manuscript (4), do not cite (3), submitted (35), accepted (14), final (15), revised (5), reviewed (3). Postprint was mentioned by just one respondent. A standardised terminology that describes how one version relates to another was also considered very important and useful by 30% of the researchers and interesting by 41% of the respondents. Linking Essential Very important, useful for me Interesting, but not so important Not important at all Don't know N/A Total Response Response by % 27 141 190 31 14 61 464 5.8 30.4 40.9 6.7 3.0 13.1 100 Table 34: Version identification methods – Describing relations 14.2 Chronological and numeric labelling Having a method in place that allows for the displaying of the date of completion of a document was considered either essential (19%) or very important (39%) by more than half of the researchers. 54 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 Chronological labelling Essential Very important, useful for me Interesting, but not so important Not important at all Don't know N/A Total Response Response by % 89 180 100 23 6 66 464 19.2 38.8 21.6 5.0 1.3 14.2 100 Table 35: Version identification methods - Chronological labelling The option of being able to organise the different versions of papers by assigning a version control number was considered interesting but not so important by about one third of the respondents and very important by another 29%. Numbering versions Essential Very important, useful for me Interesting, but not so important Not important at all Don't know N/A Total Response Response by % 45 134 153 49 9 74 464 9.7 28.9 33.0 10.6 1.9 15.9 100 Table 36: Version identification methods - By version control number Almost half of the researchers (48%) indicated that a note in the description of the paper that states that the version of the paper accessed is the latest one available was considered essential (14%) or very important (34%) for the researchers and their work. Track revisions Essential Very important, useful for me Interesting, but not so important Not important at all Don't know N/A Total Response Response by % 64 160 99 49 21 71 464 13.8 34.5 21.3 10.6 4.5 15.3 100 Table 37: Version identification methods – Note in the description of papers 14.3 Describing the content and relationships The researchers were asked to indicate their preference towards a method that would include notes provided by the author, describing how one version relates to another. About 68% of the respondents noted that this would be either an interesting (28%) or very important (40%) feature for them. 55 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 Describing relationships Essential Very important, useful for me Interesting, but not so important Not important at all Don't know N/A Total Response Response by % 18 131 186 53 7 69 464 3.9 28.2 40.1 11.4 1.5 14.9 100 Table 38: Version identification methods - Describing content 14.4 Linking - Collocation A method of linking records together so all versions of a given paper are retrieved by searches and presented as a group (collocation) was noted as something interesting by 34% of the respondents while another 29% replied that they would find such a function very important and useful. Linking versions Essential Very important, useful for me Interesting, but not so important Not important at all Don't know N/A Total Response Response by % 27 136 157 58 13 73 464 5.8 29.3 33.8 12.5 2.8 15.7 100 Table 39: Version identification methods - Collocation 14.5 Textual comparison Despite being indicated as one of the difficulties that the respondents encountered when they find more than one versions of the same paper available online only 27% noted that a method that would allow them to compare the text of different versions and displaying the differences between them would be very important and useful to them. About one third of the respondents (34%) noted that they thought such a method would be interesting but not so important. Comparing text and displaying differences Response Response by % Essential 22 4.7 Very important, useful for me 127 27.4 Interesting, but not so important 157 33.8 Not important at all 77 16.6 Don't know 4 0.9 n/a 77 16.6 Total 464 100 Table 40: Version identification methods - Textual comparison 56 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 14.6 Signposting – Publisher and author versions Researchers deemed a method for indicating which version of a paper is the published version as most important. It was noted as something essential for them and their research by 37% of researchers, while another 40% noted that this would be very important and useful to them. Published version indication Essential Very important, useful for me Interesting, but not so important Not important at all Don't know n/a Total Response Response by % 173 176 40 7 0 68 464 37.3 37.9 8.6 1.5 0.0 14.7 100 Table 41: Version identification methods - Published version A method for identifying the author’s latest version was also essential to 25% of the researchers and very important and useful for their research to another 42%. Latest version indication Response Response by % Essential 118 25.4 Very important, useful for me 194 41.8 Interesting, but not so important 59 12.7 Not important at all 10 2.2 Don't know 7 1.5 n/a 76 16.4 Total 464 100 Table 42: Version identification methods - Author’s latest version 14.7 Other suggestions The final substantive question of the survey was to ask respondents if they had any other suggestions themselves about how versions could be identified clearly in digital repositories. Thirty four (34) responses were received to this question. Seven of these specifically referred to perceived problems with the solution proposed in Question 30, which had asked about the usefulness of a way to flag something as the most recent version. 57 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 15. Discussion The questionnaire was long and some researchers told us it took around 20 minutes to complete. In view of the number of questions we took the decision to make some questions optional in the hope that busy researchers would be willing to complete the questionnaire even if they skipped the non-essential questions. We were satisfied with the response rate at 464 researchers. However we note that there were a further 109 incomplete survey responses. We did not include these in our report or analysis, because it was not possible to determine whether they were duplicates caused by a researcher starting the survey then starting again from the beginning on another occasion. In general the questions appear to have been understood and to have worked as expected, apart from one or two where we did not allow for a No or N/A response. In addition, questions 10-14 concerning responsibility for secure long term storage of journal articles were weakened by the fact that we did not define the term ‘subject repository’ clearly enough for respondents. Question 9 about researchers’ own practice regarding storage and organisation of their own versions and revisions generated much interest and feeling among researchers. It highlighted the fact that researchers are themselves dealing with large personal and collaborative collections of digital files, even before the stage at which any of these are disseminated or passed on to repositories and archives. The series of questions asking about interest in different version identification solutions could perhaps have been better structured to bring out priorities. As it was, many of the solutions received broad support. 58 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 16. Conclusions The questionnaire completed by researchers brought to light a number of interesting points which have a bearing on the question of version identification and versions of academic papers in the context of open access digital repositories. Firstly the questionnaire confirmed the findings of the VERSIONS interviews that academic authors are dealing with large personal collections of versions and revisions of academic papers. It could be of great assistance to researchers to have some guidance or support in managing these as it is clearly a time consuming administrative task. The respondents were a research active group with a range of dissemination routes and types of research output typically pursued for each research project. 59% say they would produce 4 or more different types of research output for a typical project (such as conference paper, working paper in institutional or other series, article, chapter, report, etc). This active scholarly communication culture does suggest that many versions of related works can be in circulation at one time. In principle, authors tend to keep the author-created final accepted versions of refereed journal articles that would be suitable for deposit in open access repositories. 91% said they do personally keep these versions. In practice, the difficulties mentioned above about personal information management, can mean that these potential open access versions may not be easily accessible and therefore less likely actually to be deposited (in the absence of any strong policy or mandate). However, it is encouraging that 81% of respondents said they would be willing to provide such a version for deposit in an institutional repository if requested to do so. There remains some uncertainty among researchers about their agreements with publishers despite the untiring efforts of many participants in the open access and scholarly communications communities to explain the current position. 68% said they were unsure whether they are permitted to place a final author version of their articles in a repository (68%). This suggests scope for continuing advocacy work by institutional repository managers about resources such as the RoMEO service. Another issue explored was the extent to which authors are already disseminating their work online themselves. In this respect the questions focussed very specifically on economics-related sites such as RePEc and SSRN. The findings were that a little under half of economics researchers are using 3 or more such routes already, in addition to formal publication in refereed journals. Therefore institutional repository managers offering to increase visibility of research outputs for this group of researchers will need to explain the unique selling point of a well-managed open access repository. It is likely to be only one of a possible 5 or so possible dissemination channels. 69% of researchers said it would be very useful or useful if their university/institution could take on a role in depositing research outputs on their behalf. The question of loss of citations is sometimes mentioned by researchers as one concern they have about placing earlier or author-created versions of their work online. 80% of respondents though said they would cite the published version of an article (or the published version and the earlier author version) even when they have read an earlier 59 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 version online. Further answers to this question indicated that readers are taking time and effort to locate published versions so that they do not inadvertently cite content that has been removed from the published version. In general authors and readers are united in their wish to highlight the latest and published versions. However, there was significant comment in the free text answers on citation, on storing authors’ own versions, and on secure long term storage about the value of working paper versions in economics. The question about who and which institutions should have responsibility for the secure long term storage of versions of authors’ work did suggest that universities and institutional repositories still have some way to go in explaining how they can play a part in this. It was striking that authors consider publishers to have a more significant role than universities/libraries in the secure long term storage of final author-created versions. It was also notable that 102 researchers saw no role for universities/libraries in the secure long term storage of any versions of their journal articles (including published versions). 60 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 Appendix A - Questionnaire Versions of academic papers online - the experience of authors and readers Thank you for your interest in the VERSIONS Project survey on writing, disseminating and accessing academic research papers. This is one of two surveys being conducted by the VERSIONS Project. If you are not a researcher or student but have another interest in the question of version identification, please complete our survey for providers and other interested parties at http://www.survey.lse.ac.uk/versionssurveyexperts The VERSIONS Project is led by the London School of Economics and Political Science, with the Nereus consortium of European research libraries in economics as associate partners. The project is funded by the Joint Information Systems Committee (JISC). VERSIONS Project (http://www.lse.ac.uk/versions) Nereus Consortium (http://www.nereus4economics.info) The London School of Economics (UK) Tilburg University (NL) Erasmus University of Rotterdam (NL) German National Library of Economics (D) Sciences Po (F) Université Libre de Bruxelles (B) University College Dublin (IRL) UCL (University College London) (UK) University of Oxford (UK) The University of Warwick (UK) Katholieke Universiteit Leuven (B) Vienna University of Economics and Business Administration (A) Maastricht University (NL) Carlos III de Madrid (ES) Charles University CERGE-EI (CZ) Université Toulouse 1 Sciences Sociales (F) Joint Information Systems Committee -- Digital Repositories Programme (http://www.jisc.ac.uk/index.cfm?name=programme_digital_repositories) Introduction An academic research paper evolves during its development from idea to published article. As a result numerous versions of the same paper are produced by the author. The ease with which papers can be created and distributed in digital form has led to multiple versions being available online. It can be difficult to establish the status of these online papers, or the relationships between available versions. We hope to collect valuable information from you about: • versions of your own academic papers that you produce as research outputs • how you manage your own papers while revising your work • how you disseminate full text copies of your work online • your experience of finding and identifying other researchers' papers 61 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 About you (Page 1 of 8) Your role and subject area 1. Which of the following best describes your role? Professor Lecturer / Associate Professor Post doctoral research staff Student (PhD or other research degree) Contract / freelance researcher Not an active researcher If you are not active in research or teaching, please consider completing our survey for providers and other interested parties at http://www.survey.lse.ac.uk/versionssurveyexperts which may be more relevant to you. 2. Do you have any of the following responsibilities? (select all that apply) Head of department or research unit Dean or head of university research Teacher Journal editor Working paper series editor Officer of learned society or research association Other (please specify): 3. Which subject discipline are you engaged in? [More information: What is a UOA? These are Units of Assessment as used in the Research Assessment Exercise in the United Kingdom. See http://www.rae.ac.uk/pubs/2004/03/ for further details.] Economics and Econometrics (UOA 34) Accounting and Finance (UOA 35) Business and Management Studies (UOA 36) Other We are happy to receive responses from all disciplines, but you may find that some of the questions are targeted specifically towards economists. You as an author (Page 2 of 8) Your research outputs 4. Thinking about a current or recent typical research project, which of the following do you expect / hope to produce as output(s) from that research? (select all that apply) Conference paper Conference / workshop / seminar presentation Working / discussion paper (institutional working paper series with no quality control) Working / discussion paper (institutional working paper series with quality control) Working paper (membership working paper series such as NBER, CEPR, IZA) Journal article in refereed journal Journal article in unrefereed journal Report for funding body Book chapter Book Dataset 62 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 Thesis Other (please specify): 5. Thinking about revisions you make to your research outputs during their preparation, which revisions do you personally keep / plan to keep stored in electronic form (e.g. on your computer or network drive) at the end of the process? Keep all revisions Keep major revisions but not all Keep the latest revision that I worked on only Do not keep a personal copy Don’t know Do not produce research outputs Other (please specify): You as an author - papers produced for refereed academic journals (Page 3 of 8) In the following questions we focus on your academic papers, intended for publication in refereed academic journals, and the process of writing, revising and storing them. Writing papers for submission to refereed journals 6. How many papers intended for publication in refereed academic journals have you produced in the past two years? (Optional) More than 6 4–6 1– 3 0 – I expect to produce academic papers in the near future 0 – I am unlikely to produce academic papers Don’t know Revising and storing your academic papers intended for refereed journals Final author version, in the following questions, relates to the version produced by yourself / co-authors, as agreed with the journal following referee comments. 7. Which of the following versions of a paper, that you have written for publication in a refereed journal, would you personally keep (e.g. on your own computer or network drive)? Keep permanently a) Early draft version(s) (before circulation to anyone, other than co-authors) b) Draft version circulated to colleagues or peers for feedback before submitting to a journal c) Version submitted to a journal for peer review d) Final author version produced by 63 Keep until updated version produced (if applicable) Do not produce/have this version Don’t know Don’t produce papers VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 yourself/co-authors – agreed with the journal, following referee comments e) Version produced by publisher – Proof copy f) Version produced by publisher – Final published version (often in PDF format as it appears in the journal itself 8. Thinking about storing your academic papers in the long term and focussing on final author versions of your papers, do you have an easily accessible copy of these among your personal files (electronic or paper)? All Most Some None Don’t know Don’t produce If you did not answer All, please provide your reasons. (Optional) (select all that apply) I do not have electronic copies before a certain date I do not have copies produced while I was at a previous university / institution I do not have copies of papers that I co-authored, the principal / lead investigator has this version Changes to the manuscript are made iteratively between myself and the publisher in the later stages so I would have to assemble such a version Loss or damage to my computer Papers are stored electronically but would be difficult to retrieve from various servers Loss or damage to paper files I have discarded print copies of older papers before a certain date and do not have electronic versions Other (please specify): 9. Are you satisfied with the way in which you organise revisions and different versions of your own work, on your own computer or storage medium? [More information: For example, some researchers keep all files on a specific research area in a folder by research topic, subdivided by document types such as conference paper, journal article submissions etc. Others use an incremental numbering system in the filename to denote revisions to documents. Others keep everything and simply search by the date and time that the documents are saved.] Yes No, not completely Don’t know Don’t produce research outputs If Yes, please describe the system that you use and how it helps you to organise your files. (Optional) If No, not completely, please describe some of the difficulties that you face when organising your own revisions of documents. (Optional) Responsibility for secure storage of papers produced for publication in refereed academic journals In this section we ask you to consider what responsibility, if any, individuals and organisations, should take for secure long term storage of versions of academic papers, at different stages in the revision process. You may select more than one. 10. Authors / Co-authors should take responsibility for secure storage of: (select all that apply) Early draft version(s) before circulation (other than to co-authors) Draft version circulated to colleagues or peers for feedback before submitting to a journal Version submitted to a journal for peer review Final version produced by yourself / co-authors – agreed with the journal, following referee comments 64 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 Version produced by publisher – Proof copy Version produced by publisher – Final published version (often in PDF format as it appears in the journal itself) None of these Don’t know 11. Authors' universities / institutions (including libraries) should take responsibility for secure storage of: (select all that apply) Early draft version(s) before circulation (other than to co-authors) Draft version circulated to colleagues or peers for feedback before submitting to a journal Version submitted to a journal for peer review Final version produced by yourself / co-authors – agreed with the journal, following referee comments Version produced by publisher – Proof copy Version produced by publisher – Final published version (often in PDF format as it appears in the journal itself) None of these Don’t know 12. Publishers should take responsibility for secure storage of: (select all that apply) Early draft version(s) before circulation (other than to co-authors) Draft version circulated to colleagues or peers for feedback before submitting to a journal Version submitted to a journal for peer review Final version produced by yourself / co-authors – agreed with the journal, following referee comments Version produced by publisher – Proof copy Version produced by publisher – Final published version (often in PDF format as it appears in the journal itself) None of these Don’t know 13. Subject repositories should take responsibility for secure storage of: (select all that apply) Early draft version(s) before circulation (other than to co-authors) Draft version circulated to colleagues or peers for feedback before submitting to a journal Version submitted to a journal for peer review Final version produced by yourself / co-authors – agreed with the journal, following referee comments Version produced by publisher – Proof copy Version produced by publisher – Final published version (often in PDF format as it appears in the journal itself) None of these Don’t know 14. Do you have anything to add to your answers above about responsibility for secure long term storage of versions of academic papers? (Optional) You as an author - disseminating your work in full text online (Page 4 of 8) Digital repositories In this section we ask you about your use of digital repositories to store and disseminate full text copies of the papers that you intend for publication in refereed academic journals. A digital repository can be defined as an online collection where you can put, store and retrieve digital objects as well as descriptions about the objects. 15. Does your university have a digital repository where you can deposit your papers? Examples of institutional repositories are London School of Economics and Political Science's LSE Research Online http://eprints.lse.ac.uk or Tilburg University's Academic Output: http://www.tilburguniversity.nl/services/library/ir/ 65 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 Yes No Don’t know If Yes, have you ever placed any papers into the repository? Yes No Don’t know Don’t produce papers If you answered No, can you tell briefly why? (Optional) 16. If your university invites you to place a copy of your paper in the institutional repository and requests from you the final author version, would you provide this version? Final author version means the version produced by yourself / co-authors, as agreed with the journal following referee comments. Strongly agree Slightly agree Slightly disagree Strongly disagree Don’t know Don’t produce papers I am willing to provide this version – it helps me to disseminate my research quickly I am willing to provide this version on condition that readers are made aware that it is not the published version I am willing to provide this version on condition that a link to the published version is provided It would take too much time for me to provide this version I consider this version to be inferior to the publisher PDF version I place the publisher PDF version on my personal website as my first priority, if permitted I am willing to provide an author final version to a fellow researcher if requested by email I am concerned that I might lose citations to the published version if I provide my final author version I am unsure whether the publisher copyright agreement permits me to provide this version I intend to provide such versions in the future Other dissemination routes In this section we ask you about other online services you use to disseminate your papers and how this affects your attitudes towards the use of digital repositories. 17. In addition to formal publication in refereed academic journals and dissemination in university / institutional collections, as described above, do you disseminate your own research findings, in full text, through any of the following channels? (select all that apply) Personal website / homepage University website for working paper or discussion paper series REPEC (IDEAS, EconPapers) SSRN Don’t produce research outputs Other (please specify): 66 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 18. If your university / institution could deposit your research outputs in these alternative channels on your behalf, would this be: (Optional) Very useful Useful A little useful Not useful Don’t know Don’t produce research outputs Accessibility 19. Which of the following versions of your academic papers are you interested in making openly accessible to the general public, if permitted: (select all that apply) [More information: Some publishers do not permit authors to disseminate the published PDF version of articles but do allow the dissemination of final author versions. Other publishers prefer authors to use the final publisher version. The SHERPA/ROMEO list of publisher copyright policies provides further information: http://www.sherpa.ac.uk/romeo.php] Draft version circulated to colleagues or peers for feedback before submitting to a journal Version submitted to a journal for peer review Final version produced by yourself / co-authors – agreed with the journal, following referee comments Version produced by publisher – Proof copy Version produced by publisher – Final published version (often in PDF format as it appears in the journal itself) Don’t know Don’t produce papers 20. Do you have further comments to add about making versions of your academic papers openly accessible? (Optional) 21. Do you feel that you have a full understanding of which version(s) of your academic papers (intended for publication in refereed academic journals) you are allowed to disseminate in full text, in which locations and at which times? (Optional) Full understanding Some understanding Limited understanding No understanding Don’t know Don’t produce research outputs You as a reader (Page 5 of 8) In the following questions we ask you about your experience of searching for academic papers of other researchers in your field. Finding multiple versions of the same academic paper 22. When searching for papers by other authors, how frequently do you find more than one full text version / copy available online? Very frequently Frequently Sometimes Never Don’t know 67 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 23. If you find multiple versions and / or copies of the same work, is it generally quick and easy to establish which one(s) you wish to read? Yes No Do not find multiple versions / copies If No, what difficulties do you face? (select all that apply) Knowing whether there is a published version Knowing if I have found the latest (most recently issued) version Knowing which version is the most authoritative Knowing the difference between the content of one version and another Knowing whether I have found all of the versions / copies that could be available to me Time taken to look at different versions Other (please specify): Citations and persistence (Page 6 of 8) Citing papers that you have found in full text online 24. If you read an earlier version of a paper that has been published in a journal, how would you prefer to cite it? Cite the published version only, even though I have read the earlier version Cite both the published version and the earlier version that I have read Cite the earlier author version that I have read Do not cite any version of the paper if I have not read the published version Don’t know Do you have any further comments relating to reading and citing pre-publication versions of academic papers? (Optional) Long-term availability of academic papers online (persistence) 25. If you have read or cited a version of a paper online, how important is it to you that the version remains available at the same location? Essential Important Unimportant Very unimportant Don’t know You as an author or a reader - identifying versions (Page 7 of 8) The questions on this page are all optional. If you have time to complete them this will provide us with valuable information. However, if you need to complete the survey more quickly, please click the Continue button at the bottom of the page now to proceed to two final questions. How useful do you feel that the following methods of identifying versions could be in helping you to organise your own research, and locate the work of others, if they could be implemented? Some of these methods are available already in some digital repositories. Labelling and naming versions 26. A standardised terminology to describe each stage in the process of developing a research output: (Optional) 68 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 Essential Very important, useful for me Interesting, but not so important Not important at all Don’t know When thinking about preparing a paper for publication in a refereed journal, which words / phrases for stages in the revision process do you or your publisher use? (For example, ‘submitted draft’ or ‘preprint’). Please briefly list the stages. (Optional) 27. A standardised terminology to describe how one version relates to another (for example B is a digital copy of A, C is a digital revision of A): (Optional) Essential Very important, useful for me Interesting, but not so important Not important at all Don’t know Chronological and numeric labelling 28. A standardised way of recording and displaying the date of manuscript completion: (Optional) Essential Very important, useful for me Interesting, but not so important Not important at all Don’t know 29. A standardised way of referring to different revisions by version number: (Optional) Essential Very important, useful for me Interesting, but not so important Not important at all Don’t know 30. A standardised note in the description of the paper stating that it is the latest revision available: (Optional) Essential Very important, useful for me Interesting, but not so important Not important at all Don’t know Describing the content and relationships 31. Notes provided by the author, describing how one version relates to another: (Optional) Essential Very important, useful for me Interesting, but not so important Not important at all Don’t know Linking 32. A method of linking records together so all versions of a given paper are retrieved by searches and presented as a group (collocation): (Optional) 69 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 Essential Very important, useful for me Interesting, but not so important Not important at all Don’t know Textual comparison 33. A method of comparing the text of different versions and displaying the differences between them: (Optional) Essential Very important, useful for me Interesting, but not so important Not important at all Don’t know Signposting 34. A method of indicating which is the published version: (Optional) Essential Very important, useful for me Interesting, but not so important Not important at all Don’t know 35. A method of indicating which is the author's latest version of a paper: (Optional) Essential Very important, useful for me Interesting, but not so important Not important at all Don’t know 36. Do you have further comments to add about how versions of academic papers could be identified clearly in digital repositories? (Optional) More about you (Page 8 of 8) It will help us to interpret the survey results further if you can answer the following questions More about you 37. How long have you been engaged in research? (Optional) [More information: Please indicate the number of years, beginning from the first year of a PhD degree.] More than 10 years 6 – 10 years 0 – 5 years (including PhD research) 38. Which country are you based in for your study / research? 70 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 Appendix B - Other subject disciplines of respondents As noted in the section on identities, the majority of responses were from economics and related disciplines: Economics and Econometrics (UOA 34) Accounting and Finance (UOA 35) Business and Management Studies (UOA 36) Other Total 347 15 29 73 464 74.8 3.2 6.3 15.7 100 This list below shows the subject disciplines of those 16% of respondents who were not from economics and related disciplines: Discipline (UOA 9) Psychiatry, Neuroscience and Clinical Psychology: (UOA 14) Biological Sciences (UOA 17) Earth Systems and Environmental Sciences (UOA 19) Physics (UOA 21) Applied Mathematics (UOA 22) Statistics and Operational Research (UOA 23) Computer Science and Informatics (UOA 26) Chemical Engineering (UOA 31) Town and Country Planning (UOA 32) Geography and Environmental Studies (UOA 37) Library and Information Management (UOA 38) Law (UOA 39) Politics and International Studies (UOA 40) Social Work and Social Policy & Administration (UOA 41) Sociology (UOA 42) Anthropology (UOA 43) Development Studies (UOA 44) Psychology (UOA 45) Education (UOA 52) French (UOA 58) Linguistics (UOA 60) Philosophy (UOA 62) History (UOA 63) Art and Design (UOA 64) History of Art, Architecture and Design (UOA 66) Communication, Cultural and Media Studies 71 Response 1 Response by % 1.4% 1 1 1.4% 1.4% 16 4 7 7 1 1 1 8 1 2 4 21.9% 5.5% 9.6% 9.6% 1.4% 1.4% 1.4% 11.0% 1.4% 2.7% 5.5% 3 1 1 1 2 1 1 1 2 1 1 3 4.1% 1.4% 1.4% 1.4% 2.7% 1.4% 1.4% 1.4% 2.7% 1.4% 1.4% 4.1% VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 Appendix C – Countries of respondents Country Argentina Australia Austria Belgium Bolivia Brazil Canada Chile China (People's Republic of) Also Tibet Colombia Costa Rica Czech Republic Denmark Finland France Germany Greece Hungary India Italy (also Vatican City) Japan Mexico Netherlands Norway Pakistan Panama Peru Philippines Portugal (also Madeira, Azores) Puerto Rico Romania (Rumania) Russia Singapore Spain Sweden Switzerland Turkey Ukraine United Kingdom United States Uruguay Response 4 4 32 16 2 6 7 1 1 5 1 20 2 1 28 49 2 3 1 25 2 7 40 3 1 1 1 1 5 1 1 1 1 24 9 5 1 1 80 57 1 72 Response as % 0.9% 0.9% 6.9% 3.4% 0.4% 1.3% 1.5% 0.2% 0.2% 1.1% 0.2% 4.3% 0.4% 0.2% 6.0% 10.6% 0.4% 0.6% 0.2% 5.4% 0.4% 1.5% 8.6% 0.6% 0.2% 0.2% 0.2% 0.2% 1.1% 0.2% 0.2% 0.2% 0.2% 5.2% 1.9% 1.1% 0.2% 0.2% 17.2% 12.3% 0.2% VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 Appendix D – Personal information management and versions – success stories Question 9 asked ‘Are you satisfied with the way in which you organise revisions and different versions of your own work, on your own computer or storage medium? Those who answered Yes (229 respondents) were invited to ‘describe the system that you use and how it helps you to organise your files’. 134 people replied. Draft form with primary number 0; issued versions allocated consecutive numbers. Letters used with numbers to ensure versions are saved regularly whilst in drafting stage. organize along dates of storing and no. of revision filename with version number when published filename includes the word ""final"" I define a project folder, with subfolders indicating consecutive journal submissions, and further subfolders indicating the various revisions that have been submitted. I use subject based filing system and use the date to distinguish between versions that I have produced. Yes, I have my files organised by folder, each related to particular topics within which I deposit related research outputs (irrespective of dissemination method (i.e. journal, conference, etc.)). Filenames contain date of update to foster sorting. I use version numbers, eg ""paper 2.1.doc"" changing the second number with each edit of any significance and the first number if there is a milestone in the process - team review/change of direction etc. I keep the milestone versions in a backup folder within the main folder that the document is developed in. When working with colleagues I ask them to include their initials with the version number so it might go ""paper CR2.1"" ""paper FB2.2"" etc. I keep research outputs in a directory/file hierarchy on my computer organised by date, then by research project, then by paper, then by version. The directory/file hierarchy is backed up for security. Each version has a title, number, and date; I keep a separate folder for the latest version. In my publication folder, per year and per publication. Back-up on a stand-alone HD I keep a folder for each paper. I number versions and put old version in a sub-folder. After publication, I move the folder to a folder of old papers. Number each version, like papername1.pdf, papername2.pdf etc. each paper has its own directory organized in several subdirectories (data, text, literature and so on). Each subdirectroy is also divided in sub-sub-directories. older versions are put in subdirectories ordered by time. Have files organized by topics, use file-date information to refer to appropriate version. directories by version On a ""research project"" folder, I keep all major versions of the paper. On a ""publications"" folder, I keep just the latest version currently submitted for publication or the final version, if already published (and if I have it in electronical format). keep latest version only By creating one folder per version I keep track of all previous versions. paper xxx revision 1 data programs output text literature revision 2 ..... naming a project with a simple word and save files with date and other data included. for example: imagination_18May2006 or imagination_jeea. dating files subversion (svn) I have a separate folder for each paper, where I store dated versions of the paper (I record the date in the file name). for every paper or work I'm producing, I'm attaching at the end of the document a version number and a date. for each new improvement that affects seriously my paper, I'm attaching to the document a new version number. eg. paper-v1.3-12.07.2005.doc. for the significants versions 73 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 (which I'm producing for some review) I'm preparing pdf files. Change file name and make backups once major revisions are made. Filenames with subsequent versions contain the version number, so that I can easily sort them by name in my file manager, and see which is the last one. for example: thesis01.pdf thesis02.pdf thesis05.pdf etc.. Windows + WORD Just sort files by date A directory per paper, with a number suffix indicating version number. They are organized in electronic folders, one per paper. store in one folder, giving each subsequent version a higher number I collect all the revisions and final versions in one folder and name them in ascending order (the oldest being version 1) One project - one folder I have a ""research"" folder, where each of my projects has a different subfolder. Everytime I change the version of a paper, I save it by using the current date on the filename. Like this, I know immediately what is the last version of my paper. One folder per project! I use numbers as used for software versions e.g. 2.2 for second version of first major revision For each project I use a numbering system projectname_XXX.ext Any revision by any coauthor increments the number I save major versions until the paper is accepted for publication. Then I save only the final accepted version. different folder per research project and different subfolders for different versions I keep all printed papers arranged chronologically and the NBER keeos all Working Papers electronically Numbering versions Documenting major changes in a separate document (if necessary) I name the different files starting with the date, but keeping the same name otherwise. I should however have back-up copies (now only on the network at work, but need also personal back-up) CVS More or less intuitive folder and file names. I rarely look at old versions, but I can find them if I must, albeit after some effort. Working draft are stored in dossiers by year/subdossier by the name of the paper. They are not updated and keeped as it is. Final versions (pdf and latex) are stored in the special dossier where there are two subdossiers: pdf and latex. These files are updated and replaced by newer ones. By Project, then by date subfolders. This allows me to keep track of most recent revisions. I also name files with a date extension, reflecting the most recent revision. I include current date in the file name. If I want to keep the previous draft, I create another file whose name includes the current date and this will be the one on which revisions will be made. I give the paper a name, and then number all versions: i.e. paper1,paper2,paper3. When a paper is submitted to a journal, I rename it, adding the title of the journal, i.e. paper_res1,paper_res2. Final author versions get the suffix _final and proofs get the suffix _proofs. I generally create a specific place for each new research project where raw material (text, data, simulations, etc...) is stored. I also keep the latest version of each file and latex files stored in folders for each paper with the appropriate tables and figures as well as the relevant (multi-project) bibliography file (for use with bibtex); data stored also in folder if relevant Drafts folder with all revisions and files numbered; Revisions folders with work files Every file name includes a date e.g. hello060518. This way it is easy to find the latest version among coauthors and for myself. If I am looking for a specific note (which also have dates) it is easy to find when tying it back to the date. I can easily erase very old versions. My research is organized in folders by research topic, with a subfolder for each research paper. While writing and revising a paper, I always work on the same file (e.g. paper.tex). I keep a copy of each version that is circulated or submitted (saved as paperK.tex for the Kth circulated version or paperXXX.tex for the version submitted to conference or journal XXX, together with the corresponding pdf file paperK.pdf or paperXXX.pdf) This system is simple and transparent. There is never confusion about the latest version (which is always paper.tex) and submitted versions can instantly be identified. 74 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 name_of_file_version_number ; the highest version number is the latest version. all my papers have a file name composed of different parts... first the initials of all the authors, then a key word of the paper and finally the version of the paper numbered as versions of software are numbered (v2, v2.1, v2.11, v3, etc...). Different folders for each paper, and sub-folders for different stages of the paper. + a specific folder for published papers I organize all my work within folders named by year: e.g. \2005 \2006 \2004 etc. On January each year I burn 2 CD backups of all the previous year folder. I also keep a backup copy of my hard disk every six months or so. Each version of each paper has a unique name, e.g. uniqequil010205, where the first part indicates the paper and the second indicates the version. Using dates as part of the name makes it easy to see which is the last version. Also, Windows autimatically displays the files in chronological order. In the ""My Documents"" directory, I have a subdirectory for each paper, containing the latest revision of the paper and (sometimes) previous versions. Filing in appropriate folder and backups. Prepare a folder for each article, including dataset and commands list. There I also save the versions, giving different names to files only if the changes are substantial, and only once the work is fully completed. I store a new major revision in a new folder and minir revisions in the same folder I use bscw, gmail (store them into emails), my pc's at home and at work, so I have 4 back ups. bscw has the best way to keep in mind what is the current version. This is my primal organizer. One folder per paper. Computer file system, BSCW i simply give every new version a new name (including a number). (1) Main directories ""accepted"", ""submitted"", ""work"" (2) Subdirectories for each paper (3) Different versions of the paper identifiable by the date in the name of the paper, like ""AAA220506"" the name of the version contains the last date of activity on that paper. mostly just filesystem based occasionally using revision control software I try to make sure that each item is either dated or numbered, so that I never get mixed up about which one is the most recent. I have created sub-files for each paper that I write and gathered the different papers written in files by research topic All versions are classified in the same directory. The name of the file includes the date of revision. No special system, just directories for every year and inside for each conference or other type of publication I allot a folder for each paper, and keep all its versions there Same title + ascending abc for different versions Files have dates By date I save each new version with the version date in the name Make sure each subsequent version is referenced by a revision number, such as rev3, sometimes with date or the name of the last drafter or editor I number the version of the same paper chronologically and save under a subdirectory which indicates the destination (one paper may end up as an inside publication, a journal, or/and a report, each with its own style and format). different folders for each paper, versions include date I keep copies of all files of a version I want to store permanently in seperate subdirectories. If necessary, I use CVS. I keep all files numbered (e.g., draft1, draft2) or with date (draft0105_2006). By paper, by iteration (identified by start date) within paper Folder for each paper, various versions in same folder. Back up copies on home computer. 01_ 02_ 75 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 SVN archive containing both LaTeX documentation and R code/datasets from computations. SVN (subversion) stores all revisions automatically, earlier version can always be retrieved. files have short names (""xyz""), different version shown in appendix (""xyz_v1"") and journal acronym if submitted in this version ""xyz_v2_cje"" every version of the file has a distinct name, ending with the last date of actualisation or with ""final version"", ""layouteditor"", so I don't get confused about which is the latest version but have a backup at the hand if necessary. rank order of folders coauthor year title of article (paper) or journal to which paper will be submitted last digits give number of revision I fact, there is no need to keep all versions after a paper is published. usually we go through 10-15 rounds of revisions before we submit. Then follow the revisions during the review process, sometimes only 3, more often 5, up to 15. I save each article by name and date (plus a, b, c, etc if several in one day). I save chapter drafts similarly. All version of each article are stored in a single subfolder in a folder marked 'articles' or 'books'. Once accepted/published, the icon for each is changed for easy reference. If rejected and abandoned I would plan to treat it the same way. files that are named accordingly All versions are in one subdirectory, version number is in the name of file normal backup procedures; more recently: Subversion version control system on a personal server Every time I make a new version, I save the version with the same name, exept for the date. So when I continue I open the last version... Separate folder for each publication, separate subfolders for each version, storing all data (including figures etc.) I have subdirectories for different papers. Version control system (CVS or Subversion) Just construct a new folder for every article. Store the different version and the database used in this folder. by research theme and journal title choose adequate file name & add date in format yymmdd, for example Estimating Risk Premium 060612.doc I add a version number to the name of the paper. The final version has ""_final"" in its name. subdirectories for different stages of work I am using simple file structure in directories which is sorted by conferences or others events I usually throw away everything as soon as I have a new version. Unless, i.e., the new version is in another language or has some substantial changes in it, so that I may need the first version for some other purpose later. When writing a new version and keeping the old one, I usually add ""New"" to the title, so I may have some ""NewNew"" and ""NewNewNew"" versions, but this rarely occurs. I can always identify which is the last one. I number each version sequentially, add working paper series and number to versions brought out as working papers, add journal name (with revision number, if relevant) to versions submitted to journals, and label final publisher PDF files in the same way as for papers by other authors. I have files containing different version of the paper with different dates Create a new directory for each version I simply keep all the different versions. After two years I send everything to dvd and erase them from my computer (except final versions) First I have a folder for each topic of research I am interested in. In each topic folder, I have a folder for each project. In each project folder, I use several folders for: (a) bibliography, (b) data, tables and graphs, (c) econometric output, (d) PowerPoint presentations, and (e) drafts. All drafts have dates printed on the cover page so I am able to order them chronologically. I keep a separate folder for each of paper produced, in which there are subfolders for each phase of the work. Within them there are the material produced (images, database, spreadsheets, outputs, etc) including paper versions. I always put a date on the file name for each major revision. The front page also has a date. I have a separate folder for each project, and I keep all drafts in the same folder. Each new version of a paper has a new name with the date specified on which I started the 76 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 revision. I keep older versions in a separate folder called 'older versions' on my computer. In a folder called 'recent version' I keep the latest version to work on. Besides these two folders, I also have a folder for 'publication in a journal' in which I keep the submitted version, the reviewed versions, and the final version. For each paper, I have these three folders. Include date in file names Separate file folders for different projects, versions are always dated in the file name and usually the journal name is added to the file name. Date each file and give it a name that indicates the sequence. I number succeeding drafts of my work and the final one I give a title with the word ""final"" in it. Some titles refer to versions produced for particular conferences. For each paper, one file. For each version, date of production. Different versions are numbered consecutively (1,2,3 ...). Different papers are in different folders. Research Directory/coauthor/paper paper/final paper/seminars paper/drafts/working papers paper/submissions/journal Everytime I work on the paper I save it in the same folder with the same title but a different date after the file. This is possible because my computer has lots of memory so the cost is low. This means a folder might have twenty or more versions. When the final version is ready I put FINAL in capital letters after the file name. I have a different sub directory for each set of revisions, which would include new versions of spreadsheets and new versions of the paper. I produce papers using Latex, and I usually put a version number in the title when major changes are made. Minor changes (slight rewording, typo corrections) are effected by overwriting the existing version. This system seems to work. I attach version numbers to all revisions. If I am co-author on someone else's paper, I attach my initials and date to the filename of the revised version. Spotlight on Mac OS X I have a version control filing system and full back-ups I use my own computer. I have an directory structure with each paper having its own directory. Inside that directory are major iterations (the filename is updated for each iteration), and a subdirectory for figures. The submitted version is named as such, and then when the referee comments are received I make a new subdirectory and produce the revised version in that. I use Subversion versioning software, running on a university server that is backed up daily Draft form with primary number 0; issued versions allocated consecutive numbers. Letters used with numbers to ensure versions are saved regularly whilst in drafting stage. organize along dates of storing and no. of revision filename with versionnumber when published filename includes the word ""final"" I define a project folder, with subfolders indicating consecutive journal submissions, and further subfolders indicating the various revisions that have been submitted. I use subject based filing system and use the date to distinguish between versions that I have produced. Yes, I have my files organised by folder, each related to particular topics within which I deposit related research outputs (irrespective of dissemination method (i.e. journal, conference, etc.)). 77 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 Appendix E - Personal information management and versions – Difficulties Those who answered No, not completely (222 respondents) were invited to ‘describe some of the difficulties that you face when organising your own revisions of documents. probs arise because of using multiple machines - use stick to transfer but sometimes versions get muddled! Need a virtual storage space accessible from anywhere and one which has auto back up! I would like to have more comprehensive system (better than Microsoft file system( to archive them by creating and using easily and rapidly a classification system and a filing system where preserve with a unique persistent identifier all the materials selected for preservation I am a bad information manager - something else always captures my imagination before I get around to it It is difficult to keep track of revisitions that at first may seem to be minor but may develop into major changes. I need a better discipline for describing and storing revisions. I tend to end up with names for documents that sometimes confuse versions. I tend to retain too many draft versions and sometimes I get confused about which is which Changes in software versions make going back to old versions difficult don't catalog carefully too many files relating to the same version: one tex file and many matlab codes for just the same version of the paper leads to too many names for just one version I tend to disaggregate the work to much and sometimes it is hard to trace back a document Sometimes hard to keep track of all changes, unless a good naming convention is adopted early on. Spending some time on deciding what to keep and what to drop would be more efficient. Sometimes I cannot remember what is the name of the revision that I´m looking for at the moment. I should improve my system of naming the files:-) I have many different version (and a lot when the paper has been rejected several times). It is difficult to remember where all the versions are. finding a filename and sometimes what has changed in the different versions if there are too many versions, it is sometimes difficult to organize them (1) takes up too much space (2) makes it harder to find what you are looking for (3) Figures (and other files) called on by the main document can get put in the wrong place. No consistency in my work (only versioning of important documents, versioning is done manually - new order in the file system), as I have no co-authored documents this is no real problem (in the moment, but could be as soon as I write more documents). Could do with a formal version control system listing changes, rather than tracking versions by date sometimes I do not know which is the latest version or which version was sent to a journal Sometimes there is no clear ranking of versions because they are produced to match different requirements (conference, journal submission ...) finding them easily keeping track of which version was produced when and what for Sometimes it gets messy ! Difficulties with tracking the most recent version, especially when working with co-author (i.e. always!) I often lack sufficient time to dedicate to file management - also among the many tasks one has as researcher and teacher this is not considered to be the most important although it is nice and convenient when looking for a file. I keep version of different stages of each research project, and am not happy with my labeling 78 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 scheme which often does not reveal which version contains what. I am using two different computers for my work. As a result sometimes I have my work at different stages on the two computers. I would like to find an easy way to get both systems up to date at any point in time. At times when getting information from older versions (information that at some point I thought not to include but then changed the decision) some times I am not very sure which previous version contains the information. Too many versions kept and difficulties to find the one I am looking for I easily get messed up with when I did produce what. Often there are almost too many different versions of the papers to keep track of things. forget to rename updated versions no rational numbers attributed My organisation of versions of papers have improved over time. For the older papers I have to search by ""date and time"" but the newer I have used a numbering system (especially when writing together with other researchers). Multiple folders, with dates, and number of version in file Excessive storage needed for different versions of code. I am not really good enough/take enough time to organize all vintages of code well. Moving too often...files get lost. i keep on rearranging my storage system structure and have several parallel ""versions"" different versions may have different merits, and the final version (accepted) may or may not be the ""best"" one... to recognize the latest version! sometimes my folder organization gets messed up and I do not remember where I saved the latest version Given long lag for publication and need to revise multiple times, one sometimes looses tract of new versions of paper as well as versions of dataset that correspond to the paper. I revise many times and am always afraid to lose bits that are removed from the paper. So, I store everything ""just in case"" but get lost in the end when I want to recover that. This is even more difficult when I have more co-authors. Could use a better version-naming system, including perhaps an automatic one (Word's 'copy' naming too clunky). I need to improve my unique system of naming and storing electronic and paper copies as more work - in progress and completed - accumulates... Incomplete tracking of changes Sloppy dating of drafts It is a nightmare!!! I used to try to name things with titles that would say something, btu with so many drafts, it is impossible, and I lose some. The best system I ahve come up with is paper title_date, but that is not fool proof either. When working alone, it is ok, but when I had a coauthor, I accidentally overwrote things and it was a mess. In switching between institutions (4 times in 6 years), I have also permanently lost data that I need to revise a paper for it to be published... ARGH! i do not know whether to give them a number corresponding to the number of the revision or to give them a date Maintaining coordinated archives between multiple machines Too many files, partly because storage is cheap (fortunately) I'd like to have version published. Come to think about it: why don't I have it? A few times I have noted that the final draft I keep is slightly different (some sentences) from the one published. People have asked me for the PDF version of the published paper and I cannot send it! In the final analysis, I guess the reason is that going through the paper proofs, correcting them and seeing the paper published is, for me, a completely cumbersome process. I'm sometimes changing my habits, in other words find it difficult to have a systematic organisation of my storage behaviour Folders tend to become messy. Synchronization of laptop and desktop hard-drives. Sometimes may have computer problems with files Sometimes it is difficult to keep track of different versions and sometimes a similar paper belongs 79 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 to a project and to a conference etc. and I do not know where to file it. Also, I am never sure how much versions to keep - I do not want to lose important information but an information overkill may be a problem as well. Especially, after some time has passed I might not remember which are the relevant versions and what are the differences. Keeping too many working versions and forgetting to destroy them when possible makes my filesystem messy. And the XP/Word files/folders dichotomy is to flat: I'd like to have 'bookshelves', some further organising 'tools' in the operating system. Difficult to see which version is the latest when there are only o few minor differences. Some difficulties keeping track of final versions of older papers Different folders and even computers, no systematic naming. Sometimes lots of versions all date stored but takes up too much space Need to be more organised and concentrate on storing final version unless it is a paper that changes radically over the time of writing I work on several computers, so have to be careful about which version I'm currently working on. Sometimes, papers get merged together or splitted into splitted papers. It is then difficult to keep track of the history. it takes a lot of space, I sometimes get confused with which one is the latest Sometimes I have too many copies. Difficult to remember which one was a major revision. time consuming! I tend to have a jumble of old directories including everything i had at the time, on the various machines I use. It should b more systematic too many documents. forget what they are Remembering the reasons for revisions The problem is that co-authors sometimes do revisions on the wrong version. We don't agree which is the latest version determining the chronological order of the versions; maintaining a common site for all versions--I work on a desktop in the office, a desktop at home, a laptop, and keep files on USB key memory devices as well Generally, I keep them all in a folder, with versions organized by date the revision was completed, so it is easy to know which is the most recent. However, my co-authors do not always follow this naming system, so sometimes when I get a revision back from them via email, if I don't save it and rename it right away, it is hard for me to locate the most recent version when I go back to work on it. I save too many iterations of a paper. I have a system for dating versions, but I sometimes save minor revisions or deviate from my system, so it's hard to figure out which draft is which. too many versions of them so it takes time to figure out which one I want to send out or work on Things that seem like dead ends, or not worth the space, might be good responses to referee comments later. Also, if current working version is a dead end, might want to re-try from an earlier version. But, assessing which versions to retain as backups is hard. I tend to keep so much that I can't catalogue it. Not organized enough, so I keep way too many old versions, and sometimes am not sure I have the most up to date one easily available and may pick an old one instead I get easily confused as to which papers are the latest versions due to poor labelling of files. While I plan to always keep the latest version of a doc often this does not happen. Most problems occur when a number of co-authors are collaborating. Some revisions are ex-post undesirable for certain submissions. The balance between keeping enough versions and not getting clogged is not 100% resolved. lack of quick access, lack of ability to adequately describe the files it is somehow difficult to reconstruct the order of the different versions after few years have past I do not have a consistent renaming system. This causes major problems in finding the correct version after a long period. not having the latest version when a co-author deals with the submission business it's sometimes time consuming to find the latest version when many versions are stored; there 80 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 remains a risk that I select the wrong version when disseminating the paper I do not subdivide the different versions. Problem of different folder (local, server) and where are the current versions I work on several computers and sometimes lose track of where the most recent revision is. It's time-consuming to save each extra step and writing/re-writing is an iterative process, so only main revisions are stored. After a long time I might not remember some details about some versions, perhaps, since different versions are just stored (under the name that also includes date of that version of paper) without any other comments. Generally I try to include key comments (in short) into the name of file. Documents are organised to the project AND to a folder named ""publications""; Sometimes I forget to save the final document to the ""Publications"" so that I have to look for the right version. I am not working on one computer. Sometimes there are three workingstations. Sometimes I forget my usb-stick and then I can´t update the files on the other workstation. i am chaos I keep all my revisions together in the same folder, and sometimes I don't name each one properly to signify what I submitted, what I got back, etc. So I sometimes have to go by the date of the file. Sometimes versions get mixed up, and then I do not know which version I sent out to a journal (for example). mistakes in change management sometimes not sure which is most recent I cannot always find the version needed to cut and paste parts of the text which are ok after all no consistent description of version status; no sure who has a copy of each version; different version control systems used over time i MAKE A MESS OF IT not systematic, no tool to keep track, plan to use CVS E.g. difficulties to keep track of the versions, especially of the order in which they were written. too many versions increase chance of confusion, working with old drafts, especially when using more than one computer, also: loosing ideas when proceeding with the work in another direction and not simply revising older versions I do have some difficult to retrieve my papers in my computer because some times I forget the names of the files or where I save them. Sometimes I cannot easily identify which version is which. Therefore I now add dates to each version of the paper in the name of the file. Could be difficult to have overview if there are many versions. Making sure that a certain version is the definitive one. I normally save everything in a separate folder for each paper, but when coming back to this folder after a while nothing makes immediate sense, except for the final published version. after a number of revisions is confusing to keep proper track of all documents and versions. There are simply too many versions. Lazy naming of files especially in “DOS” eight character era. Accidental editing of an earlier version, confusing the date order; co-authors mixing versions, so non-nested variants occur. Sometimes I loose track of storage and/or make mistakes identifying the different versions Sometimes I mix the version of papers, databases and is a mess to find out the last one when I get the response from referees. Hard to keep track of changes, versions, dates, etc Locating different versions on different machines (home, school, old home machine, old school machines) SOMETIMES IT IS DIFFICULT TO ORDER FILES IN ORDER TO KEEP TRACK OF CHANGES. ESPECIALLY WHEN COAUTHORING. I keep some useless versions. content of a revision 81 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 I am storing to many versions before deciding which ones to discard. Version control and retrieval are major problems. Not always possible to know which the final version of a paper was when one comes back to look at an old paper. sometimes it is difficult for me to find the final version. The main problem is related to the fact that I store the versions in the same directory. Nowadays I created a file entitled ""final version"" to store the final versions of my papers. I hope it helps a lot in organizing my work. sub-directories for different projects, but a few files might be in more than one and so searching can take longer than optimal Sometimes I forget where I have put my last version I have developed a system whereby every iteration of a paper is identified by paper name and date (and, if co-authored, by which of us has done the iteration). What I am not so efficient at is distinguishing between versions with limited differences and those where substantial changes have been made. Sometimes I make alternative versions of the same paper (horizontal versions) and find difficult to acknowledge which specific changes I have made in each version. organizational aspects Sometimes it is hard to find the last version. Nevertheless, with the last papers I think I am doing a better job. When I work with co-authors who have different archiving or identifying techniques it makes hard to keep a clear path of revisions. Sometimes I don't remember under which criteria I classified and stored the document. some documents and in work, some in latex, so difficult to send for revisions Not very systematic in filing them or naming them Can be hard to recall which is the latest version when it comes to revising a paper to take account of reviewers' comments. Occasionally, co-authors may need to be provided with the latest version and this may not always be immediately obvious. I do not always label versions clearly, for subsequent reference. Sometimes the name is too long that I can not distinguish the version, either if it's the first or the last one. I know it is a minor problem, however, I have always to invest time thinking about it. I try to save too many old versions. I have problems when working with different computers. Sometimes I forget to update files in one of them. Very often, I forget to write the date of the revision on the draft. Sometime later, it is difficult to remember in which order they were written. Too many diskettes, some not correctly labeled. My present computer does not accept diskettes. papers are not well organized. it takes too much time to be on top of all versions so it is easier to store. i need to be better organized. I some times got confused by the versions I have changes are made iteratively - corrections to the final proofs are often made only by the lead author - all other co-authors typically don't have a copy. Final version is often still not camera ready ie graphics and figures are often submitted electronically as separate files (in addition to a text file) to the publishers so no actual 'as submitted' final version exists which can be easily submitted to a repository. Organising a repository takes time. Upgrade of computer systems and changing institutions can cause repositories to be lost. time constraints I think that I keep too many old versions. I like to keep several while working on the paper incase files get corrupted etc but I seldom go back afterwards and delete all unimportant versions. whilst I am reasonably happy they are secure and safe, there is always the possibilty of the computor glinch. I do not always have paper copies of interim versions, so things could be lost probs arise because of using multiple machines - use stick to transfer but sometimes versions get muddled! Need a virtual storage space accessible from anywhere and one which has auto back up! 82 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 Appendix F – Responsibility for secure long term storage of versions of academic papers In Question 14, researchers were asked if they had anything to add to their answers about responsibility for secure long term storage of versions of academic papers (Qus 10-13. Subjects in Q 11 and 13 should collaborate to ensure preservation. ideally I like to post pre-accepted-for-publication versions in the public domain eg personal or professional website to speed general feedback and participate in ongoing communities of practice - but this is a personal/discipline thing - I am not working in a field where I am too worried about others ""stealing"" my work - mainly I am wanting to give my students access to my work to supplement my teaching. Right now I am not teaching but keep my research team colleagues up to date by posting documents to a project website - it has both public and team-member-only areas, most documents have a version for each area. I expect all these web resources to be kept going over a period of years. I personally don't feel that any versions before the ""nearly final"" ones should be kept forever. Depending on the remit of the IR and copyright positions, then the more ""final"" versions should be kept. However, at the moment, my position is that the publishers should have the primary responsibility for keeping the formally published version These questions are not completely correct and relevant: it depends on the function of the repository. If it is an archival repository of the publisher, of the university, etc. the level of detailed and complete information should be higher than in the case of a repository of eprints. I think that the debate in this area is very confusing because it does not make any difference in the final function of the repository involved. Even if the digital environment creates convergency the distinction should be defined at a logical level to ensure a clear framework of the responsibilities involved and the policies to develop Q7 needs an option that says something like I keep all versions until final publication.... What versions an author keeps seems to me to be entirely up to the author's own discretion. I personally keep a complete version trail but I'd resent being told I HAD to do that - I do it for my own reasons. It is only the final published version that should be made available to the community. To make lots of different versions available widely will only ultimately lead to confusion and chaos, claim and counter-claim about who said what and when, and who has priority, and ... It would be madness. Universities/institutions should provide access to backup capacity. In that respect they would be responsible for secure storage University should take responsibility for secure storage of the discussion papers released in their series Working papers that are refereed to in published papers need to be stored permanently as well. Keep it simple ;-) The word ""responsibility"" is a bit odd. It's great that ""subject repositories"" have large collections of papers, but is it their responsibility to do so? I think not. often, some online free available repositories should keep draft versions of the paper (circulated to colleagues or peers for feedback) I do not understand the term 'subject repositories'. What is your definition of 'responsibility' (legal, routine procedure,...)? No Yes, create a universal repository place on the Internet for all versions of the papers. Each user would have a secured personal page to store his/her values. I think it is important to avoid disseminating working papers in the wrong way. Before they are published with a recognised journal or working paper series they should only be made available directly by the researcher among known peers. I do not agree with the tradition that schools try to 83 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 make you publish working papers formally because it is sometimes premature and also it may affect the peer reviewing process. Electronic Paper Archives are very helpful already (SSRN, RePec) The questions are a bit misleading, since they don't give the option of keeping the Working Paper version of a paper (released under some series) and the final published version. The working paper version does not necessarily have a one-to-one correspondence with the categories above. No. Thanks for doing this! answer to question 13: responsibility for draft version only if published as working paper/placed in repository by authors We have to devise a very simple method to publish a paper. I continue wondering why I do nto have the PDF versions of the published versions. I guess I am so tired after checking the proofs that I just want to see the paper published and move on. Universities/institutions have the responsibility for secure storage of working paper versions I think that the issue of confidentiality (regarding the use of sensible data, for example) could hardly be addressed if storage of initial material were outside the authors. no Universities and libraries should take responsibility not only for secure storage by broad ACCESSIBILITY of academic papers in the lines of the Open Archives Initiative University computer centre should ensure adequate backup, as ours does checks in 11 and 13 mean either/or The published version of a paper should be the definitive outcome of a research project. Any working papers at earlier stages should be the responsibility of the authors alone No I don't like to have old versions of my papers circulating on the net so I'd like to have a centralized place where I could take down an old version and replace it with a new one. SSRN's system is pretty good. I think it's a good idea to archive material that a less-than-perfectly-attentive referee wanted stricken. That material might be useful in later work on a similar topic. Universities should also keep permanent copies of working papers (in digital forma). I take ""Version submitted to a journal for peer review"" as the working paper version, whereas ""Draft version circulated to colleagues or peers for feedback before submitting to a journal"" is taken as the version presented in seminars and conferences Some papers don't come out in refereed journals - then working paper versions should be held. pre-publication versions, in particular working paper sand/or its provider, ought to indicate where the paper was eventually published Not sure of the definition of ""subject repository"" It would not make sense to story all versions. Universities, Publishers and repositories should store the published versions only. There may be copyright implications. Copyright should remain with authors; it should not have to be assigned either to university employers (academic freedom & freedom of employment issues) or publishers (sponging off publicly funded universities), however both employers and publishers should be granted full licenses to use the relevant material as they wish. data storage/online supplementary data/ supporting material need to be kept too I assume that the ""version submitted to journal for peer review"" is also published in a working paper series by the university. Books + book chapters should also be stored and made available electronically. no In the item 13, If we are talking about the institutional repositories they should keep the final version submitted to a journal for peer review and the version produced by publisher. If the repositories is one such eprints that you can make and save colleagues and co-authors comments, them you should keep all the draft version and also the version produced by publisher. It is also preferable to keep working paper series online so that all papers (also older ones) are accessible for all interested parties 84 VERSIONS Project – WP2 – Report of Researchers Questionnaire– v1b – 29 February 2008 No, thanks What's a subject repository? Most problematic to me are appendices referred to, but not included in, the publisher's final version. I try to ensure they are included in a working paper version (preferably available online). no. In my opinion, it is mainly the responsibility of the authors, provided that the university supplies means to make back-ups etc. My views expressed in the above answers, that I have a responsibility for keeping versions from every stage whilst others should only keep final published versions, is based on my view that the changes made (in particular the cuts made from initial to final version) may be relevant for other papers within an ongoing project - and are therefore relevant to my own work but not to anyone else. I believe it is important to keep the initial version submitted to a journal for peer review to avoid potential claims of plagiarism. Institutions should all have WP series with research output. Personal web pages should have links to WP series and to publications, and (if different) versions submitted to journal for peer review. Universities should offer institutional repositories (which would secure long term storage) with open access and train researches to regularly deposit their material. University libraries have traditionally stored the paper version of an article, so long as the institution subscribed. The situation has changed now, and I think there's a strong case for arguing that the author's institution should keep a final version of all output, regardless of whether or not a subscription is active. The final version produced by an author is often in a strange format to suit reviewers, and not in a nicely set up format for easy reading. Not always the ideal thing to share on a repository. I would prefer to put in the published version, but this often violates copyright. I would not be happy with unfinished versions being publicly accessible as they may not reflect the quality of my work and sections edited out may form the basis of further articles. However if they were securely stored just for me to access that would be different 85