EU FP7 research in Open Access Repositories.doc

EU FP7 research in Open Access Repositories

1

Sara Pérez Álvarez, Felipe Pablo Álvarez and Isidro F. Aguillo

{sara.perez.alvarez; felipepablo.alvarez; isidro.aguillo}@cchs.csic.es

Cybermetrics Lab, CCHS-CSIC, Albasanz 26-28, Madrid, 28037 (Spain)

Abstract

Open access repositories are a reliable source of academic items that can be used for testing the capabilities of the webometric analysis. This paper deals with actions needed for extracting web indicators from bibliographic records in open access repositories, provides guidelines to support a further webometric study and presents the results of a preliminary web impact evaluation carried out over a sample of 1386 EU FP7 output papers available from the OpenAIRE database. The

European Commission project OpenAIRE aims, among other objectives, to provide impact measures to assess the research performance from repositories contents and, especially, of

Special Clause 39 project participants within EU FP7. Using URL citations, title mentions and copies of titles as main web impact indicators, this study suggests that a priori the implementation of the mandatory clause SC39 to encourage open access to European research may be resulted indeed in a greater and more immediate web visibility of these papers.

Introduction

Webometrics is a quantitative science devoted to the analysis of scholarly communication, such as Informetrics, Bibliometrics and Scientometrics. It was introduced in mid-nineties by a group of researchers including Ingwersen (1997, 1998), Rousseau (1997), Aguillo (1998), Bar-Ilan

(1999), Smith (1999) and Thelwall (2002, 2009), among others.

Methodologically, Webometrics is concerned with gathering data and measuring aspects of the

Web, like for example: web sites or web pages; hyperlinks; rich (documents) or media files; web search engine results; Web 2.0 social networks, etc. The data collection can be performed directly using robots or crawlers specially customized for this task or indirectly extracting information from the databases of the large commercial search engines (Google, Bing and others).

The web indicators can be useful for describing both formal and informal academic activities and results, the performance of organizations, institutions, research groups or even individual scientists and scholars. They can be grouped in three main families (Table 1): those describing activity or presence, counting number of pages, documents, files or other items; a second group describes the visibility or impact of such contents, obtaining statistics after applying link or mention analysis; finally, usage analysis is a fairly new group consisting of numbers related to visits and visitors of the websites. A more detailed classification is available in Aguillo (2009).

1 This work is supported by OpenAIRE project, grant agreement number 246686, under the Seventh Framework

Programme of the European Union. The authors appreciate the Statistical Analysis Unit of the CCHS-CSIC (Spain) for its assistance.

Table 1. Comparative classification of the main webometric and bibliometric indicators.

FAMILY

Activity

Impact

WEBOMETRICS

Web pages

Web documents

Web domains

Web contents

Link Analysis

Mention Analysis

Web 2.0

BIBLIOMETRICS

Publications

Authors/Affiliations

Disciplines

Evolution/Dynamics

Citation Analysis

Semantic Analysis

Usage Visits/Visitors

Downloads

Altmetrics

Journal Circulation

As it is shown in table 1, a novel and promising approach is to examine the use and citation of articles in new forums: Web 2.0 services (Priem & Hemminger, 2010). Because measurements of these new traces may inform alternatives to traditional citation metrics they have been dubbed

“altmetrics”. This is an umbrella term which condenses ideas on how to combine social media with aspects of traditional scholarly practice (Priem et al., 2010). As such, it is properly a subset of Webometrics (Bar-Ilan et al., 2012).

Within this framework, this paper deals with actions needed for extracting web indicators from bibliographic records in open access repositories, provides guidelines to support further webometric studies and presents the results of a preliminary web impact evaluation carried out over a sample of records available in the OpenAIRE repository network.

The European Commission project OpenAIRE aims to deliver an electronic infrastructure and supporting mechanisms for the identification, deposition, access, monitoring of Framework

Programme 7 and European Research Council (ERC) funded articles and providing impact measures to assess the research performance from repositories contents and, especially, of

Special Clause 39 project participants within EC FP7. SC39 covers scholarly literature across 7 disciplines of projects granted after August 2008. In particular, the participants of projects with

SC39 in their contracts shall deposit their scientific publications in institutional or subject-based repositories allowing open access (after an embargo period, if applicable) to project outcomes.

OA means free availability of the results of the research, but also some other advantages such as immediate and global dissemination, increased citations, new metrics, open data, access to publicly funded research, etc. (Giglia, 2010; Swan, 2007; Suber, 2006; Jeffery, 2006; Zhang,

2006). OpenAIRE is working closely to integrate its information with the CORDA database, the master database of all EU-funded research projects. Soon it should be possible to click on a project in CORDIS (the EU’s portal for research funding), for example, and to access all the open access papers published by that project (Manola, 2012).

Given the limited duration of the OpenAIRE project (2009-2012), data collection for traditional bibliometrics is highly constrained since impact analysis based on bibliometrics requires an extensive period of time (at least 2-3 years) after research results have been published in order to gain enough insights. As a result, the alternative metrics promising an earlier analysis (usage statistics and webometrics) are also considered in OpenAIRE as indicators for assessing the impact of FP7 publications.

Figure 1 shows the relative positions of the three groups of indicators being developed according to their specific characteristics. In that figure, the concept of quality refers to the fact that citations come from peers recognizing papers already published in refereed journals with a high visibility.

COVERAGE

LATENCY FRESHNESS

Visits/Downloads

QUALITY Citations

Mentions/Links

Figure 1: Mapping the indicators according to general characteristics. Source: Aguillo (2011a)

Methodology

The original and most widespread approach to the web impact analysis is to count hyperlinks to the objects studied. However, although link counts have been available from commercial free search engines for over a decade, this search facility no longer exists in 2012 (Thelwall & Sud,

2011). The loss of this tool requires theoretical and methodological developments of

Webometrics (Aguillo, 2012). The role of link analysis might now be assumed by mention analysis , a promising technique that had already been reported by several authors (Aguillo, 2009;

Thelwall, 2009). Without abandoning the search engines, the goal now is not to analyze links but terms or phrases and to evaluate its presence in a quantitative way . In this regard, two possible alternative methods for estimating the online impact of any piece of information in the

Web are URL citations and title mentions. An URL citation is the mention of the URL of a web page or web site in another web page, whether accompanied by a hyperlink or not, and a title mention is the inclusion of a title in a web page, with or without a hyperlink (Thelwall, Sud &

Wilkinson, 2012).

The data set analyzed here is a collection of 1386 records (9% are SC39 records: 122 titles) available in the OpenAIRE network and extracted in November 1st 2011. Not all the repositories with SC39 records are OpenAIRE compliant. That means at this stage only partial results are available.

OpenAIRE is based on a technology developed in an earlier project called Driver (www.driverrepository.eu/). It uses the same underlying technology to index Framework Programme 7 (FP7) publications and results. FP7 project participants are encouraged to publish their papers, reports and conference presentations to their institutional open access repositories. The OpenAIRE engine constantly crawls these repositories to identify and index any publications related to FP7funded projects (Manola, 2012). It is also linked to CERN's open access repository for 'orphan' publications, those from FP7 participants that do not have access to its own institutional repository.

OpenAIRE stores bibliographic metadata (harvested from repositories or claimed by end-users) and related project information (from CORDIS) in a database. Relevant data is then indexed and represented in the OpenAIRE portal (www.openaire.eu). The sample records were obtained by querying the OpenAIRE database and the results were transformed into a CVS file.

First, an analysis of the CSV fields extracted is carried out. The aim is detecting bibliographic errors, evaluating the quality of metadata in the records and also preparing the different web search strategies. The unit for the webometric analysis is each one of the deposited items, using as representation of them the title and the (first or corresponding) author of the paper in order to count title mentions and copies of documents, and also their URL/URLs in order to count their

URL citations. Figure 2 shows the basic model for data extraction, from parsing the record, cleaning strange characters and preparing the different strategies with the correct syntax for the search engines to be used in the web impact study.

Figure 2. Theoretical diagram of actions needed for extracting web indicators from bibliographic records in an open access repository.

As for the sources used for extracting the web indicators, the following have been considered

(Table 2):

Table 2. Proposal of sources for building individual web indicators.

WEB INDICATORS

SECTIONS TOOLS

Public

Web

Search engines

Google

Bing

Specialized Scholar

ACTIVITY IMPACT

Webpages Documents Ranks

YES

YES

YES

YES

YES

PageRank

USAGE

Links Mentions Ranks

YES

YES

YES

Web

2.0

Web 2.0 tools

Mendeley

Bibsonomy

YES

YES

(YES)

(YES)

Google includes the "link" operator, but this does not allow an easy collection of aggregated data.

That is, it provides the total number of links to a webpage, but it does not allow quantifying the links coming from a source (Orduña-Malea, 2012).

Regarding Web 2.0, we have also studied other tools like CiteULike, Connotea and Delicious.

Additional applications and their possibilities for research assessment are described in Wouters &

Costas (2012). We have considered the search results returned by general search engines because they provide all different mentions at once (Table 3). The strategy used is: "title" site:Web2.0

domain. Finally, we have chosen Mendeley (mendeley.com) and Bibsonomy (bibsonomy.org). In case of Connotea and Delicious, a small set of titles from the sample have been searched, but no usable results were obtained.

Table 3. Title mentions returned by general search engines in Web 2.0 domains.

Domain

Mendeley

General search engine results (title mentions)

Metadata record.

Mentions as related research.

Number of times referenced by other documents.

Bibsonomy

CiteULike

Connotea

Delicious

Number of times referenced by other documents.

Metadata record in each author’s list of documents in Bibtex format.

Metadata record.

“Posting history”.

Number of times the document is tagged in a bookmark.

Number of times the document is tagged in a bookmark.

Quality of metadata

The main problem detected is the lack of homogeneity among repositories when providing the information:



Character codification is different among repositories and in many cases, especially when the language is different from English or the discipline uses Greek (mathematical or scientific symbols) or non-roman letters, the record is full of strange characters. This applies to titles but to the authors’ names too.



In some repositories, the fields are unexpectedly empty.



Some fields have multiple entries. E.g. different URLs.

Guidelines for webometric analysis

The following criteria and procedures are recommended for performing the analysis:



Titles of most of the scientific papers usually have a great length, which reduces the probability of generating noise, so the full title of the record is used (but without exceeding the limits of search engines: no more than 32 words in Google or 150 characters in Bing). The text should be enclosed between quotation marks (strict adjacency operator) for exact matching.



When the number of characters is low, the first author’s last name can be added.



If there are two versions of the title (original and translated), they can be combined using the OR operator, but must take into account the limitations of search engines when using more than one "boolean" operator.



Regarding titles with non-standard characters, the use of wildcard operator (*) by Google,

Google Scholar and Bing has been studied. Table 4 summarizes the main results obtained when testing out the use of wildcard operator (*) in the titles from the sample. As a result, it has been questioned the effective use of this operator in all search engines, and it is recommended using the search string with standard characters (or parts of the title with standard characters combined by the AND operator) + author last name.

Table 4. Use of wildcard operator (*) by Bing, Google Scholar and Google.

Wildcard operator (*)

BING:

Find multiple forms of a word

Exceptions found

(titles from the sample)

1 st e.g. Title: An example of high order residual distribution scheme using non

Lagrange elements: example of BÃ©zier and NURBS.



Search query (*):

“An example of high order residual distribution scheme using non Lagrange elements: example of B* and NURBS”

. No results.



Search query (by incomplete title): "An example of high order residual distribution scheme using non Lagrange elements: example of ". Relevant results.

2 nd e.g. Title: La conception et les usages de ressources en ligne comme moteur et rÃ©vÃ©lateur du travail collectif des enseignants.



Search query (*): "La conception et les usages de ressources en ligne comme moteur et révélat* du travail collectif des enseignants ”. No results.



Search query (AND): "La conception et les usages de ressources en ligne comme moteur" AND "du travail collectif des enseignants" . Relevant results.

GOOGLE

SCHOLAR:

Substitute for whole words

E.g. Title: A colecÃ§Ã£o de estirpes autÃ³ctones de Saccharomyces cerevisiae das principais regiÃµes vitivinÃcolas portuguesas.



Search query (*): “A * de estirpes * de Saccharomyces cerevisiae das principais * portuguesas” . No results.



However, correct full title: “A colecção de estirpes autóctones de

Saccharomyces cerevisiae das principais regiões vitivinícolas portuguesas”

.

Relevant results.



Proposed search: allintitle:Saccharomyces cerevisiae das principais author:Machado (or: estirpes AND "Saccharomyces cerevisiae das principais"

AND portuguesas author:Machado). Relevant results.

E.g. Title: The cosmology of induced $f({\cal R})$ gravity.

GOOGLE:

Substitute for whole words



Search query (*): "The cosmology of induced * gravity" AND "Brouzakis". 3 filtered results are returned (5 unfiltered), but 2 of them are unrelated.



However, the search using the same criteria as for Bing and Google

Scholar : "The cosmology of induced" AND "gravity" AND "Brouzakis" returns 21 filtered results (70 unfiltered), all relevant.

Once these criteria have been applied to prepare titles and authors, Table 5 presents a summary of the indicators and search strategies used in the analysis:

Table 5. Web indicators and search strategies.

Web indicator

Search strategy

Title mentions

“Title”. Exceptions:

Titles with less than 5 relevant words: “title” AND last name’s author.

Title with non-standard characters: “The search string with standard characters (or parts with standard characters combined by the AND operator)” AND last name’s author.

Search in Mendeley and Bibsonomy:

"Title" site:Web 2.0 domain .

Title copies

Google: intitle: "x" or allintitle:x .

Google Scholar: tags allintitle and author . Also, the structure intitle:"x" AND author:x. As allintitle returns errors when using some symbols, we are going to use

INTITLE. As in the case of Google.

Bing: there is no tag allintitle , but instead, it can be used intitle: "x".

As a result, the search strategy suggested here is:



Bing and Google: intitle:“x” (intitle: “x”) (AND author last name)



Google Scholar: intitle:“x” (intitle: “x”) ( author:x)

“URL”

URL citations

Results

Some relevant results from the data set are:



94% of the sample records are “open”. However, 69% links only to the metadata record in its repository instead of the full-text document.



Only 26% (16% for the SC39 records) provides URL to the full-text (97% to PDF files;

50% in the case of SC39 records).



Regarding the name of the PDF files, most of them (95%) are not representative as they do not refer clearly to the document content (instead, they refer to numbers, title abbreviations combined with authors, other codes, parts, etc.). We consider that the most correct way to name a PDF file would include explicit semantic content related to the author/s, publication year and the title.



There is a lack of homogeneity among repositories in the type and number of URLs to be extracted for this field. 88% of the total records present one unique URL (76%) or two

(22%). In the case of SC39 records this percentage is still higher, reaching almost 100%:

79% presents 1 URL and 19% presents 2.

The results of the webometric analysis are presented below. All data come from the filtered results offered by the search engines.

URL citations

The total number of URLs which have been analysed is 1800 and the total number of URL citations received using Google is 4807. As most of the records in the sample have a unique

URL, and 69% links only to the metadata record in its repository, it is not surprising that the largest number of citations received come from this type of URL (Table 6). Main repositories in number of URL citations received are: 1) The CERN Document Server (http://cdsweb.cern.ch/);

2) French Repositories: L'archive ouverte pluridisciplinaire HAL (http://hal.archives-ouvertes.fr),

HAL – Inria (http://hal.inria.fr/); 3) University of Twente (http://doc.utwente.nl/); 4) The Orphan

Repository (http://openaire.cern.ch/).

Table 6. Distribution of URL citations.

Type of URL

Nº of URLs in the s am ple

Nº of URL citations in Google

Bibliographic citations (by

DOI) in Google

Scholar

Percentage over the total citations

Percentage over the total nº of

URLs

Metadata records in the main repositories

PDFs

PURL (handles)

Other URLs (m ainly, m etadata records in databas es -s uch as IEEE

Xplore, Science Direct, etc.-)

999

374

266

2834

921

496

59%

19%

10%

56%

21%

15%

68 305 6% 4%

DOIs (identifier)

24 197 288 4% 1%

Other file form ats (not PDF)

TOTAL

69

1800

54

4807

1%

100%

4%

100%

However, we get that the number of citations received varied significantly by type of URL

(Kruskal-Wallis test, p < 0.001). According to this fact, the DOIs receive more citations than the metadata record URLs (Figure 3). This is also true for SC39 records.

Figure 3. Nº of citations/Type of URLs.

Considering the ratio of the URL citations related to DOIs, it is worth analyzing in more detail this set. Thus, it has been obtained through Google Scholar the number of times these 24 titles have been cited by other works, obtaining a total of 288 bibliographic citations. As it was expected, the most cited publication of this set has proved to be the oldest (from 2004).

Nevertheless, next three most cited DOIs are SC39 titles (from 2009 and 2011).

Title mentions

In webometric analysis, the self-mentions should always be excluded (Aguillo, 2012) using expressions like "-site:urlrepository". Self-mentions in Google represent only 1% of the results, while in Bing this figure is significantly higher, 47%. Taking this into account, it must be stated that the results presented in this study always refer to non self-mentions. It is also noteworthy that the difference between the number of mentions offered by Google (56414) and those offered by

Bing (7793) is quite considerable: Bing offers 86% lower results.

As for the distribution of mentions and focusing on Google (dismissing Bing due to fewer number of results), the highest percentage of titles (Figure 4) is in the range extending from 31 to

40 mentions (20% of titles).

Figure 4. Title mentions (without self-mentions) in Google.

SC39 titles represent 10% of total mentions. In this case (Figure 5), the highest percentage of results is in the range of 41 to 50 mentions (27% of titles). Figure 5 also shows that 61% of the results are concentrated in the second part of the graph, from 41 mentions onwards, unlike what happened with the whole sample set (Figure 4), where the highest weight lies on the first half of the chart (0 to 40 mentions).

Figure 5. SC39 title mentions (without self-mentions) in Google.

Copies of titles

There have been found 8960 copies of titles in Google. 10% of them are SC39 titles. The highest percentage of results (52%) is in the range between 1 and 5 copies. In the case of the SC39 titles,

most have between 6 and 10 copies. On the other hand, it is rare to find titles that have more than

20 copies or none (only 4%), the same being true with SC39 titles too.

Using Google Scholar, 1601 copies have been detected (82% less than in Google). 9% are SC39 titles. 81% of the titles in Google Scholar contains only 1 copy and in no case exceed 10 copies per title. Again, this situation applies also for SC39 titles.

Social bookmarking: Mendeley and Bibsonomy

The study of title mentions in Mendeley and Bibsonomy sites using Google reflects a larger presence in the former than in the second (Figure 6). For the total sample: 64% presence in

Mendeley versus 35% presence in Bibsonomy. For the SC39 records the ratios are 83% versus

27%.

There were a total of 4216 title mentions from Mendeley (11% relates to SC39 titles). In both cases, the highest concentration of mentions is in the range of 1 to 5. Specifically, 45% of the titles are mentioned from 1 to 5 times in Mendeley, while in the case of SC39 titles this figure rises to 61%. No title exceeds 60 mentions in this site (no more than 30 mentions in the case of

SC39 titles).

Regarding Bibsonomy, there were a total of 2639 mentions (5% refers to SC39 titles). In both cases, most titles are not mentioned in this social bookmarking. Of those titles Bibsonomy mentioned, most only appear 1 to 5 times, still rarer are those mentioned over 10 times. The same applies to SC39 titles (Figure 6).

Figure 6. Title mentions in Bibsonomy and Mendeley (by Google)

As it has been observed, in general, the representativeness of the SC39 records is similar in all the cases, around 9 to 11%. However, this is not true for the case of Bibsonomy, wherein this representation drops up to 5%.

Considering the sum of the collected mentions (titles, copies of titles and presence on Mendeley and Bibsonomy) as an overall indicator of web impact, there is a slight statistically significant difference in favor of the SC39 titles (Mann-Whitney test, p <0.02). In other words, titles that meet clause SC39 have a visibility slightly higher than the rest of the titles in the sample.

Conclusions

One of the main conclusions to be drawn from this study is that the lack of homogeneity and standardization in the records in terms of levels of description and/or terminology used in certain fields -such as titles, authors or URLs (type and number)-, makes the webometric analysis difficult, so it has been required to establish some recommendations. Furthermore, to conduct an analysis of this type on a set of titles it is necessary a specific design of the search strategies which must be studied in detail to ensure that results obtained are really representative of each and every one of the records in the sample.

Regarding the results found in the present study, it can be extracted that the number of title mentions is greater than URL citations. That is, there are fewer mentions to the “addresses” of the documents than to their titles: 4807 URLs citations versus 56414 title mentions. Likewise, we have obtained that the most cited “addresses”, proportionately, have been those referring to DOIs.

As for the title mentions, the fact that self-mentions do not pose a relevant percentage of the total mentions retrieved by Google (not in the case of Bing), suggests that the sample achieved substantial visibility (taking into account that, a priori, 99% of the total mentions come from external sites). However, it is necessary to develop similar studies to compare and to correlate the results obtained.

In connection with S39 records, it is interesting to highlight that most of records that have a URL to a DOI are SC39 and how, even though these publications are recent (mainly 2009-2011), they are in the leading positions in terms of number of bibliographic citations received. Furthermore, most records in the sample do not exceed 40 mentions per title, but in the case of SC39 titles their visibility is proportionally higher as most are between 41 and 100 citations per title. In terms of visibility on the selected social bookmarking tools, there is a much more notorious presence in

Mendeley than in Bibsonomy, especially in the case of SC39 records.

Searching for copies of titles, there was again a greater presence in the record set of SC39 titles

(in Google), with an average of 6 to 10 copies detected for each title. However, it is rare (both on

Google and Google Scholar) to find a title which does not have any copy.

Ideally, the full-text files of the documents should have greater visibility. For that, it is advisable to use as official URL the one to the full text document, as Aguillo (2011b) recommended when discussing research priorities in relation to the open access initiatives. It is also recommended that this URL to the full text would be, if possible, short in length, without strange or complex codes and with meaningful content.

There is growing evidence suggesting that open access increases citation and impact of research results, as Swan (2010) concluded after analyzing a series of studies devoted specifically to the analysis of this issue. The data obtained in the present study a priori suggest that the implementation of the SC39 mandatory clause to encourage open access to European research

may be resulted indeed in a greater and more immediate visibility of these titles. Of course, subsequent studies are needed to confirm this result more firmly.

References

(All URLs have been reviewed in May 2012)

Aguillo, I.F. (1998). STM information on the Web and the development of new Internet R & D databases and indicators. In D. Raitt, (Ed.). Proceedings, Online Information 98 (pp. 239-243).

London: Learned Information.

Aguillo, I.F. (2012). La necesaria evolución de la cibermetría.

Anuario ThinkEPI, 2012 , v. 6.

Retrieved from http://www.thinkepi.net/la-necesaria-evolucion-de-la-cibermetria

Aguillo, I.F. (2011a). Building web indicators for the EU OA repository. In: Workshop on New

Research Lines in Informetrics . IPP-CCHS (CSIC). Madrid, May 16th 2011. Retrieved from http://digital.csic.es/bitstream/10261/40279/1/OpenAIRE%20Webometrics.pdf

Aguillo, I.F. (2011b). Ranking Web de repositorios: Webometrics y el acceso abierto. In:

Visibilidad y Acceso a la Producción Científica . Lima (Perú). 22-24 de septiembre de 2011.

Aguillo, I.F. (2009). Measuring the institution's footprint in the web. Library Hi Tech , 27(4): 540-

556.

Bar-Ilan, J. (1999). Search engine results over time - a case study on search engine stability.

Cybermetrics , 2 (1): Paper 1. Retrieved from http://cybermetrics.cindoc.csic.es/articles/v2i1p1.html

Bar-Ilan, J., Haustein, S., Peters, I., Priem, J., Shema, H. & Terliesner, J. (2012). Beyond citations: Scholar’s visibility on the social Web (Preprint). Retrieved from http://arxiv.org/ftp/arxiv/papers/1205/1205.5611.pdf

Giglia, E. (2010). Open access to scientific research: where are we and where are we going?.

European Journal of Physical and Rehabilitation Medicine. Minerva Medica , 461-469.

Retrieved from http://eprints.rclis.org/bitstream/10760/14980/1/eur_jnl_med_rehab_3_2010_open_access%5B

1%5D.pdf

Ingwersen, P. (1998). The calculation of Web Impact Factors. Journal of Documentation , 54 (2),

236-243.

Jeffery, K. (2006). Open access: An introduction. Retrieved from http://www.ercim.org/publication/Ercim_News/enw64/jeffery.html

Manola, N. (2012). Open access: EU project results go public. Retrieved from http://cordis.europa.eu/fetch?CALLER=PRINT_OFFR&SESSION=&ACTION=D&RCN=851

9

Orduña-Malea, E. (2012). Fuentes de enlaces web para análisis cibermétricos (2012).

Anuario

ThinkEPI, 2012 (to be published).

Priem, J., Taraborelli, D., Groth, P. & Neylon, C. (2010). Alt-metrics: A manifesto. Retrieved from http://altmetrics.org/manifesto

Priem, J., & Hemminger, B. H. (2010). Scientometrics 2.0: Toward new metrics of scholarly impact on the social Web. First Monday , 15 (7). Retrieved from http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/view/2874/2570

Rousseau, R. (1997). Sitations: an exploratory study. Cybermetrics , 1(1), paper 1. Retrieved from http://cybermetrics.cindoc.csic.es/articles/v1i1p1.html

Smith, A. G. (1999). A tale of two Web spaces; comparing sites using Web Impact Factors.

Journal of Documentation , 55(5), 577-592.

Suber, P. (2006). Open access http://www.earlham.edu/~peters/fos/overview.htm overview. Retrieved from

Swan, A. (2007). Open Access and the progress of science. American Scientist , 95, (3), 198-200.

Swan, A. (2010). The Open Access citation advantage: Studies and results to date. Technical

Report, School of Electronics & Computer Science, University of Southampton. Retrieved from http://eprints.ecs.soton.ac.uk/18516/2/Citation_advantage_paper.pdf

Thelwall, M. (2002). An initial exploration of the link relationship between UK university Web sites. ASLIB Proceedings , 54(2), 118-126.

Thelwall, M. (2009). Introduction to webometrics: Quantitative Web research for the social sciences. Synthesis Lectures on Information Concepts, Retrieval, and Services , 116 pp. doi:10.2200/S00176ED1V01Y200903ICR004

Thelwall, M. and Sud, P. (2011). A comparison of methods for collecting web citation data for academic organizations. Journal of the American Society for Information Science and

Technology , 62: 1488–1497. doi: 10.1002/asi.21571

Thelwall, M., Sud, P., & Wilkinson, D. (2012). Link and co-inlink network diagrams with URL citations or title mentions. Journal of the American Society for Information Science and

Technology (in press). Retrieved from http://www.scit.wlv.ac.uk/~cm1993/papers/URCitationsTitleMentionNetworks_preprint.doc

Wouters, P. & Costas, R. (2012). Users, narcissism and control – tracking the impact of scholarly publications in the 21st century, SURFfoundation. Utrecht. Retrieved from http://www.surffoundation.nl/nl/publicaties/Documents/Users%20narcissism%20and%20contr ol.pdf

Zhang, Y. (2006). The Effect of Open Access on Citation Impact: A Comparison Study Based on

Web Citation Analysis. Libri , 56 (3), 133-199. Retrieved from http://librijournal.org/pdf/2006-

3pp145-156.pdf

EU FP7 research in Open Access Repositories.doc

EU FP7 research in Open Access Repositories

Related documents

Products

Support

EU FP7 research in Open Access Repositories.doc

EU FP7 research in Open Access Repositories

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib