Google Scholar for bibliometrics Worksheet Time required: around 45 minutes This worksheet provides detailed worked examples and can be used for introductory level training It is based on the use of Google Scholar with the Publish or Perish software, freely available to download It covers use of Google Scholar as a bibliometric assessment tool for individuals and also its use for journal ranking Please explore the other MyRI items that can be used with this practical worksheet – other worksheets, online tutorial, posters, datasheets, product profiles and booklet Website: http://www.ndlr.ie/myri/ Last edited: 02 May 2011 For your notes: 1 Part One Google Scholar for personal bibliometric assessment Section One Publication counts for an author Obtaining an accurate list of author publications is quite tricky and various products provide tools to assist with this, and also give very different lists as they are based on different data sources. Publication counts are widely given as a metric for authors but are of quite limited value in that they do not give any real indication of use or impact of the research (is it actually read and cited at all? ) Of themselves they also do not indicate how the publication rate compares to the average for the research area which varies so much across fields. You will need to have installed the free Publish or Perish software on your PC from http://www.harzing.com/pop.htm and have internet access Start the software – all actions are done via the software interface (not your web browser) Choose the Author Impact tab (in older versions of the software this is in a left panel) and input your author’s name. It is important to include inverted commas round names in order to keep initials/first names with surnames, so Type in “F Convery” or “F J Convery” in the Author’s name box Pick relevant subject areas, uncheck subjects as below Click the Lookup button 2 (NB: searching for “F Convery” has the same effect as searching for “Convery f”) There are no other filtering and refining tools available, and this list probably has papers by the wrong author included in it still So you have to go through the list and uncheck any articles not by your author– re-sort your results by clicking on the Author column heading to facilitate this checking for the correct author Note there are often more items returned in Google Scholar than products like Scopus and Web of Science. There is no published list of what is indexed but things like repositories will likely return duplicates of the publisher versions of articles. Book publisher catalogues are included and many government and NGO sites and patent sites also. Adjust the column widths so you can see the publisher column Sort your author list by Publisher by clicking on the Publisher column heading Scroll down through the publisher column and you will see repositories and academic websites along with journal publisher sites Whilst having many versions of a publication makes Google Scholar a good source of citation information, as a source of lists of publications quite a lot of tidying up and deduplicating of titles will likely be needed – Sort your list by Title now by clicking on the Title column heading and you will see many items listed 2 or 3 times here for the same title. If checking on an applicant for a post this means that you cannot simply do an author look-up in PoP and assume a publication count of 213 = 213 separate research outputs Merge duplicates to get a better list of publications. Drag an item to drop it onto another. The number of publications will reduce but the number of citations for the individual item will aggregate. Unmerge items by right-clicking on the item and choosing Split 3 Citations from the popup menu. Explore the Copy> button to the right of the results screen as to ways to save the publication list for the author We return to the various citation metrics provided here below 4 Section Two Citation counts for individual items & cited reference searching We now move on to the real core of Bibliometrics – citation analysis, starting with looking at citation counts for individual articles from a researcher. In section One you covered getting a list of an author’s research output in Google Scholar. For those items listed in such a list under the Author Impact tab the number of cites to the article or item are shown in the left column of display – and you can click on the Cites column heading to sort items by numbers of cites You can also use the General Citation Search option in the left panel as a direct way to check on a single known reference for citations without having to create an author set first 1. Click on the General citations tab 2. Type in Kiniry as the author 3. Type in The phrase box “Hands-on look at java mobile agents” 4. Select Title words only 5. Click on Lookup 5 The record will be returned to you with the number of cites to it found in this databasenote there are various versions of the paper with different citation counts here but the principal IEEE published journal article appears at the top of the list Section Three Citation analysis of an author set of publications A number of metrics can be applied to a set of author publications rather than just getting a citation count for each discrete piece of research. By far the most important is the author h-index and its variants which are covered here. In section One you learned how to get an accurate list of an author publications in Google Scholar using the Publish or Perish software, checking and removing any false hit records by scanning the result set Repeat that process under the Author Impact tab to obtain a list of publications in Publish or Perish for a selected author Joseph Kiniry (Joseph R Kiniry, Joe Kiniry, J Kiniry, Joseph Kiniry) who works in computer science- so limit the subject choice to Engineering, Computer Science, Mathematics Scan the list and uncheck any papers you think may not be by this author – the metrics re-calculate as you do that, but these seem to be all by our author Click on the Cites column heading to sort by citation rates with the highest first and scroll back to the top of the list to see some documents with very high citation rates – and the author h-index displays above the references 6 At the top of the summary results is a very wide range of metrics regarding these papers and this author: Among them are: total papers, total cites, average cites per year and per paper Also given is the h-index for the author A range of variants to the h-index is provided each trying to correct one or more limitations of the original, such as the following which you can explore: o g-index giving more weight to highly cited articles. Because this author has a few really highly cited papers this is very high in his case compared to the hindex which does not give any extra “credit” for such high citation rates o Zhang’s e-index which attempts to allow for the overall surplus of citations about the h-index point – again very high for this author o Hc-index, Contemporary H index, which provides lower weighting to research that was done some time ago i.e. is the researcher currently active? o HI-index, individual h-index, which gives higher weighting to cites to singleauthor publications o AW-index, Age weighted index, allows for the fact the more recent publications have had less chance for citation rates to build up 7 PartTwo Google Scholar for journal ranking Journal h-index is the only journal ranking tool available with Google Scholar and Publish or Perish 1. Start the Publish or Perish software Download and install the free Publish or Perish software from http://www.harzing.com/pop.htm Start the software – all actions are done via the software interface (not your web browser) 2. Search for the journal title of interest Click on the Journal Impact tab 8 In the Journal Title box, type in the journal title of interest – “Review of Finance” Enclose title words that must appear together with double quotes Limit to one subject area to try to limit the number of results you have to filter out from other similar journal titles Untick the boxes to the right so that Business, Administration, Finance, Economics; and Social Sciences…are the only subject categories selected You should also limit to a certain year range because PoP can only cope with 1,000 articles at a time and for some journals there will certainly be more than this over a few years – limit to 3 or 4 years and if over 1,000 results are still returned narrow the year range further and repeat the search – a box will pop up and warn you if this happens Type in 2008 to 2010 in the year ranges Check you have settings as below, then click on the Lookup button After a short pause you see all documents from Review of Finance indexed and cited for the period 2008-2010 – but that may not be all you have, so move on to the next session for essential further checking and filtering of results 3. Check and filter out any incorrect journal titles in the result list Click on the Publications column heading to get results sorted and listing by publication – this makes a manual check easier for you 9 Scan the whole list of resulting articles, looking at the Publication column content Searching for “Review of Finance” will find title words in that order and therefore has retrieved Brazilian Review of Finance, International Review of Finance, amongst others. Unfortunately there are no options to do an exact title search to avoid this problem Un-tick unwanted entries in the left Cites column You can do this by marking up the whole block (click on the first unwanted entry which will be highlighted in blue, move to the last unwanted entry and hold down shift and click and the whole block is highlighted. Then choose in the right column “Uncheck selection” Or you can uncheck each one individually in the left hand cites column 10 4. View the h-index for the journal and year range View h-index for the journal title Review of Finance for the year range 2008-2010 at the top of the display The metrics including journal h-index will be recalculated without the unticked entries As previously mentioned journal h-index focuses on the number of articles from the journal that have a high number of citations. The h-index for this journal title for 2008-2010 is about 22 This means that there are 22 articles from this journal with 22 or more citations to them There are many other metrics provided for this journal as well that can be explored Part Three Managing and re-running queries The Multi-Query Center page on Publish or Perish contains a list of recently run queries. It allows you to add further queries, organise queries into folders and to re-run queries. 1. Start the Publish or Perish software You will need to have installed the free Publish or Perish software on your PC from http://www.harzing.com/pop.htm and have internet access Start the software – all actions are done via the software interface (not your web browser) 11 2. Setting up a query for current and future analysis Click on the Multi-Query center tab Click on All Queries Click on the New Folder icon A Query Folder Properties box will appear. In the Folder name box, type in “rite of spring”. Click on OK Ensure that the rite of spring folder is highlighted. This will mean that your search results will automatically populate this folder Click on the New Query icon. In the Query Properties box, type in the title of the item you are interested in, ensuring that the Title words only box is ticked – “Russian folk melodies in the rite of spring” as shown below 12 Click OK, and then when asked Do you want to perform a lookup for the new query?, click on Yes Repeat the above steps for the following additional titles: “The rite of spring genesis of a masterpiece” “Jeux de nombres automated rhythm in the rite of spring” Highlight each of the titles to view the metrics 13 To update the metrics at a future date, mark all items that you wish to update. Right click your mouse and then select Lookup 14 3. Exporting results for further processing Highlight the items you want to export Right click your mouse and make your selection from the menu e.g. Copy statistics for Excel 15