Bibliometric research methods Faculty Brown Bag IUPUI Cassidy R. Sugimoto Overview Vocabularly Citation analysis Citation indices Bibliometric laws Impact factor Applications Vocabulary Scholarly Communications Formal and information Scientometrics Scientific communication Infometrics Thinking beyond scholarly “texts” Webometrics web Bibliometrics Application of statistical and mathematical methods (formal channels) Citation analysis Citing document Cited document A B A references B B is cited by A Why do people cite? Why are some articles not cited? What does a citation mean? Who’s on first? Embedded citation index from ` En mishpat: Babylonian Talmud (1546) (Weinberg, 1997) Shepard’s Citation Index (1873) Shapiro (1992) Institute for Scientific Information (ISI) Scopus GoogleScholar Comparison Scopus n=7,333 (86%) Web of Science n=6,108 (71%) Scopus 29% (2,441) Overlap 57% (4,892) Web of Science 14% (1,216) Distribution of unique and overlapping citations in Scopus and Web of Science (n=8,549) Are you a citation index? Bibliometric research OR “Why I love good indexes” Citation analysis Citing document Cited document A B A references B B is cited by A Citation analysis: methods Not just articles… Variable:PRODUCERS Variable:PRODUCERS Variable:ARTIFACTS Variable:CONCEPTS Hybrid approaches Chaomei Chen: http://www.pages.drexel.edu/~cc345/citespace/figures/terrorism1990-2003-300dpi.png h-index Hirsch (2005) A scientist has index h if h of [his/her] Np papers have at least h citations each, and the other (Np − h) papers have at most h citations each. Bibliometric laws Lotka’s Law (1926) the number (of authors) making n contributions is about 1/n² of those making one; and the proportion of all contributors, that make a single contribution, is about 60 percent (60,15,7…6>10) Not statistically exact May be changing with the current model of scholarship Bibliometric laws Bradford’s law (1934) Journals in a field can be divided into three parts: 1) Core: relatively few # of journals producing 1/3 of all articles 2) Zone 2: same # of articles, but > # of journals 3) Zone 3: same # of articles, but > # of journals The mathematical relationship of the number of journals in the core to the first zone is a constant n and to the second zone the relationship is n². 1:n:n² Not statistically exact General power law distribution (akin to Pareto’s law in economics) Bibliometric laws Zipf’s Law (1935) listing wordsUlysses occurring within that text in order of decreasing Jamesthe Joyce's frequency, the rank of a word on that list multiplied by its 10th most frequent: 2,653 times frequency equal a265 constant. 100th mostwill frequent: times The equation for this relationship is: r x f = k133 where r is the rank of the word, f is the 200th most frequent: times frequency, and k is multiplied the constant rank of the word by the frequency of the word equals a constant that is approximately 26,500 Not statistically exact General power law probability distribution Bibliometric laws Other power law probability distributions Pareto’s law (economics) 80-20 rule Law of the vital few Principle of factor sparsity PageRank (google) The Long Tail (markets) Journal impact factors As a research method… Reliability? Validity? Limitations? Applications? Finding and use Collection development Reference services Collection evaluation Use studies Information retrieval algorithms Diffusion of ideas Domain areas and interdisciplinarity Mapping science Writing your paper…