Does Transformative Research Have Early Signs? [This is part of the panel on Scientometrics being proposed by Stephanie Shipp] Chaomei Chen College of Information Science and Technology Drexel University 3141 Chestnut Street, Philadelphia, PA 19104-2875 Email: chaomei.chen@drexel.edu Motivations Research assessment has become a central issue for more and more government agencies and private organizations in making decisions and policies. New indicators of research excellence or predictors of impact are popping out one after another. However, if we look behind the available methods and beyond the horizon decorated by the various types of indicators, then we will encounter a few questions again and again: What is the nature of creativity in science? Is there a way that we can tell great ideas early on? Are there ways that can help us to choose the right paths? Can we make ourselves more creative? The general consensus of creative thinking is that we ought to think outside the box and that we should maintain an open mind as much as we can. However, a practical question is: What does it take to move from where we are now to the next place – somewhat more desirable – in the vast space of potential discoveries and solutions? Are there any intriguing and tangible patterns that are generic enough from the diverse collection of the wisdom to get us started and help us move along? I will first introduce an explanatory and computational theory of scientific discovery. The theory provides an extensible framework, which currently consists of structural and temporal properties as the necessary conditions for potentially significant discoveries. I will illustrate the potential of the theory by demonstrating how the theory can be used to derive metrics for identifying the potential of transformative research. Finally, I will demonstrate how such metrics can be validated and discuss implications of these metrics on assessing and monitoring the forefront of research. Structural Variations as Early Signs of Transformative Research Our explanatory theory of discovery suggests that one of the mechanisms for advancing scientific knowledge is to make novel connections between previously disjoint bodies of knowledge. This theory means that we can measure the novelty of newly proposed connections in terms of the extent the new links are positioned between previously disjoint bodies of knowledge. If a new link unprecedentedly connects two or more distant fields of study, then its novelty measure should be high. In contrast, if a newly added link is merely a re-statement of an existing link, then its novelty measure should be low. Between the two extremes, a newly proposed link may introduce new interpretations of existing evidence without introducing structural variations as far as the specific knowledge representation is concerned. From the perspective of foraging for knowledge, the perceived profitability is in line with the high-risk and high pay-off expectation of transformative research; we would give higher scores to those novel and unprecedented connections because those conceptualizations are harder to come by. Detecting structural variations in networks allows us to pin point the specific links that alter the structure of existing knowledge the most, which is valuable information for additional validation, for example, by consulting with scientists themselves and other domain experts. In addition, the ability to pinpoint the potential of specific connections makes it possible for analysts to keep track the evolution of their impact over time so that one can verify whether scientific ideas that are identified today with transformative potential are evidently transformative as shown in due course. Validation with Citation Networks We develop a number of generic metrics that can be used to identify the transformative potential of a given research idea, in particular, in connection with network representations of contemporary knowledge. Assume at a given time point t, the knowledge of a topic, a field, or a discipline up to that point K(t) can be represented by mixtures of various subtopics or associative networks of conceptual components and their interrelationships. For any scientific ideas either as topics or conceptual components emerged after the time point t, i.e. t + t, measure the extent to which these new ideas depart from K(t), the accumulated knowledge up to t. In order to assess the extent that these metrics can capture transformative research, we take a retrospectivepredictive approach by focusing on citation counts received by scientific publications that had induced strong structural variations in the past. In other words, our hypothesis is that the degree of structural variation introduced by a scientific publication (as a symbol of scientific ideas) is a significant predictor of its citation counts in subsequent years. The measurement of transformative potential is derived from the explanatory and computational theory of transformative discovery. The central idea in measuring the transformative potential is that structural variations provide early signs. According to our theory, ideas that introduce a greater degree of structural variation are more likely to have the potential to transform the knowledge structure than those that alter the existing structure to a less extent. On this panel, I will specifically focus on two metrics derived from this line of reasoning. The synthesis span metric measures the degree of structural variation in terms of the distance between the existing structure and a new structure. The structure can be a network representation or a probabilistic distribution of multiple topics and citation clusters. In other words, the synthesis span indicates the amount of boundary spanning implied by the research in question. The other metric, structural divergence, measures the overall change between the old and the new structures in terms of the centrality measures of individual entities. This metric assumes a network representation. Intuitively speaking, a high score of this metric will identify contributions that cause a significant shift of centers of concentrations in the existing network. If we were to apply this to a network of world scientific activities, it would track the shift of the world center of scientific activities. One way to assess their validity is to see to what extent they are good predictors of how soon and how well an associated research embodiment is recognized. For scholarly publications, citations are generally regarded as a reasonable indicator of impact, at least how much attention peer scientists paid to cited publications. The list of suspects of good predictors of citation has been getting longer and longer over the years. Reviews and survey papers are known to attract a big share of citations. Papers written by many co-authors from prestigious universities are suspected to be citation attractors. The number of references cited by a paper is also considered as a possible factor. There are many models and many independent variables are involved. I will discuss our preliminary results that the new computational theory of transformative discovery offers a much simplified explanation of why and how scientific papers are cited. The underlying boundary spanning mechanisms provide a consistent explanation of why review papers tend to be cited more, why papers citing more references tend to be cited more, and why papers with a diverse group of co-authors tend to be cited more. The initial results are very encouraging. Not only can we summarize the state of the art as often as we wish, but also access to alternative means of identifying the transformative potential of newly emerged ideas and even what-if and other speculations.