
Henry Small- PhD
- Senior Researcher at SciTech Strategies Inc.
Henry Small
- PhD
- Senior Researcher at SciTech Strategies Inc.
About
97
Publications
28,597
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
13,870
Citations
Introduction
Current institution
SciTech Strategies Inc.
Current position
- Senior Researcher
Publications
Publications (97)
A comparison of Laudan's method of theory evaluation with a Bayesian approach.
A Bayesian account of the Guillemin and Schally work on TRF, originally presented in social constructivist terms by Latour and Woolgar in the book Laboratory Life (1979).
A naïve Bayes approach to theory confirmation is used to compute the posterior probabilities for a series of four models of DNA considered by James Watson and Francis Crick in the early 1950s using multiple forms of evidence considered relevant at the time. Conditional probabilities for the evidence given each model are estimated from historical so...
The presentation discusses the relationship of Bayesian concepts of confirmation and information theory and entropy, giving an example from the history of continental drift.
In the book Laboratory Life Latour and Woolgar present an account of how scientific “facts” are formed through a process of microsocial interactions among individuals and “inscription devices” in the lab initially described as social construction. The process moves through a series of steps during which the details and nature of the object become m...
A naïve Bayes approach to theory confirmation is used to compute the posterior probabilities for a series of four models of DNA considered by James Watson and Francis Crick in the early 1950s using multiple forms of evidence considered relevant at the time. Conditional probabilities for the evidence given each model are estimated from historical so...
The confirmation of scientific theories is approached by combining Bayesian probabilistic methods, in particular Bayesian causal networks, and the analysis of citing sentences for highly cited papers. It is assumed that causes and their effects can be identified by linguistic methods from the citing sentences and that the cause-and-effect pairs can...
In the 1970s, quantitative science studies were being pursued by sociologists, historians, and information scientists. Philosophers were part of this discussion, but their role would diminish as sociology of science asserted itself. An anti-science bias within the sociology of science became evident in the late 1970s, which split the science studie...
In the 1970s quantitative science studies were being pursued by sociologists, historians, and information scientists. Philosophers were part of this discussion, but their role would diminish as sociology of science asserted itself. An anti-science bias within the sociology of science became evident in the late 1970s which split the science studies...
In the 1970s quantitative science studies were being pursued by sociologists, historians, and information scientists. Philosophers were part of this discussion, but their role would diminish as sociology of science asserted itself. An anti-science bias within the sociology of science became evident in the late 1970s which split the science studies...
In the 1970s quantitative science studies were being pursued by sociologists, historians, and information scientists. Philosophers were part of this discussion, but their role would diminish as sociology of science asserted itself. An anti-science bias within the sociology of science became evident in the late 1970s which split the science studies...
Quantitative analysis of knowledge content of a significant technological innovation is a novel approach to understand the scientific discovery process. Here we describe such an analysis applied to the invention of recombinant DNA technology in the early 1970's. Two focal papers are selected, i.e., Jackson et al., 1972 and Cohen et al., 1973. A kno...
We report that the rate of hedging in citing sentences for biomedical papers is inversely related to the citations received by the papers as measured by the number of citances in citing papers. Hedging is often regarded as an expression of uncertainty in rhetorical studies of scientific text. Citing sentences, or citances, are retrieved from the Pu...
The top 1000 biomedical papers by number of citations are classified by method, type ofmethod and non-methods by examination of citation contexts. Supervised machine learn-ing is applied to the context data for a training sample of papers which is then used to classifythe full list, revealing that words indicating utility are most important for the...
Eugene Garfield’s ideas on citation indexing were gradually shaped over the course of the 1950s by his exposure to the thinking of various individuals such as J. D. Bernal, H. G. Wells, Chauncey Leake, William Adair, and Joshua Lederberg. Two key concepts emerged during this early period which guided his later work: the importance of interdisciplin...
This panel pays tribute to Dr. Eugene Garfield, one of the “fathers” of bibliometrics, former president of ASIS (1999–2000) and the founder of the ISI citation databases. Dr. Garfield passed away on February 25, 2017. In this panel, we will highlight his contributions to information science. The panelists are all well-known researchers who have kno...
No other individual has had a greater influence on the fields of scientometrics, informetrics, and information science generally than Eugene Garfield. Most of his contributions over the decades are found to have had their origins very early in his career. Chemistry and chemical information launched his career and led to his involvement with medical...
A procedure for identifying discoveries in the biomedical sciences is described that makes use of citation context information, or more precisely citing sentences, drawn from the PubMed Central database. The procedure focuses on use of specific terms in the citing sentences and the joint appearance of cited references. After a manual screening proc...
John P. A. Ioannidis and colleagues asked the most highly cited biomedical scientists to score their top-ten papers in six ways.
The identification of emerging topics is of current interest to decision makers in both
government and industry. Although many case studies present retrospective analyses of
emerging topics, few studies actually nominate emerging topics for consideration by decision makers. We present a novel approach to identifying emerging topics in science and...
This study presents a methodology that can be used to characterize emergent topics within the context of a contemporaneous, global micro-model of the scientific literature. To illustrate its effectiveness, two known emergent nanotechnology topics (graphene and dye-sensitized solar cells) are characterized. We show that the model and methodology are...
Historically, co-citation models have been based only on bibliographic information. Full-text analysis offers the opportunity to significantly improve the quality of the signals upon which these co-citation models are based. In this work we study the effect of reference proximity on the accuracy of co-citation clusters. Using a corpus of 270,521 fu...
We present a novel approach to identifying emerging topics in science and technology. An existing co-citation cluster model is combined with a new method for clustering based on direct citation links. Both methods are run across multiple years of Scopus data, and emergent co-citation threads in a specific year are matched against the direct citatio...
This article reports on a study of indicators and precursors of 'hot science' at the research problem level. 713 research problems in science were judged to be 'hot', 'average' or 'cold' by eight NSF and NIH program officers in two card-sorting exercises. Research problems are individual clusters of documents from a micro-structural co-citation mod...
The structure and evolution of science and technology can be studied at multiple levels. Most such studies explore the developments of fields, disciplines, or specialties. Given the large numbers of articles underlying these analyses, developments appear to be continuous and smooth in most cases. By contrast, analysis of structure and evolution at...
It is proposed that citation contexts, the text surrounding references in scientific papers, be analyzed in terms of an expanded
notion of sentiment, defined to include attitudes and dispositions toward the cited work. Maps of science at both the specialty
and global levels are used as the basis of this analysis. Citation context samples are taken...
Responding to Ton van Raan's critique of citation theories, this article explores referencing in landmark scientific texts for clues to a viable citation theory. The evolution of the modern bibliographic reference is described, the earliest form being authors commenting on other authors or themselves. Landmark scientific texts by Aristotle, Newton,...
Interdisciplinarity can be manifest in many forms: through collaboration or communication between scientists working in different
fields or through the work of individual scientists who employ concepts or methods across disciplines. This latter form of
interdisciplinarity is addressed here with the goal of understanding how ideas in different field...
Research fronts represent the most dynamic areas of science and technology and the areas that attract the most scientific
interest. We construct a methodology to identify these fronts, and we use quantitative and qualitative methodology to analyze
and describe them. Our methodology is able to identify these fronts as they form—with potential use by...
The behavior of co-citation clusters is studied over a wide range of similarity values, and we demonstrate the existence of critical or percolation transitions marked by a sudden expansion of cluster size with a small decrease in similarity which, in most cases, reflects the emergence of a giant component on the overall graph for the dataset. The s...
A particular feature of impact factor is it sensitivity to the field effects. The level of impact factor in a closed field is dependent on the citing behavior in the field (average number of references, immediacy of referencing) and growth rate. If the field exchanges citations with other fields, which is the rule in practice, the impact factor is...
A new approach to the field normalization of the classical journal impact factor is introduced. This approach, called the audience factor, takes into consideration the citing propensity of journals for a given cited journal, specifically, the mean number of references of each citing journal, and fractionally weights the citations from those citing...
A case study of an emerging research area is presented dealing with the creation of organic thin film transistors, a subtopic
within the general area called “plastic electronics.” The purpose of this case study is to determine the structural properties
of the citation network that may be characteristic of the emergence, development, and application...
We explore an empirical approach to studying the social and political implications of science by gathering scientists’ perceptions
of the social impacts of their research. It was found that 78 percent of surveyed scientists from a variety of fields responding
to a survey indicated that the research performed in connection with a recent highly cited...
Summary We explore the possibility of using co-citation clusters over three time periods to track the emergence and growth of research
areas, and predict their near term change. Data sets are from three overlapping six-year periods: 1996-2001, 1997-2002 and
1998-2003. The methodologies of co-citation clustering, mapping, and string formation are re...
The digitalization of data has provided healthcare providers with a venue to collect, store and retrieve large amounts of documents in databases, data warehouses, and data repositories. However, one of the challenges posed is how to interpret meaningful information from this data. The citation analysis, cluster analysis and discovery of knowledge h...
We explore the possibility of using co-citation clusters over three time periods to track the emergence and growth of research areas, and predict their near term change. Data sets are from three overlapping six-year periods: 1996-2001,1997-2002 and 1998-2003. The methodologies of co-citation clustering, mapping, and string formation are reviewed, a...
A survey of authors of highly cited papers in 22 fields was undertaken in connection with a new bibliometric resource called
Essential Science Indicators (ESI®). Authors were asked to give their opinions on why their papers are highly cited. They
generally responded by describing specific internal, technical aspects of their work, relating them to...
In a series of seminal studies Robert K. Merton created a coherent theoretical view of the social system of science that includes the salient features of the formal publication system, thereby providing a theoretical basis for scientometrics and citationology. A fundamental precept of this system is the view of citations as symbolic payment of inte...
Can maps of science tell us anything about paradigms? The author reviews his earlier work on this question, including Kuhn's reaction to it. Kuhn's view of the role of bibliometrics differs substantially from the kinds of reinterpretations of paradigms that information scientists are currently advocating. But these reinterpretations are necessary i...
Despite the similarity of the above title to a movie of recent vintage, I do not want to give the impression that this is material for a Hollywood script about two guys who attempt to drive off a cliff. I undertake this account of my work with Belver Griffith, not so much as history, but more as personal therapy, reflection on a creative and someti...
this paper is to suggest further indicators relevant for measuring scientific activity, in the hope that this will lead to a better estimate of the condition of science
Science mapping is discussed in the general context of information visualization. Attempts to construct maps of science using citation data are reviewed, focusing on the use of co-citation clusters. New work is reported on a dataset of about 36,000 documents using simplified methods for ordination, and nesting maps hierarchically. An overall map of...
A METHODOLOGY IS PRESENTED FOR CREATING pathways through the scientific literature following strong co-citation links. A specific path is described starting in economics and ending in astrophysics traversing 331 documents. Special attention is given to where the path crosses disciplinary boundaries and how analogy can be used to model the thought p...
Data visualization techniques have opened up new possibilities for science mapping. To exploit this opportunity new methods
are needed to position tens of thousands of documents in a single coordinate space. A general framework is described for achieving
this goal involving hierarchical clustering, ordination of clusters, and the merging of ordinat...
Science mapping projects have been revived by the advent of virtual reality software capable of navigating large synthetic
three dimensional spaces. Unlike the earlier mapping efforts aimed at creating simple maps at either a global or local level,
the focus is now on creating large scale maps displaying many thousands of documents which can be inp...
Describes the overall statistical properties of the citation network in science using a new data representation of the citation links within the ISI (Institute for Scientific Information) database. Longitudinal coupling is introduced, which depends on documents which act both as cited and citing items, and bibliographic and cocitation coupling are...
SCI-Map is a new PC based system for mapping the scientific literature. By selecting a seed item, the user can build a network or cluster of nodes interactively, and can view the structure as it is being built. New nodes are selected for addition to the network by the strength of their links to the items already clustered, and the positions of new...
At ISI we have used a consistent method for clustering the combinedScience Citation Index andSocial Sciences Citation Index for the last seven years (1983 to 1989). This method involves clustering highly cited documents by single-link clustering and then clustering the resultant clusters, a total of four times. This gives a hierarchical or nested s...
A co-citation analysis of research on acquired immune deficiency syndrome is presented, covering the years 1982 to 1987. Through the use of cluster strings and co-citation maps at various levels of aggregation, the development of the field is traced, including major findings and shifts in research emphasis. The implications of this case of a rapidl...
Data on highly cited papers from a recently compiled citation index for physics in the 1920 s are explored in relation to the revolutionary developments that are widely regarded as constituting a golden age in physics. It is found that most of the "classic" papers which we recognize today as having brought about the revolution in quantum and wave m...
The specialty of collagen research is tracked over a ten year period, 1970–1979, using the methodology of co-citation cluster strings. Independently obtained annual clusters are linked together over time by the percentage of highly cited documents countinuing from year to year. All inter-year links are clustered by single-linkage to form the string...
A method is described for generating reviews or synopses of scientific fields called specialty narratives. The raw data for the narrative are the Science Citation Index and selected texts from the published scientific literature. The review process is modeled as a walk through a co-citation network using a quasi-minimal spanning tree path and a dep...
Nature is the international weekly journal of science: a magazine style journal that publishes full-length research papers in all disciplines of science, as well as News and Views, reviews, news, features, commentaries, web focuses and more, covering all branches of science and how science impacts upon all aspects of society and life.
Previous attempts to map science using the co-citation clustering methodology are reviewed, and their shortcomings analyzed. Two enhancements of the methodology presented in Part I of the paper-fractional citation counting and variable level clustering—are briefly described and a third enhancement, the iterative clustering of clusters, is introduce...
Chaque annee l'Institut for Scientific Information (ISI) entreprend une analyse de sa banque de donnees, derivee du Science Citation Index et du Social Science Citation Index. Le but de cette analyse est de creer des «cartes de la science» qui montrent la topographie des sciences a divers niveaux d'agregation. Cette technique est encore dans l'enfa...
Earlier experiments in the use of co-citations to cluster theScience citation Indey (SCI) database are reviewed. Two proposed improvements in the methodology are introduced: fractional citation counting and variable level clustering with a maximum cluster size limit. Results of an experiment using the 1979SCI are described comparing the new methods...
A co-citation cluster analysis of a three year (1975–1977) cumulation of the Social Sciences Citation Index is described, and clusters of information science documents contained in this data-base are identified using a journal subset concentration measure. The internal structure of the information science clusters is analyzed in terms of co-citatio...
The recently developed technique of 'cocitation analysis' is used to examine intellectual developments in the geosciences from 1970 to 1975. The results for 1970 show a cluster of plate tectonics articles that are often cited together. Results for 1973 and 1975 indicate the rise of new clusters of articles associated with new research interests, eg...
Many information scientists are concerned with the operation of document retrieval systems serving scientists in various fields. The scientists served by these systems are often members of what have been called invisible colleges, groups of scientists in frequent communication with one another and involved with highly specialized subject matters. O...
The techniques of co-citation clustering and citation context analysis are combined to concretely define the shared knowledge within a research specialty. The cluster for a large and fast moving biomedical specialty, recombinant-DNA, is presented in terms of the highly cited documents comprising it and their co-citation links. By examining citation...
The technique of co-citation cluster analysis is applied to a special three-year (1972–1974) file of theSocial Sciences Citation Index. An algorithm is devised for identifying clusters which belong to a discipline based on the percentage of source documents which appear in a disciplinary journal set. Clusters in three disciplines (economics, sociol...
An interpretation of citation practice in scientific literature is offered which regards citation of a document as an act of symbol usage. By examining the language of the text around the footnote number the particular idea the citing author is associating with the cited document may be determined. the document is viewed as symbolic of the idea exp...
The classification of journal titles into fields or specialties is a problem of practical importance in library and information science. An algorithm is described which accomplishes such a classification using the single-link clustering technique and a novel application of the method of bibliographic coupling. The novelty consists in the use of two...
The phenomenon of specialization in science is receiving increasing attention as it becomes clear that the 'specialty' is the principal mode of social and cognitive organization in modern science. Recently, there have been some attempts to formulate theories of specialty growth and change.' Before sufficient evidence can accumulate to confirm or re...
Rather than trying to reply in kind to some recent, slightly polemical, criticisms of the use of citations, we will discuss the assumptions un- derlying the use of citations in the study of science. We shall attempt to explicate the principles underlying our work, with certain technical prob- lems, and end with a brief panegyric on research program...
The concept of tri-citation is introduced, as a logical extension of co-citation, and a geometrical model (the circle model) is devised to account for these and all other forms of multiple citation. The model is used to predict distances between documents which can be scaled metrically in two dimensional space. The model is also used to predict obs...
A new form of document coupling called co-citation is defined as the frequency with which two documents are cited together. The co-citation frequency of two scientific papers can be determined by comparing lists of citing documents in the Science Citation Index and counting identical entries. Networks of co-cited papers can be generated for specifi...