A central issue in evaluative bibliometrics is the characterization of the citation distribution of papers in the scientific literature. Here, we perform a large-scale empirical analysis of journals from every field in Thomson Reuters' Web of Science database. We find that only 30 of the 2,184 journals have citation distributions that are inconsistent with a discrete lognormal distribution at the rejection threshold that controls the False Discovery Rate at 0.05. We find that large, multidisciplinary journals are over-represented in this set of 30 journals, leading us to conclude that, within a discipline, citation distributions are lognormal. Our results strongly suggest that the discrete lognormal distribution is a globally accurate model for the distribution of "eventual impact" of scientific papers published in single-discipline journal in a single year that is removed sufficiently from the present date.
Explosion of disaster health information results in information overload among response professionals. The objective of this project was to determine the feasibility of applying semantic natural language processing (NLP) technology to addressing this overload. The project characterizes concepts and relationships commonly used in disaster health-related documents on influenza pandemics, as the basis for adapting an existing semantic summarizer to the domain. Methods include human review and semantic NLP analysis of a set of relevant documents. This is followed by a pilot-test in which two information specialists use the adapted application for a realistic information seeking task. According to the results, the ontology of influenza epidemics management can be described via a manageable number of semantic relationships that involve concepts from a limited number of semantic types. Test users demonstrate several ways to engage with the application to obtain useful information. This suggests that existing semantic NLP algorithms can be adapted to support information summarization and visualization in influenza epidemics and other disaster health areas. However, additional research is needed in the areas of terminology development (as many relevant relationships and terms are not part of existing standardized vocabularies), NLP, and user interface design.
Automatic document categorization is an important research problem in Information Science and Natural Language Processing. Many applications, including Word Sense Disambiguation and Information Retrieval in large collections, can benefit from such categorization. This paper focuses on automatic categorization of documents from the biomedical literature into broad discipline-based categories. Two different systems are described and contrasted: CISMeF, which uses rules based on human indexing of the documents by the Medical Subject Headings(®) (MeSH(®)) controlled vocabulary in order to assign metaterms (MTs), and Journal Descriptor Indexing (JDI) based on human categorization of about 4,000 journals and statistical associations between journal descriptors (JDs) and textwords in the documents. We evaluate and compare the performance of these systems against a gold standard of humanly assigned categories for one hundred MEDLINE documents, using six measures selected from trec_eval. The results show that for five of the measures, performance is comparable, and for one measure, JDI is superior. We conclude that these results favor JDI, given the significantly greater intellectual overhead involved in human indexing and maintaining a rule base for mapping MeSH terms to MTs. We also note a JDI method that associates JDs with MeSH indexing rather than textwords, and it may be worthwhile to investigate whether this JDI method (statistical) and CISMeF (rule based) might be combined and then evaluated showing they are complementary to one another.
An experiment was performed at the National Library of Medicine((R)) (NLM((R))) in word sense disambiguation (WSD) using the Journal Descriptor Indexing (JDI) methodology. The motivation is the need to solve the ambiguity problem confronting NLM's MetaMap system, which maps free text to terms corresponding to concepts in NLM's Unified Medical Language System((R)) (UMLS((R))) Metathesaurus((R)). If the text maps to more than one Metathesaurus concept at the same high confidence score, MetaMap has no way of knowing which concept is the correct mapping. We describe the JDI methodology, which is ultimately based on statistical associations between words in a training set of MEDLINE((R)) citations and a small set of journal descriptors (assigned by humans to journals per se) assumed to be inherited by the citations. JDI is the basis for selecting the best meaning that is correlated to UMLS semantic types (STs) assigned to ambiguous concepts in the Metathesaurus. For example, the ambiguity transport has two meanings: "Biological Transport" assigned the ST Cell Function and "Patient transport" assigned the ST Health Care Activity. A JDI-based methodology can analyze text containing transport and determine which ST receives a higher score for that text, which then returns the associated meaning, presumed to apply to the ambiguity itself. We then present an experiment in which a baseline disambiguation method was compared to four versions of JDI in disambiguating 45 ambiguous strings from NLM's WSD Test Collection. Overall average precision for the highest-scoring JDI version was 0.7873 compared to 0.2492 for the baseline method, and average precision for individual ambiguities was greater than 0.90 for 23 of them (51%), greater than 0.85 for 24 (53%), and greater than 0.65 for 35 (79%). On the basis of these results, we hope to improve performance of JDI and test its use in applications.
One of the most significant recent advances in health information systems has been the shift from paper to electronic documents. While research on automatic text and image processing has taken separate paths, there is a growing need for joint efforts, particularly for electronic health records and biomedical literature databases. This work aims at comparing text-based versus image-based access to multimodal medical documents using state-of-the-art methods of processing text and image components. A collection of 180 medical documents containing an image accompanied by a short text describing it was divided into training and test sets. Content-based image analysis and natural language processing techniques are applied individually and combined for multimodal document analysis. The evaluation consists of an indexing task and a retrieval task based on the "gold standard" codes manually assigned to corpus documents. The performance of text-based and image-based access, as well as combined document features, is compared. Image analysis proves more adequate for both the indexing and retrieval of the images. In the indexing task, multimodal analysis outperforms both independent image and text analysis. This experiment shows that text describing images can be usefully analyzed in the framework of a hybrid text/image retrieval system.
We describe the use of a domain-independent methodology to extend a natural language processing (NLP) application, SemRep (Rindflesch, Fiszman, & Libbus, 2005), based on the knowledge sources afforded by the Unified Medical Language System (UMLS®) (Humphreys, Lindberg, Schoolman, & Barnett, 1998) to support the area of health promotion within the public health domain. Public health professionals require good information about successful health promotion policies and programs that might be considered for application within their own communities. Our effort seeks to improve access to relevant information for the public health profession, to help those in the field remain an information-savvy workforce. NLP and semantic techniques hold promise to help public health professionals navigate the growing ocean of information by organizing and structuring this knowledge into a focused public health framework paired with a user-friendly visualization application as a way to summarize results of PubMed searches in this field of knowledge.
Many information portals are adding social features with hopes of enhancing the overall user experience. Invitations to join and welcome pages that highlight these social features are expected to encourage use and participation. While this approach is widespread and seems plausible, the effect of providing and highlighting social features remains to be tested. We studied the effects of emphasizing social features on users' response to invitations, their decisions to join, their willingness to provide profile information, and their engagement with the portal's social features. The results of a quasi-experiment found no significant effect of social emphasis in invitations on receivers' responsiveness. However, users receiving invitations highlighting social benefits were less likely to join the portal and provide profile information. Social emphasis in the initial welcome page for the site also was found to have a significant effect on whether individuals joined the portal, how much profile information they provided and shared, and how much they engaged with social features on the site. Unexpectedly, users who were welcomed in a social manner were less likely to join and provided less profile information; they also were less likely to engage with social features of the portal. This suggests that even in online contexts where social activity is an increasingly common feature, highlighting the presence of social features may not always be the optimal presentation strategy.
Many recent studies on MEDLINE-based information seeking have shed light on scientists' behaviors and associated tool innovations that may improve efficiency and effectiveness. Few if any studies, however, examine scientists' problem-solving uses of PubMed in actual contexts of work and corresponding needs for better tool support. Addressing this gap, we conducted a field study of novice scientists (14 upper level undergraduate majors in molecular biology) as they engaged in a problem solving activity with PubMed in a laboratory setting. Findings reveal many common stages and patterns of information seeking across users as well as variations, especially variations in cognitive search styles. Based on findings, we suggest tool improvements that both confirm and qualify many results found in other recent studies. Our findings highlight the need to use results from context-rich studies to inform decisions in tool design about when to offer improved features to users.
Using the Arts & Humanities Citation Index (A&HCI) 2008, we apply mapping
techniques previously developed for mapping journal structures in the Science
and Social Science Citation Indices. Citation relations among the 110,718
records were aggregated at the level of 1,157 journals specific to the A&HCI,
and the journal structures are questioned on whether a cognitive structure can
be reconstructed and visualized. Both cosine-normalization (bottom up) and
factor analysis (top down) suggest a division into approximately twelve
subsets. The relations among these subsets are explored using various
visualization techniques. However, we were not able to retrieve this structure
using the ISI Subject Categories, including the 25 categories which are
specific to the A&HCI. We discuss options for validation such as against the
categories of the Humanities Indicators of the American Academy of Arts and
Sciences, the panel structure of the European Reference Index for the
Humanities (ERIH), and compare our results with the curriculum organization of
the Humanities Section of the College of Letters and Sciences of UCLA as an
example of institutional organization.
This paper challenges recent research (Evans, 2008) reporting that the concentration of cited scientific literature increases with the online availability of articles and journals. Using Thomson Reuters' Web of Science, the present paper analyses changes in the concentration of citations received (two- and five-year citation windows) by papers published between 1900 and 2005. Three measures of concentration are used: the percentage of papers that received at least one citation (cited papers); the percentage of papers needed to account for 20, 50 and 80 percent of the citations; and, the Herfindahl-Hirschman index. These measures are used for four broad disciplines: natural sciences and engineering, medical fields, social sciences, and the humanities. All these measures converge and show that, contrary to what was reported by Evans, the dispersion of citations is actually increasing.
In a recent presentation at the 17th International Conference on Science and
Technology Indicators, Schneider (2012) criticised the proposal of Bornmann, de
Moya Anegon, and Leydesdorff (2012) and Leydesdorff and Bornmann (2012) to use
statistical tests in order to evaluate research assessments and university
rankings. We agree with Schneider's proposal to add statistical power analysis
and effect size measures to research evaluations, but disagree that these
procedures would replace significance testing. Accordingly, effect size
measures were added to the Excel sheets that we bring online for testing
performance differences between institutions in the Leiden Ranking and the
SCImago Institutions Ranking.
Hirsch has introduced the h-index to quantify an individual's scientific
research output by the largest number h of a scientist's papers that received
at least h citations. In order to take into account the highly skewed frequency
distribution of citations, Egghe proposed the g-index as an improvement of the
h-index. I have worked out 26 practical cases of physicists from the Institute
of Physics at Chemnitz University of Technology and compare the h and g values.
It is demonstrated that the g-index discriminates better between different
citation patterns. This can also be achieved by evaluating Jin's A-index which
reflects the average number of citations in the h-core and interpreting it in
conjunction with the h-index. h and A can be combined into the R-index to
measure the h-core's citation intensity. I have also determined the A and R
values for the 26 data sets. For a better comparison, I utilize interpolated
indices. The correlations between the various indices as well as with the total
number of papers and the highest citation counts are discussed. The largest
Pearson correlation coefficient is found between g and R. Although the
correlation between g and h is relatively strong, the arrangement of the data
set is significantly different, depending on whether they are put into order
according to the values of either h or g.
Co-occurrence matrices, such as co-citation, co-word, and co-link matrices, have been used widely in the information sciences. However, confusion and controversy have hindered the proper statistical analysis of this data. The underlying problem, in our opinion, involved understanding the nature of various types of matrices. This paper discusses the difference between a symmetrical co-citation matrix and an asymmetrical citation matrix as well as the appropriate statistical techniques that can be applied to each of these matrices, respectively. Similarity measures (like the Pearson correlation coefficient or the cosine) should not be applied to the symmetrical co-citation matrix, but can be applied to the asymmetrical citation matrix to derive the proximity matrix. The argument is illustrated with examples. The study then extends the application of co-occurrence matrices to the Web environment where the nature of the available data and thus data collection methods are different from those of traditional databases such as the Science Citation Index. A set of data collected with the Google Scholar search engine is analyzed using both the traditional methods of multivariate analysis and the new visualization software Pajek that is based on social network analysis and graph theory.
The h-index provides us with nine natural classes which can be written as a
matrix of three vectors. The three vectors are: X=(X1, X2, X3) indicate
publication distribution in the h-core, the h-tail, and the uncited ones,
respectively; Y=(Y1, Y2, Y3) denote the citation distribution of the h-core,
the h-tail and the so-called "excess" citations (above the h-threshold),
respectively; and Z=(Z1, Z2, Z3)= (Y1-X1, Y2-X2, Y3-X3). The matrix V=(X,Y,Z)T
constructs a measure of academic performance, in which the nine numbers can all
be provided with meanings in different dimensions. The "academic trace" tr(V)
of this matrix follows naturally, and contributes a unique indicator for total
academic achievements by summarizing and weighting the accumulation of
publications and citations. This measure can also be used to combine the
advantages of the h-index and the Integrated Impact Indicator (I3) into a
single number with a meaningful interpretation of the values. We illustrate the
use of tr(V) for the cases of two journal sets, two universities, and ourselves
as two individual authors.
We introduce a novel methodology for mapping academic institutions based on
their journal publication profiles. We believe that journals in which
researchers from academic institutions publish their works can be considered as
useful identifiers for representing the relationships between these
institutions and establishing comparisons. However, when academic journals are
used for research output representation, distinctions must be introduced
between them, based on their value as institution descriptors. This leads us to
the use of journal weights attached to the institution identifiers. Since a
journal in which researchers from a large proportion of institutions published
their papers may be a bad indicator of similarity between two academic
institutions, it seems reasonable to weight it in accordance with how
frequently researchers from different institutions published their papers in
this journal. Cluster analysis can then be applied to group the academic
institutions, and dendrograms can be provided to illustrate groups of
institutions following agglomerative hierarchical clustering. In order to test
this methodology, we use a sample of Spanish universities as a case study. We
first map the study sample according to an institution's overall research
output, then we use it for two scientific fields (Information and Communication
Technologies, as well as Medicine and Pharmacology) as a means to demonstrate
how our methodology can be applied, not only for analyzing institutions as a
whole, but also in different disciplinary contexts.
In this article, we analyze the citations to articles published in 11 biological and medical journals from 2003 to 2007 that employ author-choice open access models. Controlling for known explanatory predictors of citations, only 2 of the 11 journals show positive and significant open access effects. Analyzing all journals together, we report a small but significant increase in article citations of 17%. In addition, there is strong evidence to suggest that the open access advantage is declining by about 7% per year, from 32% in 2004 to 11% in 2007. Comment: citation changes; final manuscript
This article statistically analyses how the citation impact of articles deposited in the Condensed Matter section of the preprint server ArXiv (hosted by Cornell University), and subsequently published in a scientific journal, compares to that of articles in the same journal that were not deposited in that archive. Its principal aim is to further illustrate and roughly estimate the effect of two factors, 'early view' and 'quality bias', upon differences in citation impact between these two sets of papers, using citation data from Thomson Scientific's Web of Science. It presents estimates for a number of journals in the field of condensed matter physics. In order to discriminate between an 'open access' effect and an early view effect, longitudinal citation data was analysed covering a time period as long as 7 years. Quality bias was measured by calculating ArXiv citation impact differentials at the level of individual authors publishing in a journal, taking into account co-authorship. The analysis provided evidence of a strong quality bias and early view effect. Correcting for these effects, there is in a sample of 6 condensed matter physics journals studied in detail, no sign of a general 'open access advantage' of papers deposited in ArXiv. The study does provide evidence that ArXiv accelerates citation, due to the fact that that ArXiv makes papers earlier available rather than that it makes papers freely available.
In a recent paper entitled "Inconsistencies of Recently Proposed Citation
Impact Indicators and how to Avoid Them," Schreiber (2012, at arXiv:1202.3861)
proposed (i) a method to assess tied ranks consistently and (ii) fractional
attribution to percentile ranks in the case of relatively small samples (e.g.,
for n < 100). Schreiber's solution to the problem of how to handle tied ranks
is convincing, in my opinion (cf. Pudovkin & Garfield, 2009). The fractional
attribution, however, is computationally intensive and cannot be done manually
for even moderately large batches of documents. Schreiber attributed scores
fractionally to the six percentile rank classes used in the Science and
Engineering Indicators of the U.S. National Science Board, and thus missed, in
my opinion, the point that fractional attribution at the level of hundred
percentiles-or equivalently quantiles as the continuous random variable-is only
a linear, and therefore much less complex problem. Given the quantile-values,
the non-linear attribution to the six classes or any other evaluation scheme is
then a question of aggregation. A new routine based on these principles
(including Schreiber's solution for tied ranks) is made available as software
for the assessment of documents retrieved from the Web of Science (at
This article examines the relationship between acquaintanceship and
coauthorship patterns in a multi-disciplinary, multi-institutional,
geographically distributed research center. Two social networks are constructed
and compared: a network of coauthorship, representing how researchers write
articles with one another, and a network of acquaintanceship, representing how
those researchers know each other on a personal level, based on their responses
to an online survey. Statistical analyses of the topology and community
structure of these networks point to the importance of small-scale, local,
personal networks predicated upon acquaintanceship for accomplishing
collaborative work in scientific communities.
Using the CD-ROM version of the Science Citation Index 2010 (N = 3,705
journals), we study the (combined) effects of (i) fractional counting on the
impact factor (IF) and (ii) transformation of the skewed citation distributions
into a distribution of 100 percentiles and six percentile rank classes (top-1%,
top-5%, etc.). Do these approaches lead to field-normalized impact measures for
journals? In addition to the two-year IF (IF2), we consider the five-year IF
(IF5), the respective numerators of these IFs, and the number of Total Cites,
counted both as integers and fractionally. These various indicators are tested
against the hypothesis that the classification of journals into 11 broad fields
by PatentBoard/National Science Foundation provides statistically significant
between-field effects. Using fractional counting the between-field variance is
reduced by 91.7% in the case of IF5, and by 79.2% in the case of IF2. However,
the differences in citation counts are not significantly affected by fractional
counting. These results accord with previous studies, but the longer citation
window of a fractionally counted IF5 can lead to significant improvement in the
normalization across fields.
The impact factor of an academic journal for any year is the number of times the average article published in that journal in the previous two years are cited in that year. From 1994-2005, the average impact factor of journals listed by the ISI has been increasing by an average of 2.6 percent per year. This paper documents this growth and explores its causes.
The bibliometric measure impact factor is a leading indicator of journal influence, and impact factors are routinely used in making decisions ranging from selecting journal subscriptions to allocating research funding to deciding tenure cases. Yet journal impact factors have increased gradually over time, and moreover impact factors vary widely across academic disciplines. Here we quantify inflation over time and differences across fields in impact factor scores and determine the sources of these differences. We find that the average number of citations in reference lists has increased gradually, and this is the predominant factor responsible for the inflation of impact factor scores over time. Field-specific variation in the fraction of citations to literature indexed by Thomson Scientific's Journal Citation Reports is the single greatest contributor to differences among the impact factors of journals in different fields. The growth rate of the scientific literature as a whole, and cross-field differences in net size and growth rate of individual fields, have had very little influence on impact factor inflation or on cross-field differences in impact factor.
The launching of Scopus and Google Scholar, and methodological developments in Social Network Analysis have made many more indicators for evaluating journals available than the traditional Impact Factor, Cited Half-life, and Immediacy Index of the ISI. In this study, these new indicators are compared with one another and with the older ones. Do the various indicators measure new dimensions of the citation networks, or are they highly correlated among them? Are they robust and relatively stable over time? Two main dimensions are distinguished -- size and impact -- which together shape influence. The H-index combines the two dimensions and can also be considered as an indicator of reach (like Indegree). PageRank is mainly an indicator of size, but has important interactions with centrality measures. The Scimago Journal Ranking (SJR) indicator provides an alternative to the Journal Impact Factor, but the computation is less easy.
In this paper we distinguish between top-performance and lower performance groups in the analysis of statistical properties of bibliometric characteristics of two large sets of research groups. We find intriguing differences between top-performance and lower performance groups, but also between the two sets of research groups. Particularly these latter differences are interesting, as they may indicate the influence of research management strategies. Lower performance groups have a larger scale-dependent cumulative advantage than top-performance groups. We also find that regardless of performance, larger groups have less not-cited publications. We introduce a simple model in which processes at the micro level lead to the observed phenomena at the macro level. Top-performance groups are, on average, more successful in the entire range of journal impact. We fit our findings into a concept of hierarchically layered networks. In this concept, the network of research groups constitutes a layer of one hierarchical step higher than the basic network of publications connected by citations. The cumulative size-advantage of citations received by a group looks like preferential attachment in the basic network in which highly connected nodes (publications) increase their connectivity faster than less connected nodes. But in our study it is size that causes an advantage. In general, the larger a group (node in the research group network), the more incoming links this group acquires in a non-linear, cumulative way. Moreover, top-performance groups are about an order of magnitude more efficient in creating linkages (i.e., receiving citations) than the lower performance groups.
The ISI-Impact Factors suffer from a number of drawbacks, among them the statistics-why should one use the mean and not the median?-and the incomparability among fields of science because of systematic differences in citation behavior among fields. Can these drawbacks be counteracted by counting citation weights fractionally instead of using whole numbers in the numerators? (i) Fractional citation counts are normalized in terms of the citing sources and thus would take into account differences in citation behavior among fields of science. (ii) Differences in the resulting distributions can be tested statistically for their significance at different levels of aggregation. (iii) Fractional counting can be generalized to any document set including journals or groups of journals, and thus the significance of differences among both small and large sets can be tested. A list of fractionally counted Impact Factors for 2008 is available online at http://www.leydesdorff.net/weighted_if/weighted_if.xls. The in-between group variance among the thirteen fields of science identified in the U.S. Science and Engineering Indicators is not statistically significant after this normalization. Although citation behavior differs largely between disciplines, the reflection of these differences in fractionally counted citation distributions could not be used as a reliable instrument for the classification.