About
156
Publications
111,020
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
11,890
Citations
Introduction
Kevin Boyack has been with SciTech Strategies since July 2007. Previously he worked at Sandia National Labs in areas of combustion, transport processes, socio-economic war gaming, and science mapping. His recent work and current interests include detailed mapping of the structure and dynamics of science and technology, accuracy of maps and classifications, merging of multiple data types and sources, identification and prediction of emerging topics, and development of advanced metrics.
Additional affiliations
September 1985 - March 1990
May 2007 - present
SciTech Strategies Inc
Position
- CEO
Description
- Bibliometrics research
March 1990 - July 2007
Education
September 1985 - April 1990
Publications
Publications (156)
We are inviting contributions to a series of three workshops in which we intend to explore sociological foundations of bibliometric indicators.
We evaluated how the gender composition of top-cited authors within different subfields of research has evolved over time. We considered 9,071,122 authors with at least 5 full papers in Scopus as of September 1, 2022. Using a previously validated composite citation indicator, we identified the 2% top-cited authors for each of 174 science subfields...
We evaluated how the gender composition of top-cited authors within different subfields of research has evolved over time. We considered 9,071,122 authors with at least 5 full papers in Scopus as of September 1, 2022. Using a previously validated composite citation indicator, we identified the 2% top-cited authors for each of 174 science subfields...
Using a model of the literature indexed in Scopus, we have increased the accuracy of our ability to predict which of 20,747 research communities would achieve exceptional growth from 32.2 to 39.6 using double exponential smoothing of inertial indicators and by doing predictions in each of 26 fields rather than across the entire model. Each field no...
Climate change is an ongoing topic in nearly all areas of society since many years. A discussion of climate change without referring to scientific results is not imaginable. This is especially the case for policies since action on the macro scale is required to avoid costly consequences for society. In this study, we deal with the question of how r...
The accurate forecasting of exceptional growth in research areas has been an extremely difficult problem to solve. In a previous study we introduced an approach to forecasting which research clusters in a global model of the scientific literature would have an annual growth rate of 8% annually over a three-year period. In this study we (a) introduc...
Massive scientific productivity accompanied the COVID-19 pandemic. We evaluated the citation impact of COVID-19 publications relative to all scientific work published in 2020 to 2021 and assessed the impact on scientist citation profiles. Using Scopus data until August 1, 2021, COVID-19 items accounted for 4% of papers published, 20% of citations r...
Climate change is an ongoing topic in nearly all areas of society since many years. A discussion of climate change without referring to scientific results is not imaginable. This is especially the case for policies since action on the macro scale is required to avoid costly consequences for society. In this study, we deal with the question of how r...
Massive scientific productivity accompanied the COVID-19 pandemic. We evaluated the citation impact of COVID-19 publications relative to all scientific work published in 2020-2021 and assessed the impact on scientist citation profiles. Using Scopus data until August 1, 2021, COVID-19 items accounted for 4% of papers published, 20% of citations rece...
Disagreement is essential to scientific progress but the extent of disagreement in science, its evolution over time, and the fields in which it happens remain poorly understood. Here we report the development of an approach based on cue phrases that can identify instances of disagreement in scientific articles. These instances are sentences in an a...
We examined the extent to which the scientific workforce in different fields was engaged in publishing COVID-19-related papers. According to Scopus (data cut, 1 August 2021), 210 183 COVID-19-related publications included 720 801 unique authors, of which 360 005 authors had published at least five full papers in their career and 23 520 authors were...
Disagreement is essential to scientific progress. However, the extent of disagreement in science, its evolution over time, and the fields in which it happens, remains largely unknown. Leveraging a massive collection of scientific texts, we develop a cue-phrase based approach to identify instances of disagreement citations across more than four mill...
Our work analyzes the artificial intelligence and machine learning (AI/ML) research portfolios of six large research funding organizations from the United States [National Institutes of Health (NIH) and National Science Foundation (NSF)]; Europe [European Commission (EC) and European Research Council (ERC)]; China [National Natural Science Foundati...
Recent concerns about the reproducibility of science have led to several calls for more open and transparent research practices and for the monitoring of potential improvements over time. However, with tens of thousands of new biomedical articles published per week, manually mapping and monitoring changes in transparency is unrealistic. We present...
We examined the extent to which the scientific workforce in different fields was engaged in publishing COVID-19-related papers. According to Scopus (data cut, March 1, 2021), 129,570 COVID-19-related publications included 495,010 unique authors, of which 211,894 authors had published at least 5 full papers in their career and 15,012 authors were at...
Portfolio analysis is a fundamental practice of organizational leadership and is a necessary precursor of strategic planning. Successful application requires a highly detailed model of research options. We have constructed a model, the first of its kind, that accurately characterizes these options for the biomedical literature. The model comprises...
Recent concerns about the reproducibility of science have led to several calls for more open and transparent research practices and for the monitoring of potential improvements over time. However, with tens of thousands of new biomedical articles published per week, manually mapping and monitoring changes in transparency is unrealistic. We present...
This Formal Comment presents an update to citation databases of top-cited scientists across all scientific fields, including more granular information on diverse indicators.
The prediction of exceptional or surprising growth in research is an issue with deep roots and few practical solutions. In this study, we develop and validate a novel approach to forecasting growth in highly specific research communities. Each research community is represented by a cluster of papers. Multiple indicators were tested, and a composite...
Recent large-scale bibliometric models have largely been based on direct citation, and several recent studies have explored augmenting direct citation with other citation-based or textual characteristics. In this study we compare clustering results from direct citation, extended direct citation, a textual relatedness measure and several citation-te...
We aimed to assess whether Nobel prizes (widely considered the most prestigious award in science) are clustering in work done in a few specific disciplines. We mapped the key Nobel prize-related publication of each laureate awarded the Nobel Prize in Medicine, Physics, and Chemistry (1995–2017). These key papers mapped in only narrow sub-regions of...
The prediction of exceptional or surprising growth in research is an issue with deep roots and few practical solutions. In this study we develop and validate a novel approach to forecasting growth in highly specific research communities. Each research community is represented by a cluster of papers. Multiple indicators were tested, and a composite...
There are many different relatedness measures, based for instance on citation relations or textual similarity, that can be used to cluster scientific publications. We propose a principled methodology for evaluating the accuracy of clustering solutions obtained using these relatedness measures. We formally show that the proposed methodology has an i...
Indicators that could predict the success or failure of 3459 research proposals are identified and evaluated. The sample was highly homogeneous (all proposals were from one medical school and submitted to one funding agency) but heterogeneous within this context (all types of NIH proposals are included). The most important exogenous indicator was w...
Citation metrics are widely used and misused. We have created a publicly available database of 100,000 top scientists that provides standardized information on citations, h-index, coauthorship-adjusted hm-index, citations to papers in different authorship positions, and a composite indicator. Separate data are shown for career-long and single-year...
We report that the rate of hedging in citing sentences for biomedical papers is inversely related to the citations received by the papers as measured by the number of citances in citing papers. Hedging is often regarded as an expression of uncertainty in rhetorical studies of scientific text. Citing sentences, or citances, are retrieved from the Pu...
There are many different relatedness measures, based for instance on citation relations or textual similarity, that can be used to cluster scientific publications. We propose a principled methodology for evaluating the accuracy of clustering solutions obtained using these relatedness measures. We formally show that the proposed methodology has an i...
Currently, there is a growing interest in ensuring the transparency and reproducibility of the published scientific literature. According to a previous evaluation of 441 biomedical journals articles published in 2000–2014, the biomedical literature largely lacked transparency in important dimensions. Here, we surveyed a random sample of 149 biomedi...
To highlight uncertain norms in authorship, John P. A. Ioannidis, Richard Klavans and Kevin W. Boyack identified the most prolific scientists of recent years.
We report characteristics of in-text citations in over five million full text articles from two large databases - the PubMed Central Open Access subset and Elsevier journals - as functions of time, textual progression, and scientific field. The purpose of this study is to understand the characteristics of in-text citations in a detailed way prior t...
Citation analysis and discourse analysis of 369 R01 NIH proposals are used to discover possible predictors of proposal success. We focused on two issues: the Matthew effect in science—Merton’s claim that eminent scientists have an inherent advantage in the competition for funds—and quality of writing or clarity. Our results suggest that a clearly a...
We aimed to assess which factors correlate with collaborative behavior and whether such behavior associates with scientific impact (citations and becoming a principal investigator). We used the R index which is defined for each author as log(Np)/log(I1), where I1 is the number of co-authors who appear in at least I1 papers written by that author an...
We investigated the similarities of pairs of articles that are cocited at the different cocitation levels of the journal, article, section, paragraph, sentence, and bracket. Our results indicate that textual similarity, intellectual overlap (shared references), author overlap (shared authors), proximity in publication time all rise monotonically as...
Stakeholders in the science system need to decide where to place their bets. Example questions include: Which areas of research should get more funding? Who should we hire? Which projects should we abandon and which new projects should we start? Making informed choices requires knowledge about these research options. Unfortunately, to date research...
We report characteristics of in-text citations in over five million full text articles from two large databases - the PubMed Central Open Access subset and Elsevier journals - as functions of time, textual progression, and scientific field. The purpose of this study is to understand the characteristics of in-text citations in a detailed way prior t...
Stakeholders in the science system need to decide where to place their bets. Example questions include: Which areas of research should get more funding? Who should we hire? Which projects should we abandon and which new projects should we start? Making informed choices requires knowledge about these research options. Unfortunately, to date research...
We investigate the similarities of pairs of articles which are co-cited at the different co-citation levels of the journal, article, section, paragraph, sentence and bracket. Our results indicate that textual similarity, intellectual overlap (shared references), author overlap (shared authors), proximity in publication time all rise monotonically a...
In 1965, Derek de Solla Price foresaw the day when a citation-based taxonomy
of science and technology would be delineated and correspondingly used for
science policy. A taxonomy needs to be comprehensive and accurate if it is to
be useful for policy making, especially now that policy makers are utilizing
citation-based indicators to evaluate peopl...
This is the last paper in the Synthesis section of this special issue on ‘Same Data, Different Results’. We first provide a framework of how to describe and distinguish approaches to topic extraction from bibliographic data of scientific publications. We then compare solutions delivered by the different topic extraction approaches in this special i...
Visualization of literature-related information is common in scientometrics and related fields. Despite this, relatively little work has been done to visualize knowledge organization systems, such as controlled vocabularies or thesauri. In this paper we explore the creation and use of contextual visualizations based on thesauri. Two different metho...
A dataset containing 111,616 documents in astronomy and astrophysics (Astro-set) has been created and is being partitioned by several research groups using different algorithms. For this paper, rather than partitioning the dataset directly, we locate the data in a previously created model of the full Scopus database. This allows comparisons between...
What motivates the research strategies of nations and institutions? We suggest that research primarily serves two masters–altruism and economic growth. Some nations focus more research in altruistic (or non-economic) fields while others focus more research in fields associated with economic growth. What causes this difference? Are there characteris...
Description of 114 DC2 disciplines and grouping into fields.
(DOCX)
Industry-related strings used to find industry authored papers along with the numbers of matches for 2010–2013 using institution strings and associated Scopus affiliation profiles.
Due to overlap in the matches, the unique set of matches is less than the sum over strings and match types.
(DOCX)
Citation metrics are increasingly used to appraise published research. One challenge is whether and how to normalize these metrics to account for differences across scientific fields, age (year of publication), type of document, database coverage, and other factors. We discuss the pros and cons for normalizations using different approaches. Additio...
[This corrects the article DOI: 10.1371/journal.pbio.1002501.].
Many fields face an increasing prevalence of multi-authorship, and this poses challenges in assessing citation metrics. Here, we explore multiple citation indicators that address total impact (number of citations, Hirsch H index [H]), co-authorship adjustment (Schreiber Hm index [Hm]), and author order (total citations to papers as single; single o...
Proportion of scientists from each scientific disciplines represented among those with top composite scores.
Abbreviations of disciplines as in Fig 1.
(DOCX)
Top 1,000 scientists ranked according to composite score.
(DOCX)
Indicator values for all 84,116 scientists represented in the study.
(XLSX)
Bubble scatter plots showing the correlation between various log-adjusted citation indices among themselves.
(TIF)
EDITOR'S SUMMARY
From early cartography to modern science maps, visual presentations facilitate understanding of large amounts of data. A traveling exhibit entitled Places and Spaces: Mapping Science has presented outstanding maps illustrating different designs and applications since 2005. The 10th year of the exhibit focuses on the future of scien...
EDITOR'S SUMMARY
The science mapping community offers insights and points to trends in scientific inquiry by revealing connections among publications, authors, terminology and citation patterns. The authors applied the process of creating science maps to topics compiled by GuideStar to reveal altruistic motives driving nonprofit organizations (NPOs...
John P. A. Ioannidis and colleagues asked the most highly cited biomedical scientists to score their top-ten papers in six ways.
Uzzi et al. (2013) recently argued that the highest impact articles are likely to reference novel combinations of existing knowledge while still building upon typical combinations. In this study we replicate this intriguing finding using slightly different methods. We also show, however, that the findings are not free from disciplinary effects. For...
The National Institutes of Health (NIH) is the largest source of funding for biomedical research in the world. This funding is largely effected through a competitive grants process. Each year the Center for Scientific Review (CSR) at NIH manages the evaluation, by peer review, of more than 55,000 grant applications. A relevant management question i...
The National Institutes of Health (NIH) is the largest source of funding for biomedical research in the world. This funding is largely effected through a competitive grants process. Each year the Center for Scientific Review (CSR) at NIH manages the evaluation, by peer review, of more than 55,000 grant applications. A relevant management question i...
The identification of emerging topics is of current interest to decision makers in both
government and industry. Although many case studies present retrospective analyses of
emerging topics, few studies actually nominate emerging topics for consideration by decision makers. We present a novel approach to identifying emerging topics in science and...
Background: The ability of a scientist to maintain a continuous stream of publication may be important, because research requires continuity of effort. However, there is no data on what proportion of scientists manages to publish each and every year over long periods of time.
Methodology/Principal Findings: Using the entire Scopus database, we est...
This study presents a methodology that can be used to characterize emergent topics within the context of a contemporaneous, global micro-model of the scientific literature. To illustrate its effectiveness, two known emergent nanotechnology topics (graphene and dye-sensitized solar cells) are characterized. We show that the model and methodology are...
Cited non-source documents such as articles from regional journals, conference papers, books and book chapters, working papers and reports have begun to attract more attention in the literature. Most of this attention has been directed at understanding the effects of including non-source items in research evaluation. In contrast, little work has be...
A comprehensive, state-of-the-art examination of the changing ways we measure scholarly performance and research impact.
Bibliometrics has moved well beyond the mere tracking of bibliographic citations. The web enables new ways to measure scholarly productivity and impact, making available tools and data that can reveal patterns of intellectual act...
The majority of the effort in metrics research has addressed research evaluation. Far less research has addressed the unique problems of research planning. Models and maps of science that can address the detailed problems associated with research planning are needed. This article reports on the creation of an article-level model and map of science...
A great deal of work has been done to understand how science contributes to technological
innovation and medicine. This is no surprise given the amount of money invested annually in
R&D. However, what is not well known is that U.S. science (R&D) investment is only one-sixth
that of the annual revenue received by non-profit organizations (NPOs) i...
A system of four research levels, designed to classify scientific journals from most applied to most basic, was introduced by Francis Narin and colleagues in the 1970s. Research levels have been used since that time to characterize research at institutional and departmental levels. Currently, less than half of all articles published are in journals...
We have generated a list of highly influential biomedical researchers based on Scopus citation data from the period 1996-2011. Of the 15,153,100 author identifiers in Scopus, approximately 1% (n=149,655) have an h-index >=20. Of those, we selected 532 authors who belonged to the 400 with highest total citation count (>=25,142 citations) and/or the...
Historically, co-citation models have been based only on bibliographic information. Full-text analysis offers the opportunity to significantly improve the quality of the signals upon which these co-citation models are based. In this work we study the effect of reference proximity on the accuracy of co-citation clusters. Using a corpus of 270,521 fu...
An indicator of conformity - the tendency for a scientific paper to reinforce existing belief systems - is introduced. This indicator is based on a computational theory of innovation, where an author's belief systems are compared to socio-cognitive norms. Evidence of the validity of the indicator is provided using a sample of 4180 high impact paper...