Research Evaluation

Published by Oxford University Press

Online ISSN: 1471-5449


Print ISSN: 0958-2029


Figure 1. SciTS cluster map Note: Two-dimensional map of the 95 final synthesized SciTS topic statements, grouped into seven clusters. Each numbered point represents one synthesized statement (a list of all statements organized by cluster is in Table 1). Statements closer to each other are considered to be more similar in meaning than statements further away from one another. The grouping (as defined by polygon-shaped boundaries) displays the statements into related clusters
Figure 2. SciTS concept map Note : A comprehensive SciTS issues map showing labeled clusters and regions. Synthesized SciTS topic statements (refer to Figure 1) are no longer shown as individual points; rather, they are now grouped and represented by clusters (7), and then by regions (4). The average importance rating for each cluster is displayed inside the clusters 
Mapping a research agenda for the science of team science
  • Article
  • Full-text available

June 2011


518 Reads






An increase in cross-disciplinary, collaborative team science initiatives over the last few decades has spurred interest by multiple stakeholder groups in empirical research on scientific teams, giving rise to an emergent field referred to as the science of team science (SciTS). This study employed a collaborative team science concept-mapping evaluation methodology to develop a comprehensive research agenda for the SciTS field. Its integrative mixed-methods approach combined group process with statistical analysis to derive a conceptual framework that identifies research areas of team science and their relative importance to the emerging SciTS field. The findings from this concept-mapping project constitute a lever for moving SciTS forward at theoretical, empirical, and translational levels.

Piloting an approach to rapid and automated assessment of a new research initiative: Application to the National Cancer Institute's Provocative Questions initiative

December 2013


73 Reads

Funders of biomedical research are often challenged to understand how a new funding initiative fits within the agency's portfolio and the larger research community. While traditional assessment relies on retrospective review by subject matter experts, it is now feasible to design portfolio assessment and gap analysis tools leveraging administrative and grant application data that can be used for early and continued analysis. We piloted such methods on the National Cancer Institute's Provocative Questions (PQ) initiative to address key questions regarding diversity of applicants; whether applicants were proposing new avenues of research; and whether grant applications were filling portfolio gaps. For the latter two questions, we defined measurements called focus shift and relevance, respectively, based on text similarity scoring. We demonstrate that two types of applicants were attracted by the PQs at rates greater than or on par with the general National Cancer Institute applicant pool: those with clinical degrees and new investigators. Focus shift scores tended to be relatively low, with applicants not straying far from previous research, but the majority of applications were found to be relevant to the PQ the application was addressing. Sensitivity to comparison text and inability to distinguish subtle scientific nuances are the primary limitations of our automated approaches based on text similarity, potentially biasing relevance and focus shift measurements. We also discuss potential uses of the relevance and focus shift measures including the design of outcome evaluations, though further experimentation and refinement are needed for a fuller understanding of these measures before broad application.

Table 4 
Scientific and Public Health Impacts of the NIEHS Extramural Asthma Research Program - Insights from Primary Data

December 2009


77 Reads

A conceptual model was developed to guide evaluation of the long-term impacts of research grant programs at the National Institutes of Health, National Institute of Environmental Health Sciences. The model was then applied to the extramural asthma research portfolio in two stages: (1) using extant data sources, (2) involving primary data collection with asthma researchers and individuals in positions to use asthma research in development of programs, policies and practices. Reporting on the second stage, this article describes how we sought to broaden the perspectives included in the assessment and obtain a more nuanced picture of research impacts by engaging those involved in conducting or using the research.

Figure 1. Collaboratively authored concept map of HIV/AIDS clinical trials network success factors Source: Kagan et al (2009)  
Figure 4. Days for protocol registration, non-US clinical research sites
Figure 5. Kaplan-Meier plots for protocols in US and non-US sites: days for protocol registration
Integrating utilization-focused evaluation with business process modeling for clinical research improvement

October 2010


84 Reads

New discoveries in basic science are creating extraordinary opportunities to design novel biomedical preventions and therapeutics for human disease. But the clinical evaluation of these new interventions is, in many instances, being hindered by a variety of legal, regulatory, policy and operational factors, few of which enhance research quality, the safety of study participants or research ethics. With the goal of helping increase the efficiency and effectiveness of clinical research, we have examined how the integration of utilization-focused evaluation with elements of business process modeling can reveal opportunities for systematic improvements in clinical research. Using data from the NIH global HIV/AIDS clinical trials networks, we analyzed the absolute and relative times required to traverse defined phases associated with specific activities within the clinical protocol lifecycle. Using simple median duration and Kaplan-Meyer survival analysis, we show how such time-based analyses can provide a rationale for the prioritization of research process analysis and re-engineering, as well as a means for statistically assessing the impact of policy modifications, resource utilization, re-engineered processes and best practices. Successfully applied, this approach can help researchers be more efficient in capitalizing on new science to speed the development of improved interventions for human disease.

Measuring the evolution and output of cross-disciplinary collaborations within the NCI Physical Sciences-Oncology Centers Network

December 2013


169 Reads

Development of effective quantitative indicators and methodologies to assess the outcomes of cross-disciplinary collaborative initiatives has the potential to improve scientific program management and scientific output. This article highlights an example of a prospective evaluation that has been developed to monitor and improve progress of the National Cancer Institute Physical Sciences-Oncology Centers (PS-OC) program. Study data, including collaboration information, was captured through progress reports and compiled using the web-based analytic database: Interdisciplinary Team Reporting, Analysis, and Query Resource. Analysis of collaborations was further supported by data from the Thomson Reuters Web of Science database, MEDLINE database, and a web-based survey. Integration of novel and standard data sources was augmented by the development of automated methods to mine investigator pre-award publications, assign investigator disciplines, and distinguish cross-disciplinary publication content. The results highlight increases in cross-disciplinary authorship collaborations from pre- to post-award years among the primary investigators and confirm that a majority of cross-disciplinary collaborations have resulted in publications with cross-disciplinary content that rank in the top third of their field. With these evaluation data, PS-OC Program officials have provided ongoing feedback to participating investigators to improve center productivity and thereby facilitate a more successful initiative. Future analysis will continue to expand these methods and metrics to adapt to new advances in research evaluation and changes in the program.

Table 1. Milestone timing (in months) for the 22 primary studies publications 
Figure 1. Distribution of all 1,429 citations from 2006–10 for the 22 primary studies publications.  
Figure 2. Publication dissemination landscape for the 22 primary studies publications.  
Figure 3. Differences in pace of citation for high–low AIS groups of the 22 primary studies publications.  
Modeling the dissemination and uptake of clinical trials results

September 2013


64 Reads

A select set of highly cited publications from the National Institutes of Health (NIH) HIV/AIDS Clinical Trials Networks was used to illustrate the integration of time interval and citation data, modeling the progression, dissemination, and uptake of primary research findings. Following a process marker approach, the pace of initial utilization of this research was measured as the time from trial conceptualization, development and implementation, through results dissemination and uptake. Compared to earlier studies of clinical research, findings suggest that select HIV/AIDS trial results are disseminated and utilized relatively rapidly. Time-based modeling of publication results as they meet specific citation milestones enabled the observation of points at which study results were present in the literature summarizing the evidence in the field. Evaluating the pace of clinical research, results dissemination, and knowledge uptake in synthesized literature can help establish realistic expectations for the time course of clinical trials research and their relative impact toward influencing clinical practice.

Greatest 'HITS': A new tool for tracking impacts at the National Institute of Environmental Health Sciences

December 2013


106 Reads

Evaluators of scientific research programs have several tools to document and analyze products of scientific research, but few tools exist for exploring and capturing the impacts of such research. Understanding impacts is beneficial because it fosters a greater sense of accountability and stewardship for federal research dollars. This article presents the High Impacts Tracking System (HITS), a new approach to documenting research impacts that is in development at the National Institute of Environmental Health Sciences (NIEHS). HITS is designed to help identify scientific advances in the NIEHS research portfolio as they emerge, and provide a robust data structure to capture those advances. We have downloaded previously un-searchable data from the central NIH grants database and developed a robust coding schema to help us track research products (going beyond publication counts to the content of publications) as well as research impacts. We describe the coding schema and key system features as well as several development challenges, including data integration, development of a final data structure from three separate ontologies, and ways to develop consensus about codes among program staff.

Normalization of peer-evaluation measures of group research quality across academic disciplines

June 2010


154 Reads

Peer-evaluation-based measures of group research quality such as the UK's Research Assessment Exercise (RAE), which do not employ bibliometric analyses, cannot directly avail of such methods to normalize research impact across disciplines. This is seen as a conspicuous flaw of such exercises and calls have been made to find a remedy. Here a simple, systematic solution is proposed based upon a mathematical model for the relationship between research quality and group quantity. This model manifests both the Matthew effect and a phenomenon akin to the Ringelmann effect and reveals the existence of two critical masses for each academic discipline: a lower value, below which groups are vulnerable, and an upper value beyond which the dependency of quality on quantity reduces and plateaus appear when the critical masses are large. A possible normalization procedure is then to pitch these plateaus at similar levels. We examine the consequences of this procedure at RAE for a multitude of academic disciplines, corresponding to a range of critical masses.

Why Do Academic Scientists Engage in Interdisciplinary Research?

February 2004


324 Reads

We introduce a measure of interdisciplinarity as the diversity of academic research production across scientific domains. Our dataset concerns more than 900 permanent researchers employed by a large French university which is ranked first among French universities in terms of impact. As expected we find that the traditional academic career incentives do not stimulate interdisciplinary research, while having connections with industry does. The context of work in the laboratory (size, colleagues' status, age and affiliations) strongly affects the propensity to undertake interdisciplinary research.

Academic Patenting in Europe: New Evidence from the KEINS Database.

February 2008


358 Reads

The paper provides summary statistics from the KEINS database on academic patenting in France, Italy, and Sweden. It shows that academic scientists in those countries have signed many more patents than previously estimated. This re-evaluation of academic patenting comes by considering all patents signed by academic scientists active in 2004, both those assigned to universities and the many more held by business companies, governmental organizations, and public laboratories. Specific institutional features of the university and research systems in the three countries contribute to explaining these ownership patterns, which are remarkably different from those observed in the USA. In the light of these new data, European universities' contribution to domestic patenting appears not to be much less intense than that of their US counterparts.

Figure 1: Different web entities (site, directory and page) in an open accessible web space.  
Figure 2: A log file sample showing the proposed access types search engine, backlink and direct access which can be differentiated by a heuristic.  
Constructing experimental indicators for Open Access documents

November 2006


77 Reads

The ongoing paradigm change in the scholarly publication system makes it necessary to construct alternative evaluation criteria/metrics which appropriately take into account the unique characteristics of electronic publications and other research output in digital formats. Today, major parts of scholarly open access (OA) publications and the self-archiving area are not well covered in traditional citation and indexing databases. The growing share and importance of freely accessible research output demands new approaches/metrics for measuring and evaluating these new types of scientific publication. We propose a simple quantitative method which establishes indicators by measuring the access/download pattern of OA documents and other web entities of a single web server. The experimental indicators are constructed, based on standard local web usage data. This new type of web-based indicator is developed to model the specific demand for better study/evaluation of the accessibility, visibility and interlinking of open accessible documents. We conclude that escience will need new stable e-indicators.

Agent-based Simulation of Cooperative Innovation

January 2010


146 Reads

This article introduces an agent-based simulation model representing the dynamic processes of cooperative R&D in the manufacturing sector of South Korea. Firms’ behaviors were defined according to empirical findings on a data set from the internationally standardized Korean Innovation Survey in 2005. Simulation algorithms and parameters were defined based on the determinants of firms’ likelihood to participate in cooperation with other firms when conducting innovation activities. The calibration process was conducted to the point where artificially generated scenarios were equivalent to the one observed in the real world. The aim of this simulation game was to create a basic implementation that could be extended to test different policies strategies in order to observe sector responses (including cross-sector spillovers) when promoting cooperative innovation. Based on the evaluation of simulated research collaboration data, sector responses to strategies concerning government intervention in R&D of the firms can now be assessed.

Past performance, peer review, and project selection: A case study in the social and behavioral sciences

November 2009


403 Reads

Does past performance influence success in grant applications? We tested whether the decisions of the Netherlands Research Council for the Economic and Social Sciences correlate with the past performances of applicants in publications and citations, and with the results of the Council's peer reviews. The Council proves successful in distinguishing grant applicants with above-average from below-average performance, but within the former group there was no correlation between past performance and receiving a grant. When comparing the best-performing researchers who were denied funding with those who received it, the rejected researchers significantly outperformed the funded ones. The best rejected proposals score on average as high on the outcomes of the peer-review process as the accepted proposals. The Council successfully corrected for gender effects during the selection process. We explain why these findings may apply beyond this case. However, if research councils are not able to select the ‘best’ researchers, perhaps they should reconsider their mission. We discuss the role of research councils in the science system in terms of variation, innovation and quality control.

Table 1 . How were excellent papers defined in the study (in percent)? 
How are excellent (highly cited) papers defined in bibliometrics? A quantitative analysis of the literature

January 2014


1,274 Reads

As the subject of research excellence has received increasing attention (in science policy) over the last few decades, increasing numbers of bibliometric studies have been published dealing with excellent papers. However, many different methods have been used in these studies to identify excellent papers. The present quantitative analysis of the literature has been carried out in order to acquire an overview of these methods and an indication of an "average" or "most frequent" bibliometric practice. The search in the Web of Science yielded 321 papers dealing with "highly cited", "most cited", "top cited" and "most frequently cited". Of the 321 papers, 16 could not be used in this study. In around 80% of the papers analyzed in this study, a quantitative definition has been provided with which to identify excellent papers. With definitions which relate to an absolute number, either a certain number of top cited papers (58%) or papers with a minimum number of citations are selected (17%). Around 23% worked with percentile rank classes. Over these papers, there is an arithmetic average of the top 7.6% (arithmetic average) or of the top 3% (median). The top 1% is used most frequently in the papers, followed by the top 10%. With the thresholds presented in this study, in future, it will be possible to identify excellent papers based on an "average" or "most frequent" practice among bibliometricians.

Towards a better list of citation superstars: Compiling a multidisciplinary list of highly cited researchers

April 2006


46 Reads

A new approach to producing multidisciplinary lists of highly cited researchers is described and used for compiling the first multidisciplinary list of highly cited researchers. This approach is essentially related to the recently discovered law of the constant ratios, and gives a better-balanced representation of different scientific fields.

Impact vitality: An indicator based on citing publications in search of excellent scientists

July 2013


39 Reads

This paper contributes to the quest for an operational definition of ‘research excellence’ and proposes a translation of the excellence concept into a bibliometric indicator. Starting from a textual analysis of funding program calls aimed at individual researchers and from the challenges for an indicator at this level in particular, a new type of indicator is proposed. The impact vitality indicator (Rons and Amez, 2008) reflects the vitality of the impact of a researcher's publication output, based on the change in volume over time of the citing publications. The introduced metric is shown to possess attractive operational characteristics and meets a number of criteria which are desirable when comparing individual researchers. The validity of one of the possible indicator variants is tested using a small dataset of applicants for a senior full-time research fellowship. Options for further research involve testing various indicator variants on larger samples linked to different kinds of evaluations.

Figure 1. Percentage of publications with local focus over time 
Figure 2. Percentage of local and non-local papers with a Colombian address by degree of interdisciplinarity (Rao-Stirling diversity) 
Table 3 :
Interdisciplinarity and research on local issues: Evidence from a developing country

April 2013


435 Reads

This paper examines the role of interdisciplinarity on research pertaining to local issues. Using Colombian publications from 1991 until 2011 in the Web of Science, we investigate the relationship between the degree of interdisciplinarity and the local orientation of the articles. We find that a higher degree of interdisciplinarity in a publication is associated with a greater emphasis on local issues. In particular, our results support the view that research that combines cognitively disparate disciplines, what we refer to as distal interdisciplinarity, is associated with more local focus of research. We discuss the policy implications of these results in the context of national research assessments targeting excellence and socio-economic impact.

An Integrated Impact Indicator (I3): A New Definition of "Impact" with Policy Relevance

May 2012


258 Reads

Allocation of research funding, as well as promotion and tenure decisions, are increasingly made using indicators and impact factors drawn from citations to published work. A debate among scientometricians about proper normalization of citation counts has resolved with the creation of an Integrated Impact Indicator (I3) that solves a number of problems found among previously used indicators. The I3 applies non-parametric statistics using percentiles, allowing highly-cited papers to be weighted more than less-cited ones. It further allows unbundling of venues (i.e., journals or databases) at the article level. Measures at the article level can be re-aggregated in terms of units of evaluation. At the venue level, the I3 creates a properly weighted alternative to the journal impact factor. I3 has the added advantage of enabling and quantifying classifications such as the six percentile rank classes used by the National Science Board's Science & Engineering Indicators.

Research evaluation per discipline: A peer-review method and its outcomes

July 2013


180 Reads

This paper describes the method for ex-post peer-review evaluation per research discipline used at the Vrije Universiteit Brussel, and the outcomes obtained. Pertinent advice and responses at different levels benefit research quality, competitivity and visibility. Imposed reflection and contacts modify the researcher's attitude and improve team strategies. Deeper insights and data sets on research disciplines and extracted general recommendations support the university management's policy decisions, instruments and guidelines. Comparisons with other assessments lead to a better understanding of possibilities and limitations of different evaluation processes. The described peer-review method can be applied systematically, yielding a complete overview, or on an ad hoc basis for a particular discipline, based on demands from research teams or on strategic or policy arguments.

Research and Development in a Service Economy

February 1997


15 Reads

Canada has a service economy and R&D in Canada is mainly a service sector activity. This paper examines the sectoral distribution of expenditure on R&D performance, with emphasis on the business sector in Canada and with international comparisons. Human resources are a key component in the performance of R&D, and comparisons are made, overtime, of the number of research workers in service and non-service industries, of the ratio of professional to technical and other personnel, and of the changes in educational levels of R&D personnel. Using Canadian experience as a guide, some conclusions are drawn about the measurement challenges in producing indicators of the transition to a service economy.

Figure 1. Distribution of academic researchers by network size  
Table 1 . Participation in collaborative grants by type of partner and discipline
Table 2 . Average network size by discipline
Table 3 . Summary statistics
What Drives the Emergence of Entrepreneurial Academics? A Study on Collaborative Research Partnerships in the UK

December 2008


200 Reads

We study the patterns of engagement in collaborative research among university researchers. We investigate how frequently researchers engage in collaborative research with third parties, as well as the factors that affect the probability of interacting with industry as compared to other types of partners. By focusing on a large sample of recipients of collaborative research grants, we examine how environmental and individual characteristics impact on both the size of the network set up by academic researchers, and the type of partner chosen. Our results demonstrate that individual characteristics matter both for the size and the type of the network, while department-level characteristics are particularly important for the type of the network set up by the researcher.

Table 2 . Effects of patenting on research performance: results from random effects panel tobit regression
Patent and Publication Activities of German Professors: An Empirical Assessment of Their Co-Activity

June 2007


183 Reads

The growing importance of technology-relevant non-publication output of university research has come into the focus of policy-makers' interest. A fierce debate arose on possible negative consequences of the increasing commercialization of science, as it may come with a reduction in research performance. This paper investigates the relationship between publishing as a measure of scientific output and patenting for German professors active in a range of science fields. We combine bibliometric/technometric indicators and econometric techniques to show that patenting positively correlates with the publication output and quality of patenting researchers.

Research actors and the State: research evaluation and evaluation of science and technology policies in Spain

April 1995


186 Reads

This paper describes the development of research evaluation in Spain. It assumes that research evaluation, R&D policy and programme evaluation are embedded in the development of and R&D system and are characterised by general Spanish policy-making. Research evaluation in a context of delegation and as a self-organising system for research actors guaranteed by the state, has been strongly developed in the last few years: R&D policy and programmme evaluation in less institutionalised. The explanation is linked to the sequence of reforms of the R&D system and to the set up of the first Spanish science and technology policy. The support of the European Commission is acknowledged (HCM contract CHRX-CT93-0240) and the Spanish National R&D Plan (projects SEC 93-0688 and SEC 94-0796).

The Use of Behavioural Additionality in Innovation Policy-Making

December 2011


480 Reads

The overarching aim of this paper is to contribute to a better understanding of how the main target variable of innovation policy – change in behaviour – can be better conceptualized and put into practice in evaluation and policy making. The paper first develops a theoretical framework of the concept of behavioural additionality. On that basis it looks in detail at the way behavioural additionality is operationalized in evaluation practice and how the concept is discussed and applied in the interaction between policy-makers and evaluators. The paper utilizes a statistical analysis of 171 innovation policy evaluations, a text analysis of selected behavioural additionality evaluation reports and finally a number of in-depth case studies of evaluations. Based on the theoretical considerations and the empirical findings, the paper identifies three different uses of behavioural additionality in innovation policy evaluations. It concludes that despite the widespread use of the behavioural additionality concept, an improved theoretical basis and serious methodological improvements are needed to realize the full potential of this concept for evaluation and policy practice.Those who are interested in this article or the underlying study are welcome to contact the authors.

Table 2 . Annual growth in labour productivity in 1990-97 and manufacturing share of GDP.
Table 10 . Estimated elasticity of productivity growth with respect to innovation output.
The link between firm-level innovation and aggregate productivity growth: A cross-country examination

February 2003


230 Reads

A broad definition of innovation input is used, in which R&D is one of several sources of innovation. A quantitative innovation output measure is used in the analysis, which is based on a large representative sample of firms, including small firms. An econometric framework based on the knowledge-production function accounting for both selectivity and simultaneity bias is employed. The results from Nordic countries show that, given difficulties in pooling the data, it is important to identify country-specific models to account for country-specific effects and differences in countries' national innovation systems.

Top-cited authors