Article

Bibliometric Approximation of a Scientific Specialty by Combining Key Sources, Title Words, Authors and References

Authors:
To read the full-text of this research, you can request a copy directly from the author.

Abstract

Bibliometric methods for the analysis of highly specialized subjects are increasingly investigated and debated. Information and assessments well-focused at the specialty level can help make important decisions in research and innovation policy. This paper presents a novel method to approximate the specialty to which a given publication record belongs. The method partially combines sets of key values for four publication data fields: source, title, authors and references. The approach is founded in concepts defining research disciplines and scholarly communication, and in empirically observed regularities in publication data. The resulting specialty approximation consists of publications associated to the investigated publication record via key values for at least three of the four data fields. This paper describes the method and illustrates it with an application to publication records of individual scientists. The illustration also successfully tests the focus of the specialty approximation in terms of its ability to connect and help identify peers. Potential tracks for further investigation include analyses involving other kinds of specialized publication records, studies for a broader range of specialties, and exploration of the potential for diverse applications in research and research policy context.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the author.

... In choosing Web of Science (WoS) over Scopus for bibliometric analysis, it was important to consider the features and coverage of both databases. While Scopus is known for its scope and user-friendly interface, Web of Science is known for its selective journal coverage and quality data [25,26]. The Web of Science Core Collection (WoSCC) is particularly emphasized because it contains meticulously curated, complete literature data, including titles, abstracts, keywords, references, and citations, making it a preferred choice for scientometric analyses [25]. ...
... Web of Science is established as an important bibliometric data source and its citation indexes are widely used for research evaluation [28]. The selective nature of Web of Science's journal coverage, coupled with its quality-controlled data, makes it a reliable source for conducting responsible bibliometric analysis in the field of scientometrics [26]. As the Web of Science Core Collection is increasingly applied in academic research, this data source should be used with caution [29]. ...
... source and its citation indexes are widely used for research evaluation [28]. The selective nature of Web of Science's journal coverage, coupled with its quality-controlled data, makes it a reliable source for conducting responsible bibliometric analysis in the field of scientometrics [26]. As the Web of Science Core Collection is increasingly applied in academic research, this data source should be used with caution [29]. ...
Article
Full-text available
This research uncovers contemporary patterns by employing the bibliometric analysis approach to analyze sustainability research in the education domain. Consequently, we map the academic outputs and observe a tendency of increased publications, which proves the growing interest in global sustainability imperatives with the help of WoS data. Regarding the publications, the United Kingdom, Germany, and the United States were the most productive, stressing an international focus and interdisciplinarity. The analysis showed that there was a shift in the topic focus from environmental education to sustainable education as a result of integrating SDGs into every level of education. As such, the results highlight the role of education in sustainability and necessitate more research for better evaluation and implementation of educational efforts. This study not only presents a history of the field but also sets a future agenda for the discipline, which is the significance of education for sustainability. In this way, our work enhances the knowledge of sustainability in education, and the results of our work lay a theoretical and methodological basis for further research and activities in the field.
... The "Most Active Source Titles" analysis assists in identifying journals, conferences, or other publications that are the primary venues for research publications in the field under study [26]. It allows researchers to figure out where their research will receive maximum exposure or where they can seek current and relevant research. ...
... Analysis of Highly Cited Documents can help in understanding essential research trends and focus in a particular field. Identifying highly cited documents can provide insights into research topics that are in the limelight and indicate the direction in which research is progressing [26], ) and help recognize research that has made significant contributions in a particular field of research. These documents often reflect important thoughts, findings or concepts that have influenced scientific development and direction [26] Meanwhile, Table 11 when viewed from the top five highly cited articles the first is D. ...
... Identifying highly cited documents can provide insights into research topics that are in the limelight and indicate the direction in which research is progressing [26], ) and help recognize research that has made significant contributions in a particular field of research. These documents often reflect important thoughts, findings or concepts that have influenced scientific development and direction [26] Meanwhile, Table 11 when viewed from the top five highly cited articles the first is D. ...
... Bibliometric analysis aims to quantitatively measure, evaluate, and comprehend the trends, patterns, and characteristics of scientific publications within a specific research subject. This approach entails the gathering and examination of bibliographic data, encompassing article titles, authors, abstracts, keywords, publication journals, and other pertinent information pertaining to scientific publications (Rons, 2018). Subsequently, this data undergoes processing and analysis employing bibliometric methodologies, including frequency analysis, co-citation analysis, correlation analysis, network analysis, and so forth. ...
... The database search technique is derived from the Scopus core collection as of July 7, 2023. The search string is meticulously executed utilizing themes in the Scopus database, enabling the extraction of published data based on title, abstract, and keywords (Rons, 2018). Scopus is utilized for its reputation as the foremost and most superior database, owing to its extensive data coverage across diverse scientific areas. ...
... For six applicants for a 5-year Senior Research Fellowship term awarded by the Vrije Universiteit Brussel, specialty approximations were constructed as described by Rons (2018). This method's approach is founded in concepts defining research disciplines (Sugimoto and Weingart, 2015) and scholarly communication (Ni, Sugimoto, and Cronin, 2013), and in empirically observed regularities for sources (Bradford, 1934;Garfield, 1971), title words (Zipf, 1935), authors (Lotka, 1926) and references (Price, 1965): -Phase 1: Construction of the seed record, for individual scientists by enlarging the publication record with the publications referred to. ...
... -Phase 3: Identification of all publications covered by key values for at least three of the four data fields, constituting the specialty approximation. In all phases the same operationalizations were used as by Rons (2018), except for a lower coverage threshold of 50% instead of 80% in phase 2. This setting more strongly focuses results on the most representative field values, which is more important than exhaustivity in this application where the only aim is to select prominent scientists in the specialty as potential reviewers. In phases 1 and 3, publications were collected from the online Web of Science of Clarivate Analytics (key figures in Table 1). ...
Preprint
Many contemporary research funding instruments and research policies aim for excellence at the level of individual scientists, teams or research programmes. Good bibliometric approximations of related specialties could be useful for instance to help assign reviewers to applications. This paper reports findings on the usability of reviewer suggestions derived from a recently developed specialty approximation method combining key sources, title words, authors and references (Rons, 2018). Reviewer suggestions for applications for Senior Research Fellowships were made available to the evaluation coordinators. Those who were invited to review an application showed a normal acceptance rate, and responses from experts and coordinators contained no indications of mismatched scientific focus. The results confirm earlier indications that this specialty approximation method can successfully support tasks in research management.
... The incorporation of Scopus databases into a visualization study can expand the content of the analysis. 14, 16 We developed different search formulas on the basis of the categories of different databases ( Figure 1). The search formulas we use are as follows: When searching PubMed, the search formula is as follows: (("Chronic Fatigue Syndrome") OR ("Chronic Fatigue Immune Dysfunction Syndrome") OR ("Systemic Exertional Intolerance Disease") OR ("Myalgic Encephalomyelitis") OR ("persistent fatigue disorder") OR ("unexplained chronic fatigue")) AND (("health economics") OR ("Medical economics") OR ("Healthcare economics") OR ("Health care economics") OR ("public health economics") OR ("Health services economics") OR ("Medical cost analysis")) When searching Scopus, the search formula is as follows: ...
Article
Full-text available
Objective To explore the research trends and hotspots of health economics evaluations of patients with chronic fatigue syndrome. Methods To explore the research trends and hotspots of health economics evaluations of chronic fatigue syndrome, 180 articles published between 1991 and 2024 were visualized and analyzed via CiteSpace 6.3 software. R3 and VOSviewer1.6.20 and R4.3.3. The content includes annual publication volume, journal distribution, author country, publishing organization, author collaboration, citation analysis, and keyword analysis in 7 aspects. Results Fewer studies have evaluated the health economics of individuals with chronic fatigue syndrome in China and abroad, Chinese studies are especially rare, and research results in the UK are mostly found in other countries. Moreover, cooperation and linkages between institutions, as well as between authors, are not yet strong. Conclusion The hotspot of health economics evaluation methods in this field is cost-effectiveness analysis, and the hotspot of diagnosis and treatment methods is cognitive‒behavioral therapy. We also found that chronic fatigue syndrome may also have a strong potential association with depression from the perspective of health economics. Health economic evaluations of multiple treatments should be conducted simultaneously to increase attention to this field and provide a reference basis for low-cost and high-quality diagnostic and treatment programs.
... The title produces a representative sample of the field of interest based on a long tradition of research papers which show the usefulness of article titles [34]. The results were downloaded in a CSV file from the database. ...
Article
Audience Response Systems (ARS) can be used to increase students’ commitment and engagement. ARS are becoming popular at lectures, complementing traditional masterclasses and shedding light to a more profitability of the time. Several researchers studied the impact of ARS in the classroom. However, there is a lack of information about the current research landscape to identify paths towards the development of scientific research and projects in ARS field. This bibliometric study discusses a collection of bibliometric parameters on ARS literature that were calculated from data downloaded in Scopus database. A total of 2,015 publications were considered from Scopus database. Results showed that the number of publications is stable since 2010 with a noticeable decrease in 2019. The United States and the United Kingdom are the most productive countries with a total of 898 papers in the US and 179 in the UK. The most prolific author was Daniel Zingaro from the University of Toronto with a total of 10 manuscripts published. This study provides researchers who are interested in conducting research on ARS with insights on potential venues for publications and collaboration with research institutions and researchers that are more prolific in the field.
... Bibliometric studies serve to monitor and trace scientific research [68,69]. They help to make important research decisions by analyzing specialized topics [70] since useful information can be extracted [71], and they constitute an alternative to other types of reviews [72]. Different indicators can be used for a bibliometric analysis [73]. ...
Article
Full-text available
There is consensus, both in academia and in the business world, that one of the main resources of a company is the incorporation of technology and, along with this, its capacity to generate innovation. Therefore, knowing the development of a company’s research becomes essential. The aim of this work is to develop a bibliometric analysis of the literature published in the Web of Science database to analyze the advances and trends in the development of research. The methodology analyzed bibliometric quantity and quality indicators using Bibliometrix, VOSviewer, and SciMAT software. The results show the evolution of the topic as well as recognition of the different lines along which research has organized the debate.
... Os métodos bibliométricos são ferramentas valiosas para rastrear processos científicos (Benavides-Velasco et al., 2013;Wydra, 2020). Eles são cada vez mais usados para auxiliar a tomada de decisões críticas em políticas de pesquisa sobre temas altamente especializados (Rons, 2018) uma vez que fornecem informações úteis para pesquisadores (Rey-Martí et al., 2016). Indicadores bibliométricos, como contagens citadas e fatores de impacto de periódicos, são frequentemente usados para avaliar o impacto de artigos (Pan et al., 2018). ...
... Thus, many bibliometrics research tends to select the unit of analysis based on the research objectives. For example, research that analyzes references (Rons, 2018) in order to determine the productivity of authors (Jiménez, Prieto, & García, 2019). journalists closer (Yang et al., 2016). ...
Article
Full-text available
Background: Journalism and public relations are two fields that collaborate and compete with each other. Several studies have confirmed this dualism, where both terminologies are interrelated in the same scientific publication. Purpose: This study is aimed to find the interconnection between the two fields in several studies published in international journals. Method: This inquiry applies the bibliometrics method with data sources from the Web of Science and uses VosViewer as an analysis and mapping tool. Results: Results show that the number of keywords containing “public relations” counts more than “journalism.” This study reveals six clusters of keyword mapping that form specific themes: crisis communication management, ethics, professional education, public relations practitioners- journalist relationships, media relations, and publicity scope, news media management, and public relations and the media. Comparing the most cited references from the field of public relations and journalism shows the number 2:2 or equal. There are six most cited authors, four from the USA and two from Australia. Conclusions: The issue of public relations was found more than journalism because most articles are written by experts, especially Americans, and Australians, who have worked in the field of public relations, although some also had early careers in journalism. However, journalism studies were still considered as references for most articles. Implications: The work of western researchers is still at the forefront of the development of public relations science and journalism studies, which becomes a challenge for researchers from developing countries to develop studies more at the international level.
... As Bibliometrics is intended to quantitatively analyze the bibliographic features of publications in certain fields of knowledge, one Research and publication trends: Sports branding on the movie (Hanny Hafiar, Putri Limilia, Ari Agung Prastowo, Kholidil Amin, Davi Sofyan) of which is keywords (Smolina et al., 2020). (Rons, 2018), for example, for researchers to raise certain concepts that will be researched. ...
Article
Full-text available
Many films have raised the story of sports as the major story or as the background of a film. However, so far, no research has been obtained that analyzes the mapping of research results related to film and sports. Therefore, this study intends to examine various studies related to film and sports that global researchers have produced. This research uses the bibliometric method. The data source used is the Web of Science, while the tools used to process and display the data are ScientoPy and VosViewer. The results showed that the development of scientific publications starts 2000 to 2021 related to films and sports experienced fluctuating developments. Authors from America and England occupy the top positions for the number of publications. However, in the period 2020 and 2021, researchers from France and Canada are researchers who are more productive in publishing their scientific works. They included most published scientific works related to films and sports in the WoS version of the Social science category. However, Between 2020-2022, more are categorized into the subject of Communication and Film, Radio, & Television. Reference sources widely cited in scientific publications related to film and sports are books by Beeton with the title film-induced tourism (2005) and Crosson's work with the title sport and film (2013). The results of the keyword mapping show that six clusters represent some keywords used by the author in scientific works related to films and sports. In conclusion, sports and cinema research are fully developed in this research.
... Bibliometric methods are valuable tools to track and trace scientific processes (Benavides-Velasco et al., 2013;Wydra, 2020). They are increasingly used to help critical decision-making in research policies on highly specialized topics (Rons, 2018) since they provide useful information for researchers (Rey-Martí et al., 2016). Bibliometric indicators, such as cited counts and journal impact factors, are often used to evaluate the impact of articles (Pan et al., 2018). ...
... Bibliometric methods are valuable tools to track and trace scientific processes (Benavides-Velasco et al., 2013;Wydra, 2020). They are increasingly used to help critical decision-making in research policies on highly specialized topics (Rons, 2018) since they provide useful information for researchers (Rey-Martí et al., 2016). Bibliometric indicators, such as cited counts and journal impact factors, are often used to evaluate the impact of articles (Pan et al., 2018). ...
Article
Full-text available
In the last decades, the interest in creativity has grown significantly; its importance is related to the impact on the performance of companies since creativity is defined as the root of innovation. Even though research has been quite fruitful in many disciplines, its study in small and medium enterprises has been less explored. This article reviews the literature on creativity in small and medium enterprises and aims to establish proposals for future research. A bibliometric analysis was developed to achieve the objective mentioned above, considering the construction of scientific maps, performance analysis, and graphic maps. Additionally, a content analysis of the selected articles to establish the variables studied around creativity. The study has shown that it is necessary to increase research on creativity in small and medium businesses in a wide variety of topics. Therefore we offer a valuable framework to resolve existing gaps and guide future researchers.
... Bibliometric methods are used to help making important decisions on specialized topics (Rons, 2018). This is because it allows the monitoring and tracking of systems of scientific development (Benavides-Velasco et al., 2013) and provide useful information for researchers (Albort-Morant & Ribeiro-Soriano, 2016;Rey-Marti et al., 2016). ...
Article
Full-text available
Technological innovation is a matter of interest to governments, decision-makers, entrepreneurs, and researchers due to its impact on competitiveness, which is why publications in this field have grown exponentially. To synthesize the main research topics and highlight possible lines for future research, this work aims to develop a bibliometric analysis of technological innovation in the field of the food industry, based on the review of 1015 papers published in specialized journals. The methodology consists of analyzing bibliometric indicators of quantity and quality using the VOSviewer and SCIMat tools. The results show recognition of the different lines in which the research has organized the debate, grouping them into 12 main themes positioned on a strategic map. Furthermore, this study presents directions for future research obtained from the analysis of existing gaps. This study contributes to the literature on innovation by providing a systematization of technological innovation in the food industry.
... A novel method to approximate the specialty to which a given publication record belongs is presented by [31]. The method partially combines sets of critical values for four publication data fields: source, title, authors, and references. ...
Article
Full-text available
Abstract. This paper presents an overview of entrepreneurship in the family business.We used detailed bibliometric analysis to map terms and analyze relations and tendencies between documents, keywords, authors, universities, organizations, and countries. The purpose is to understand better the phenomenon of Entrepreneurship, their relationship, and implications related to causes and consequences derived from a family business on the first stage of their life. Scholars with interests in entrepreneurship may find relevant and pertinency information about patterns of research between Universities, authors, countries, keywords, and the co-citations and co-occurrences of them. We used descriptive bibliometric analysis, to show in this study bibliometric characteristics reporting publication and citation trends from 2005 to 2018, combining bibliometric analysis and mapping, with thematic analysis, usingWeb of Science and VoSviewer software.We based this research on the Web of Sciences(WoS) Core Collection including: Science Citation Index Expanded (Sci-Expanded), Social Sciences Citation Index (SSCI), Arts & Humanities Citation Index (A&HCI), Emerging Sources Citation Index (ESCI), in order to analyze the most productive authors, institutions and countries; besides we look for the most cited papers and articles. Bibliometric indicators represent bibliographic data, including the total number of publications and citations between 2005 and 2018 found inWoS.We created, based on VOSviewer software, graphical visualization of the bibliographic material, developing the map of different terms: journals, keywords, institutions, besides bibliographic coupling and co-citation analysis. Keywords: Entrepreneurship, bibliometrics, family business, VoSviewer, WoS
... We assume that searching the title generates a reasonable, possibly representative sample, of the field of interest. We base this assumption on a long tradition of research on the nature and usefulness of article titles (Rons 2018). ...
Article
Full-text available
In this study the evolution of Big Data (BD) and Data Science (DS) literatures and the relationship between the two are analyzed by bibliometric indicators that help establish the course taken by publications on these research areas before and after forming concepts. We observe a surge in BD publications along a gradual increase in DS publications. Interestingly, a new publications course emerges combining the BD and DS concepts. We evaluate the three literature streams using various bibliometric indicators including research areas and their origin, central journals, the countries producing and funding research and startup organizations, citation dynamics, dispersion and author commitment. We find that BD and DS have differing academic origin and different leading publications. Of the two terms, BD is more salient, possibly catalyzed by the strong acceptance of the pre-coordinated term by the research community, intensive citation activity, and also, we observe, by generous funding from Chinese sources. Overall, DS literature serves as a theory-base for BD publications.
... To provide an overview of the developed work, the presentation of results concerning publications used two approaches: one employing quantitative bibliometric indicators, and the other employing scientometric analysis based on maps of authors' networks, drawn up using a computational tool. Computational tools may be helpful in circumstances such as extending collaborations toward less familiar areas or in interdisciplinary research (Rons, 2018). Quantitative bibliometric indicators, such as number of articles published, number of citations and h-index allow the analysis of scientific performance of authors and their works (Cobo et al., 2015;Baier-Fuentes et al., 2018). ...
Article
Full-text available
Este estudio tiene como objetivo presentar un análisis cientométrico, basado en mapas de redes de autores, para determinar los autores más influyentes y relevantes con trabajos publicados sobre el tema Pequeñas y medianas empresas, la competitividad y su medición, incluido el uso de indicadores clave de rendimiento. La investigación académica se basa en la prospección para recuperar los estudios de investigación más relevantes y establecer vínculos con autores de grupos de investigación internacionales clave. Para facilitar este estudio, utilizamos los resultados de la investigación de las bases de datos Scopus y Web of Science, debido a su número significativo de artículos científicos indexados. Los datos extraídos fueron compilados y analizados a través de redes de autores utilizando el software estadístico Sci2 Tool, que es compatible con el análisis temporal, geoespacial, tópico y de redes. Este estudio también intenta señalar las tendencias de investigación y las brechas en esta área. Los resultados obtenidos se ilustran mediante mapas de redes de autores, que revelan los principales autores y grupos de temas de investigación, mejorando así el acceso a la información de una manera científica.
... A partial explanation for the observation that reference values given by the item-oriented approach that incorporate terms in the calculation of similarity estimates tend to perform worse than just using cited references might simply be that terms introduce more noise when publications from the same scientific problem area are to be identified compared with the case when cited references are used. While terms are connected to the communicative aspect of fields and specialties as it captures specific terminology, cited references connects to the cognitive aspects of a given scientific problem area as they mirror a shared body of theories, methods and important papers (Rons 2018). Subject matter mismatch is probably more likely to happen when terms (as are far less specific than cited references) are used compared to when cited references are used for similarity estimation, especially using a quite simple and straightforward bag-of-terms approach as we do here. ...
Article
Full-text available
In this paper, we compare two sophisticated publication-level approaches to ex-post citation normalization: an item-oriented approach and an approach falling under the general algorithmically constructed classification system approach. Using articles published in core journals in Web of Science (SCIE, SSCI & A&HCI) during 2009 (n = 955,639), we first examine, using the measure Proportion explained variation (PEV), to what extent the publication-level approaches can explain and correct for variation in the citation distribution that stems from subject matter heterogeneity. We then, for the subset of articles from life science and biomedicine (n = 456,045), gauge the fairness of the normalization approaches with respect to their ability to identify highly cited articles when subject area is factored out. This is done by utilizing information from publication-level MeSH classifications to create high quality subject matter baselines and by using the measure Deviations from expectations (DE). The results show that the item-oriented approach had the best performance regarding PEV. For DE, only the most fine-grained clustering solution could compete with the item-oriented approach. However, the item-oriented approach performed better when cited references were heavily weighted in the similarity calculations.
... Bibliometrics is a recognized approach to analyse research literature production and induce synthetic reviews (Alfonzo et al., 2014;Baumgartner, 2010;Colin et al., 2014;Jiain et al., 2015). It is now also used for the analysis of highly specialized subjects (Rons, 2018). Bibliometrics encompasses different counting and mapping methods (Garfield, 2006). ...
Preprint
Purpose The rapid development of eHealth requires the extension of existing health informatics competences sets. These competences are needed not only by health-care professionals but also by health-care consumers. The purpose of this paper is to analyse literature production of health informatics and eHealth competences/skills (EHCS). Design/methodology/approach Bibliometric analysis and mapping have been used as a form of distant reading approach in the manner to perform thematic analysis, identify gaps in knowledge and predict future trends. Findings This study shows that the literature production of health informatics and EHCS differs in bibliometric indicators, as well as in research content. Thematic analysis showed that medicine is the most productive subject area in both fields. However, health informatics competencies/skills are more oriented toward education, nursing, electronic health record and evidence-based practice, while EHCS cover health information technology, engineering, computer science and patient-centred care. The literature research production exhibits positive trend and is geographically widespread in both fields. Research limitations/implications The use of Scopus database might have led to different results if the authors had used Web of Science or Medline, because of the fact that different databases cover different lists of source titles. The authors used various search strings, and the most optimal one for their study; however, a different search string might result in slightly different outcomes. In addition, the thematic analysis has been performed on information source abstracts and titles only, as the analysis of full texts (if available) could lead to different results. Despite the fact that the thematic analysis has been performed by three researchers with different scientific backgrounds, the results of the analysis are subjective. On the other hand, the bibliometric analyses and comparison of health informatics and eHealth competences have never been done before and this study revealed some important gaps in research in both fields. Practical implications The World Health Organization defined four distinct but related components of eHealth: mobile health, health information systems, telemedicine and distance learning. While the research in telemedicine and health information systems seems to be well covered, the skills and competencies in mobile health and distant learning should be researched more extensively. Social implications More research in the skills and competencies associated with so-called connected health, a new subfield in eHealth research, is needed. The skills and competencies of how to better implement and use the services related to the management of chronic diseases, health coproduction and how to implement eHealth in developing countries are currently under research areas and with candidates for future research. For both health informatics competencies/skills and EHCS, we noted that more research is needed for personalised medicine, health coproduction, smart health, internet of things, internet of services and intelligent health systems. Originality/value The literature production on health informatics and EHCS has been analysed for the first time and been compared in a systemic way, using bibliometrics. The results reveal that current research directions as well as knowledge gaps could thus provide guidelines for further research.
Article
Educational data mining (EDM) enhances the educational system by uncovering hidden patterns of academic data. The discipline of EDM has grown rapidly and produced numerous publications, leading to knowledge dissemination among researchers. This research aims to understand the EDM field literature by examining the citation network of significant publications. This research utilizes a quantitative approach based on citation main path analysis (MPA) to analyze 1009 Web of Science (WoS) publications between 1988 and 2023. The study uncovers 22 significant publications that have shaped the knowledge diffusion trajectories of EDM. The research reveals that EDM has undergone three phases of evolution, each of which represents a substantial shift in the research focus: automated adaptation, leveraging human decision, and advanced predictive analytics. Unlike previous EDM reviews, this study applies a novel approach using multiple global MPA, uncovering five key sub‐research areas: student performance, early warning, learning behavior, transfer learning, and dropout. Notably, recent trends emphasize a growing focus on student performance. The primary contribution of this paper lies in its comprehensive mapping of EDM's developmental trajectory, offering an understanding of its diverse research trends. By elucidating these patterns and emerging areas, this study not only enriches the existing literature but also identifies unexplored topics that can guide future research directions, distinguishing itself from other EDM reviews by offering a more systematic and data‐driven analysis of the field's evolution.
Article
Full-text available
Bananas (Musa spp.) are among the most widely consumed fruits globally, yet their high perishability and short shelf-life pose significant challenges to the postharvest industry. To address this, edible coatings have been extensively studied for their ability to preserve the physical, microbiological, and sensory qualities of bananas. Among various types of edible coatings, polysaccharide-based coatings, particularly chitosan, have emerged as the most effective. The dipping method is predominantly employed for their application, surpassing spraying and brushing techniques. This review integrates insights from bibliometric analysis using Scopus, revealing that research on edible coatings for bananas began in 2009, with 45 journals contributing to the field. Key trends, including publication growth, author contributions, and geographical focus, are explored through VOS-viewer analysis. Mechanistically, edible coatings enhance postharvest banana quality by limiting gaseous exchange, reducing water loss, and preventing lipid migration. Performance is further improved by incorporating active ingredients such as antioxidants, antimicrobials, and plasticizers. Despite their benefits over synthetic chemicals, the commercial adoption of edible coatings faces limitations, related to scalability and practicality. This review highlights these challenges while proposing future directions for advancing edible coating technologies for banana preservation.
Article
Full-text available
Regarding the traditional irrigation and drainage management methods in China's irrigation districts, there are problems such as complex management processes, dispersed information systems, difficulty in resource sharing, diverse data sources, and insufficient intelligent auxiliary decision-making. In order to further realize refined intelligent management decision-making on issues related to smart irrigation districts, this article uses the basic information of 28 Yellow River diversion irrigation districts in Henan Province, as well as the overall plan reports and related issue management manuals of 7 management offices of the first and second phase projects of Zhaokou Irrigation District as data sources. Collect and organize data from irrigation areas, and use these to build a knowledge graph for management of issues in irrigation areas diverted from the Yellow River. At the same time, the BERT + BiLSTM + CRF model is used to intelligently identify entities such as irrigation projects and problem events in the Zhaokou Irrigation District inspection log text, and entity alignment technology is used to match the inspection text with entities in the knowledge graph. Finally, combined with the graph retrieval function, the intelligent generation of smart irrigation district management decision-making solutions is realized. The validation of specific irrigation district examples and analysis of the evaluation indexes of relevant models demonstrate the reliability of the problem management decision-making scheme proposed in this paper. This application effectively facilitates the integration of information data in the Yellow River diversion irrigation area and enables visual management of intelligent irrigation zones, presenting a novel concept for the informationization construction of China's irrigation areas.
Chapter
Determining the priorities for sustainable development involves critical problems identifying, the conditions creating necessary for successfully solving such problems, and identifying a set of methods and tools to ensure that these conditions are met. The existing quantitative approaches to the priorities definition do not consider the problems of resource provision of scientific priorities. The developed expert system makes it possible to use effectively multilevel information-logical structures for the systematic analysis of scientific priorities according to many criteria. It allows ensuring the transition from research goals to methods of achieving these goals, the choice of specific means and the development of the necessary research methodology.
Article
The life cycle assessment (LCA) is an important environmental management tool that has been developed since the 1960s and it is a widely accepted field of research in the scientific community. This paper focuses on the changes of research trends of this field in the recent two decades (1999–2018) from the perspective of bibliometrics. First, we use the Web of Science (WoS) database to collect relevant literature, which is a widely used database in the field of bibliometrics. Then, this paper analyzes in detail the amount of publications, citations, cooperation models and their evolution trends, and identifies the most productive countries, institutions and authors in this field. Finally, by studying the author-keyword co-occurrence networks of the LCA papers in different periods, the research hotspots and their changes in this field are explored. We hope that the research of this paper will contribute to the faster and better development of the LCA field.
Chapter
Full-text available
Novelties are part of our daily lives. We constantly adopt new technologies, conceive new ideas, meet new people, experiment with new situations. Occasionally, we as individuals, in a complicated cognitive and sometimes fortuitous process, come up with something that is not only new to us, but to our entire society so that what is a personal novelty can turn into an innovation at a global level. Innovations occur throughout social, biological and technological systems and, though we perceive them as a very natural ingredient of our human experience, little is known about the processes determining their emergence. Still the statistical occurrence of innovations shows striking regularities that represent a starting point to get a deeper insight in the whole phenomenology. This paper represents a small step in that direction, focusing on reviewing the scientific attempts to effectively model the emergence of the new and its regularities, with an emphasis on more recent contributions: from the plain Simon's model tracing back to the 1950s, to the newest model of Polya's urn with triggering of one novelty by another. What seems to be key in the successful modelling schemes proposed so far is the idea of looking at evolution as a path in a complex space, physical, conceptual, biological, technological, whose structure and topology get continuously reshaped and expanded by the occurrence of the new. Mathematically it is very interesting to look at the consequences of the interplay between the "actual" and the "possible" and this is the aim of this short review.
Article
Full-text available
Selecting among alternative projects is a core management task in all innovating organizations. In this paper, we focus on the evaluation of frontier scientific research projects. We argue that the “intellectual distance” between the knowledge embodied in research proposals and an evaluator’s own expertise systematically relates to the evaluations given. To estimate relationships, we designed and executed a grant proposal process at a leading research university in which we randomized the assignment of evaluators and proposals to generate 2,130 evaluator–proposal pairs. We find that evaluators systematically give lower scores to research proposals that are closer to their own areas of expertise and to those that are highly novel. The patterns are consistent with biases associated with boundedly rational evaluation of new ideas. The patterns are inconsistent with intellectual distance simply contributing “noise” or being associated with private interests of evaluators. We discuss implications for policy, managerial intervention, and allocation of resources in the ongoing accumulation of scientific knowledge. This paper was accepted by Lee Fleming, entrepreneurship and innovation.
Article
Full-text available
We study research collaborations between cities in Africa, the Middle East and South- Asia, focusing on the topics of malaria and tuberculosis. For this investigation we introduce a method to predict or recommend high-potential future (i.e., not yet realized) collaborations. The proposed method is based on link prediction techniques. A weighted network of co-authorships at the city level is constructed. Next, we calculate scores for each node pair according to three different measures: weighted Katz, rooted PageRank, and SimRank. The resulting scores can be interpreted as indicative of the likelihood of future linkage for the given node pair. A high score for two nodes that are not linked in the network is then treated as a recommendation for future collaboration. Results suggest that - of the three measures studied - the weighted Katz method leads to the most accurate predictions. Cities that often take part in new intercity collaborations are referred to as facilitator cities.
Article
Full-text available
Biomedical literature is an essential source of biomedical evidence. To translate the evidence for biomedicine study, researchers often need to carefully read multiple articles about specific biomedical issues. These articles thus need to be highly related to each other. They should share similar core contents, including research goals, methods, and findings. However, given an article r, it is challenging for search engines to retrieve highly related articles for r. In this paper, we present a technique PBC (Passage-based Bibliographic Coupling) that estimates inter-article similarity by seamlessly integrating bibliographic coupling with the information collected from context passages around important out-link citations (references) in each article. Empirical evaluation shows that PBC can significantly improve the retrieval of those articles that biomedical experts believe to be highly related to specific articles about gene-disease associations. PBC can thus be used to improve search engines in retrieving the highly related articles for any given article r, even when r is cited by very few (or even no) articles. The contribution is essential for those researchers and text mining systems that aim at cross-validating the evidence about specific gene-disease associations.
Article
Full-text available
Author name ambiguity is a crucial problem in any type of bibliometric analysis. It arises when several authors share the same name, but also when one author expresses their name in different ways. This article focuses on the former, also called the “namesake” problem. In particular, we assess the extent to which this compromises the Thomson Reuters Essential Science Indicators ranking of the top 1 % most cited authors worldwide. We show that three demographic characteristics that should be unrelated to research productivity—name origin, uniqueness of one’s family name, and the number of initials used in publishing—in fact have a very strong influence on it. In contrast to what could be expected from Web of Science publication data, researchers with Asian names—and in particular Chinese and Korean names—appear to be far more productive than researchers with Western names. Furthermore, for any country, academics with common names and fewer initials also appear to be more productive than their more uniquely named counterparts. However, this appearance of high productivity is caused purely by the fact that these “academic superstars” are in fact composites of many individual academics with the same name. We thus argue that it is high time that Thomson Reuters starts taking name disambiguation in general and non-Anglophone names in particular more seriously.
Conference Paper
Full-text available
This paper proposes a network analytic approach for scientific paper recommendations to researchers and academic learners. The proposed approach makes use of the similarity between citing and cited papers to eliminate irrelevant citations. This is achieved by combining both content-related and network-based similarities. The process of selecting recommendations is inspired by the ways researchers adopt in literature search, i.e. traversing certain paths in a citation network by omitting others. In this paper, we present the application of the newly devised algorithm to provide paper recommendations. To evaluate the results, we conducted a study in which human raters evaluated the paper recommendations and the ratings were compared to the results of other network analytic algorithms (such as Main Path Analysis and Modularity Clustering) and a well known recommendation algorithm (Collaborative Filtering). The evaluation shows that the newly devised algorithm yields good results comparable to those generated by Collaborative Filtering and exceeds those of the other network analytic algorithms.
Article
Full-text available
Purpose – The purpose of this paper is to identify criteria for and definitions of disciplinarity, and how they differ between different types of literature. Design/methodology/approach – This synthesis is achieved through a purposive review of three types of literature: explicit conceptualizations of disciplinarity; narrative histories of disciplines; and operationalizations of disciplinarity. Findings – Each angle of discussing disciplinarity presents distinct criteria. However, there are a few common axes upon which conceptualizations, disciplinary narratives, and measurements revolve: communication, social features, topical coherence, and institutions. Originality/value – There is considerable ambiguity in the concept of a discipline. This is of particular concern in a heightened assessment culture, where decisions about funding and resource allocation are often discipline-dependent (or focussed exclusively on interdisciplinary endeavors). This work explores the varied nature of disciplinarity and, through synthesis of the literature, presents a framework of criteria that can be used to guide science policy makers, scientometricians, administrators, and others interested in defining, constructing, and evaluating disciplines.
Article
Full-text available
Discipline-specific research evaluation exercises are typically carried out by panels of peers, known as expert panels. To the best of our knowledge, no methods are available to measure overlap in expertise between an expert panel and the units under evaluation. This paper explores bibliometric approaches to determine this overlap, using two research evaluations of the departments of Chemistry (2009) and Physics (2010) of the University of Antwerp as a test case. We explore the usefulness of overlay mapping on a global map of science (with Web of Science subject categories) to gauge overlap of expertise and introduce a set of methods to determine an entity’s barycenter according to its publication output. Barycenters can be calculated starting from a similarity matrix of subject categories (N-dimensions) or from a visualization thereof (2-dimensions). We compare the results of the N-dimensional method with those of two 2-dimensional ones (Kamada-Kawai maps and VOS maps) and find that they yield very similar results. The distance between barycenters is used as an indicator of expertise overlap. The results reveal that there is some discrepancy between the panel’s and the groups’ publications in both the Chemistry and the Physics departments. The panels were not as diverse as the groups that were assessed. The match between the Chemistry panel and the Department was better than that between the Physics panel and the Department.
Conference Paper
Full-text available
The internal homogeneity of research disciplines in subject categories (SC) of the Web of Science database (WoS) regarding their publication and citation practices is an essential precondition for the field-normalization of citation indicators. This imperative of underlying homogeneity seems not to be met throughout all categories, as has been shown in former research. A keyword-based clustering method displays both the diversity of research areas included in an SC and that the clusters' mean citation rate differ substantially. This proof-of-concept paper on the basis of one country set and two SCs presents a bootstrapping method, which allows quantifying the degree of heterogeneity within subject categories as a stability interval. The MNCS 95% stability interval of our set has a range of 6.7% and 7.3% compared to its score. This kind of robustness measure could be implemented for future evaluative citation analysis in order to convey the coarseness of bibliometric point estimates.
Article
Full-text available
Nowadays, science should address societal challenges, such as ‘sustainability’, or ‘responsible research and innovation’. This emerging form of steering toward broad and generic goals involves the use of ‘big words’: encompassing concepts that are uncontested themselves, but that allow for multiple interpretations and specifications. This paper is based on the premise that big words matter in the structuring of scientific practice and it empirically traces how three ‘big words’–‘sustainability’, ‘responsible innovation’ and ‘valorization’ (a term closely linked to knowledge utilization) – steer research activities within a Dutch research program of nanotechnology that is explicitly related to societal challenges. To do so, the theory of articulation is extended with the concept of ideographs. We report on how the topdown steering ambitions of policy are countervailed by the bottom-up dynamics and logics of researchers. We also conclude that when ‘big words’ are used in an organizational and administrative setting, it changes their effects.
Article
Full-text available
A similarity-oriented approach for deriving reference values used in citation normalization is explored and contrasted with the dominant approach of utilizing database-defined journal sets as a basis for deriving such values. In the similarity-oriented approach, an assessed article's raw citation count is compared with a reference value that is derived from a reference set, which is constructed in such a way that articles in this set are estimated to address a subject matter similar to that of the assessed article. This estimation is based on second-order similarity and utilizes a combination of 2 feature sets: bibliographic references and technical terminology. The contribution of an article in a given reference set to the reference value is dependent on its degree of similarity to the assessed article. It is shown that reference values calculated by the similarity-oriented approach are considerably better at predicting the assessed articles' citation count compared to the reference values given by the journal-set approach, thus significantly reducing the variability in the observed citation distribution that stems from the variability in the articles' addressed subject matter.
Article
Full-text available
Field normalization, and its effect of bibliometric indicators, is a widely discussed topic among bibliometricians. It is not the necessity of field normalization around which the debate evolves, but how to field normalize bibliometric indicators. In this article we present the results of a study in which publication data of a large disciplinary database in economics (EconLit) is combined with the multidisciplinary citation indexes produced by Thomson Reuters. The main purpose of the study is to investigate whether it would be possible to combine the classification scheme of the economics database with the advantages of the citation indexes (both multiple addresses and citation data), in order to improve the possible applicability of the citation indexes in research performance studies in the field of economics and its periphery. The authors show the starting points of both databases, the outcome of the matching and combining of both sets of publications, and the effects of EconLit field classification in terms of differences in impact levels. The study clearly shows that research performance exercises conducted in the field of economics would benefit from the labeling of publications in the citation indexes with a more detailed classification scheme as found in EconLit. Copyright The Author 2012. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com, Oxford University Press.
Article
Full-text available
We introduce the quantitative method named "reference publication year spectroscopy" (RPYS). With this method one can determine the historical roots of research fields and quantify their impact on current research. RPYS is based on the analysis of the frequency with which references are cited in the publications of a specific research field in terms of the publication years of these cited references. The origins show up in the form of more or less pronounced peaks mostly caused by individual publications which are cited particularly frequently. In this study, we use research on graphene and on solar cells to illustrate how RPYS functions, and what results it can deliver.
Article
Full-text available
Since theScience Citation Index emerged within the system of scientific communication in 1964, an intense controversy about its character has been raging: in what sense can citation analysis be trusted? This debate can be characterized as the confrontation of different perspectives on science. In this paper the citation representation of science is discussed: the way the citation creates a new reality of as well as in the world of science; the main features of this reality; and some implications for science and science policy.
Article
Full-text available
Three different types of bibliometrics — literature bibliometrics, patent bibliometrics, and linkage bibliometric can all be used to address various government performance and results questions. Applications of these three bibliometric types will be described within the framework of Weinberg's internal and external criteria, whether the work being done is good science, efficiently and effectively done, and whether it is important science from a technological viewpoint. Within all bibliometrics the fundamental assumption is that the frequency with which a set of papers or patents is cited is a measure of the impact or influence of the set of papers. The literature bibliometric indicators are counts of publications and citations received in the scientific literature and various derived indicators including such phenomena as cross-sectoral citation, coauthorship and concentration within influential journals. One basic observation of literature bibliometrics, which carries over to patent bibliometrics, is that of highly skewed distributions — with a relatively small number of high-impact patents and papers, and large numbers of patents and papers of minimal impact. The key measure is whether an agency is producing or supporting highly cited papers and patents. The final set of data are in the area of linkage bibliometrics, looking at citations from patents to scientific papers. These are particularly relevant to the external criteria, in that it is quite obvious that institutions and supporting agencies whose papers are highly cited in patents are making measurable contributions to a nation's technological progress.
Article
We examine how the premature death of eminent life scientists alters the vitality of their fields. While the flow of articles by collaborators into affected fields decreases after the death of a star scientist, the flow of articles by non-collaborators increases markedly. This surge in contributions from outsiders draws upon a different scientific corpus and is disproportionately likely to be highly cited. While outsiders appear reluctant to challenge leadership within a field when the star is alive, the loss of a luminary provides an opportunity for fields to evolve in new directions that advance the frontier of knowledge within them.
Article
This article proposes a model of document selection by real users of a bibliographic retrieval system. It reports on Part 1 of a longitudinal study of decision making on document use by academics during an actual research project. (Part 2 followed up the same users on how the selected documents were actually used in subsequent stages.) The participants are 25 self-selected faculty and graduate students in Agricultural Economics. After a reference interview, the researcher conducted a search of DIALOG databases and prepared a printout. The users selected documents from this printout; they were asked to read and think aloud while selecting documents. Their verbal reports were recorded and analyzed from a utility-theoretic perspective. The following model of the decision-making in the selection process emerged: document information elements (DIEs) in document records provide the information for judging documents on 11 criteria (including topicality, orientation, quality, novelty, and authority); the criteria judgments are combined in an assessment of document value along five dimensions (epistemic, functional, conditional, social, and emotional values), leading to the use decision. This model accounts for the use of personal knowledge and decision strategies applied in the selection process. The model has implications for the design of an intelligent document selection assistant.
Article
The claim that co-citation analysis is a useful tool to map subject-matter specialties of scientific research in a given period, is examined. A method has been developed using quantitative analysis of content-words related to publications in order to: (1) study coherence of research topics within sets of publications citing clusters, i.e., (part of) the “current work” of a specialty; (2) to study differences in research topics between sets of publications citing different clusters; and (3) to evaluate recall of “current work” publications concerning the specialties identified by co-citation analysis. Empirical support is found for the claim that co-citation analysis identifies indeed subject-matter specialties. However, different clusters may identify the same specialty, and results are far from complete concerning the identified “current work.” These results are in accordance with the opinion of some experts in the fields. Low recall of co-citation analysis concerning the “current work” of specialties is shown to be related to the way in which researchers build their work on earlier publications: the “missed” publications equally build on very recent earlier work, but are less “consensual” and/or less “attentive” in their referencing practice. Evaluation of national research performance using co-citation analysis appears to be biased by this “incompleteness.”
Article
Publication and citation patterns can vary significantly between related disciplines or more narrow specialties, even when sharing journals. Journal-based structures are therefore not accurate enough to approximate certain specialties, neither subject categories in global citation indices, nor cell sub-structures (Rons, 2012). This paper presents first test results of a new methodology that approximates the specialty of a highly specialized seed record by combining criteria for four publication metadata-fields, thereby broadly covering conceptual components defining disciplines and scholarly communication. To offer added value compared to journal-based structures, the methodology needs to generate sufficiently distinct results for seed directories in related specialties (sharing subject categories, cells, or even sources) with significantly different characteristics. This is tested successfully for the sub-domains of theoretical and experimental particle physics. In particular analyses of specialties with characteristics deviating from those of a broader discipline embedded in can benefit from an approach discerning down to specialty level. Such specialties are potentially present in all disciplines, for instance as cases of peripheral, emerging, frontier, or strategically prioritized research areas.
Article
Bibliometric analyses depend on the quality of data sets and the author name disambiguation process which attributes written papers with names on it to real persons. Errors of the author name disambiguation process can distort the results of the analyses. To assess the resulting error in the analyses outcomes, Monte Carlo simulations can be used. This paper presents a basic algorithm of such simulations and how errors will lead to changes to the results of different kinds of analyses (rankings and regressions analysis with number of papers as dependent variable). The results show that rakings of authors are more depended on data set quality than regression coefficients. Both mean and individual per person data set quality is important for valid ranking results. Regression coefficients change less than 10% under current automatic attribution processes quality.
Article
Discipline-specific research evaluation exercises are typically carried out by panels of peers, known as expert panels. To the best of our knowledge, no methods are available to measure overlap in expertise between an expert panel and the units under evaluation. This paper explores bibliometric approaches to determine this overlap, using two research evaluations of the departments of Chemistry (2009) and Physics (2010) of the University of Antwerp as a test case. We explore the usefulness of overlay mapping on a global map of science (with Web of Science subject categories) to gauge overlap of expertise and introduce a set of methods to determine an entity’s barycenter according to its publication output. Barycenters can be calculated starting from a similarity matrix of subject categories (N-dimensions) or from a visualization thereof (2-dimensions). We compare the results of the N-dimensional method with those of two 2-dimensional ones (Kamada-Kawai maps and VOS maps) and find that they yield very similar results. The distance between barycenters is used as an indicator of expertise overlap. The results reveal that there is some discrepancy between the panel’s and the groups’ publications in both the Chemistry and the Physics departments. The panels were not as diverse as the groups that were assessed. The match between the Chemistry panel and the Department was better than that between the Physics panel and the Department.
Article
Tenure decisions and university rankings are just two examples where interfield comparison of academic output is needed. There are differences in publication performances among fields when the number of papers is used as the quantity measure and the Journal Impact Factor is used as the quality measure. For example, it is well known that the economics departments publish less than the chemistry departments and their journals have less impact factors. But there is no consensus on the magnitude of the difference and the methodology for the adjustment. Every decision maker makes his own adjustment and uses a different formula. In this paper, we quantify the publication performance differences among nine academic fields by using data from 1417 departments in the United States. We use two quality measures. First we weigh the publications by the impact factor of the journals. Second, we only consider the publications in the journals that are in the top quartile of the subject categories. We see that there are vast interfield differences in terms of the number of publications. Moreover, we find that the interfield differences are augmented when we consider the quality of the publications. Lastly, we rank the departments according to the quality of their graduate programs. We see that there are also huge differences among the departments with graduate programs of comparable rank.
Article
Calls for interdisciplinary collaboration have become increasingly common in the face of large-scale complex problems (including climate change, economic inequality, and education, among others); however, outcomes of such collaborations have been mixed, due, among other things, to the so-called “translation problem” in interdisciplinary research. This article presents a potential solution: an empirical approach to quantitatively measure both the degree and nature of differences among disciplinary tongues through the social and epistemic terms used (a research area we refer to as discourse epistemetrics), in a case study comparing dissertations in philosophy, psychology, and physics. Using a support-vector model of machine learning to classify disciplines based on relative frequencies of social and epistemic terms, we were able to markedly improve accuracy over a random selection baseline (distinguishing between disciplines with as high as 90% accuracy) as well as acquire sets of most indicative terms for each discipline by their relative presence or absence. These lists were then considered in light of findings of sociological and epistemological studies of disciplines and found to validate the approach's measure of social and epistemic disciplinary identities and contrasts. Based on the findings of our study, we conclude by considering the beneficiaries of research in this area, including bibliometricians, students, and science policy makers, among others, as well as laying out a research program that expands the number of disciplines, considers shifts in socio-epistemic identities over time and applies these methods to nonacademic epistemological communities (e.g., political groups).
Article
Individual, excellent scientists have become increasingly important in the research funding landscape. Accurate bibliometric measures of an individual's performance could help identify excellent scientists, but still present a challenge. One crucial aspect in this respect is an adequate delineation of the sets of publications that determine the reference values to which a scientist's publication record and its citation impact should be compared. The structure of partition cells formed by intersecting fixed subject categories in a database has been proposed to approximate a scientist's specialty more closely than can be done with the broader subject categories. This paper investigates this cell structure's suitability as an underlying basis for methodologies to assess individual scientists, from two perspectives: (1) Proximity to the actual structure of publication records of individual scientists: The distribution and concentration of publications over the highly fragmented structure of partition cells are examined for a sample of ERC grantees; (2) Proximity to customary levels of accuracy: Differences in commonly used reference values (mean expected number of citations per publication, and threshold number of citations for highly cited publications) between adjacent partition cells are compared to differences in two other dimensions: successive publication years and successive citation window lengths. Findings from both perspectives are in support of partition cells rather than the larger subject categories as a journal based structure on which to construct and apply methodologies for the assessment of highly specialized publication records such as those of individual scientists.
Article
Normalization of citation scores using reference sets based on Web-of-Science Subject Categories (WCs) has become an established ("best") practice in evaluative bibliometrics. For example, the Times Higher Education World University Rankings are, among other things, based on this operationalization. However, WCs were developed decades ago for the purpose of information retrieval and evolved incrementally with the database; the classification is machine-based and partially manually corrected. Using the WC "information science & library science" and the WCs attributed to journals in the field of "science and technology studies," we show that WCs do not provide sufficient analytical clarity to carry bibliometric normalization in evaluation practices because of "indexer effects." Can the compliance with "best practices" be replaced with an ambition to develop "best possible practices"? New research questions can then be envisaged.
Article
We describe latent Dirichlet allocation (LDA), a generative probabilistic model for collections of discrete data such as text corpora. LDA is a three-level hierarchical Bayesian model, in which each item of a collection is modeled as a finite mixture over an underlying set of topics. Each topic is, in turn, modeled as an infinite mixture over an underlying set of topic probabilities. In the context of text modeling, the topic probabilities provide an explicit representation of a document. We present efficient approximate inference techniques based on variational methods and an EM algorithm for empirical Bayes parameter estimation. We report results in document modeling, text classification, and collaborative filtering, comparing to a mixture of unigrams model and the probabilistic LSI model.
Article
The majority of the effort in metrics research has addressed research evaluation. Far less research has addressed the unique problems of research planning. Models and maps of science that can address the detailed problems associated with research planning are needed. This article reports on the creation of an article-level model and map of science covering 16 years and nearly 20 million articles using cocitation-based techniques. The map is then used to define discipline-like structures consisting of natural groupings of articles and clusters of articles. This combination of detail and high-level structure can be used to address planning-related problems such as identification of emerging topics and the identification of which areas of science and technology are innovative and which are simply persisting. In addition to presenting the model and map, several process improvements that result in greater accuracy structures are detailed, including a bibliographic coupling approach for assigning current papers to cocitation clusters and a sequential hybrid approach to producing visual maps from models.
Article
A system of four research levels, designed to classify scientific journals from most applied to most basic, was introduced by Francis Narin and colleagues in the 1970s. Research levels have been used since that time to characterize research at institutional and departmental levels. Currently, less than half of all articles published are in journals that been classified by research level. There is thus a need for the notion of research level to be extended in a way that all articles can be so classified. This article reports on a new model – trained from title and abstract words and cited references – that classifies individual articles by research level. The model covers all of science, and has been used to classify over 25 million articles from Scopus by research level. The final model and set of classified articles are further characterized. Code is available at https://github.com/SciTechStrategies/rlev-model.
Article
The notion of ‘core documents’, first introduced in the context of co-citation analysis and later re-introduced for bibliographic coupling and extended to hybrid approaches, refers to the representation of the core of a document set according to given criteria. In the present study, core documents are used for the identification of new emerging topics. The proposed method proceeds from independent clustering of disciplines in different time windows. Cross-citations between core documents and clusters in different periods are used to detect new, exceptionally growing clusters or clusters with changing topics. Three paradigmatic types of new, emerging topics are distinguished. Methodology is illustrated using the example of four ISI subject categories selected from the life sciences, applied sciences and the social sciences.
Article
Field normalized citation rates are well-established indicators for research performance from the broadest aggregation levels such as countries, down to institutes and research teams. When applied to still more specialized publication sets at the level of individual scientists, also a more accurate delimitation is required of the reference domain that provides the expectations to which a performance is compared. This necessity for sharper accuracy challenges standard methodology based on predefined subject categories. This paper proposes a way to define a reference domain that is more strongly delimited than in standard methodology, by building it up out of cells of the partition created by the pre-defined subject categories and their intersections. This partition approach can be applied to different existing field normalization variants. The resulting reference domain lies between those generated by standard field normalization and journal normalization. Examples based on fictive and real publication records illustrate how the potential impact on results can exceed or be smaller than the effect of other currently debated normalization variants, depending on the case studied. The proposed Partition-based Field Normalization is expected to offer advantages in particular at the level of individual scientists and other very specific publication records, such as publication output from interdisciplinary research.
Article
In this paper the adequacy of the co-word method for mapping the structure of scientific inquiry is explored. Co-word analysis of both the keywords and the titles of a set of papers in `acidification research' is undertaken and the results are found to be comparable, though the keyword-derived results provide greater detail. This strongly suggests that keyword indexing doest not, as has sometimes been claimed, distort coword findings. It also points to differences between titles (which often emphasize the supposed originality of an article) and keywords (which tend to show the relationship between the paper and other publications). The paper also explores important differences between the methodological assumptions that underlie the Paris/Keele co-word clustering algorithms and the factor analysis method for creating clusters.
Article
This Note examines the data base used by Lotka in propounding his Law, and by Price in elaborating it, and questions the validity of the generalizations drawn from it.
Article
This paper first explains the need to define subfields of science by means of “filters” that selectively retrieve papers from a database, and then describes how such filters are constructed and calibrated. Good filters should have precision and recall of the order of 90% so as to be representative of a subfield; they are created by an interactive partnership between an expert in the subject and a bibliometrician. They are based primarily on the use of title keywords, often in combination rather than singly, and specialist journals. Their calibration depends on experts marking lists of papers extracted by the filter as relevant, don't know or not relevant. This allows the actual size of a subfield to be estimated and hence the relative importance accorded to it within a major field of science. It permits organisations and countries to see their contributions to individual scientific subfields in detail.
Article
An account of discoveries pertaining to linguistic change, presenting many problems to the psychologist whose interest lies in speech-behavior or meaning. Harvard Book List (edited) 1955 #268 (PsycINFO Database Record (c) 2012 APA, all rights reserved)