
Francesco Osborne- Doctor of Philosophy
- Senior Research Fellow at The Open University
Francesco Osborne
- Doctor of Philosophy
- Senior Research Fellow at The Open University
About
131
Publications
54,694
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
2,257
Citations
Introduction
I am a Senior Research Fellow at the Knowledge Media institute of the Open University in Milton Keynes, UK, where I lead the Scholarly Knowledge Mining team. My research covers Artificial Intelligence, Information Extraction, Knowledge Graphs, Science of Science, Semantic Web, Research Analytics, and Semantic Publishing. I have authored more than a hundred peer-reviewed publications in the top journals and conferences of my research areas.
More info at https://www.francescoosborne.net.
Current institution
Publications
Publications (131)
The ability to recognise new research trends early is strategic for many stakeholders, such as academics, institutional funding bodies, academic publishers and companies. While the state of the art presents several works on the identification of novel research topics, detecting the emergence of a new research area at a very early stage, i.e., when...
Being able to rapidly recognise new research trends is strategic for many stakeholders, including universities, institutional funding bodies, academic publishers and companies. The literature presents several approaches to identifying the emergence of new research topics, which rely on the assumption that the topic is already exhibiting a certain d...
With the explosive growth of artificial intelligence (AI) and big data, it has become vitally important to organize and represent the enormous volume of knowledge appropriately. As graph data, knowledge graphs accumulate and convey knowledge of the real world. It has been well-recognized that knowledge graphs effectively represent complex informati...
This paper presents a comprehensive review of the use of Artificial Intelligence (AI) in Systematic Literature Reviews (SLRs). A SLR is a rigorous and organised methodology that assesses and integrates prior research on a given topic. Numerous tools have been developed to assist and partially automate the SLR process. The increasing role of AI in t...
The rapid evolution of AI and the increased accessibility of scientific articles through open access marks a pivotal moment in research. AI-driven tools are reshaping how scientists explore, interpret, and contribute to the body of scientific knowledge, offering unprecedented opportunities. Nonetheless, a significant challenge remains: dealing with...
Knowledge Organization Systems (KOSs), such as term lists, thesauri, taxonomies, and ontologies, play a fundamental role in categorising, managing, and retrieving information. In the academic domain, KOSs are often adopted for representing research areas and their relationships, primarily aiming to classify research articles, academic courses, pate...
The rise of big data has introduced significant challenges in managing, storing, analyzing, and modeling data. These challenges require the integration of diverse storage and computing platforms. Consequently, incorporating new data sources and developing new data transformation methods is typically a slow and expensive process. Furthermore, these...
The integration of Environmental, Social, and Governance (ESG) factors into corporate decision-making is a fundamental aspect of sustainable finance. However, ensuring that business practices align with evolving regulatory frameworks remains a persistent challenge. AI-driven solutions for automatically assessing the alignment of sustainability repo...
The process of news digitalization over the past decades has released massive amounts of news content, revolutionizing consumer access to news and disrupting traditional business models. These radical changes have also introduced new opportunities for media content analysis, potentially opening up new scenarios for ambitious large-scale media analy...
Ontologies of research topics are crucial for structuring scientific knowledge, enabling scientists to navigate vast amounts of research, and forming the backbone of intelligent systems such as search engines and recommendation systems. However, manual creation of these ontologies is expensive, slow, and often results in outdated and overly general...
Despite the seismic changes brought about by the web and social media, mainstream news sources still play a crucial role in democratic societies. In particular, a healthy democracy requires a balanced and diverse media landscape, able to provide an arena in which the various topics and viewpoints relevant to the political discourse of the day are p...
Knowledge Organization Systems (KOSs), such as term lists, thesauri, taxonomies, and ontologies, play a fundamental role in categorising, managing, and retrieving information. In the academic domain, KOSs are often adopted for representing research areas and their relationships, primarily aiming to classify research articles, academic courses, pate...
Numerous methods and pipelines have recently emerged for the automatic extraction of knowledge graphs from documents such as scientific publications and patents. However, adapting these methods to incorporate alternative text sources like micro-blogging posts and news has proven challenging as they struggle to model open-domain entities and relatio...
Several techniques and workflows have emerged recently for automatically extracting knowledge graphs from documents like scientific articles and patents. However, adapting these approaches to integrate alternative text sources such as micro-blogging posts and news and to model open-domain entities and relationships commonly found in these sources i...
Online platforms have become the primary means for travellers to search, compare, and book accommodations for their trips. Consequently, online platforms and revenue managers must acquire a comprehensive comprehension of these dynamics to formulate a competitive and appealing offerings. Recent advancements in natural language processing, specifical...
Numerous methods and pipelines have recently emerged for the automatic extraction of knowledge graphs from documents such as scientific publications and patents. However, adapting these methods to incorporate alternative text sources like micro-blogging posts and news has proven challenging as they struggle to model open-domain entities and relatio...
This paper explores the growing importance of Environmental, Social, and Governance (ESG) criteria in financial assessments and conducts an AI-driven analysis of ESG concepts' evolution from 1980 to 2022. Focusing on media sources from the United States and the United Kingdom, the study utilizes the Dow Jones News Article dataset for a comprehensiv...
In recent years, transformer-based models have emerged as powerful tools for natural language processing tasks, demonstrating remarkable performance in several domains. However, they still present significant limitations. These shortcomings become more noticeable when dealing with highly specific and complex concepts, particularly within the scient...
This manuscript presents a comprehensive review of the use of Artificial Intelligence (AI) in Systematic Literature Reviews (SLRs). Our study focuses on how AI techniques are applied in the semi-automation of SLRs, specifically in the screening and extraction phases. We examine 21 leading SLR tools using a framework that combines 23 traditional fea...
In recent years, the significance of Environmental, Social, and Governance criteria in assessing financial investments has grown significantly. This paper presents an AI-driven analysis of ESG concepts and their evolution from 1980 to 2022, with a specific focus on media sources from the United States and the United Kingdom. The primary data source...
The crucial task of analysing the complex dynamics of the research landscape and uncovering the latest insights from the scientific literature is of paramount importance to researchers, governments, and commercial organizations. Springer Nature, one of the leading academic publishers worldwide, plays a significant role in this domain and regularly...
Understanding the relationship between the composition of a research team and the potential impact of their research papers is crucial as it can steer the development of new science policies for improving the research enterprise. Numerous studies assess how the characteristics and diversity of research teams can influence their performance across s...
Understanding the relationship between the composition of a research team and the potential impact of their research papers is crucial as it can steer the development of new science policies for improving the research enterprise. Numerous studies assess how the characteristics and diversity of research teams can influence their performance across s...
With the explosive growth of artificial intelligence (AI) and big data, it has become vitally important to organize and represent the enormous volume of knowledge appropriately. As graph data, knowledge graphs accumulate and convey knowledge of the real world. It has been well-recognized that knowledge graphs effectively represent complex informati...
In the last few years, chatbots have become mainstream solutions adopted in a variety of domains for automatizing communication at scale. In the same period, knowledge graphs have attracted significant attention from business and academia as robust and scalable representations of information. In the scientific and academic research domain, they are...
The tourism and hospitality sectors have become increasingly important in the last few years and the companies operating in this field are constantly challenged with providing new innovative services. At the same time, (big-) data has become the “new oil” of this century and Knowledge Graphs are emerging as the most natural way to collect, refine,...
Security token offerings (STOs), based on blockchain technology, are attracting increasing attention as an innovative alternative means of venture financing. Information about specific STOs is generally provided in white papers. This study analyses the content of white papers using a unique sample of 188 STOs from 2017 to 2021 to identify which top...
In the last few years, we have witnessed the emergence of several knowledge graphs that explicitly describe research knowledge with the aim of enabling intelligent systems for supporting and accelerating the scientific process. These resources typically characterize a set of entities in this space (e.g., tasks, methods, evaluation techniques, prote...
In the past few decades, we saw a proliferation of scientific articles available online. This data-rich environment offers several opportunities but also challenges, since it is problematic to explore these resources and identify all the relevant content. Hence, it is crucial that they are appropriately annotated with their relevant concepts so to...
Research publishing companies need to constantly monitor and compare scientific journals and conferences in order to inform critical business and editorial decisions. Semantic Web and Knowledge Graph technologies are natural solutions since they allow these companies to integrate, represent, and analyse a large quantity of information from heteroge...
Science communication has a number of bottlenecks that include the rising number of published research papers and its non-machine-accessible and document-based paradigm, which makes the exploration, reading, and reuse of research outcomes rather inefficient. Recently, Knowledge Graphs (KG), i.e., semantic interlinked networks of entities, have been...
In recent years, we saw the emergence of several approaches for producing machine-readable, semantically rich, interlinked descriptions of the content of research publications, typically encoded as knowledge graphs. A common limitation of these solutions is that they address a low number of articles, either because they rely on human experts to sum...
Interest in Artificial Intelligence (AI) continues to grow rapidly, hence it is crucial to support researchers and organisations in understanding where AI research is heading. In this study, we conducted a bibliometric analysis on 257K articles in AI, retrieved from OpenAlex. We identified the main conceptual themes by performing clustering analysi...
Classifying scientific articles, patents, and other documents according to the relevant research topics is an important task, which enables a variety of functionalities, such as categorising documents in digital libraries, monitoring and predicting research trends, and recommending papers relevant to one or more topics. In this paper, we present th...
Scientific conferences are essential for developing active research communities, promoting the cross-pollination of ideas and technologies, bridging between academia and industry, and disseminating new findings. Analyzing and monitoring scientific conferences is thus crucial for all users who need to take informed decisions in this space. However,...
Academia and industry share a complex, multifaceted, and symbiotic relationship. Analysing the knowledge flow between them, understanding which directions have the biggest potential, and discovering the best strategies to harmonise their efforts is a critical task for several stakeholders. Research publications and patents are an ideal medium to an...
Analysing research trends and predicting their impact on academia and industry is crucial to gain a deeper understanding of the advances in a research field and to inform critical decisions about research funding and technology adoption. In the last years, we saw the emergence of several publicly-available and large-scale Scientific Knowledge Graph...
Analysing research trends and predicting their impact on academia and industry is crucial to gain a deeper understanding of the advances in a research field and to inform critical decisions about research funding and technology adoption. In the last years, we saw the emergence of several publicly-available and large-scale Scientific Knowledge Graph...
Knowledge graphs (KGs) are widely used for modeling scholarly communication, performing scientometric analyses, and supporting a variety of intelligent services to explore the literature and predict research dynamics. However, they often suffer from incompleteness (e.g., missing affiliations, references, research topics), leading to a reduced scope...
The incompleteness of Knowledge Graphs (KGs) is a crucial issue affecting the quality of AI-based services. In the scholarly domain, KGs describing research publications typically lack important information, hindering our ability to analyse and predict research dynamics. In recent years, link prediction approaches based on Knowledge Graph Embedding...
Analysing research trends and predicting their impact on academia and industry is crucial to gain a deeper understanding of the advances in a research field and to inform critical decisions about research funding and technology adoption. In the last years, we saw the emergence of several publicly-available and large-scale Scientific Knowledge Graph...
The incompleteness of Knowledge Graphs (KGs) is a crucial issue affecting the quality of AI-based services. In the scholarly domain, KGs describing research publications typically lack important information, hindering our ability to analyse and predict research dynamics. In recent years, link prediction approaches based on Knowledge Graph Embedding...
Classifying research papers according to their research topics is an important task to improve their retrievability, assist the creation of smart analytics, and support a variety of approaches for analysing and making sense of the research environment. In this paper, we present the CSO Classifier, a new unsupervised approach for automatically class...
Identifying the research topics that best describe the scope of a scientific publication is a crucial task for editors, in particular because the quality of these annotations determine how effectively users are able to discover the right content in online libraries. For this reason, Springer Nature, the world's largest academic book publisher, has...
Major academic publishers need to be able to analyse their vast catalogue of products and select the best items to be marketed in scientific venues. This is a complex exercise that requires characterising with a high precision the topics of thousands of books and matching them with the interests of the relevant communities. In Springer Nature, this...
The continuous growth of scientific literature brings innovations and, at the same time, raises new challenges. One of them is related to the fact that its analysis has become difficult due to the high volume of published papers for which manual effort for annotations and management is required. Novel technological infrastructures are needed to hel...
The continuous growth of scientific literature brings innovations and, at the same time, raises new challenges. One of them is related to the fact that its analysis has become difficult due to the high volume of published papers for which manual effort for annotations and management is required. Novel technological infrastructures are needed to hel...
Ontologies of research areas have been proven to be useful resources for analysing and making sense of scholarly data. In this chapter, we present the Computer Science Ontology (CSO), which is the largest ontology of research areas in the field, and discuss a number of applications that build on CSO to support high-level tasks, such as topic classi...
Scientific knowledge has been traditionally disseminated and preserved through research articles published in journals, conference proceedings , and online archives. However, this article-centric paradigm has been often criticized for not allowing to automatically process, categorize , and reason on this knowledge. An alternative vision is to gener...
Understanding, monitoring, and predicting the flow of knowledge between academia and industry is of critical importance for a variety of stakeholders, including governments, funding bodies, researchers, investors, and companies. To this purpose, we introduce ResearchFlow, an approach that integrates semantic technologies and machine learning to qua...
Research on database and information technologies has been rapidly evolving over the last couple of years. This evolution was lead by three major forces: Big Data, AI and Connected World that open the door to innovative research directions and challenges, yet exploiting four main areas: (i) computational and storage resource modeling and organizati...
Academia and industry are constantly engaged in a joint effort for producing scientific knowledge that will shape the society of the future. Analysing the knowledge flow between them and understanding how they influence each other is a critical task for researchers, governments, funding bodies, investors, and companies. However, current corpora are...
Ontologies of research areas are important tools for characterizing, exploring, and analyzing the research landscape. Some fields of research are comprehensively described by large-scale taxonomies, e.g., MeSH in Biology and PhySH in Physics. Conversely, current Computer Science taxonomies are coarse-grained and tend to evolve slowly. For instance,...
Ontologies of research areas have been proven to be useful in many application for analysing and making sense of scholarly data. In this chapter, we present the Computer Science Ontology (CSO), which is the largest ontology of research areas in the field of Computer Science, and discuss a number of applications that build on CSO, to support high-le...
The increasing interest in analysing, describing, and improving the research process requires the development of new forms of scholarly data publication and analysis that integrates lessons and approaches from the field of Semantic Technologies, Science of Science, Digital Libraries, and Artificial Intelligence. This editorial summarises the conten...
Identifying the research topics that best describe the scope of a scientific publication is a crucial task for editors, in particular because the quality of these annotations determine how effectively users are able to discover the right content in online libraries. For this reason, Springer Nature, the world’s largest academic book publisher, has...
In this paper, we present a preliminary approach that uses a set of NLP and Deep Learning methods for extracting entities and relationships from research publications and then integrates them in a Knowledge Graph. More specifically, we (i) tackle the challenge of knowledge extraction by employing several state-of-the-art Natural Language Processing...
Classifying research papers according to their research topics is an important task to improve their retrievability, assist the creation of smart analytics, and support a variety of approaches for analysing and making sense of the research environment. In this paper, we present the CSO Classifier, a new unsupervised approach for automatically class...
Context. Systematic Reviews (SRs) are means for collecting and synthesizing evidence from the identification and analysis of relevant studies from multiple sources. To this aim, they use a well-defined methodology meant to mitigate the risks of biases and ensure repeatability for later updates. SRs, however, involve significant effort. Goal. The go...
Context
Systematic Reviews (SRs) are means for collecting and synthesizing evidence from the identification and analysis of relevant studies from multiple sources. To this aim, they use a well-defined methodology meant to mitigate the risks of biases and ensure repeatability for later updates. SRs, however, involve significant effort.
Goal
The goa...
In this paper we focus on the International Journal of Human-Computer Studies (IJHCS) as a domain of analysis, to gain insights about its evolution in the past 50 years and what this evolution tells us about the research landscape associated with the journal. To this purpose we use techniques from the field of Science of Science and analyse the rel...
In this paper we focus on the International Journal of Human-Computer Studies (IJHCS) as a domain of analysis, to gain insights about its evolution in the past 50 years and what this evolution tells us about the research landscape associated with the journal. To this purpose we use techniques from the field of Science of Science and analyse the rel...
In the last decade, the research literature has reached an enormous volume with an unprecedented current annual increase of 1.5 million new publications. As research gets ever more global and new countries and institutions, either from academia or corporate environments, start to contribute, it is important to monitor this complex phenomenon and un...
Knowledge graphs (KG) are large networks of entities and relationships, typically expressed as RDF triples, relevant to a specific domain or an organization. Scientific Knowledge Graphs (SKGs) focus on the scholarly domain and typically contain metadata describing research publications such as authors, venues, organizations, research topics, and ci...
[article available at https://arxiv.org/abs/1806.04055]
Software architecture (SA) is celebrating 25 years. This is so if we consider the seminal papers establishing SA as a distinct discipline and scientific publications that have identified cornerstones of both research and practice, like architecture views, architecture description languages, a...
The identification of research topics and trends is an important scientometric activity, as it can help guide the direction of future research. In the Semantic Web area, initially topic and trend detection was primarily performed through qualitative, top-down style approaches, that rely on expert knowledge. More recently, data-driven, bottom-up app...
In the last decade, research literature reached an enormous volume with an unprecedented current annual increase of 1.5 million new publications. As research gets ever more global and new countries and institutions, either from academia or corporate environment, start to contribute with their share, it is important to monitor this complex scenario...
This book constitutes the refereed proceedings of the 3rd International Workshop, SAVE-SD 2017, held in Perth, Australia, in April 2017, and the 4th International Workshop, SAVE-SD 2018, held in Lyon, France, in April 2018. The 6 full, 2 position and 4 short papers were selected from 16 submissions. The papers describe multiple ways in which schola...
Technologies such as algorithms, applications and formats are an important part of the knowledge produced and reused in the research process. Typically, a technology is expected to originate in the context of a research area and then spread and contribute to several other fields. For example, Semantic Web technologies have been successfully adopted...
Purpose
This paper introduces the Research Articles in Simplified HTML (or RASH), which is a Web-first format for writing HTML-based scholarly papers; it is accompanied by the RASH Framework, a set of tools for interacting with RASH-based articles. The paper also presents an evaluation that involved authors and reviewers of RASH articles submitted...
Purpose: this paper introduces the Research Articles in Simplified HTML (or RASH ), which is a Web-first format for writing HTML-based scholarly papers; it is accompanied by the RASH Framework , a set of tools for interacting with RASH-based articles. The paper also presents an evaluation that involved authors and reviewers of RASH articles submitt...
Purpose: this paper introduces the Research Articles in Simplified HTML (or RASH ), which is a Web-first format for writing HTML-based scholarly papers; it is accompanied by the RASH Framework , a set of tools for interacting with RASH-based articles. The paper also presents an evaluation that involved authors and reviewers of RASH articles submitt...
The ability to promptly recognise new research trends is strategic for many stakeholders, including universities, institutional funding bodies, academic publishers and companies. While the literature describes several approaches which aim to identify the emergence of new research topics early in their lifecycle, these rely on the assumption that th...
The third edition of the Workshop on Semantics, Analytics and Visualisation: Enhancing Scholarly Data (SAVE-SD 2017) is taking place in Perth, Australia on the 3rd of April 2017, co-located with the 26th International World Wide Web Conference. The main goal of the workshop is to provide a venue for researchers, publishers and other companies to en...
The ability to recognise new research trends early is strategic for many stakeholders, such as academics, institutional funding bodies, academic publishers and companies. While the state of the art presents several works on the identification of novel research topics, detecting the emergence of a new research area at a very early stage, i.e., when...
The ability to recognise new research trends early is strategic for many stakeholders, such as academics, institutional funding bodies, academic publishers and companies. While the state of the art presents several works on the identification of novel research topics, detecting the emergence of a new research area at a very early stage, i.e., when...
The ontology engineering research community has focused for many years on supporting the creation, development and evolution of ontologies. Ontology forecasting, which aims at predicting semantic changes in an ontology, represents instead a new challenge. In this paper, we want to give a contribution to this novel endeavour by focusing on the task...
In recent years we have seen the emergence of a variety of scholarly datasets. Typically these capture ‘standard’ scholarly entities and their connections, such as authors, affiliations, venues, publications, citations, and others. However, as the repositories grow and the technology improves, researchers are adding new entities to these repositori...
The process of classifying scholarly outputs is crucial to ensure timely access to knowledge. However, this process is typically carried out manually by expert editors, leading to high costs and slow throughput. In this paper we present Smart Topic Miner (STM), a novel solution which uses semantic web technologies to classify scholarly publications...
Purpose: this paper introduces the Research Articles in Simplified HTML (or RASH ), which is a Web-first format for writing HTML-based scholarly papers; it is accompanied by the RASH Framework , i.e. a set tools for interacting with RASH-based articles. The paper also presents an evaluation that involved authors and reviewers of RASH articles, subm...
The ability to recognise new research trends early is strategic for many stakeholders, such as academics, institutional funding bodies, academic publishers and companies. While the state of the art presents several works on the identification of novel research topics, detecting the emergence of a new research area at a very early stage, i.e., when...
In this poster paper we introduce the RASH Online Conversion Service, i.e., a Web application that allows the conversion of ODT documents into RASH, a HTML-based markup language for writing scholarly articles, and from RASH into LaTeX. This tool allows authors with no experience in HTML to easily produce HTML-based papers and supports the publishin...
The second edition of the Workshop on Semantics, Analytics and Visualisation: Enhancing Scholarly Data (SAVE-SD 2016) is held in Montreal, Canada on April 11, 2016 and co-located with the 25th International World Wide Web Conference. Its main goal is bringing together publishers, companies and researchers working on semantics, analytics and visuali...
The natural language processing (NLP) community has developed a variety of methods for extracting and disambiguating information from research publications. However, they usually focus only on standard research entities such as authors, affiliations, venues, references and keywords. We propose a novel approach, which combines NLP and semantic techn...
Questions
Question (1)
R-Classify (https://cso.kmi.open.ac.uk/classify/) is a new open application for extracting #ResearchTopics from articles in #ComputerScience. You can use it to select the best keywords for making your paper findable!
It is based on a technology developed in collaboration with Springer Nature for improving the #findability of scientific documents in #DigitalLibraries and on the Web. Its application led to a 25% increase in the number of downloads of the processed papers ( ). It makes use of the Computer Science Ontology (https://cso.kmi.open.ac.uk/), a large #KnowledgeGraph of research topics in this field.
Preprint: