Article

The Semantic Web

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... This extends the capabilities of a global network of documents by switching over to a data network (Web of Data) and provides in-document data links that can identify any object, persons, or concept. The concept [64] in a special format that allows program agents to disclose their semantics, i.e., their meaning or content. The authors of [64] assigned the role of an explicit and formal specification of semantics to the ontology, considering it the true core of the new approach. ...
... The concept [64] in a special format that allows program agents to disclose their semantics, i.e., their meaning or content. The authors of [64] assigned the role of an explicit and formal specification of semantics to the ontology, considering it the true core of the new approach. In a general sense, an ontology is a system of concepts of a subject area that is represented as a set of entities connected by various relations. ...
Article
The review summarizes the results of the long-term work carried out at the Glushko Thermocenter for the creation of topical databases and the use of new information technologies ensuring the integration of diversified electronic resources. The basic principles of the IVTANTHERMO thermodynamic database and the latest results on the expansion of its content and functional capabilities are examined in detail. The THERMAL thermophysical database, which includes bibliographic data on a wide range of thermophysical, optical, electrical, and other physical properties, is described. The future plans to update the THERMAL database and expand its scope are reviewed. The advantages of modern information technologies for the solution of pressing problems of the integration of diversified resources (such as databases, text documents, spreadsheets, plots, and data files in proprietary formats) using a unified infrastructure are studied. It is demonstrated that ontological modeling can be used as the most effective tool of categorization and search in organizing a flexible data structure peculiar to substances and materials with properties that depend on the type of sample, manufacturing technology, environmental effects, etc.
... Communication in social media is semantic (meaningful). Each individual social media user is considered unique and cannot be generalized (Berners-Lee et al., 2001). Therefore, measuring engagement is needed to be a guide for communicators conducting marketing communications by paying attention to patterns that emerge when uploading content on social media. ...
Article
Full-text available
In the era of Industry 4.0, social media user engagement contributed to the effectiveness of marketing communications. Companies that use social media as communication marketing channels are still trying to define the concept of engagement according to the subjective understanding and definition of measurement offered by social media insights. This article provides an update on the concept of social media user engagement in the context of marketing communication in Indonesia. Through a qualitative approach to the study of literature and interviews with some informants who use social media as a marketing communication platform, it was found that the involvement of social media users serves to measure and find patterns of content that are effectively used for marketing communication. In general, social media user engagement can also be grouped into three dimensions: affective, cognitive, and behavioral engagement.
... Les technologies du web sémantique facilitent la représentation, la publication, les liens entre les données et la recherche d'information [62]. Au lieu de naviguer à travers des pages webs, le web sémantique propose de naviguer à travers les données [63]. D'un web de données en silo où l'information est inaccessible aux machines, le web sémantique, ...
Thesis
Full-text available
Le développement des technologies numériques a conduit à la numérisation des informations médicales et à la dématérialisation des dossiers papiers en dossiers patients informatisés (DPI). Les données générées dans un hôpital contiennent des informations précieuses pour la recherche médicale. Les hôpitaux ont mis en place des entrepôts de données (EDS) pour faciliter l’utilisation secondaire des données. Dans un EDS, les chercheurs ont besoin d’identifier les patients éligibles à une étude clinique et de retourner au DPI pour remplir le cahier d’observation électronique d’une étude. La principale difficulté réside dans le caractère non structuré des informations médicales présentes sous forme de texte libre. Des méthodes de traitement automatique de la langue sont nécessaires pour structurer les données afin de faciliter leur interrogation et leur extraction. L’objectif de cette thèse était de développer des outils et des méthodes pour aider les chercheurs à mener des études de faisabilité et à trouver des informations dans un DPI. Les principales contributions de cette thèse sont les suivantes: une terminologie sur les médicaments en langue française. De nombreuses études s’intéressent à l’utilisation, l’efficacité et à la tolérance des médicaments en vie réelle. Les médicaments permettent aussi d’identifier certaines maladies. L’absence d’une terminologie normalisée du médicament a conduit à la construction de Romedi, référentiel ouvert du médicament, qui offre de bonnes performances pour détecter et identifier les médicaments dans les données hospitalières. Un annotateur sémantique scalable à un entrepôt de données. L’annotation sémantique consiste à relier des séquences de mots d’un document aux concepts d’une terminologie. Elle permet la détection et l’indexation de concepts médicaux. Comment indexer des millions de documents d’un EDS avec des terminologies médicales contenant plusieurs centaines de milliers de termes ? Dans ce travail, nous proposons un nouvel algorithme, IAMsystem, scalable à l’échelle d’un entrepôt de données et dont la complexité dépend peu de la taille d’une terminologie. Un inventaire de sens des abréviations médicales. Les abréviations sont largement utilisées en médecine. Elles ajoutent de la complexité aux tâches de traitement automatique de la langue et doivent être prises en compte par un annotateur sémantique. Ce travail présente deux algorithmes pour détecter automatiquement des abréviations à partir d’un corpus de documents médicaux et propose le premier inventaire d’abréviations issu de données hospitalières en langue française. Une stratégie d’appariements de données hospitalières avec les certificats de décès Le statut vital des individus est d’une importance capitale pour de nombreuses études épidémiologiques et les études de faisabilité ont besoin de connaître si les patients éligibles sont vivants ou décédés. Les grands volumes de données nécessitent de recourir à un stratagème pour diminuer le nombre de comparaisons. Nous montrons qu’un modèle d’espace vectoriel offre d’excellents résultats pour diminuer le nombre de comparaisons et qu’il est possible de générer automatiquement un gold standard à partir de données hospitalières pour apparier données hospitalières et certificats de décès par apprentissage automatique. Une interface pour la revue des DPI. Une interface, SmartCRF, a été développée pour rechercher rapidement des informations dans un DPI. Elle est constituée d’une ligne de vie, d’un moteur de recherche, d’une visionneuse de documents et d’un système de recommandation. Par rapport au logiciel métier, elle permet de diminuer le temps passé à vérifier les critères d’inclusion et d’exclusion d’une étude de faisabilité et elle facilite le remplissage d’un cahier d’observation électronique.
Article
Full-text available
Metal–organic polyhedra (MOPs) are discrete, porous metal–organic assemblies known for their wide-ranging applications in separation, drug delivery, and catalysis. As part of The World Avatar (TWA) project—a universal and interoperable knowledge model—we have previously systematized known MOPs and expanded the explorable MOP space with novel targets. Although these data are available via a complex query language, a more user-friendly interface is desirable to enhance accessibility. To address a similar challenge in other chemistry domains, the natural language question-answering system “Marie” has been developed; however, its scalability is limited due to its reliance on supervised fine-tuning, which hinders its adaptability to new knowledge domains. In this article, we introduce an enhanced database of MOPs and a first-of-its-kind question-answering system tailored for MOP chemistry. By augmenting TWA’s MOP database with geometry data, we enable the visualization of not just empirically verified MOP structures but also machine-predicted ones. In addition, we renovated Marie’s semantic parser to adopt in-context few-shot learning, allowing seamless interaction with TWA’s extensive MOP repository. These advancements significantly improve the accessibility and versatility of TWA, marking an important step toward accelerating and automating the development of reticular materials with the aid of digital assistants.
Article
Full-text available
Ontologies define the main concepts and relations of a domain and can play the role of common language between domain experts, software developers and computer systems, allowing for easier and more comprehensive data management. Ontologies can provide a structure and context for data, enabling better analysis and decision‐making. Ontologies can be leveraged for improving various Machine Learning‐based tasks (they can be used for improving the accuracy and consistency of training data, and we can combine ML‐based predictions with ontology‐based reasoning). Ontologies are key components for achieving semantic data integration. In the context of this deliverable, we have surveyed 40 ontologies and 7 other knowledge organization systems related to food safety and we have categorized them according to a set of appropriate criteria. Subsequently we analysed the 18 case studies, that could involve ontologies, and for each one we have described the possible use of ontologies and what would be the benefit. Finally the identified case studies have been evaluated with respect to a set of criteria regarding benefits, cost and maturity.
Article
Full-text available
This article proposes a framework of linked software agents that continuously interact with an underlying knowledge graph to automatically assess the impacts of potential flooding events. It builds on the idea of connected digital twins based on the World Avatar dynamic knowledge graph to create a semantically rich asset of data, knowledge, and computational capabilities accessible to humans, applications, and artificial intelligence. We develop three new ontologies to describe and link environmental measurements and their respective reporting stations, flood events, and their potential impact on population and built infrastructure as well as the built environment of a city itself. These coupled ontologies are deployed to dynamically instantiate near real-time data from multiple fragmented sources into the World Avatar. Sequences of autonomous agents connected via the derived information framework automatically assess consequences of newly instantiated data, such as newly raised flood warnings, and cascade respective updates through the graph to ensure up-to-date insights into the number of people and building stock value at risk. Although we showcase the strength of this technology in the context of flooding, our findings suggest that this system-of-systems approach is a promising solution to build holistic digital twins for various other contexts and use cases to support truly interoperable and smart cities.
Article
Full-text available
According to the World Health Organization (WHO) data from 2000 to 2019, the number of people living with Diabetes Mellitus and Chronic Kidney Disease (CKD) is increasing rapidly. It is observed that Diabetes Mellitus increased by 70% and ranked in the top 10 among all causes of death, while the rate of those who died from CKD increased by 63% and rose from the 13th place to the 10th place. In this work, we combined the drug dose prediction model, drug-drug interaction warnings and drugs that potassium raising (K-raising) warnings to create a novel and effective ontology-based assistive prescription recommendation system for patients having both Type-2 Diabetes Mellitus (T2DM) and CKD. Although there are several computational solutions that use ontology-based systems for treatment plans for these type of diseases, none of them combine information analysis and treatment plans prediction for T2DM and CKD. The proposed method is novel: (1) We develop a new drug-drug interaction model and drug dose ontology called DIAKID (for drugs of T2DM and CKD). (2) Using comprehensive Semantic Web Rule Language (SWRL) rules, we automatically extract the correct drug dose, K-raising drugs, and drug-drug interaction warnings based on the Glomerular Filtration Rate (GFR) value of T2DM and CKD patients. The proposed work achieves very competitive results, and this is the first time such a study conducted on both diseases. The proposed system will guide clinicians in preparing prescriptions by giving necessary warnings about drug-drug interactions and doses.
Article
Full-text available
With proliferation of Big Data, organizational decision making has also become more complex. Business Intelligence (BI) is no longer restricted to querying about marketing and sales data only. It is more about linking data from disparate applications and also churning through large volumes of unstructured data like emails, call logs, social media, News, and so on in an attempt to derive insights that can also provide actionable intelligence and better inputs for future strategy making. Semantic technologies like knowledge graphs have proved to be useful tools that help in linking disparate data sources intelligently and also enable reasoning through complex networks that are created as a result of this linking. Over the last decade the process of creation, storage, and maintenance of knowledge graphs have sufficiently matured, and they are now making inroads into business decision making also. Very recently, these graphs are also seen as a potential way to reduce hallucinations of large language models, by including these during pre‐training as well as generation of output. There are a number of challenges also. These include building and maintaining the graphs, reasoning with missing links, and so on. While these remain as open research problems, we present in this article a survey of how knowledge graphs are currently used for deriving business intelligence with use‐cases from various domains. This article is categorized under: Algorithmic Development > Text Mining Application Areas > Business and Industry
Chapter
The paper deals with the impact of the digital transformation on the research, documentation and dissemination of historical information, with a focus on the spatial development history of cities. The Digital Urban History Lab aims to exploit the challenges and opportunities of this digital transformation to move one step closer towards ‘Serious 3D’ in research, education and popularization of cultural heritage. The use of digital 3D reconstructions in scholarly projects, documentaries, and exhibitions has become increasingly common. However, unresolved issues have arisen regarding the scholarly nature of these reconstructions and their findability, accessibility, interoperability and reusability. The Digital Urban History Lab addresses the above questions using the example of reprocessing, documenting, and communicating the latest findings about medieval cities: Mainz, Worms and Speyer. The focus is on the sustainability of research data and includes the development of a CIDOC CRM referenced data model and a virtual research environment using Linked Data technologies. The Digital Urban History Lab represents an exhibition space where the 3D models are presented along with interactive access to the knowledge behind them. The focus of the consideration is the working method of a source-based hypothetical 3D reconstruction of the past, which is hidden behind the concept of ‘Scientific Reference Model’. Overall, the project illustrates the potential of scientifically based 3D models, supported by structured, semantically enriched, referenceable research data, which ensure accessibility and reusability, among others, for research, education, creative industries, etc. KeywordsSerious 3DScientific Reference ModelHypothetical 3d reconstructiondata modelingCIDOC CRM
Conference Paper
В работе рассматривается вариант решения ряда проблем компьютерной поддержки научных исследований через концепцию создания научной распределенной децентрализованной информационно-вычислительной сети, состоящей из типовых узлов научных данных и вычислений. В основе рассматриваемой концепции лежат идеи Semantic Web об автономных программных агентах, ситуативно коммуницирующих с поставщиками данных и вычислений, представленных в виде множества узкоспециализированных научных информационно-вычислительных веб-систем, поддерживаемых малыми научными коллективами. This paper considers a variant of solving a number of problems of computer support of scientific research through the concept of creating a scientific distributed decentralized information-computing network, consisting of typical nodes of scientific data and computations. The concept is based on Semantic Web ideas about autonomous software agents, situationally communicating with data and computation providers, represented as a set of highly specialized scientific information-computing web systems, supported by small scientific teams.
Conference Paper
Доклад посвящен описанию технологии построения программного решения, автоматизирующего рутинные процессы, связанные с развитием оригинальных моделей TSUNM3 и CTM. Рассматриваемая прикладная научная информационно-вычислительная веб-система Метео+ предоставляет компьютерную поддержку научных исследований, в первую очередь, профессионалам в вычислительной геофизике, занимающимся разработкой и усовершенствованием собственных вычислительных моделей. При этом формируемая коллекция результатов моделирования позволяет привлекать уже других специалистов для детального анализа. The report is devoted to describing the technology of building a software solution that automates routine processes related to the development of the original TSUNM3 and CTM models. The considered applied scientific information-computational web-system Meteo+ provides computer support of scientific research, first of all, to professionals in computational geophysics who are engaged in development and improvement of their own computational models. At the same time, the formed collection of modeling results allows to involve already other specialists for detailed analysis.
Article
Full-text available
Several industry‐specific metadata initiatives have historically facilitated structured data modeling for the web in domains such as commerce, publishing, social media, and so forth. The metadata vocabularies produced by these initiatives allow developers to “wrap” information on the web to provide machine‐readable signals for search engines, advertisers, and user‐facing content on apps and websites, thus assisting with surfacing facts about people, places, and products. A universal iteration of such a project called Schema.org started in 2011, resulting from a partnership between Google, Microsoft, Yahoo, and Yandex to collaborate on a single structured data model across domains. Yet, few studies have explored the metadata vocabulary terms in this significant web resource. What terms are included, upon what subject domains do they focus, and how does Schema.org represent knowledge in its conceptual model? This article presents findings from our extraction and analysis of the documented release history and complete hierarchy on Schema.org's developer pages. We provide a semantic network visualization of Schema.org, including an analysis of its modularity and domains, and discuss its global significance concerning fact‐checking and COVID‐19. We end by theorizing Schema.org as a gatekeeper of data on the web that authors vocabulary that everyday web users encounter in their searches.
Chapter
In archaeology, maps have always played an important role for the primary documentation of findings from fieldwork, whether from excavation or survey. Since the advent of computerized database systems in the 1970s, archaeologists have amassed huge amounts of digital spatial information, usually stored in a geographic information system (GIS). From its beginning, archaeologists have been very eager to explore the analytical possibilities of GIS. The initial focus of GIS use in archaeology was on landscape archaeology. An important technological development in GIS over the past 20 years has been the development of shared data services. The process of data collection and management is one side of working with spatial information in GIS, but making sense of it is equally important. A major trend in software development over the past decade has been a move away from proprietary tools to free and open source software, in particular in academia.
Chapter
Big Data has become an increasingly cited buzzword in archaeological research over the past decade. Archaeology is a discipline than spans both the sciences and the humanities/social sciences/arts: various departments of archaeology at UK universities fall under the remit of all of these different divisions. This chapter looks at some of the projects that have or could claim to have conducted Big Data archaeological research up until the time of writing in late 2019. Visualization is the task of translating information into human‐perceptible form, easing understanding through the enhancement of visual perception/cognition. Spatial binning is widely used in sciences such as biology/zoology and has been regularly used in archaeology to plot the results of field‐walking surveys. Fundamental to the practice of Big Data analytics in archaeology is the construction, adaptation, and maintenance of workflows that allow the efficient processing and interpretation of large datasets.
Chapter
A narrative traffic jam emerges in the current public debate when it involves immigration and language. On the one hand, a one‐way narrative could create exclusion, labeling, disinformation; on the other, a contamination of the mother tongue could be a precious tool for intercultural dialogue. Immigration sits within economic, social, cultural, political, and technological fields whose effects – positive or negative – have global relevance, starting from language. It is a multidimensional process, continuously translated and interpreted, that shows the degree of integration that the immigrant holds within the host society; a multidimensional market that exists and resists, admits and excludes. It is also a place of encounter between diversity and hospitality, tradition and innovation; a time of possibilities off‐ and online.
Chapter
Just as inexpensive and powerful devices are driving ecosystems of connected Internet of Things in work and home environments, the battlefield has experienced a proliferation of and reliance on networked devices. However, current networking technologies are ill‐suited for content sharing in these emerging military networks where fixed infrastructures and constant connectivity cannot be assumed. The communication paradigm, content‐based networking, is proving to be a highly effective solution for operation in mobile infrastructure‐less environments where intermittent and disrupted connectivity is expected. We present approaches and tradeoffs for content‐centric military IoT architectures that facilitate generation and dissemination of content in challenging tactical edge environments. These architectures are designed to address information flow across different underlying tactical data links by managing the dissemination of mission‐critical information across an overlay network optimized for disconnected, intermittent, and limited (DIL) connectivity operations.
Book
The monitoring of resuscitation patients is one of the most critical healthcare tasks. The Internet of Things (IoT) medical devices enable healthcare providers to continuously monitor patients by providing them with vital signs data. This chapter proposes a knowledge representation and reasoning framework to semantically annotate data for analyzing the semantics of vital signs monitors and data that come from them. It addresses complex queries on semantically annotated data. The datasets is published as Linked Data, which will allow querying data streams and enriching them with other datasets to obtain additional information. presents work that used the IoT in healthcare applications and applies semantic in smart health. The chapter provides the detail and implementation of the proposed approach to enhance the monitoring of patients in intensive healthcare units. It also provides conclusion and future works.
Thesis
Dans le cadre de sa quatrième révolution, le monde industriel subit une forte digitalisation dans tous les secteurs d’activité. Les travaux de recherche de cette thèse s’intègrent dans un contexte de transition vers l’industrie du futur, et plus spécifiquement dans les industries d’usinage mécanique. Ces travaux de recherche répondent ainsi à la problématique d’intégration données et connaissances industrielles, comme support aux systèmes d’aide à la décision. L’approche proposée est appliquée au diagnostic de défaillance des machines d’usinage connectées. Cette thèse propose, dans un premier temps, un cadre conceptuel pour la structuration de bases de données et de connaissances hétérogènes, nécessaires pour la mise en place du SAD.Grace à une première fonction de traçabilité, le système capitalise la description des caractéristiques de tous les événements particuliers et les phénomènes malveillants pouvant apparaître au moment de l’usinage. La fonction de diagnostic permet de comprendre les causes de ces défaillances et de proposer des solutions d’amélioration, à travers la réutilisation des connaissances stockées dans l’ontologie du domaine et un raisonnement à base de règles métiers. Le système à base de connaissances proposé est implémenté dans un Framework global d’aide à la décision, développé dans le cadre du projet ANR collaboratif appelé Smart Emma. Une application pratique a été faite sur deux bases de données réelles provenant de deux industriels différents
Article
Full-text available
Knowledge graphs are widely used in information queries. They are built using triples from knowledge bases, which are extracted with varying accuracy levels. Accuracy plays a key role in a knowledge graph, and knowledge graph construction uses several techniques to refine and remove any inaccurate triples. There are many algorithms that have been employed to refine triples while constructing knowledge graphs. These techniques use the information about triples and their connections to identify erroneous triples. However, these techniques lack in effective correspondence to human evaluations. Hence, this paper proposes a machine learning approach to identify inaccurate triples that correspond to actual human evaluations by injecting supervision through a subset of crowd-sourced human evaluation of triples. Our model uses the probabilistic soft logic’s soft truth values and an empirical feature, the fact strength, that we derived based on the triples. We evaluated the model using the NELL and YAGO datasets and observed an improvement of 12.56% and 5.39% in their respective precision. In addition, we achieved an average improvement of 4.44% with the F1 scores, representing a better prediction accuracy. The inclusion of the fact strength augmented the modeling precision by an average of 2.13% and provided a higher calibration. Hence, the primary contribution of this paper is the proposal of a model that effectively identifies erroneous triples, aligning with high correspondence to actual human judgment.
Article
In the era of Internet Technology (IT), uncertainty management is a challenge in many fields. These include e‐commerce, social and sensor networks, scientific data production and mining, object tracking, data integration, geo‐located services, and recently Internet and Web of Things. Due to the uncertain data published on the web, web resources are diverse. Hence, identical resources could be available from heterogeneous platforms and heterogeneous resources could represent the same objects. These resources are hugely heterogeneous, conflict, inconsistent, or have incompatible formats. This uncertainty is inherently related to many facts, such as information extraction and integration. Hence, with web resources proliferation on the web, referencing through the uncertain web has become increasingly difficult. The traditional techniques used for the classical web could not handle uncertain navigation. Generally, it's implicitly represented, decided randomly, or even neglected. Harnessing these uncertain resources to their full potential in order to handle the uncertain navigation, raises major challenges that relate to each phase of their life cycle: creation, representation, and navigation. In this article, we establish a probabilistic approach to model and interpret uncertain web resources. We present operators to compute response uncertainty. Finally, we create algorithms in order to validate resources and achieve uncertain hypertext navigation.
Article
Full-text available
Data‐driven discovery in geoscience requires an enormous amount of FAIR (findable, accessible, interoperable and reusable) data derived from a multitude of sources. Many geology resources include data based on the geologic time scale, a system of dating that relates layers of rock (strata) to times in Earth history. The terminology of this geologic time scale, including the names of the strata and time intervals, is heterogeneous across data resources, hindering effective and efficient data integration. To address that issue, we created a deep‐time knowledge base that consists of knowledge graphs correlating international and regional geologic time scales, an online service of the knowledge graphs, and an R package to access the service. The knowledge base uses temporal topology to enable comparison and reasoning between various intervals and points in the geologic time scale. This work unifies and allows the querying of age‐related geologic information across the entirety of Earth history, resulting in a platform from which researchers can address complex deep‐time questions spanning numerous types of data and fields of study. We created a deep‐time knowledge base that consists of knowledge graphs correlating international and regional geologic time scales, an online service of the knowledge graphs, and an R package to access the service. The knowledge base uses temporal topology to enable comparison and reasoning between various intervals and points in the geologic time scale.
Article
Full-text available
Knowledge graphs (KGs) have emerged as a compelling abstraction for organizing the world's structured knowledge and for integrating information extracted from multiple data sources. They are also beginning to play a central role in representing information extracted by AI systems, and for improving the predictions of AI systems by giving them knowledge expressed in KGs as input. The goals of this article are to (a) introduce KGs and discuss important areas of application that have gained recent prominence; (b) situate KGs in the context of the prior work in AI; and (c) present a few contrasting perspectives that help in better understanding KGs in relation to related technologies.
Article
Full-text available
Data Science (DS) has emerged from the shadows of its parents—statistics and computer science—into an independent field since its origin nearly six decades ago. Its evolution and education have taken many sharp turns. We present an impressionistic study of the evolution of DS anchored to Kuhn's four stages of paradigm shifts. First, we construct the landscape of DS based on curriculum analysis of the 32 iSchools across the world offering graduate‐level DS programs. Second, we paint the “field” as it emerges from the word frequency patterns, ranking, and clustering of course titles based on text mining. Third, we map the curriculum to the landscape of DS and project the same onto the Edison Data Science Framework (2017) and ACM Data Science Knowledge Areas (2021). Our study shows that the DS programs of iSchools align well with the field and correspond to the Knowledge Areas and skillsets. iSchool's DS curriculums exhibit a bias toward “data visualization” along with machine learning, data mining, natural language processing, and artificial intelligence; go light on statistics; slanted toward ontologies and health informatics; and surprisingly minimal thrust toward eScience/research data management, which we believe would add a distinctive iSchool flavor to the DS.
Article
Full-text available
Distributional semantic models like the Latent Dirichlet Allocation (LDA) model Guo et al. (Concurr. Comput.: Pract. Exper. 29(3), 319–343 2016) consist of defining similar representation of words according to their similar context. LDA has been originally used to model documents and extract topics in Information Retrieval. In recent years, LDA has become a hot topic among ontology learning because of the exponential increase of the number of documents and textual data not only on the web but also in digital libraries. LDA-based approaches have proven to provide the best result. However, they suffer of several limitations related to concept and relation extraction, as well as handling the corpus evolution and maintaining. In order to cope with these problems, we propose in this paper LEOnto⁺, an extended version of LEOnto (Tissaoui et al. 2020, Tissaoui et al. SN Comput. Sci. J. 1: 336 2020), to provide a new approach for automatic ontology enriching from textual corpus. In LEOnto⁺, LDA is used to provide dimension reduction and to identify semantic relationships between topic-document and word-topic using probability distributions. Here, we provide several experiments conducted using several evaluation techniques (Evaluation based criteria, Gold standard evaluation, Expert evaluation, Task-based evaluation and Corpus-based evaluation). We also compare the results of LEOnto⁺ with two existing methods using their respective datasets. The evaluation results show that LEOnto⁺ outperforms the aforementioned methods (particularly in terms of precision). We also compare our approach using two large corpus in order to demonstrate its scalability.
Article
Full-text available
Narrative cartography is a discipline which studies the interwoven nature of stories and maps. However, conventional geovisualization techniques of narratives often encounter several prominent challenges, including the data acquisition & integration challenge and the semantic challenge. To tackle these challenges, in this paper, we propose the idea of narrative cartography with knowledge graphs (KGs). Firstly, to tackle the data acquisition & integration challenge, we develop a set of KG-based GeoEnrichment toolboxes to allow users to search and retrieve relevant data from integrated cross-domain knowledge graphs for narrative mapping from within a GISystem. With the help of this tool, the retrieved data from KGs are directly materialized in a GIS format which is ready for spatial analysis and mapping. Two use cases — Magellan’s expedition and World War II — are presented to show the effectiveness of this approach. In the meantime, several limitations are identified from this approach, such as data incompleteness, semantic incompatibility, and the semantic challenge in geovisualization. For the later two limitations, we propose a modular ontology for narrative cartography, which formalizes both the map content (Map Content Module) and the geovisualization process (Cartography Module). We demonstrate that, by representing both the map content and the geovisualization process in KGs (an ontology), we can realize both data reusability and map reproducibility for narrative cartography.
Article
Full-text available
The use of Semantic Web and linked data increases the possibility of data accessibility, interpretability, and interoperability. It supports cross-domain data and knowledge sharing and avoids the creation of research data silos. Widely adopted in several research domains, the use of the Semantic Web has been relatively limited with respect to sustainability assessments. A primary barrier is that the framework of the principles and technologies required to link and query data from the Semantic Web is often beyond the scope of industrial ecologists. Linking of a dataset to Semantic Web requires the development of a semantically linked core ontology in addition to the use of existing ontologies. Ontologies provide logical meaning to the data and the possibility to develop machine-readable data format. To enable and support the uptake of semantic ontologies, we present a core ontology developed specifically to capture the data relevant for life cycle sustainability assessment. We further demonstrate the utility of the ontology by using it to integrate data relevant to sustainability assessments, such as EXIOBASE and the Yale Stocks and Flow Database to the Semantic Web. These datasets can be accessed by the machine-readable endpoint using SPARQL, a semantic query language. The present work provides the foundation necessary to enhance the use of Semantic Web with respect to sustainability assessments. Finally, we provide our perspective on the challenges toward the adoption of Semantic Web technologies and technical solutions that can address these challenges.
Article
Graphics, uncertainty, and semantics are three approaches to building models. The combination of the three approaches is a way to develop a stronger modeling method. This article surveys the research efforts toward combining these aspects, which can be divided into two routes: One is to combine graphics and uncertainty as probabilistic graphical models and then incorporate semantics, and the other is to combine graphics and semantics and then incorporate probability to handle uncertainty. The models and methods involved in these efforts are introduced and their expressiveness, pros, and cons are discussed.
Article
With the development of the Semantic Web and Artificial Intelligence techniques, ontology has become a very powerful way of representing not only knowledge but also their semantics. Therefore, how to construct ontologies from existing data sources has become an important research topic. In this paper, an approach for constructing ontologies by mining deep semantics from eXtensible Markup Language (XML) Schemas (including XML Schema 1.0 and XML Schema 1.1) and XML instance documents is proposed. Given an XML Schema and its corresponding XML instance document, 34 rules are first defined to mine deep semantics from the XML Schema. The mined semantics is formally stored in an intermediate conceptual model and then is used to generate an ontology at the conceptual level. Further, an ontology population approach at the instance level based on the XML instance document is proposed. Now, a complete ontology is formed. Also, some corresponding core algorithms are provided. Finally, a prototype system is implemented, which can automatically generate ontologies from XML Schemas and populate ontologies from XML instance documents. The paper also classifies and summarizes the existing work and makes a detailed comparison. Case studies on real XML data sets verify the effectiveness of the approach.
Article
Full-text available
Abstract Over the years, a growing number of semantic data repositories have been made available on the web. However, this has created new challenges in exploiting these resources efficiently. Querying services require knowledge beyond the typical user’s expertise, which is a critical issue in adopting semantic information solutions. Several proposals to overcome this difficulty have suggested using question answering (QA) systems to provide user‐friendly interfaces and allow natural language use. Because question answering over knowledge bases (KBQAs) is a very active research topic, a comprehensive view of the field is essential. The purpose of this study was to conduct a systematic review of methods and systems for KBQAs to identify their main advantages and limitations. The inclusion criteria rationale was English full‐text articles published since 2015 on methods and systems for KBQAs. Sixty‐six articles were reviewed to describe their underlying reference architectures.
Article
Self-adaptive service-oriented Applications (Self-Apps) must be able to understand themselves or the environment in which they are executed, and propose solutions to meet changing conditions. The development of these applications is not a trivial task, since it encompasses issues from different research areas. Despite the importance of frameworks for Self-Apps, there is a lack of comprehensive analysis of how the design of such applications is performed, and regarding the standardization of concepts and coverage of minimum requirements for Self-Apps. The main contribution of this article is to present this comprehensive analysis, providing the state of the art for this subject. This analysis was built through a Systematic Mapping Study, based on a total of 65 studies, from which we identify the main attributes for Quality of Service (QoS), search strategies, and service management strategies employed in the design of frameworks for Self-Apps. The main aspects of requirements involved in the design of Self-Apps were pointed out to stakeholders. For example: these applications must implement a method for evaluation of QoS based on metrics. We also put forward the S-Frame, a modular solution that brings together the main features for the design of Self-Apps, and describe the main challenges concerning these applications.
Article
Full-text available
In this paper, a novel medical knowledge graph in Chinese approach applied in smart healthcare based on IoT and WoT is presented, using deep neural networks combined with self-attention to generate medical knowledge graph to make it more convenient for performing disease diagnosis and providing treatment advisement. Although great success has been made in the medical knowledge graph in recent studies, the issue of comprehensive medical knowledge graph in Chinese appropriate for telemedicine or mobile devices have been ignored. In our study, it is a working theory which is based on semantic mobile computing and deep learning. When several experiments have been carried out, it is demonstrated that it has better performance in generating various types of medical knowledge graph in Chinese, which is similar to that of the state-of-the-art. Also, it works well in the accuracy and comprehensive, which is much higher and highly consisted with the predictions of the theoretical model. It proves to be inspiring and encouraging that our work involving studies of medical knowledge graph in Chinese, which can stimulate the smart healthcare development.
Article
Full-text available
Covid-19 is an acute respiratory infection and presents various clinical features ranging from no symptoms to severe pneumonia and death. Medical expert systems, especially in diagnosis and monitoring stages, can give positive consequences in the struggle against Covid-19. In this study, a rule-based expert system is designed as a predictive tool in self-pre-diagnosis of Covid-19. The potential users are smartphone users, healthcare experts and government health authorities. The system does not only share the data gathered from the users with experts, but also analyzes the symptom data as a diagnostic assistant to predict possible Covid-19 risk. To do this, a user needs to fill out a patient examination card that conducts an online Covid-19 diagnostic test, to receive an unconfirmed online test prediction result and a set of precautionary and supportive action suggestions. The system was tested for 169 positive cases. The results produced by the system were compared with the real PCR test results for the same cases. For patients with certain symptomatic findings, there was no significant difference found between the results of the system and the confirmed test results with PCR test. Furthermore, a set of suitable suggestions produced by the system were compared with the written suggestions of a collaborated health expert. The suggestions deduced and the written suggestions of the health expert were similar and the system suggestions in line with suggestions of the expert. The system can be suitable for diagnosing and monitoring of positive cases in the areas other than clinics and hospitals during the Covid-19 pandemic. The results of the case studies are promising, and it demonstrates the applicability, effectiveness, and efficiency of the proposed approach in all communities.
Chapter
The virtual representation and integration of the internet with the physical objects, devices or things have been growing exponentially in recent years. This has motivated the community to design and develop new Internet of Things (IoT) platforms to cater, capture, access, store, share, and communicate data for information retrieval and intelligent applications. However, the associated dynamism, resource-constrain, cost and the nature of the IoT warrants special design obligations for its effectiveness in the days ahead, hence pose a challenge to the community. The understanding of web data from machines according to the subject of terminology in different fields is a complex task. It opens up new challenges to researchers as such an effort mandates the provision of semantically structured, appropriate information sources in this information age. The advent of numerous smart devices, operators, and IoT service providers subject to time-consuming and complex operations, inadequate research and innovations give rise to design complexity. For efficient functioning and effective implementation of the domain requires the inclusion of semantics and the desired interoperability among these factors. This motivates the authors to review and emphasizes a few of the emerging trends of the semantic technology impacting the IoT. Particularly, the work focuses on different aspects as information modeling, ontology design, machine learning, network tools, security policy and processing of semantic data—and discuss the issues and challenges in the current scenario.
Chapter
P3 is a petri dish brimming with questions, not answers, but suggestions, to explore. The aim is not to teach or pontificate but may swing the proverbial pendulum between science and engineering in the context of commercial and consumer services. The reader may ponder about the amorphous questions or wonder in confusion. We disrupt the status quo and indulge in orthogonal, nonlinear, and asymmetric information arbitrage which may not be correct. This is a seed, sterile unless cultivated. We aspire to inform that tools and data related to the affluent world are not a template to be “copied” or applied to systems in the remaining (80%) parts of the world which suffer from economic constraints. We need different thinking that resists the inclination of the affluent 20% of the world to treat the rest of the world (80% of the population) as a market. The 80/20 concept evokes the Pareto [1] theme in P3 and the implication is that ideas may float between (porous) the 80/20 domains (partition).
Article
Full-text available
General ontology is a prominent theoretical foundation for information technology analysis, design, and development. Ontol-ogy is a branch of philosophy which studies what exists in reality. A widely used ontology in information systems, especially for conceptual modeling, is the BWW (Bunge-Wand-Weber), which is based on ideas of the philosopher and physicist Mario Bunge, as synthesized by Wand and Weber. The ontology was founded on an early subset of Bunge's philosophy; however, many of Bunge's ideas have evolved since then. An important question, therefore, is: do the more recent ideas expressed by Bunge call for a new ontology? In this paper, we conduct an analysis of Bunge's earlier and more recent works to address this question. We present a new ontology based on Bunge's later and broader works, which we refer to as Bunge's System-ist Ontology (BSO). We then compare BSO to the constructs of BWW. The comparison reveals both considerable overlap between BSO and BWW, as well as substantial differences. From this comparison and the initial exposition of BSO, we provide suggestions for further ontology studies and identify research questions that could provide a fruitful agenda for future scholarship in conceptual modeling and other areas of information technology.
Article
Анализируются связи понятий математической предметной области на примере раздела уравнений математической физики. Предлагается вариант статьи тезауруса для терминов и связанных с ними уравнений и формул. Особенность такого тезауруса заключается в использовании контекста формул для их дополнительной идентификации в предметной области. Кроме того, предлагается учитывать индексы авторов и статей, где встречаются термины тезауруса. Предложенный подход способствует уточнению поискового запроса и уменьшению информационного шума при использовании тезауруса в цифровых библиографических коллекциях. The studying is focusing to the relationship of the concepts of the mathematical subject area on the example of the section of mathematical physics equations. A version of the thesaurus article for terms and related equations and formulas is proposed. The peculiarity of such a thesaurus lies in the use of the context of formulas for their additional identification in the subject area. In addition, it is proposed to take into account the indices of authors and articles where thesaurus terms are found. This approach helps to refine the search query and reduce information noise when using the thesaurus in digital bibliographic collections.
Article
Full-text available
Multiple sclerosis (MS) is a neurological disorder that strikes the central nervous system. Due to the complexity of this disease, healthcare sectors are increasingly in need of shared clinical decision-making tools to provide practitioners with insightful knowledge and information about MS. These tools ought to be comprehensible by both technical and non-technical healthcare audiences. To aid this cause, this literature review analyzes the state-of-the-art decision support systems (DSSs) in MS research with a special focus on model-driven decision-making processes. The review clusters common methodologies used to support the decision-making process in classifying, diagnosing, predicting, and treating MS. This work observes that the majority of the investigated DSSs rely on knowledge-based and machine learning (ML) approaches, so the utilization of ontology and ML in the MS domain is observed to extend the scope of this review. Finally, this review summarizes the state-of-the-art DSSs, discusses the methods that have commonalities, and addresses the future work of applying DSS technologies in the MS field.
Article
Full-text available
Problemas relacionados à especificação de requisitos, tais como ambiguidade e incompletude, ainda são recorrentes nos processos de desenvolvimento de sistemas de informação. O reúso de requisitos é um dos mecanismos que podem auxiliar na redução desses contratempos. Nesse sentido, o objetivo deste trabalho é propor um método para a criação e para a publicação de tesauros semânticos de requisitos para reúso, utilizando tecnologias e padrões da Web Semântica e de acordo com os princípios Linked Data. Para descrição formal desses tesauros, a ontologia central utilizada é a Simple Knowledge Organization System. Esse modelo ontológico fornece um conjunto de axiomas e de propriedades voltados para criação de tesauros, permitindo documentar de forma precisa e fidedigna, em um grafo de conhecimento, a definição, a hierarquia e outros inter-relacionamentos entre os requisitos de um sistema. Também é apresentado um protótipo de Web service que funciona como repositório para reúso e demonstra o método na prática. É descrito um estudo sobre a viabilidade da implementação da proposta realizado com profissionais, em que foram promovidos uma discussão em grupo e um posterior preenchimento individual de um questionário de avaliação. O estudo obteve resultados favoráveis, em sua maioria, e algumas sugestões de melhoria foram apontadas. Os participantes consideraram a proposta relevante para a Engenharia de Requisitos e com potencial de expansão, uma vez que as diretrizes apresentadas permitem a criação de novos tipos de inferência e navegabilidade sobre os requisitos armazenados.
Chapter
The World Wide Web (WWW) is global information medium, where users can read and write using computers over internet. Web is one of the services available on internet. The Web was created in 1989 by Sir Tim Berners-Lee. Since then a great refinement has done in the web usage and development of its applications. Semantic Web Technologies enable machines to interpret data published in a machine-interpretable form on the web. Semantic web is not a separate web it is an extension to the current web with additional semantics. Semantic technologies play a crucial role to provide data understandable to machines. To achieve machine understandable, we should add semantics to existing websites. With additional semantics, we can achieve next level web where knowledge repositories are available for better understanding of web data. This facilitates better search, accurate filtering and intelligent retrieval of data. This paper discusses about the Semantic Web and languages involved in describing documents in machine understandable format.
Article
Full-text available
Ontologies have long been employed in the life sciences to formally represent and reason over domain knowledge and they are employed in almost every major biological database. Recently, ontologies are increasingly being used to provide background knowledge in similarity-based analysis and machine learning models. The methods employed to combine ontologies and machine learning are still novel and actively being developed. We provide an overview over the methods that use ontologies to compute similarity and incorporate them in machine learning methods; in particular, we outline how semantic similarity measures and ontology embeddings can exploit the background knowledge in ontologies and how ontologies can provide constraints that improve machine learning models. The methods and experiments we describe are available as a set of executable notebooks, and we also provide a set of slides and additional resources at https://github.com/bio-ontology-research-group/machine-learning-with-ontologies.
Article
Interoperability is a key concern in systems‐of‐systems (SoS). Numerous frameworks have been proposed to deal with this, but they are generally on a high level and do not provide specific guidance for technical implementation. However, in the context of simulation, the Levels of Conceptual Interoperability Model (LCIM) has been proposed. Also, the semantic web initiative has been introduced to provide description logic information to web pages. This paper investigates how these two concepts can be combined into a general approach for SoS interoperability. It also expands on the LCIM model by providing more details about the world models of a system and its content on the higher levels of interoperability. The combination is illustrated using an example of autonomous vehicles, and experiences from other applications are also discussed.
Article
Full-text available
The emergence of the Internet of Things (IoT) in the medical field has led to the massive deployment of a myriad of medical connected objects (MCOs). These MCOs are being developed and implemented for remote healthcare monitoring purposes including elderly patients with chronic diseases, pregnant women, and patients with disabilities. Accordingly, different associated challenges are emerging and include the heterogeneity of the gathered health data from these MCOs with ever‐changing contexts. These contexts are relative to the continuous change of constraints and requirements of the MCOs deployment (time, location, state). Other contexts are related to the patient (medical record, state, age, sex, etc.) that should be taken into account to ensure a more precise and appropriate treatment of the patient. These challenges are difficult to address due to the absence of a reference model for describing the health data and their sources and linking these data with their contexts. This article addresses this problem and introduces a semantic‐based context‐aware system (IoT Medicare system) for patient monitoring with MCOs. This system is based on a core domain ontology (HealthIoT‐O), that is, designed to describe the semantic of heterogeneous MCOs and their data. Moreover, an efficient interpretation and management of this knowledge in diverse contexts are ensured through SWRL rules such as the verification of the proper functioning of the MCOs and the analysis of the health data for diagnosis and treatment purposes. A case study of gestational diabetes disease management is proposed to evaluate the effectiveness of the implemented IoT Medicare system. An evaluation phase is provided and focuses on the quality of the elaborated semantic model and the performance of the system.
Chapter
The computerization of health and medical industry, which includes the employment of information systems and use of technological medical gadgets, produces huge amounts of data in hospitals, clinics, and other medical establishments on a regular basis. This enormous volume of health and medicine associated data from medical records, patient monitoring, etc., continues to grow and thus needs to be managed properly for facilitating better healthcare services and development of enhanced practices and biomedical products. This poses the challenge of finding valuable information, analyzing it, and transforming it to knowledge for enabling better decision making. However, the key challenge of maintaining the interoperability of health‐related data which is huge, disparate, and distributed, needs to be addressed. Internet of Things (IoT) and Semantic Web Technologies (SWTs) are two key emerging technologies that present a significant role in overcoming these challenges of handling and presenting data searches. Their role and usage in addressing the concerns of health and medical sector has been explored through this chapter, along with their challenges, various applications, and the future research scope they offer. Further, the integration of data mining and machine learning for healthcare and medical data analytics is also discussed in this chapter.
Chapter
Recommender System guides the users to choose objects from variety of possible options in personalized manner. Broadly, there are two categories of recommender systems i.e. content based and collaborative filtering based. These systems suggest the items based on the interest of the customers in the past. They personalize the information by using relevant information. These systems are used in various domains like recommending movies, products to purchase, restaurants, places to visit, etc. This chapter deliberates the concepts of content‐based recommender systems by including distinct features in their design and implementation. High level architecture and applications of these systems in various domains are also presented in this chapter.
Article
Spatial data infrastructures (SDIs) provide access to spatial data and services for humans to solve spatial problems but represent a barrier for machines such as search engines or spatial services. In this article we propose an architecture for spatial knowledge infrastructure (SKI), which attempts to overcome these limitations, and describe the interactions between its components. The SKI architecture proposed is illustrated for two scenarios. The first is an architecture for an ideal solution requiring spatial data on the web, and the second is for a practical short‐term solution that takes into consideration legacy SDIs as well as spatial data on the web in its implementation. In addition, the utility of the proposed SKI architecture is illustrated with two examples demonstrating the scope of the SKI in two different ways: where the SKI is a “knowledge enabler,” and where the SKI acts as a “knowledge creator.” A discussion on how well the proposed architecture meets current best practice for spatial data on the web, as well as obstacles and challenges in the implementation of the proposed solutions, concludes the article.
Article
Full-text available
Background: Sharing sensitive data across organizational boundaries is often significantly limited by legal and ethical restrictions. Regulations such as the EU General Data Protection Rules (GDPR) impose strict requirements concerning the protection of personal and privacy sensitive data. Therefore new approaches, such as the Personal Health Train initiative, are emerging to utilize data right in their original repositories, circumventing the need to transfer data. Results: Circumventing limitations of previous systems, this paper proposes a configurable and automated schema extraction and publishing approach, which enables ad-hoc SPARQL query formulation against RDF triple stores without requiring direct access to the private data. The approach is compatible with existing Semantic Web-based technologies and allows for the subsequent execution of such queries in a safe setting under the data provider's control. Evaluation with four distinct datasets shows that a configurable amount of concise and task-relevant schema, closely describing the structure of the underlying data, was derived, enabling the schema introspection-assisted authoring of SPARQL queries. Conclusions: Automatically extracting and publishing data schema can enable the introspection-assisted creation of data selection and integration queries. In conjunction with the presented system architecture, this approach can enable reuse of data from private repositories and in settings where agreeing upon a shared schema and encoding a priori is infeasible. As such, it could provide an important step towards reuse of data from previously inaccessible sources and thus towards the proliferation of data-driven methods in the biomedical domain.
ResearchGate has not been able to resolve any references for this publication.