Sébastien Harispe

Sébastien Harispe
IMT Mines Alès | EMA · EuroMov Digital Health in Motion

PhD Computer Science

About

52
Publications
15,137
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,026
Citations
Additional affiliations
May 2014 - present
IMT Mines Alès
Position
  • PostDoc Position
April 2011 - May 2014
IMT Mines Alès
Position
  • PhD Student

Publications

Publications (52)
Article
Full-text available
This paper presents and compares several text classification models that can be used to extract the outcome of a judgment from justice decisions, i.e., legal documents summarizing the different rulings made by a judge. Such models can be used to gather important statistics about cases, e.g., success rate based on specific characteristics of cases’...
Chapter
Full-text available
Association Rule Mining (ARM) in the context of imperfect data (e.g. imprecise data) has received little attention so far despite the prevalence of such data in a wide range of real-world applications. In this work, we present an ARM approach that can be used to handle imprecise data and derive imprecise rules. Based on evidence theory and Multiple...
Conference Paper
Full-text available
Les approches permettant de regrouper/segmenter des objets parta-geant ou non des caractéristiques similaires sont nombreuses. Des distances classiques sont utilisées selon l'hypothèse que ces caractéristiques sont indé-pendantes et définissent un espace métrique. Cependant, lorsque ces caractéris-tiques sont organisées dans une représentation des...
Conference Paper
Full-text available
Cet article présente la contribution de l'équipe du Laboratoire de Génie Informatique et d'Ingénierie de Production (LGI2P) d'IMT Mines Alès au DÉfi Fouille de Textes (DEFT) 2019. Il détaille en particulier deux approches proposées pour les tâches liées à (1) l'indexation et à (2) la similarité de documents. Ces méthodes reposent sur des techniques...
Chapter
Court decisions are legal documents that undergo careful analysis by lawyers in order to understand how judges make decisions. Such analyses can indeed provide invaluable insight into application of the law for the purpose of conducting many types of studies. As an example, a decision analysis may facilitate the handling of future cases and detect...
Conference Paper
Full-text available
Un intérêt croissant est exprimé par les organisations pour le développement d’approches visant à valoriser et tirer parti des expériences passées afin d’améliorer leurs processus de décision. Dans ce cadre de Retour d’Expérience, l’étude d’approches semi-automatisées pour la capitalisation et l’exploitation des connaissances revêt un intérêt centr...
Article
Data veracity is one of the main issues regarding Web data. Truth Discovery models can be used to assess it by estimating value confidence and source trustworthiness through analysis of claims on the same real-world entities provided by different sources. Many studies have been conducted in this domain. True values selected by most models have the...
Article
Data veracity is one of the main issues regarding web data. Facing fake news proliferation and disinformation dangers, Truth Discovery models can be used to assess this veracity by estimating value confidence and source trustworthiness through analysis of claims on the same real-world entities provided by different sources. This treatment is crucia...
Conference Paper
Full-text available
Automatic elicitation of implicit evocations - i.e. indirect references to entities (e.g. objects, persons, locations) - is central for the development of intelligent agents able of understanding the meaning of written or spoken natural language. This paper focuses on the definition and evaluation of models that can be used to summarize a set of wo...
Article
Full-text available
This article introduces an automated knowledge inference approach taking advantage of relationships extracted from texts. It is based on a novel framework making possible to exploit (i) a generated partial ordering of studied objects (e.g. noun phrases), and (ii) prior knowledge defined into ontologies. This framework is particularly suited for def...
Conference Paper
Summarizing a body of information is a complex task which mainly depends on the ability to distinguish important information and to condense notions through abstraction. Considering a knowledge representation partially ordering concepts into a directed acyclic graph, this study focuses on the problem of summarizing several human descriptions expres...
Conference Paper
When evaluating an odor, non-specialists generally provide descriptions as bags of terms. Nevertheless, these evaluations cannot be processed by classical odor analysis methods that have been designed for trained evaluators having an excellent mastery of professional controlled vocabulary. Indeed, currently, mainly oriented approaches based on lear...
Conference Paper
The main aim of truth-finding methods is to identify the most reliable and trustworthy data among a set of facts. Since existing methods assume a single true value, they cannot deal with numerous real-world use cases in which a set of true values exists for a given fact, even for functional predicate (e.g. Picasso is born in Màlaga and in Spain). T...
Conference Paper
Full-text available
Designing approaches able to automatically detect uncertain expressions within natural language is central to design efficient models based on text analysis, in particular in domains such as question-answering, approximate reasoning , knowledge-based population. This article proposes an overview of several contributions and classifications defining...
Article
Full-text available
La détection de l'incertitude dans le langage naturel est centrale pour le développe-ment de nombreux modèles exploitant l'analyse de textes e.g. questions-réponses, raisonnement approché, enrichissement de bases de connaissances. Après une synthèse des différentes classifications de l'incertitude et des méthodes de détection correspondantes, cet a...
Conference Paper
Full-text available
The need of indexing biomedical papers with the MeSH is incessantly growing and automated approaches are constantly evolving. Since 2013, the BioASQ challenge has been promoting those evolutions by proposing datasets and evaluation metrics. In this paper, we present our system, USI, and how we adapted it to participate to this challenge this year....
Conference Paper
Full-text available
Ontologies are core elements of numerous applications that are based on computer-processable expert knowledge. They can be used to estimate the Information Content (IC) of the key concepts of a domain: a central notion on which depend various ontology-driven analyses, e.g. semantic measures. This paper proposes new IC models based on the belief fun...
Article
Full-text available
Malgré leur volume important et leur accessibilité, de nombreuses données numériques ne peuvent être correctement exploitées car elles sont contenues dans des textes sous des formes peu ou pas structurées. L’extraction de relations est un processus qui rassemble des techniques pour extraire des entités et des relations à partir de textes, nous donn...
Book
Full-text available
Artificial Intelligence federates numerous scientific fields in the aim of developing machines able to assist human operators performing complex treatments -- most of which demand high cognitive skills (e.g. learning or decision processes). Central to this quest is to give machines the ability to estimate the likeness or similarity between things i...
Chapter
The capacity of assessing the similarity of objects or stimuli has long been characterized as a central component for establishing numerous cognitive processes. It is therefore not surprising that measures of similarity or distance play an important role in a large variety of treatments and algorithms, and are of particular interest for the develop...
Chapter
Back in the 60s, the quest for artificial intelligence (AI) had originally been motivated by the assumption that “[…] every aspect of learning or any other feature of intelligence can in principle be so precisely described that a machine can be made to simulate it […]” [McCarthy et al., 2006]. Even if this assumption has today proved to be pretenti...
Chapter
This chapter is dedicated to semantic measure evaluation and discusses in particular two important topics: (i) how to objectively evaluate measures and (ii) how to guide their selection with regard to specific needs. To tackle these central questions, we propose technical discussions on both methodological and practical aspects related to semantic...
Chapter
As we have seen, two main families of semantic measures can be distinguished: corpus-based measures, which take advantage of unstructured or semi-structured texts, and knowledge-based measures which rely on ontologies.
Conference Paper
Full-text available
Knowledge-based semantic measures are cornerstone to exploit ontologies not only for exact inferences or retrieval processes, but also for data analyses and inexact searches. Abstract theoretical frameworks have recently been proposed in order to study the large diversity of measures available; they demonstrate that groups of measures are particula...
Conference Paper
Full-text available
Semantic similarity and relatedness are cornerstones of numerous treatments in which lexical units (e.g., terms, documents), concepts or instances have to be compared from texts or knowledge representation analysis. These semantic measures are central for NLP, information retrieval, sentiment analysis and approximate reasoning, to mention a few. In...
Thesis
Full-text available
The notions of semantic proximity, distance, and similarity have long been considered essential for the elaboration of numerous cognitive processes, and are therefore of major importance for the communities involved in the development of artificial intelligence. This thesis studies the diversity of semantic measures which can be used to compare lex...
Article
Full-text available
The Semantic Measures Library and Toolkit are robust open source and easy to use software solutions dedicated to semantic measures. They can be used for large scale computations and analyses of semantic similarities between terms/concepts defined in terminologies and ontologies. The comparison of entities (e.g. genes) annotated by concepts is also...
Article
Full-text available
Semantic measures are today widely used to estimate the strength of the semantic relationship between elements of various types: units of language (e.g., words, sentences), concepts or even entities (e.g., documents, genes, geographical locations). They play an important role for the comparison these elements according to semantic proxies, texts an...
Conference Paper
Full-text available
Many applications take advantage of both ontologies and the Linked Data paradigm to characterize various kinds of resources. To fully exploit this knowledge, measures are used to estimate the relatedness of resources regarding their semantic characterization. Such semantic measures mainly focus on specific aspects of the semantic characterization (...
Conference Paper
Full-text available
Semantic Measures (SMs) are of critical importance in multiple treatments relying on ontologies. However, the improvement and use of SMs are currently hampered by the lack of a dedicated theoretical framework and an extensive generic software solution. To meet these needs, this paper introduces a unified theoretical framework of graph-based SMs, fr...
Article
Ontologies are widely adopted in the biomedical domain to characterize various resources (e.g. diseases, drugs, scientific publications) with non-ambiguous meanings. By exploiting the structured knowledge that ontologies provide, a plethora of ad hoc and domain-specific semantic similarity measures have been defined over the last years. Nevertheles...
Data
Optimal pairwise computation cost for two coding DNA sequences. (PDF)
Article
Full-text available
Until now the most efficient solution to align nucleotide sequences containing open reading frames was to use indirect procedures that align amino acid translation before reporting the inferred gap positions at the codon level. There are two important pitfalls with this approach. Firstly, any premature stop codon impedes using such a strategy. Seco...

Network

Cited By

Projects

Projects (3)
Archived project
PhD thesis on how to use knowledge bases to enrich indexing and clustering algorithms.
Project
Take into account of uncertainty in knowledge discovery from texts. The pipeline made extract relations from texts, use a prior knowledge like a taxonomy to enrich extractions and infer knowledge.