Guillaume Gravier’s research while affiliated with French National Centre for Scientific Research and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (204)


Fig. 3: Examples of attention maps on one of the e-SNLI entailment pair.
Regularization, Semi-supervision, and Supervision for a Plausible Attention-Based Explanation
  • Preprint
  • File available

January 2025

·

3 Reads

·

·

Guillaume Gravier

·

Attention mechanism is contributing to the majority of recent advances in machine learning for natural language processing. Additionally, it results in an attention map that shows the proportional influence of each input in its decision. Empirical studies postulate that attention maps can be provided as an explanation for model output. However, it is still questionable to ask whether this explanation helps regular people to understand and accept the model output (the plausibility of the explanation). Recent studies show that attention weights in the RNN encoders are hardly plausible because they spread on input tokens. We thus propose 3 additional constraints to the learning objective function to improve the plausibility of the attention map: regularization to increase the attention weight sparsity, semi-supervision to supervise the map by a heuristic and supervision by human annotation. Results show that all techniques can improve the attention map plausibility at some level. We also observe that specific instructions for human annotation might have a negative effect on classification performance. Beyond the attention map, the result of experiments on text classification tasks also shows that no matter how the constraint brings the gain, the contextualization layer plays a crucial role in finding the right space for finding plausible tokens.

Download

Decreasing graph complexity with transitive reduction to improve temporal graph classification

International Journal of Data Science and Analytics

Carolina Jerônimo

·

·

·

[...]

·

Domains such as bioinformatics, social network analysis, and computer vision, describe relations between entities and cannot be interpreted as vectors or fixed grids. Instead, they are naturally represented by graphs. Often this kind of data evolves over time in a dynamic world, respecting a temporal order being known as temporal graphs. The latter became a challenge since subgraph patterns are very difficult to find and the distance between those patterns may change irregularly over time. While state-of-the-art methods are primarily designed for static graphs and may not capture temporal information, recent works have proposed mapping temporal graphs to static graphs to allow for the use of conventional static kernels approaches. This work presents a new method for temporal graph classification based on transitive reduction, which explores new kernels and Graph Neural Networks for temporal graph classification. We compare the transitive reduction impact on the map to static graphs in terms of accuracy and computational efficiency across different classification tasks. Experimental results demonstrate the effectiveness of the proposed mapping method in improving the accuracy of supervised classification for temporal graphs while maintaining reasonable computational efficiency.



Filtering Safe Temporal Motifs in Dynamic Graphs for Dissemination Purposes

November 2023

·

19 Reads

Lecture Notes in Computer Science

In this paper, we address the challenges posed by dynamic networks in various domains, such as bioinformatics, social network analysis, and computer vision, where relationships between entities are represented by temporal graphs that respect a temporal order. To understand the structure and functionality of such systems, we focus on small subgraph patterns, called motifs, which play a crucial role in understanding dissemination processes in dynamic networks that can be a spread of fake news, infectious diseases or computer viruses. To address this, we propose a novel approach called temporal motif filtering for classifying dissemination processes in labeled temporal graphs. Our approach identifies and examines key temporal subgraph patterns, contributing significantly to our understanding of dynamic networks. To further enhance classification performance, we combined directed line transformations with temporal motif removal. Additionally, we integrate filtering motifs, directed edge transformations, and transitive edge reduction. Experimental results demonstrate that our proposed approaches consistently improve classification accuracy across various datasets and tasks. These findings hold the potential to unlock deeper insights into diverse domains and enable the development of more accurate and efficient strategies to address challenges related to spreading process in dynamic environments. Our work significantly contributes to the field of temporal graph analysis and classification, opening up new avenues for advancing our understanding and utilization of dynamic networks.



FIGURE 2 -ROC pour l'évaluation de la plausibilité par poids d'attention et saillance
Filtrage et régularisation pour améliorer la plausibilité des poids d'attention dans la tâche d'inférence en langue naturelle

June 2023

·

20 Reads

·

3 Citations

We scrutinize the plausibility of an attention mechanism for a natural language inference task, specifically its capacity to offer a human-plausible explanation for the detected relationship between two sentences. Using the Explanation-Augmented Stanford Natural Language Inference corpus, we've identified that the attention weights are seldom plausible in practice and usually fail to focus on the relevant tokens. We assess various strategies to augment the plausibility of attention weights, leveraging masks from morphosyntactic analysis or regularization to enforce parsimony. Our results show that these strategies significantly improve the plausibility of the attention weights and yield superior performance compared to saliency map methods.


Regularization, Semi-supervision, and Supervision for a Plausible Attention-Based Explanation

June 2023

·

31 Reads

Lecture Notes in Computer Science

Attention mechanism is contributing to the majority of recent advances in machine learning for natural language processing. Additionally, it results in an attention map that shows the proportional influence of each input in its decision. Empirical studies postulate that attention maps can be provided as an explanation for model output. However, it is still questionable to ask whether this explanation helps regular people to understand and accept the model output (the plausibility of the explanation). Recent studies show that attention weights in RNN encoders are hardly plausible because they spread on input tokens. We thus propose three additional constraints to the learning objective function to improve the plausibility of the attention map: regularization to increase the attention weight sparsity, semi-supervision to supervise the map by a heuristic and supervision by human annotation. Results show that all techniques can improve the attention map plausibility at some level. We also observe that specific instructions for human annotation might have a negative effect on classification performance. Beyond the attention map, results on text classification tasks also show that the contextualization layer plays a crucial role in finding the right space for finding plausible tokens, no matter how constraints bring the gain.KeywordsAttention mechanismExplainabilityPlausibiltyRegularizationSemi-supervisionSupervision


Confronting Active Learning for Relation Extraction to a Real-life Scenario on French Newspaper Data

With recent deep learning advances in natural language processing, tasks such as relation extraction have been solved on benchmark data with near-perfect accuracy. However, in a realistic scenario, such as in a French newspaper company mostly dedicated to local information, relations are of varied, highly specific types, with virtually no data annotated for relations, and many entities co-occur in a sentence without being related. We question the use of supervised state-of-the-art models in such a context, where resources such as time, computing power and human annotators are limited. To adapt to these constraints, we experiment with an active-learning based relation extraction pipeline, consisting of a binary LSTM-based model for detecting the relations that do exist, and a state-of-the-art model for relation classification. We compare several classification models of different depths, from simplistic word embedding averaging, to graph neural networks and Bert-based models, as well as several active learning query strategies, including a proposal for a balanced uncertainty-based strategy, in order to find the most cost-efficient yet accurate approach in our newspaper company’s use case. Our findings highlight the unsuitability of deep models in this data-scarce scenario, as well as the need to further develop data-driven active learning strategies.



Fig. 2: Overall model architecture. The dash lines indicate that layers share the weights; in other words, we use the same first 3 layers (Embedding, Contextualization, Attention) for both premise and hypothesis.
Fig. 3: The attention mechanism used in the experimentation. The intuition is that α vector should give how much relevant a word is comparing to the sentence embedding of the other side h m .
A Study of the Plausibility of Attention between RNN Encoders in Natural Language Inference

December 2021

·

20 Reads

·

4 Citations

Attention maps in neural models for NLP are appealing to explain the decision made by a model, hopefully emphasizing words that justify the decision. While many empirical studies hint that attention maps can provide such justification from the analysis of sound examples, only a few assess the plausibility of explanations based on attention maps, i.e., the usefulness of attention maps for humans to understand the decision. These studies furthermore focus on text classification. In this paper, we report on a preliminary assessment of attention maps in a sentence comparison task, namely natural language inference. We compare the cross-attention weights between two RNN encoders with human-based and heuristic-based annotations on the eSNLI corpus. We show that the heuristic reasonably correlates with human annotations and can thus facilitate evaluation of plausible explanations in sentence comparison tasks. Raw attention weights however remain only loosely related to a plausible explanation.


Citations (69)


... Despite producing accurate results, the methods are computationally expensive since the graphs obtained from the translation to static ones are huge. To reduce the complexity of the graph, in our previous work [24], we proposed a method based on Transitive Reduction (TR) for removing redundancy in static graphs. In graph theory, this redundancy, the transitive relation, may be seen as the existence of two or more paths between two different vertices. ...

Reference:

Decreasing graph complexity with transitive reduction to improve temporal graph classification
A Novel Method for Temporal Graph Classification based on Transitive Reduction
  • Citing Conference Paper
  • October 2023

... Other post-hoc explanation techniques can provide faithful explanations (such as gradient-based methods or feature suppression), they present two main drawbacks: (i) incurring additional computational costs during each inference and (ii) offering benefits only to the model developer, without the flexibility to impose constraints for plausible explanations [2] while its explanation cannot be guaranteed to be plausible for end-users [3,21]. ...

Filtrage et régularisation pour améliorer la plausibilité des poids d'attention dans la tâche d'inférence en langue naturelle

... While plausibility is an interesting feature that allows to present an easily comprehensible way to individuals with limited knowledge of neural models without additional computational costs, the contributions in this direction remain limited and rare. Given that multiple studies have suggested that raw attention weights lack plausibility (see, e.g., [20]), the issue of forcing their plausibility is an obvious one that calls for further exploration. As it is proven possible to incorporate constraints on attention while maintaining satisfactory performance [25,12,31], we propose three approaches for enforcing plausibility constraints on attention maps, namely, sparsity regularization, semi-supervised learning, and supervised learning. ...

A Study of the Plausibility of Attention between RNN Encoders in Natural Language Inference

... presents how such a system works: the method for relation detection discriminates the positive examples from the negative examples, and our neighboursbased method classifies the positive examples among the compatible relation types. Such a two-step approach has already been exploited with promising results[14]. ...

Active Learning for Interactive Relation Extraction in a French Newspaper’s Articles

... For some applications, it may be useful to store some high-level audio descriptors, such as the signal envelope or a transcription via an automatic speech recognition algorithm. Alternatively, a representative short excerpt from the original audio footage (a so-called acoustic thumbnail Gravier et al. [2014]) could be included. The computation of the acoustic fingerprint may also change in future payload versions. ...

Audio thumbnails for spoken content without transcription based on a maximum motif coverage criterion
  • Citing Conference Paper
  • September 2014

... Paragraph Vector [13] is a well-known method that uses a large set of unlabeled text to learn sentence representations, but the representations are not optimized for a specific task because they are learned based on unsupervised objectives. In contrast, some researchers have worked on supervised sentence embedding particularly for natural language understanding using recurrent neural networks (RNNs) [16,33,36] and long shortterm memory (LSTM) networks [22,35,5]. However, because we cannot define an objective function based on classification error between ID cases and OOD cases, these methods are not directly applicable to our problem in which only ID sentences are available as a training set. ...

Is it time to switch to word embedding and recurrent neural networks for spoken language understanding?

... Clustering algorithms that can be employed include K-Means, K-medoids, and C-Means. Methods for reducing dimensionality include Principle Component Analysis (PCA) and Singular Value Decomposition (SVD) (Saranya et al., 2020) (Simeoni et al., 2021). 3) Semi Supervised Learning: falls between the two approaches and involves working with both labeled and unlabeled data, thereby merging the two previously described approaches of Supervised and Unsupervised methods. ...

Rethinking deep active learning: Using unlabeled data at model training
  • Citing Conference Paper
  • January 2021

... cohesion, coupling). On the other hand, recent software engineering research has showcased the potential of using Large Language Models (LLMs) [60] and Representation Learning techniques [36,91] to enhance the performance of downstream tasks [84] like code summarization and code search [21] through rich, high-dimensional representation vectors of source code. ...

A survey on training and evaluation of word embeddings

International Journal of Data Science and Analytics

... Label propagation can predict pseudo-labels for unlabeled data (test data) and increase the amount of training samples at training time. KGs promote this transductive process [10,15,33] by modeling latent relations between labeled data and unlabeled data, facilitating for the same reason classification through labels propagation. Figure 1: The overview of our proposed framework, named GCNBoost. ...

A correlation-based entity embedding approach for robust entity linking
  • Citing Conference Paper
  • November 2020

... How much are two given words related? In general, the way of automatically computing a degree of relatedness between words falls into one of the following categories of methods [22]: corpus-based methods, which use large corpora of natural language texts, and exploit co-occurrences of words, as for instance [47]; knowledge-based methods, which rely on structured resources, as for instance [13]; and hybrid methods, which are a mix of the two, as for instance [38]. Corpus-based methods benefit from the huge availability of textual documents and the advancements in the field of natural language processing and, for this reason, they have been widely investigated in the literature for a long time. ...

A Novel Path-Based Entity Relatedness Measure for Efficient Collective Entity Linking
  • Citing Chapter
  • November 2020

Lecture Notes in Computer Science