Damien Graux’s research while affiliated with Trinity College Dublin and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (53)


Figure 1: LLM performances across the three benchmarks (higher the better).
Figure 2: Performances of LLMs as co-pilots, reviewing closeness (%) of generations to the "gold".
An Extensive Evaluation of PDDL Capabilities in off-the-shelf LLMs
  • Preprint
  • File available

February 2025

·

15 Reads

Kaustubh Vyas

·

Damien Graux

·

Sébastien Montella

·

[...]

·

Jeff Z. Pan

In recent advancements, large language models (LLMs) have exhibited proficiency in code generation and chain-of-thought reasoning, laying the groundwork for tackling automatic formal planning tasks. This study evaluates the potential of LLMs to understand and generate Planning Domain Definition Language (PDDL), an essential representation in artificial intelligence planning. We conduct an extensive analysis across 20 distinct models spanning 7 major LLM families, both commercial and open-source. Our comprehensive evaluation sheds light on the zero-shot LLM capabilities of parsing, generating, and reasoning with PDDL. Our findings indicate that while some models demonstrate notable effectiveness in handling PDDL, others pose limitations in more complex scenarios requiring nuanced planning knowledge. These results highlight the promise and current limitations of LLMs in formal planning tasks, offering insights into their application and guiding future efforts in AI-driven planning paradigms.

Download

Figure 4: Progressive accumulation of input and output LLM tokens across agent iterations on MuSiQue.Token counts at each iteration build upon previous ones, resulting in monotonically increasing token counts to demonstrate the cumulative nature of multi-step approaches. Tokenisation is performed using OpenAI's 'tiktoken' open-source tool. The Hybrid + SyncGE method appears only in Iteration 1 as it is a single-step approach.
Figure 5: Analysis of the relationship between the number of hops in questions and the required number of agent iterations on the MuSiQue dataset. For each hop count, we analyse the number of iterations required by GEAR to determine question answerability. The maximum iteration limit was set to 4, with "4+" indicating cases where the agent could not determine answerability within this limit. The visualization presents two complementary perspectives on the same data: the left panel shows a box plot emphasizing the median and distribution of stopping iterations, while the right panel focuses on the mean number of iterations across different hop counts.
Example of a query from MuSiQue that in not answerable solely based on the provided gold passages.
GeAR: Graph-enhanced Agent for Retrieval-augmented Generation

December 2024

·

55 Reads

Retrieval-augmented generation systems rely on effective document retrieval capabilities. By design, conventional sparse or dense retrievers face challenges in multi-hop retrieval scenarios. In this paper, we present GeAR, which advances RAG performance through two key innovations: (i) graph expansion, which enhances any conventional base retriever, such as BM25, and (ii) an agent framework that incorporates graph expansion. Our evaluation demonstrates GeAR's superior retrieval performance on three multi-hop question answering datasets. Additionally, our system achieves state-of-the-art results with improvements exceeding 10% on the challenging MuSiQue dataset, while requiring fewer tokens and iterations compared to other multi-step retrieval systems.


From An LLM Swarm To A PDDL-Empowered HIVE: Planning Self-Executed Instructions In A Multi-Modal Jungle

December 2024

·

11 Reads

In response to the call for agent-based solutions that leverage the ever-increasing capabilities of the deep models' ecosystem, we introduce Hive -- a comprehensive solution for selecting appropriate models and subsequently planning a set of atomic actions to satisfy the end-users' instructions. Hive operates over sets of models and, upon receiving natural language instructions (i.e. user queries), schedules and executes explainable plans of atomic actions. These actions can involve one or more of the available models to achieve the overall task, while respecting end-users specific constraints. Notably, Hive handles tasks that involve multi-modal inputs and outputs, enabling it to handle complex, real-world queries. Our system is capable of planning complex chains of actions while guaranteeing explainability, using an LLM-based formal logic backbone empowered by PDDL operations. We introduce the MuSE benchmark in order to offer a comprehensive evaluation of the multi-modal capabilities of agent systems. Our findings show that our framework redefines the state-of-the-art for task selection, outperforming other competing systems that plan operations across multiple models while offering transparency guarantees while fully adhering to user constraints.




Figure 2: Extending machine learning methods with reproducibility steps.
Replication results.
Reproduce, Replicate, Reevaluate. The Long but Safe Way to Extend Machine Learning Methods

March 2024

·

37 Reads

Proceedings of the AAAI Conference on Artificial Intelligence

Reproducibility is a desirable property of scientific research. On the one hand, it increases confidence in results. On the other hand, reproducible results can be extended on a solid basis. In rapidly developing fields such as machine learning, the latter is particularly important to ensure the reliability of research. In this paper, we present a systematic approach to reproducing (using the available implementation), replicating (using an alternative implementation) and reevaluating (using different datasets) state-of-the-art experiments. This approach enables the early detection and correction of deficiencies and thus the development of more robust and transparent machine learning methods. We detail the independent reproduction, replication, and reevaluation of the initially published experiments with a method that we want to extend. For each step, we identify issues and draw lessons learned. We further discuss solutions that have proven effective in overcoming the encountered problems. This work can serve as a guide for further reproducibility studies and generally improve reproducibility in machine learning.


Auto-generation of Blockchain-Based Distributed Applications Using Ontologies

March 2024

·

12 Reads

·

2 Citations

Blockchain has already promoted itself to solving business issues within every major domain, from supply chain to financial institutions to the healthcare industry. This is marked as the transition to Blockchain 2.0. However, this mass migration of industries cannot yet be a reality due to the limitations in standards and expertise of smart contracts within the various domains and the concern of the legal validity of smart contracts. Therefore, it is necessary to standardize concepts of smart contracts within blockchain frameworks in relation to legal agreements and to provide a direct mapping of agreements to code. This would allow for standardization and reuse of smart contracts across domains and make them legally enforceable. We target the R3 Corda blockchain framework and propose a novel Ontology, CordaO, that can be used to model Corda Smart Contracts (CorDapps). We also develop a tool, CordaOntoG, that auto-generates the relevant state, contract, and flow code in Java that can be deployed and run on a Corda network. The ontology and code generator are then evaluated with elementary domain-specific agreements such as clinical trial patient registration, car rental, and invoices.


Large Language Models and Knowledge Graphs: Opportunities and Challenges

August 2023

·

757 Reads

·

4 Citations

Large Language Models (LLMs) have taken Knowledge Representation -- and the world -- by storm. This inflection point marks a shift from explicit knowledge representation to a renewed focus on the hybrid representation of both explicit knowledge and parametric knowledge. In this position paper, we will discuss some of the common debate points within the community on LLMs (parametric knowledge) and Knowledge Graphs (explicit knowledge) and speculate on opportunities and visions that the renewed focus brings, as well as related research topics and challenges.



Efficient Semantic Summary Graphs for Querying Large Knowledge Graphs

May 2022

·

20 Reads

Knowledge Graphs (KGs) integrate heterogeneous data, but one challenge is the development of efficient tools for allowing end users to extract useful insights from these sources of knowledge. In such a context, reducing the size of a Resource Description Framework (RDF) graph while preserving all information can speed up query engines by limiting data shuffle, especially in a distributed setting. This paper presents two algorithms for RDF graph summarization: Grouping Based Summarization (GBS) and Query Based Summarization (QBS). The latter is an optimized and lossless approach for the former method. We empirically study the effectiveness of the proposed lossless RDF graph summarization to retrieve complete data, by rewriting an RDF Query Language called SPARQL query with fewer triple patterns using a semantic similarity. We conduct our experimental study in instances of four datasets with different sizes. Compared with the state-of-the-art query engine Sparklify executed over the original RDF graphs as a baseline, QBS query execution time is reduced by up to 80% and the summarized RDF graph is decreased by up to 99%.


Citations (35)


... However, LLMs have superseded those frameworks to stand as a planner on their own (i.e. the LLM-as-planner paradigm). Multiple prompt engineering techniques Graux et al., 2024) were designed to leverage in-context learning aiming to directly generate the multi-step problem solutions. More specifically, the Chain-of-Thought has revealed the promising reasoning capabilities of LLMs, and therefore new techniques were fashioned such as the self-consistency decoding strategy , Tree-of-Thought (Yao et al., 2023a), Program-of-Thought or Graph-of-Thought (Yao et al., 2023c;Besta et al., 2024). ...

Reference:

From An LLM Swarm To A PDDL-Empowered HIVE: Planning Self-Executed Instructions In A Multi-Modal Jungle
[PromptEng] First International Workshop on Prompt Engineering for Pre-Trained Language Models
  • Citing Conference Paper
  • May 2024

... For our training data, we use knowledge graph verbalizations from the field of knowledge graph-to-text approaches [1]. Therefore, from the dimensions described in survey papers on combining LMs with existing knowledge bases [20,21], AutoRAG uses all three: Training from knowledge base data, using knowledge base entities for prompt construction and for augmenting generated output. Our prompt tuning based on concept embedding sequences is an approach both for controllable text generation [39] and context compression [23]. ...

Large Language Models and Knowledge Graphs: Opportunities and Challenges

... This performance in-crease has gone hand-in-hand with an explosion in popularity. The global market for Automatic Speech Recognition Technology has grown to ten times its size in just 8 years, and this growth is accelerating [1,2]. As ASR improves in reliability, this opens the door to more advanced use cases that were previously out of reach for early voice-recognition technology. ...

Multi Platform-Based Hate Speech Detection
  • Citing Conference Paper
  • January 2023

... This method was novel because it used the best-fit occurrence estimate methodology to generate query facets from domain ontologies with query indicator words, allowing for dynamic query extension. In their work, Niazmand et al. [31] introduced the grouping-based summarization (GBS) and query-based summarization (QBS) methods for summarizing RDF graphs. The second technique was an improved lossless variant of the first. ...

Efficient semantic summary graphs for querying large knowledge graphs
  • Citing Article
  • May 2022

International Journal of Information Management Data Insights

... Parvin et al., (2021) afirma que la usabilidad es un criterio de calidad que determina cuán fácil es usar las interfaces de usuario. Por otro lado Graux & Orlandi, (2022) menciona que La World Wide Web Consortium (W3C) define la accesibilidad como garantizar que los productos y servicios sean utilizables La evaluación continua de la usabilidad y la accesibilidad es un componente crucial en el diseño y desarrollo de interfaces de usuario. Este proceso permite determinar cómo los usuarios finales interactúan con el sistema y si pueden alcanzar sus objetivos de manera efectiva, eficiente y satisfactoria. ...

Through the Lens of the Web Conference Series: A Look Into the History of the Web
  • Citing Conference Paper
  • April 2022

... HRAN (Li, Liu, et al., 2021) takes full advantage of the heterogeneity of KG when aggregating neighbour information, but they ignore the overall structure. Unlike many models that focus only on information about 1-hop neighbourhoods, EIGAT (Zhao et al., 2022) and GFA-NN (Sadeghi et al., 2021) take into account the global characteristics of the entities, but still unable to perform cross-relational information transfer. Within the recent years, most of the emerging KGE models utilize complex neural structures, such as tensor networks, graph convolutional networks, and transformers, to learn richer representations. ...

Embedding Knowledge Graphs Attentive to Positional and Centrality Qualities
  • Citing Book
  • September 2021

... Those metadata are necessary to describe a KB in a catalog as they constitute the core of its catalog entry. They can also be useful for result ranking in query federation systems that annotate results with their sources, such as Corese [5] and BioFed [14], to apply trust-based policies [23] or to privilege the freshness of a source. ...

Beyond Classical SERVICE Clause in Federated SPARQL Queries: Leveraging the Full Potential of URI Parameters

... Centralized systems can face challenges related to scalability, data sovereignty, and vulnerabilities associated with single points of failure. However, the digital horizon is changing, with decentralized technologies, especially those foundational to Web 3.0 and blockchain, heralding a new paradigm in spatial data storage (Mahmoodi, 2021). ...

A Fully Decentralized Triplestore Managed via the Ethereum Blockchain

... Big Data Analytics, hence, refers to the strategy of analysing large volumes of data that gathered from a wide variety of sources, including social networks, transaction records, videos, digital images and different kind of sensors. In an attempt to support the European data economy policy [1], our consortium proposed a training approach [2] and established the infrastructure for collaborative work of teachers/trainers with PhD students and other interested parties such as industries. ...

Deploying a Strategy to Unlock Big Data Research and Teaching Activities in the West Balkan Region

... Structure. Definitions of graph structure in the literature typically refer to measures of how nodes and edges are connected, such as measures of node degree [9,22,25,33,45,58,70,71,73,74,98,110] or the frequency of an edge [9,45,58,70,71,73]. While there is no single universal definition of graph structure, this work follows the examples set in existing literature and defines structure as so: the structure of a graph is a quantitative description of the local connectivity of individual nodes and edges in a graph, such as node degree or edge frequency. ...

Embedding Knowledge Graphs Attentive to Positional and Centrality Qualities