Pierre Larmande

Pierre Larmande
  • PhD, data integration and bioinformatics
  • Researcher at Institute of Research for Development

About

114
Publications
24,027
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,954
Citations
Introduction
I am senior scientist at IRD (http://www.ird.fr). I am also staff member of the South Green Platform (http://www.southgreen.fr/). My main research interests are (plant)-ontologies, semantic annotation, data integration, Semantic Web, Metadata, Knowledge management, Genomics . More info at https://sites.google.com/site/larmandepierre
Current institution
Institute of Research for Development
Current position
  • Researcher
Additional affiliations
September 2020 - November 2020
Institute of Research for Development
Position
  • Researcher
November 2018 - August 2020
Institute of Research for Development
Position
  • Researcher
September 2016 - October 2018
Institute of Research for Development
Position
  • Engineer

Publications

Publications (114)
Article
Motivation Pre-trained Language Models (PLMs) have achieved remarkable performance across various natural language processing tasks. However, they encounter challenges in biomedical Named Entity Recognition (NER), such as high computational costs and the need for complex fine-tuning. These limitations hinder the efficient recognition of biological...
Preprint
Full-text available
Motivation: Pre-trained Language Models (PLMs) have achieved remarkable performance across various natural language processing tasks. However, they encounter challenges in biomedical Named Entity Recognition (NER), such as high computational costs and the need for complex fine-tuning. These limitations hinder the efficient recognition of biological...
Article
Full-text available
Jasmonate is an essential phytohormone involved in plant development and stress responses. Its perception occurs through the CORONATINE INSENSITIVE (COI) nuclear receptor allowing to target the Jasmonate-ZIM domain (JAZ) repressors for degradation by the 26S proteasome. Consequently, repressed transcription factors are released and expression of ja...
Article
Full-text available
Background: As the number of genome-wide association study (GWAS) and quantitative trait locus (QTL) mappings in rice continues to grow, so does the already long list of genomic loci associated with important agronomic traits. Typically, loci implicated by GWAS/QTL analysis contain tens to hundreds to thousands of single-nucleotide polmorphisms (S...
Conference Paper
Full-text available
DLinker is a system for matching instances of two RDF data sources. Its performance is mainly based on the deep comparison of literals. The main comparison algorithm is based on the search for the longest common subsequence (LCS) present in the literals. The validation of the similarity between two literals is performed by a mathematical formula. T...
Chapter
Understanding genotype–phenotype relationships is one of the most important areas of research in agronomy. The new challenges aim at understanding these relationships on the level of the different molecular entities responsible for the expression of complex phenotypic traits. Recent advances in high-throughput technologies have resulted in tremendo...
Chapter
Recent advances in high-throughput technologies have resulted in tremendous increase in the amount of data in the agronomic domain. There is an urgent need to effectively integrate complementary information to understand the biological system in its entirety. We have developed AgroLD, a knowledge graph that exploits the Semantic Web technology and...
Chapter
Next generation sequencing technologies enabled high-density genotyping for large numbers of samples. Nowadays SNP calling pipelines produce up to millions of such markers, but which need to be filtered in various ways according to the type of analyses. One of the main challenges still lies in the management of an increasing volume of genotyping fi...
Chapter
Recent advances in sequencing technologies and high-throughput phenotyping have revolutionized the analysis in the field of the plant sciences. However, there is an urgent need to effectively integrate and assimilate complementary information to understand the biological system in its entirety. We have developed AgroLD, a knowledge graph that explo...
Article
Full-text available
Due to the rapid evolution of high-throughput technologies, a tremendous amount of data is being produced in the biological domain, which poses a challenging task for information extraction and natural language understanding. Biological named entity recognition (NER) and named entity normalisation (NEN) are two common tasks aiming at identifying an...
Article
Full-text available
Since its emergence in China, the COVID-19 pandemic has spread rapidly around the world. Faced with this unknown disease, public health authorities were forced to experiment, in a short period of time, with various combinations of interventions at different scales. However, as the pandemic progresses, there is an urgent need for tools and methodolo...
Article
Currently, gene information available for Oryza sativa species is located in various online heterogeneous data sources. Moreover, methods of access are also diverse, mostly web-based and sometimes query APIs, which might not always be straightforward for domain experts. The challenge is to collect information quickly from these applications and com...
Article
Full-text available
In semantic annotation, semantic concepts are linked to natural language. Semantic annotation helps in boosting the ability to search and access resources and can be used in information retrieval systems to augment the queries from the user. In the research described in this paper, we aimed to identify ontological concepts in scientific text contai...
Technical Report
Full-text available
We describe in this document the COMOKIT model using the standard O.D.D. protocol in its 1 first review version.
Preprint
Full-text available
Currently, gene information available for Oryza sativa species is located in various online heterogeneous data sources. Moreover, methods of access are also diverse, mostly web-based and sometimes query APIs, which might not always be straightforward for domain experts. The challenge is to collect information quickly from these applications and com...
Preprint
Full-text available
Semantic annotation is the process in which semantic concepts are linked to natural language. It helps in boosting the search and access of resources and can be used in information retrieval systems to increase the queries from the user. In this paper, we are interested in identifying ontological concepts in scientific text contained in spreadsheet...
Preprint
Full-text available
Candidate genes prioritization allows to rank among a large number of genes, those that are strongly associated with a phenotype or a disease. Due to the important amount of data that needs to be integrate and analyse, gene-to-phenotype association is still a challenging task. In this paper, we evaluated a knowledge graph approach combined with emb...
Article
Full-text available
Motivation With high-throughput genotyping systems now available, it has become feasible to fully integrate genotyping information into breeding programs. To make use of this information effectively requires DNA extraction facilities and marker production facilities that can efficiently deploy the desired set of markers across samples with a rapid...
Article
Full-text available
Text mining has become an important research method in biology, with its original purpose to extract biological entities, such as genes, proteins and phenotypic traits, to extend knowledge from scientific papers. However, few thorough studies on text mining and application development, for plant molecular biology data, have been performed, especial...
Article
Full-text available
Background The study of genetic variations is the basis of many research domains in biology. From genome structure to population dynamics, many applications involve the use of genetic variants. The advent of next-generation sequencing technologies led to such a flood of data that the daily work of scientists is often more focused on data management...
Article
Full-text available
Background Rice molecular genetics, breeding, genetic diversity, and allied research (such as rice-pathogen interaction) have adopted sequencing technologies and high-density genotyping platforms for genome variation analysis and gene discovery. Germplasm collections representing rice diversity, improved varieties, and elite breeding materials are...
Article
Motivation Modern genomic breeding methods rely heavily on very large amounts of phenotyping and genotyping data, presenting new challenges in effective data management and integration. Recently, the size and complexity of datasets have increased significantly, with the result that data is often stored on multiple systems. As analyses of interest i...
Preprint
Full-text available
Motivation With high-throughput genotyping systems now available, it has become feasible to fully integration genotyping information into breeding programs [22]. To make use of this information effectively requires DNA extraction facilities and marker production facilities that can efficiently deploy the desired set of markers across samples with a...
Article
La compréhension des relations génotype-phénotype est un des axes les plus important de la recherche en agronomie. Or les interactions génotype-phénotype sont complexes à identifier car elles s'expriment à différentes échelles moléculaires dans la plante et subissent de fortes influences de la part des facteurs environnementaux. Les technologies d'...
Article
Full-text available
Recent advances in high-throughput technologies have resulted in a tremendous increase in the amount of omics data produced in plant science. This increase, in conjunction with the heterogeneity and variability of the data, presents a major challenge to adopt an integrative research approach. We are facing an urgent need to effectively integrate an...
Data
Report of the online survey. Report of 3 sessions evaluating the AgroLD user interfaces. (PDF)
Data
Examples of SPARQL queries. Example of SPARQL queries showing the benefits of property path algorithm, and complex queries. (PDF)
Data
AgroLD user guide. This document shows how to use the various features of the platform. (PDF)
Thesis
Full-text available
Developing a semantic annotation framework to annotate semantically some datasets using several ontologies The Agronomic Linked Data (AgroLD) project
Presentation
Full-text available
Develop a semantic annotation framework to annotate semantically some datasets using several ontologies, using Python Flask, RDFLib, NLTK and Elasticsearch
Chapter
Text mining research is becoming an important topic in biology with the aim to extract biological entities from scientific papers in order to extend the biological knowledge. However, few thorough studies are developed for plant molecular biology data, especially rice, thus resulting a lack of datasets available to exploit advanced machine learning...
Article
Full-text available
The future of agricultural research depends on data. The sheer volume of agricultural biological data being produced today makes excellent data management essential. Governmental agencies, publishers and science funders require datamanagement plans for publicly funded research. Furthermore, the value of data increases exponentially when they are pr...
Preprint
Full-text available
Text mining research is becoming an important topic in biology with the aim to extract biological entities from scientific papers in order to extend the biological knowledge. However, few thorough studies on text mining and applications are developed for plant molecular biology data, especially rice, thus resulting a lack of datasets available to t...
Article
African rice (Oryza glaberrima) was domesticated independently from Asian rice. The geographical origin of its domestication remains elusive. Using 246 new whole-genome sequences, we inferred the cradle of its domestication to be in the Inner Niger Delta. Domestication was preceded by a sharp decline of most wild populations that started more than...
Preprint
Full-text available
Background Rice molecular genetics, breeding, genetic diversity, and allied research (such as rice-pathogen interaction) have adopted sequencing technologies and high density genotyping platforms for genome variation analysis and gene discovery. Germplasm collections representing rice diversity, improved varieties and elite breeding materials are a...
Preprint
Full-text available
Recent advances in high-throughput technologies have resulted in a tremendous increase in the amount of omics data produced in plant science. This increase, in conjunction with the heterogeneity and variability of the data, presents a major challenge to adopt an integrative research approach. We are facing an urgent need to effectively integrate an...
Article
Improving productivity of the staple crops wheat and rice is essential to feed the growing global population, particularly in the context of a changing climate. However, current rates of yield gain are insufficient to support the predicted population growth. New approaches are required to accelerate the breeding process, and many of these are drive...
Article
Full-text available
In this article, we present a joint effort of the wheat research community, along with data and ontology experts, to develop wheat data interoperability guidelines. Interoperability is the ability of two or more systems and devices to cooperate and exchange data, and interpret that shared information. Interoperability is a growing concern to the wh...
Article
Full-text available
Many vocabularies and ontologies are produced to represent and annotate agronomic data. However, those ontologies are spread out, in different formats, of different size, with different structures and from overlapping domains. Therefore, there is need for a common platform to receive and host them, align them, and enabling their use in agro-informa...
Article
Full-text available
In this article, we present a joint effort of the wheat research community, along with data and ontology experts, to develop wheat data interoperability guidelines. Interoperability is the ability of two or more systems and devices to cooperate and exchange data, and interpret that shared information. Interoperability is a growing concern to the wh...
Article
With the development of new experimental technologies, biologists are faced with an avalanche of data to be computationally analyzed for scientific advancements and discoveries to emerge. Faced with the complexity of analysis pipelines, the large number of computational tools, and the enormous amount of data to manage, there is compelling evidence...
Article
Full-text available
The South Green Web portal (http://www.southgreen.fr/http://www.southgreen.fr/) provides access to a large panel of public databases, analytical workflows and bioinformatics resources dedicated to the genomics of tropical and Mediterranean crops. The portal contains currently about 20 information systems and tools and targets a broad range of crops...
Conference Paper
Full-text available
The Agronomic Linked Data project (AgroLD) is a Semantic Web knowledge base designed to integrate data from various publicly available plant centric data sources. The aim of AgroLD project is to provide a portal for bioinformaticians and domain experts to exploit the homogenized data towards enabling to bridge the knowledge. Here we present new too...
Article
Full-text available
Background Exploring the structure of genomes and analyzing their evolution is essential to understanding the ecological adaptation of organisms. However, with the large amounts of data being produced by next-generation sequencing, computational challenges arise in terms of storage, search, sharing, analysis and visualization. This is particularly...
Article
Full-text available
Background: Exploring the structure of genomes and analyzing their evolution is essential to understanding the ecological adaptation of organisms. However, with the large amounts of data being produced by next-generation sequencing, computational challenges arise in terms of storage, search, sharing, analysis and visualization. This is particularly...
Conference Paper
Full-text available
Many vocabularies and ontologies are produced to represent and annotate agronomic data. By reusing the NCBO BioPortal technology, we have already designed and implemented an advanced prototype ontology repository for the agronomy domain. We plan to turn that prototype into a real service to the community. The AgroPortal project aims at reusing the...
Poster
Full-text available
In the coming years, the study of the interaction between the epigenome and the mobilome is likely to give insights on the role of TEs on genome stability and evolution. In the present project we have created tools to collect epigenetic datasets from different laboratories and databases and translate them to a standard format to be integrated, anal...
Conference Paper
Full-text available
In the recent years, the data deluge in many areas of scientific research brings challenges in the treatment and improvement of agricultural data. Research in bioinformatics field does not outside this trend. This paper presents some approaches aiming to solve the Big Data problem by combining the increase in semantic search capacity on existing da...
Poster
Full-text available
Dans le contexte des ressources agronomiques et au jour d’aujourd’hui, il n’existe ni de données de référence ni de questions adaptées au domaine pour évaluer les interfaces dotées de Systèmes à Questions-Réponses (SQR). A cet effet, nous avons construit un étalon-or (gold standard) constitué d’un ensemble de questions de référence formulées en lan...
Conference Paper
Full-text available
Agronomy is an overarching field constituting various research areas such as genetics, plant molecular biology, ecology and earth science. The last several decades has seen the successful development of high-throughput technologies that have revolutionised and transformed agronomic research. The application of these technologies have generated larg...
Article
Full-text available
In the recent years, the data deluge in many areas of scientific research brings challenges in the treatment and improvement of agricultural data. Research in bioinformatics field does not outside this trend. This paper presents some approaches aiming to solve the Big Data problem by combining the increase in semantic search capacity on existing da...
Article
Full-text available
Similarly to what happens in biomedicine, communities engaged in agronomic research need to access specific sets of ontologies for data annotation and integration. For instance, it has been established that the scientific challenges in plant breeding have switched from genetics to phenotyping and that standard traits/phenotypes vocabularies are nec...
Conference Paper
Full-text available
The advancements in empirical technologies has generated vast amounts of heterogeneous data. This situation has created a need to integrate the data to understand the system of interest in its entirety. Therefore, information systems play a crucial role in managing these data, enabling the biologists in the extraction of new knowledge. The plant bi...
Conference Paper
Full-text available
Today, the revolution in empirical technologies has generated vast amounts of data. This data deluge has created an urgent need to assimilate it with a panoramic view. To this end, information systems play a central role in managing and integrating these data, aiding the biologists in exploiting this integrated information for the extraction of new...
Conference Paper
Full-text available
The objective of plant phenotyping is to advance plant science for breeding and crop management. Phenotyping platforms automate the measurements of traits from the cell to the whole plant by using novel sensors and methods in both controlled environment and in the field. Big data are produced in the form of alphanumeric matrices, images, statistica...
Article
Full-text available
Our project is to develop and support a reference ontology repository for the agronomic domain. By reusing the NCBO BioPortal technology, we have already designed and implemented a prototype ontology repository for plants and a few crops. We plan to turn that prototype into a real service to the community. The AgroPortal project aims at reusing the...
Conference Paper
Full-text available
With the data deluge produced by Next Generation Sequencing (NGS) arise serious computational challenges in terms of storage, search, sharing, analysis, and data visualization, that redefine some practices in data management. Gigwa was mainly developed to manage genomic and genotyping data from NGS analysis.
Conference Paper
Full-text available
This application is developed in the context of studies of genetic and phenotypic diversity in Asian and African rice (Oryza sativa and Oryza glaberrima). The objective of these studies is to identify by association genetics approaches some genes of interest in order to understand biological processes related to plant development and plasticity or...
Article
Full-text available
Today, the revolution in empirical technologies has generated vast amounts of data. This data deluge has created an urgent need to assimilate it with a panoramic view. To this end, information systems play a central role in managing and integrating these data, aiding the biologists in exploiting this integrated information for the extraction of new...
Conference Paper
In recent years, a large amount of "-omics" data has been produced. However, these data are stored in many different species-specific databases that are managed by different institutes and laboratories. Biologists often need to find and assemble data from disparate sources to perform certain analyses. Searching for these data and assembling it is a...
Data
Full-text available
NCBO Annotator: Ontology-based annotation workflow Customized IBC Annotator for database schemas • First, direct annotations are created by recognizing concepts in raw text. • Second, annotations are semantically expanded using knowledge of the ontologies. • Third, all annotations are aggregated and scored according to the context in which they hav...
Data
Full-text available
Plant genetic and genomic data are accessible through several databases around the world. Data integration is necessary to allow biologists to access all available data. Combined use of Web Services and Semantic Web standards will allow to overtake data sources heterogeneity and to automate data integration.

Network

Cited By