Maguelonne Teisseire

Maguelonne Teisseire
  • Research Director at French National Institute for Agriculture, Food, and Environment (INRAE)

About

372
Publications
41,548
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
2,887
Citations
Current institution
French National Institute for Agriculture, Food, and Environment (INRAE)
Current position
  • Research Director

Publications

Publications (372)
Conference Paper
Full-text available
Dans le cadre de la modélisation de dynamiques spatiales, les connaissances expertes permettent de définir des règles d’état ou de changement d’état. Ces connaissances contribuent à la cohérence des modèles mais sont fastidieuses à obtenir. Ce travail propose une approche d’extraction automatique de connaissances à partir de données textuelles, app...
Article
Full-text available
Land artificialization is a significant modern concern, as it is irreversible, diminishes agriculturally suitable land and causes environmental problems. Our project, Hérelles, aims to address this challenge by developing a framework for land artificialization management. In this framework, we associate urban planning rules in text form with cluste...
Preprint
Full-text available
To address the current crises (climatic, social, economic), the self-sufficiency -- a set of practices that combine energy sobriety, self-production of food and energy, and self-construction - arouses an increasing interest. The CNRS STAY project (Savoirs Techniques pour l'Auto-suffisance, sur YouTube) explores this topic by analyzing techniques sh...
Article
La directive cadre européenne sur l’eau (2000) fixe l’atteinte du bon état écologique dans toutes les masses d’eau, à court et moyen termes. Ce bon état est établi par des indices biologiques basés sur les êtres vivants aquatiques. Les conditions hydrologiques sont l’une des caractéristiques physiques importantes de ces écosystèmes, d’autant plus d...
Preprint
Full-text available
Language models now constitute essential tools for improving efficiency for many professional tasks such as writing, coding, or learning. For this reason, it is imperative to identify inherent biases. In the field of Natural Language Processing, five sources of bias are well-identified: data, annotation, representation, models, and research design....
Conference Paper
Full-text available
Spatial information in text enables to understand the geographical context and relationships within text for better decision-making across various domains such as disease surveillance , disaster management and other location-based services. Therefore, it is crucial to understand the precise geographical context for location-sensitive applications....
Article
Full-text available
Food Security (FS) is a major concern in West Africa, particularly in Burkina Faso, which has been the epicenter of a humanitarian crisis since the beginning of this century. Early warning systems for FS and famines rely mainly on numerical data for their analyses, whereas textual data, which are more complex to process, are rarely used. However, t...
Article
Spatial information extraction from textual documents and its accurate geo-referencing are important steps in epidemiology, with many applications such as outbreak detection and disease surveillance and control. However, inaccuracy in extraction of such geospatial information will result into inaccurate location identification, which in consequence...
Article
Full-text available
Land artificialization is a serious problem of civilization. Urban planning and natural risk management are aimed to improve it. In France, these practices operate the Local Land Plans (PLU – Plan Local d’Urbanisme) and the Natural risk prevention plans (PPRn – Plan de Prévention des Risques naturels) containing land use rules. To facilitate automa...
Article
Full-text available
Crises such as natural disasters and public health emergencies generate vast amounts of text data, making it challenging to classify the information into relevant categories. Acquiring expert-labeled data for such scenarios can be difficult, leading to limited training datasets for text classification by fine-tuning BERT-like models. Unfortunately,...
Article
Full-text available
“Can you tell me where San Jose is located?” “Uh! Do you know that there are more than 1700 locations named San Jose in the world?” The official name of a location is often not the name with which we are familiar. Spatial named entity (SNE) disambiguation is the process of identifying and assigning precise coordinates to a place name that can be id...
Conference Paper
Full-text available
Digital news sources are the primary source of information for health officials and stakeholders to stay informed about potential health risks. However, with the abundance of news sources available, it can be challenging to distinguish relevant news articles from irrelevant ones. To address this issue, we propose a metadata-based approach for class...
Preprint
In the context of Epidemic Intelligence, many Event-Based Surveillance (EBS) systems have been proposed in the literature to promote the early identification and characterization of potential health threats from online sources of any nature. Each EBS system has its own surveillance definitions and priorities, therefore this makes the task of select...
Article
Full-text available
Background The timely and accurate identification of food insecurity situations represents a challenging issue. Household surveys are routinely used in low-income countries and are an essential tool for obtaining key food security indicators that are used by decision makers to determine the targets of food security interventions. Methodology This...
Article
Full-text available
This paper presents an annotated dataset used in the MOOD Antimicrobial Resistance (AMR) hackathon, hosted in Montpellier, June 2022. The collected data concerns unstructured data from news items, scientific publications and national or international reports, collected from four event-based surveillance (EBS) Systems, i.e. ProMED, PADI-web, HealthM...
Article
Full-text available
In the context of Epidemic Intelligence, many Event-Based Surveillance (EBS) systems have been proposed in the literature to promote the early identification and characterization of potential health threats from online sources of any nature. Each EBS system has its own surveillance definitions and priorities, therefore this makes the task of select...
Chapter
Food security is a major concern in West Africa, particularly in Burkina Faso, which has been the epicenter of a humanitarian crisis since the beginning of this century. Early warning systems for food insecurity and famines rely mainly on numerical data for their analyses, whereas textual data, which are more complex to process, are rarely used. To...
Article
Full-text available
A variety of remote sensing applications call for automatic optical classification of satellite images. Recently, satellite missions, such as Sentinel-2, allow us to capture images in real-time of the Earth’s scenario. The classification of this large amount of data requires increasingly precise and fast methods, which must take into account not on...
Article
Full-text available
Spatial information has gained more attention in natural language processing tasks in different interdisciplinary domains. Moreover, the spatial information is available in two forms: Absolute Spatial Information (ASI) e.g., Paris, London, and Germany and Relative Spatial Information (RSI) e.g., south of Paris, north Madrid and 80 km from Rome. The...
Article
Soil surface characteristics (SSCs) are of high importance for water infiltration processes in crop fields. As SSCs present strong spatiotemporal variability influenced by climatic conditions and agricultural practices, their monitor has already been explored by using UAV images and multispectral remote sensing. However, each technique has encounte...
Article
Full-text available
This dataset is composed by spatial (e.g. location) and thematic (e.g. diseases, symptoms, virus) entities concerning avian influenza in social media (textual) data in English. It was created from three corpora: the first one includes 10 transcriptions of YouTube videos and 70 tweets manually annotated. The second corpus is composed by the same tex...
Chapter
Full-text available
Online news sources are popular resources for learning about current health situations and developing event-based surveillance (EBS) systems. However, having access to diverse information originating from multiple sources can misinform stakeholders, eventually leading to false health risks. The existing literature contains several techniques for pe...
Article
Full-text available
Agro-sylvo-pastoral systems are common around the Mediterranean Basin, where they provide a variety of goods and services to the local populations. Their sustainability relies on efficient grazing management, especially in Mediterranean rangelands. The diversity of pastoral resources, combined with the variety of grazing management techniques and f...
Article
Full-text available
Purpose Event Based Surveillance (EBS) systems detect and monitor diseases by analysing articles from online newspapers and reports from health organizations (e.g. FAO, OIE, etc.). However, they partially integrate data from social networks, even though these data are present in large quantities on the web. The purpose of this study is to exploit s...
Article
Full-text available
Purpose In the first quarter of 2020, World Health Organization (WHO) declared COVID-19 as a public health emergency around the globe. Therefore, different users from all over the world shared their thoughts about COVID-19 on social media platforms i.e., Twitter, Facebook etc. So, it is important to analyze public opinions about COVID-19 from diffe...
Article
After many years of decline, hunger in Africa is growing again. This represents a global societal issue that all disciplines concerned with data analysis are facing. The rapid and accurate identification of food insecurity situations is a complex challenge. Although a number of food security alert and monitoring systems exist in food insecure count...
Article
Full-text available
Here, we introduce ITEXT-BIO, an intelligent process for biomedical domain terminology extraction from textual documents and subsequent analysis. The proposed methodology consists of two complementary approaches, including free and driven term extraction. The first is based on term extraction with statistical measures, while the second considers mo...
Article
Full-text available
Data produced by social networks may contain weak signals of possible epidemic outbreaks. In this paper, we focus on Twitter data during the waiting period before the appearance of COVID-19 first cases outside China. Among the huge flow of tweets that reflects a global growing concern in all countries, we propose to analyze such data with an adapta...
Chapter
We present the MeDO project, aimed at developing resources for text mining and information extraction in the wastewater domain. We developed a specific Natural Language Processing (NLP) pipeline named WEIR-P (WastewatEr InfoRmation extraction Platform) which identifies the entities and relations to be extracted from texts, pertaining to information...
Conference Paper
Full-text available
Dans le cadre d'un partenariat avec la métropole de Montpellier (3M) et son service ville intelligente, nous établissons une plateforme permettant l'intégration et l'analyse de données hétérogènes massives pour une observation intelligente du ter-ritoire concerné. Dans cet article, nous décrivons un processus de récolte automatique de documents de...
Article
Textual data is available to an increasing extent through different media (social networks, companies data, data catalogues, etc.). New information extraction methods are needed since these new resources are highly heterogeneous. In this article, we propose a text matching process based on spatial features and assessed through heterogeneous textual...
Article
Full-text available
In this paper, we propose a methodology for designing data lake dedicated to Spatial Data and an implementation of this specific framework. Inspired from previous proposals on general data lake Design and based on the Geographic information – Metadata normalization (ISO 19115), the contribution presented in this paper integrates, with the same phil...
Article
Full-text available
Dans cet article, les auteurs expérimentent une démarche permettant de produire une cartographie cohérente de l’occupation des sols des surfaces des parcours en zones périméditerranéennes françaises représentées par les régions Occitanie et Provence-Alpes-Côte d’Azur. Quatre différentes sources de données sont utilisées : l’occupation des sols mill...
Conference Paper
Full-text available
La gestion efficace d'un lac de données nécessite un système de gestion de méta-données performant. De nombreux travaux se sont penchés sur cet aspect en proposant des solutions. Néanmoins, peu de travaux se sont intéressés aux lacs de données dédiés aux informations spatiales. Pourtant, cette dimension géographique est fondamentale dès lors que l'...
Chapter
Identifying food insecurity situations timely and accurately is a complex challenge. To prevent food crisis and design appropriate interventions, several food security warning and monitoring systems are very active in food-insecure countries. However, the limited types of data selected and the limitations of data processing methods used make it dif...
Chapter
In this paper, we propose a multidimensional mapping approach for heterogeneous textual data that exploits firstly the spatial dimension and secondly the thematic dimension. Based on the Spatial Textual Representation (STR) as well as the Geodict geographic database, the contribution presented in this paper integrates the thematic dimension of docu...
Article
The expansion of satellite technologies makes remote sensing data abundantly available. While the access to such data is no longer an issue, the analysis of this kind of data is still challenging and time consuming. In this paper, we present an object-oriented methodology designed to handle multi-annual Satellite Image Time Series (SITS). This meth...
Conference Paper
RÉSUMÉ. Le projet "Mégadonnées, données liées et fouille de données pour les réseaux d’assainissement" (MeDo) a pour objectif de tirer profit des mégadonnées disponibles sur le web pour renseigner la géométrie et l’historique d’un réseau d’assainissement, en combinant différentes techniques de fouille de données et en multipliant les sources analys...
Book
Full-text available
The TERRE-ISTEX project aims to identify scientific research dealing with specific geographical territories areas based on heterogeneous digital content available in scientific papers. The project is divided into three main work packages: (1) identification of the periods and places of empirical studies, and which reflect the publications resulting...
Preprint
The TERRE-ISTEX project aims at identifying the evolution of research working relation to study areas, disciplinary crossings and concrete research methods based on the heterogeneous digital content available in scientific corpora. The project is divided into three main actions: (1) to identify the periods and places which have been the subject of...
Book
We present Gemedoc, a platform for text similarity annotation based on the spatial and the thematic dimension. To this end, a two-step annotation protocol was designed to assess the similarity between two documents: (1) identification of salient features according to the two analysis dimensions; (2) similarity assessment according to a 4-degree sca...
Preprint
The TERRE-ISTEX project aims to identify scientific research dealing with specific geographical territories areas based on heterogeneous digital content available in scientific papers. The project is divided into three main work packages: (1) identification of the periods and places of empirical studies, and which reflect the publications resulting...
Article
Background: Rapid advancements in biomedical research have accelerated the number of relevant electronic documents published online, ranging from scholarly articles to news, blogs, and user-generated social media content. Nevertheless, the vast amount of this information is poorly organized, making it difficult to navigate. Emerging technologies s...
Article
Full-text available
Environmental and more generally geospatial information is now provided by crowdsourcing but also by public administrations in the context of the open data policies. Analyses of such data are still challenging, because of their heterogeneity (structural, semantic, spatial, and temporal) and because of the difficulty in choosing the "best" knowledge...
Book
Environmental and more generally geospatial information is now provided by crowdsourcing but also by public administrations in the context of the open data policies. Analyses of such data are still challenging, because of their heterogeneity (structural, semantic, spatial, and temporal) and because of the difficulty in choosing the "best" knowledge...
Article
Full-text available
Nowadays, remote sensing technologies produce huge amounts of satellite images that can be helpful to monitor geographical areas over time. A satellite image time series (SITS) usually contains spatio-temporal phenomena that are complex and difficult to understand. Conceiving new data mining tools for SITS analysis is challenging since we need to s...
Article
Texts in addition to maps and satellite images, have become an important spatial data resource in recent years. Electronic written texts used in mediated interactions, especially short messages, have triggered the emergence of new ways of writing. Extracting information from such short messages, which represent a rich source of information, is high...
Article
Enhancing the frequency of satellite acquisitions represents a key issue for Earth Observation community nowadays. Repeated observations are crucial for monitoring purposes, particularly when intra-annual process should be taken into account. Time series of images constitute a valuable source of information in these cases. The goal of this paper is...
Conference Paper
In the past few years, texts have become an important spatial data resource, in addition to maps, satellite images and GPS. Electronic written texts used in mediated interactions, especially short messages (SMS, tweets, etc.), have triggered the emergence of new ways of writing. Extracting information from such short messages, which represent a ric...
Conference Paper
Full-text available
Polysemy is the capacity for a word to have multiple meanings. Polysemy detection is a first step for Word Sense Induction (WSI), which allows to find different meanings for a term. The polysemy detection is also important for information extraction (IE) systems. In addition, the polysemy detection is important for building/enriching terminologies...
Article
Recent improvements in positioning technology have led to a much wider availability of massive moving object data. A crucial task is to find the moving objects that travel together. In common, these object sets are called object movement patterns. Due to the emergence of many different kinds of object movement patterns in recent years, different ap...
Conference Paper
Full-text available
Biomedical ontologies play an important role for information extraction in the biomedical domain. We present a workflow for updating automatically biomedical ontologies, composed of four steps. We detail two contributions concerning the concept extraction and semantic linkage of extracted terminology.
Article
Data mining methods extract knowledge from huge amounts of data. Recently with the explosion of mobile technologies, a new type of data appeared. The resulting databases can be described as spatiotemporal data in which spatial information (e.g., the location of an event) and temporal information (e.g., the date of the event) are included. In this a...
Article
In this letter, we propose a new active transductive learning (ATL) framework for object-based classification of satellite images. The framework couples graph-based label propagation with active learning (AL) to exploit positive aspects of the two learning settings. The transductive approach considers both labelled and unlabelled image objects to p...
Article
Full-text available
Urban growth is an ongoing trend and one of its direct consequences is the development of buried utility networks. With growing needs among consumers, new networks are being in- stalled and more underground space is being occupied. Locating these networks is becoming a challenging task. Mispositioning of utility networks is an important problem for...
Conference Paper
With the amount of textual data available on the web, new methodologies of knowledge extraction domain are provided. Some original methods allow the users to combine different types of data in order to extract relevant information. In this context, we present the cornerstone of manipulations on textual documents and their preparation for extracting...
Conference Paper
Knowledge discovery from texts, particularly the identification of spatial information is a difficult task due to the complexity of texts written in natural language. Here we propose a method combining two statistical approaches (lexical and contextual analysis) and a text mining approach to automatically identify types of spatial relations. Experi...
Conference Paper
Full-text available
Highlights: This paper describes the GEOSUD project which aims to implement a national data and services infrastructure in order to facilitate the use of satellite imagery by the French scientific community and public institutions. This ecosystem of innovation is part of the THEIA Land Data Centre.
Article
Full-text available
Terminologyextraction is an essential task in domain knowledge acquisition, as well as for information retrieval. It is also a mandatory first step aimed at building/enriching terminologies and ontologies. As often proposed in the literature, existing terminology extraction methods feature linguistic and statistical aspects and solve some problems...
Article
Full-text available
La notion d'aménagement du territoire fait référence à différents concepts tels que les informations spatiales et temporelles, les acteurs, les opinions, l'histoire, la politique, etc. Aujourd'hui, avec le développement des technologies numériques (blogs, forums, réseaux sociaux, etc.), l'ensemble des acteurs impliqués s'expriment et tous les docum...
Article
Data & Knowledge Engineering (DKE) serves designers, managers, and users of database systems, expert systems, and knowledge-based systems. The major aim of the journal is to identify, investigate, and analyze the underlying principles in the design and effective use of these systems. The DKE journal will be devoted to cross-fertilization of ideas a...
Article
Cet article décrit un système décisionnel développé pour permettre l'analyse des don-nées concernant le fonctionnement des hydro-écosystèmes ; ces données sont nombreuses, di-verses et issues de sources variées. Le système mis en place comporte une base de données intégrée, un entrepôt permettant l'exploration des dimensions associées aux données,...
Conference Paper
Full-text available
Gradual patterns highlight covariations of attributes of the form " The more/less X, the more/less Y ". Their usefulness in several applications has recently stimulated the synthesis of several algorithms for their automated discovery from large datasets. However, existing techniques require all the interesting data to be in a single database relat...
Conference Paper
Full-text available
La polysémie est la caractéristique d'un terme à avoir plusieurs significations. La prédiction de la polysémie est une première étape pour l'Induction de Sens (IS), qui permet de trouver des significations différentes pour un terme, ainsi que pour les systèmes d'extraction d'information. En outre, la détection de la polysémie est importante pour la...
Article
Full-text available
Economic development based on industrialization, intensive agriculture expansion and population growth places greater pressure on water resources through increased water abstraction and water quality degradation [40], River pollution is now a visible issue, with emblematic ecological disasters following industrial accidents such as the pollution of...
Article
The main use of satellite imagery concerns the process of the spectral and spatial dimensions of the data. However, to extract useful information, the temporal dimension also has to be accounted for which increases the complexity of the problem. For this reason, there is a need for suitable data mining techniques for this source of data. In this wo...
Article
Nowadays, sequence databases are available in several domains with increasing sizes. Exploring such databases with new pattern mining approaches involving new data structures is thus important. This paper investigates this data mining challenge by presenting OrderSpan, an algorithm that is able to extract a set of closed partially ordered patterns...
Article
Full-text available
High-spatial-resolution satellites usually have the constraint of a low temporal frequency, which leads to long periods without information in cloudy areas. Furthermore, low-spatial-resolution satellites have higher revisit cycles. Combining information from high-and low-spatial-resolution satellites is thought a key factor for studies that require...
Book
Knowledge discovery from texts, particularly the identifica- tion of spatial information is a difficult task due to the complexity of texts written in natural language. Here we propose a method combining two statistical approaches (lexical and contextual analysis) and a text mining approach to automatically identify types of spatial relations. Expe...
Article
Full-text available
In this contribution we present a local evaluation procedure of Landsat-MODIS fusion methods for crop monitoring purposes. Two fusion methods are applied to obtain a two-year time series of Landsat-resolution images. The validation is applied at pixel level in order to analyze if the simulated images are capable of unmixing coarse-resolution pixels...
Article
We propose a new data mining process to extract original knowledge from hydro-ecological data, in order to help the identification of pollution sources. This approach is based (1) on a domain knowledge discretization (quality classes) of physico-chemical and biological parameters, and (2) on an extraction of temporal patterns used as discriminant f...

Network

Cited By