Sandra Bringay

Agence Régionale de Santé (ARS), Lutetia Parisorum, Île-de-France, France

Are you Sandra Bringay?

Claim your profile

Publications (55)13.61 Total impact

  • Source
    Jérome Pasquet, Sandra Bringay, Marc Chaumont
    [Show abstract] [Hide abstract]
    ABSTRACT: Many different hypotheses may be chosen for modeling a steganography/steganalysis problem. In this paper, we look closer into the case in which Eve, the steganalyst, has partial or erroneous knowledge of the cover distribution. More precisely we suppose that Eve knows the algorithms and the payload size that has been used by Alice, the steganographer, but she ignores the images distribution. In this source-cover mismatch scenario, we demonstrate that an Ensemble Classifier with Features Selection (EC-FS) allows the steganalyst to obtain the best state-of-the-art performances, while requiring 100 times smaller training database compared to the previous state-of-the art approach. Moreover, we propose the islet approach in order to increase the classification performances.
    EUSIPCO 2014, 22nd European Signal Processing Conference 2014, Lisbon, Portugal; 09/2014
  • [Show abstract] [Hide abstract]
    ABSTRACT: Rapid population growth, and human activities (such as agriculture, industry, transports,…) development have increased vulnerability risk for water resources. Due to the complexity of natural processes and the numerous interactions between hydro-systems and human pressures, water quality is difficult to be quantified. In this context, we present a knowledge discovery process applied to hydrological data. To achieve this objective, we combine successive methods to extract knowledge on data collected at stations located along several rivers. Firstly, data is pre-processed in order to obtain different spatial proximities. Later, we apply a standard algorithm to extract sequential patterns. Finally we propose a combination of two techniques (1) tofilter patterns based on interest measure, and; (2) to group and present them graphically, to help the experts. Such elements can be used to assess spatialized indicators to assist the interpretation of ecological and river monitoring pressure data.
    Ecological Informatics 06/2014; · 1.96 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: In the framework of the French project Patients' Mind, we fo-cus on the semi-automatic analysis of online health forums. Online health forums are areas of exchange where patients, on condition of anonymity, can talk about their personal experiences freely. These resources are a gold mine for health professionals, giving them access to patient to patient exchanges, patient to health professional exchanges and even health professional to health professional exchanges. In this paper, we focus on the emotions expressed by the authors of the messages and more precisely on the targets of these emotions. We suggest an innovative method to identify these targets, based on the notion of semantic roles and using the FrameNet resource. Our method has been successfully validated on real data set.
    15th International Conference on Intelligent Text Processing and Computational Linguistics (CICLing 2014), Kathmandu, Nepal; 04/2014
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: To identify local meteorological drivers of dengue fever in French Guiana, we applied an original data mining method to the available epidemiological and climatic data. Through this work, we also assessed the contribution of the data mining method to the understanding of factors associated with the dissemination of infectious diseases and their spatiotemporal spread. We applied contextual sequential pattern extraction techniques to epidemiological and meteorological data to identify the most significant climatic factors for dengue fever, and we investigated the relevance of the extracted patterns for the early warning of dengue outbreaks in French Guiana. The maximum temperature, minimum relative humidity, global brilliance, and cumulative rainfall were identified as determinants of dengue outbreaks, and the precise intervals of their values and variations were quantified according to the epidemiologic context. The strongest significant correlations were observed between dengue incidence and meteorological drivers after a 4-6-week lag. We demonstrated the use of contextual sequential patterns to better understand the determinants of the spatiotemporal spread of dengue fever in French Guiana. Future work should integrate additional variables and explore the notion of neighborhood for extracting sequential patterns. Dengue fever remains a major public health issue in French Guiana. The development of new methods to identify such specific characteristics becomes crucial in order to better understand and control spatiotemporal transmission.
    Journal of the American Medical Informatics Association 02/2014; · 3.57 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Internet health forums are a rich textual resource with content generated through free exchanges among patients and, in certain cases, health professionals. We tackle the problem of retrieving clinically relevant information from such forums, with relevant topics being defined from clinical auto-questionnaires. Texts in forums are largely unstructured and noisy, calling for adapted preprocessing and query methods. We minimize the number of false negatives in queries by using a synonym tool to achieve query expansion of initial topic keywords. To avoid false positives, we propose a new measure based on a statistical comparison of frequent co-occurrences in a large reference corpus (Web) to keep only relevant expansions. Our work is motivated by a study of breast cancer patients' health-related quality of life (QoL). We consider topics defined from a breast-cancer specific QoL-questionnaire. We quantify and structure occurrences in posts of a specialized French forum and outline important future developments.
    Studies in health technology and informatics 01/2014; 205:1070-1074.
  • Studies in health technology and informatics 01/2014; 205:1185.
  • [Show abstract] [Hide abstract]
    ABSTRACT: We propose a new data mining process to extract original knowledge from hydro-ecological data, in order to help the identification of pollution sources. This approach is based (1) on a domain knowledge discretization (quality classes) of physico-chemical and biological parameters, and (2) on an extraction of temporal patterns used as discriminant features to link physico-chemistry with biology in river sampling sites. For each bio-index quality value, we obtained a set of significant discriminant features. We used them to identify the physico-chemical characteristics that impact on different biological dimensions according to their presence in extracted knowledge. The experiments meet with the domain knowledge and also highlight significant mismatches between physico-chemical and biological quality classes. Then, we discuss about the interest of using discriminant temporal patterns for the exploration and the analysis of temporal environmental data such as hydro-ecological databases.
    Ecological Informatics 01/2014; · 1.96 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Rapid population growth and human activities (such as agriculture, industry, transports...) development have increased vulnerability risk for water resources. Due to the nature of river networking distribution, the interactions between hydro-systems and human pressures are difficult understand. In addition, many hypotheses about river water pollution can be formulated. In this context, knowledge discovery is a promising process to better understand and manage such phenomenon. We have combine the results of several data mining methods to extract actionable knowledge from data collected by stations located along several rivers. First, data are pre processed (aggregated) according to different spatial relationships, which leads to the extraction of semantically different patterns in the second phase of the process. Then, the resulting datasets are mined to extract sequential and spatio-sequential patterns. Finally, patterns are filtered using a new quality measure based on the notion of contradiction. Such elements can be used to assess specialized indicators to assist the experts in river water quality restoration.
    Revue internationale de géomatique 12/2013; 23(3-4):471-496.
  • [Show abstract] [Hide abstract]
    ABSTRACT: Technological advances in terms of data acquisition enable to better monitor dynamic phenomena in various domains including environment. The collected data are more complex (spatial, temporal, heterogeneous and multi-scale). The exploitation of this data requires new methods of data analysis and knowledge discovery. In this context, approaches for discovering spatio-temporal patterns are particularly relevant. This paper proposes to make a detailed review of these works. We focus on two examples of patterns : colocation and spatio-sequential patterns. These patterns have been used to study real applications in the field of environment.
    Revue d intelligence artificielle 10/2013;
  • Spatial Analysis and GEOmatics 2009 (SAGEO'2013), Brest, France; 09/2013
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, we focus on methods for extracting spatial information in text documents. After presenting textual description of space and manual annotation of named entities, mainly location and organization, we present our proposal Text2Geo. It is a hybrid method which combines information extraction approach based on patterns with a supervised classification approach to explore context. We discuss some results obtained on the dataset of Thau lagoon.
    In Proceedings of WIMS'13 (International Conference on Web Intelligence, Mining and Semantics),; 01/2013
  • Source
    Julien Rabatel, Sandra Bringay, Pascal Poncelet
    [Show abstract] [Hide abstract]
    ABSTRACT: Traditional sequential patterns do not take into account contextual infor- mation associated with sequential data. For instance, when studying purchases of customers in a shop, a sequential pattern could be “frequently, customers buy prod- ucts A and B at the same time, and then buy product C”. Such a pattern does not consider the age, the gender or the socio-professional category of customers. However, by taking into account contextual information, a decision expert can adapt his/her strategy according to the type of customers. In this paper, we focus on the analysis of a given context (e.g., a category of customers) by extracting context-dependent sequential patterns within this context. For instance, given the context correspond- ing to young customers, we propose to mine patterns of the form “buying products A and B then product C is a general behavior in this population” or “buying products B and D is frequent for young customers only”. We formally define such context-dependent sequential patterns and highlight relevant properties that lead to an efficient extraction algorithm. We conduct our experimental evaluation on real-world data and demonstrate performance issues.
    Advances in Knowledge Discovery and Management (AKDM-3), 01/2013: pages 23-41; Springer.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Epidemiological surveillance is an important issue of public health policy. In this paper, we describe a method based on knowledge extraction from news and news classification to understand the epidemic evolution. Descriptive studies are useful for gathering information on the incidence and characteristics of an epidemic. New approaches, based on new modes of mass publication through the web, are developed: based on the analysis of user queries or on the echo that an epidemic may have in the media. In this study, we focus on a particular media: web news. We propose the Epimining approach, which allows the extraction of information from web news (based on pattern research) and a fine classification of these news into various classes (new cases, deaths...). The experiments conducted on a real corpora (AFP news) showed a precision greater than 94% and an F-measure above 85%. We also investigate the interest of tacking into account the data collected through social networks such as Twitter to trigger alarms.
    Emerging Trends in Knowledge Discovery and Data Mining, Lecture Notes in Artificial Intelligence 01/2013: pages 11-21; Springer., ISBN: 978-3-642-36777-9
  • [Show abstract] [Hide abstract]
    ABSTRACT: Epidemiological surveillance is an important issue of public health policy. In this paper, we describe a method based on knowledge extraction from news and news classification to understand the epidemic evolution. Descriptive studies are useful for gathering information on the incidence and characteristics of an epidemic. New approaches, based on new modes of mass publication through the web, are developed: based on the analysis of user queries or on the echo that an epidemic may have in the media. In this study, we focus on a particular media: web news. We propose the Epimining approach, which allows the extraction of information from web news (based on pattern research) and a fine classification of these news into various classes (new cases, deaths...). The experiments conducted on a real corpora (AFP news) showed a precision greater than 94% and an F-measure above 85%. We also investigate the interest of tacking into account the data collected through social networks such as Twitter to trigger alarms.
    Proceedings of the 2012 Pacific-Asia conference on Emerging Trends in Knowledge Discovery and Data Mining; 05/2012
  • International Conference on Geographic Information Science (AGILE'2012), 24-27 Avrile 2012, Avignon, France; 04/2012
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Health risks management such as epidemics study produces large quantity of spatio-temporal data. The development of new methods able to manage such specific characteristics becomes crucial. To tackle this problem, we define a theoretical framework for extracting spatio-temporal patterns (sequences representing evolution of locations and their neighborhoods over time). Classical frequency support doesn't consider the pattern neighbor neither its evolution over time. We thus propose a new interestingness measure taking into account both spatial and temporal aspects. An algorithm based on pattern-growth approach with efficient successive projections over the database is proposed. Experiments conducted on real datasets highlight the relevance of our method.
    PAKDD (2); 01/2012
  • EGC; 01/2012
  • [Show abstract] [Hide abstract]
    ABSTRACT: Dans cet article, nous présentons un projet de découverte de connaissances dans des données hydrologiques. Pour cela, nous appliquons un algorithme d’extraction de motifs séquentiels sur les données relevées au niveau de stations réparties le long de plusieurs rivières. Les données sont pré-traitées afin de considérer différentes proximités spatiales et l’analyse du nombre de motifs obtenus souligne l’influence des relations ainsi définies. Nous proposons et détaillons une mesure objective d’évaluation, appelée la mesure de moindre contradiction temporelle, afin d’aider l’expert dans la découverte de nouveautés. Ces éléments posent les premières bases de travaux plus ambitieux permettant de proposer des indicateurs spatialisés pour l’aide à l’interprétation des données de suivi écologique des cours d’eau et des données de pression.
    Revue des Nouvelles Technologies de l'Information. 12/2011; E-22(MQDC 2012):165-188.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Data mining allow users to discover novelty in huge amounts of data. Frequent pattern methods have proved to be efficient, but the extracted patterns are often too numerous and thus difficult to analyze by end users. In this paper, we focus on sequential pattern mining and propose a new visualization system to help end users analyze the extracted knowledge and to highlight novelty according to databases of referenced biological documents. Our system is based on three visualization techniques: clouds, solar systems, and treemaps. We show that these techniques are very helpful for identifying associations and hierarchical relationships between patterns among related documents. Sequential patterns extracted from gene data using our system were successfully evaluated by two biology laboratories working on Alzheimer's disease and cancer.
    Journal of Biomedical Informatics 10/2011; 44:760-774. · 2.13 Impact Factor
  • Source
    Julien Rabatel, Sandra Bringay, Pascal Poncelet
    [Show abstract] [Hide abstract]
    ABSTRACT: Today, many industrial companies must face problems raised by maintenance. In particular, the anomaly detection problem is probably one of the most challenging. In this paper we focus on the railway maintenance task and propose to automatically detect anomalies in order to predict in advance potential failures. We first address the problem of characterizing normal behavior. In order to extract interesting patterns, we have developed a method to take into account the contextual criteria associated to railway data (itinerary, weather conditions, etc.). We then measure the compliance of new data, according to extracted knowledge, and provide information about the seriousness and the exact localization of a detected anomaly.
    Expert Systems with Applications 06/2011; 38:7003-7015. · 1.85 Impact Factor

Publication Stats

59 Citations
13.61 Total Impact Points

Institutions

  • 2011
    • Agence Régionale de Santé (ARS)
      Lutetia Parisorum, Île-de-France, France
    • Laboratoire Bordelais de Recherche en Informatique
      Burdeos, Aquitaine, France
  • 2008–2011
    • Université Montpellier 2 Sciences et Techniques
      • Laboratoire d'Informatique, de Robotique et de Microélectronique de Montpellier (LIRMM)
      Montpellier, Languedoc-Roussillon, France
  • 2010
    • French National Centre for Scientific Research
      • Laboratoire d'Informatique, de Robotique et de Microélectronique de Montpellier (LIRMM)
      Montpellier, Languedoc-Roussillon, France
    • Paul Valéry University, Montpellier 3
      Montpelhièr, Languedoc-Roussillon, France