Alrick Dias’s research while affiliated with Mediterranean Institute of Marine and Terrestrial Biodiversity and Ecology and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (12)


Visualisation de données sous forme de graphes en archéologie. Rencontre opérationnelle des archéologues d’ArkeoGIS et des écologuesd’IndexMed
  • Article
  • Full-text available

October 2017

·

86 Reads

·

5 Citations

Archéologies numériques

·

·

Cyrille Blanpain

·

[...]

·

The one thing in common “archaeological”, “biodiversity” or “social systems” studies share is that data production is both expensive and few automated. Long time series and / or large spatial surveys are difficult to conduct, since it is necessary to use several observers. The robustness and reproducibility of the observation are also harder to get and is obviously impossible in archaeological sciences, even if modeling methods are improved. In a context of multi-source data production, the equivalence of observation systems and the inter-calibration of the observers become crucial. Multi-disciplinary integrative approaches become necessary to study systems where the output of data, in each discipline, is discontinuous, imprecise and poorly distributed. Yet, all variables (characterization of economic activities and human installation, productions studies, characteristics of the discovered or reconstituted objects, biotic or abiotic data, maps of anthropogenic and natural pressures, rendered services and feelings, societal perception...) of these systems interact over time and at each spatial scale. After a few years of existence, ArkeoGIS aggregates 67 databases representing over 50 000 objects (sites, analyzes...). With this standardization of archaeological and paleoenvironmental information, it seemed important to test new data mining methods, to see whether "related" and complex data can be linked to these archaeological data sets. The link between aggregated-bases extracts within ArkeoGIS allowed us to set up a cross-requesting and test possibilities in a prototype developed by the consortium IndexMed. This prototype, open source, allows the establishment of links between objects from different databases. The consortium IndexMed aims to identify and to raise the scientific challenges related to data quality and heterogeneity.The use of graphs allows us to consider data despite their disparity and without prioritization, and improve decision support using emerging data mining methods (collaborative clustering, machine learning, graphs approaches, representation knowledge). Adapting these methods in archeology allows us to go beyond the "simple" data aggregation: ArkeoGIS can therefore also be used to power such tools allowing us to mine our data and metadata. MOTS-CLÉS. visualisation, qualification de données, graphes, système d’information décentralisé, archéologie. KEYWORDS. visualisation, data qualification, graph, distributed information system, archeology. Nous remercions tous les membres actifs du consortium IndexMed pour leurs contributions et les GDR MaDICS et EcoStat pour leurs labellisations et soutiens. Les auteurs tiennent évidemment à remercier leurs communautés respectives, concernant ArkeoGIS plus particulièrement les auteurs des bases utilisés : G. Hoffmann, M. McCormick, C. Morrissey, C. Morel, M. Trautmann, N. Schneider, H. Wagner, C. Jeunesse, M. Roth-Zehner, D. Schwartz et C. Schmid-Merkl, et pour la relecture effectuée par Dino Ienco concernant les termes propres aux STIC.

Download

Visualisation de données sous forme de graphes en archéologie. Rencontre opérationnelle des archéologues d’ArkeoGIS et des écologues d’IndexMed - (Data visualisation in archaeology based on graph approach. Operational meeting of ArkeoGIS archaeologists and IndexMed ecologists)

October 2017

·

185 Reads

The one thing in common “archaeological”, “biodiversity” or “social systems” studies share is that data production is both expensive and few automated. Long time series and / or large spatial surveys are difficult to conduct, since it is necessary to use several observers. The robustness and reproducibility of the observation are also harder to get and is obviously impossible in archaeological sciences, even if modeling methods are improved. In a context of multi-source data production, the equivalence of observation systems and the inter-calibration of the observers become crucial. Multi-disciplinary integrative approaches become necessary to study systems where the output of data, in each discipline, is discontinuous, imprecise and poorly distributed. Yet, all variables (characterization of economic activities and human installation, productions studies, characteristics of the discovered or reconstituted objects, biotic or abiotic data, maps of anthropogenic and natural pressures, rendered services and feelings, societal perception...) of these systems interact over time and at each spatial scale. After a few years of existence, ArkeoGIS aggregates 67 databases representing over 50 000 objects (sites, analyzes...). With this standardization of archaeological and paleoenvironmental information, it seemed important to test new data mining methods, to see whether "related" and complex data can be linked to these archaeological data sets. The link between aggregated-bases extracts within ArkeoGIS allowed us to set up a cross-requesting and test possibilities in a prototype developed by the consortium IndexMed. This prototype, open source, allows the establishment of links between objects from different databases. The consortium IndexMed aims to identify and to raise the scientific challenges related to data quality and heterogeneity. The use of graphs allows us to consider data despite their disparity and without prioritization, and improve decision support using emerging data mining methods (collaborative clustering, machine learning, graphs approaches, representation knowledge). Adapting these methods in archeology allows us to go beyond the "simple" data aggregation: ArkeoGIS can therefore also be used to power such tools allowing us to mine our data and metadata.


IndexMEED cases studies using "Omics" data with graph theory

September 2017

·

255 Reads

Biodiversity Information Science and Standards

Data produced within marine and terrestrial biodiversity research projects that evaluate and monitor Good Environmental Status, have a high potential for use by stakeholders involved in environmental management. However, environmental data, especially in ecology, are not readily accessible to various users. The specific scientific goals and the logics of project organization and information gathering lead to a decentralized data distribution. In such a heterogeneous system across different organizations and data formats, it is difficult to efficiently harmonize the outputs. Few tools are available to assist. For instance standards and specific protocols can be applied to interconnect databases. Such semantic approaches greatly increase data interoperability. This communication present the recent results and the consortium IndexMEED (Indexing for Mining Ecological and Environmental Data) activity that aims to build new approaches to investigate complex research questions, and support the emergence of new scientific hypotheses based on graph theory Auber et al. 2014). Current developments in data mining based on graphs, as well as the potential for relevant contributions to environmental research, particularly about strategic decision-making, and new ways of organizing data will be presented (David et al. 2015). In particular, the consortium makes decisions on how i) to analyze heterogeneous distributed data spread throughout different databases combining molecular and habitat characteristics data [3], ii) to create matches and incorporate some approximations, iii) to identify statistical relationships between observed data and the emergence of contextual patterns using a calculation library and distributed calculation center at the European level, iv) to encourage openness and sharing data while complying with the general principles of FAIR (Findable, Accessible, Interoperable, Re-usable and citable) in order to enhance data value and their utilization. IndexMEED participants are now exploring the ability of two scientific communities (ecology sensu lato and computer sciences) to work together, using several studies cases. The ECOSCOPE project aims to meet the need to access structured and complementary omics-datasets to better understand biodiversity state and its dynamics. Indeed, the ECOSCOPE case study targets to visualize, through the graph approach, links between datasets and databases from genetics to ecosystems. Another case study, displaying anthropology fossils and omics on the same graph, will also be presented. DEVOTES (DEVelopment Of innovative Tools for understanding marine biodiversity and assessing good Environmental Status) and CIGESMED (Coralligenous based Indicators to evaluate and monitor the "Good Environmental Status" of the MEDiterranean coastal water) European projects, conducted by IMBE, are focused on photo quadrats, cartography and omics data of the marine hard bottom in order to discover context patterns helpful to build decision support system building. Study case “65 Millions d’observateurs” French project is testing AskOmics to provide a graph-based querying interface using RDF (Resource Description Framework) and SPARQL technologies. Scientific questions can be resolved by the new data mining approaches that offer new ways to investigate heterogeneous environmental data with graph mining (Muñoz et al. 2017). The uses of data from biodiversity research demonstrate the prototype functionalities (David et al. 2016) and introduce new perspectives to analyze environmental and societal responses including decision-making at large scale, both at the information system level and the observing system level than at the observed system level.


Results of IndexMed GRAIL Days 2016: How to use standards to build GRAphs and mIne data for environmentaL research? IndexMeed consortium for data mining in ecology

April 2017

·

228 Reads

Data produced by biodiversity research projects that evaluate and monitor Good Environmental Status have a high potential for use by stakeholders involved in [marine] environmental management. The lack of specific scientific objectives, poor organizational logic, and a characteristically disorganized collection of information leads to a decentralized data distribution, hampering environmental research. In such a heterogeneous system across different organizations and data formats, it is difficult to efficiently harmonize the outputs. There are few tools available to assist. The task of the newly created consortium of IndexMeed is to index biodiversity data (and to provide an index of qualified existing open datasets) and make it possible to build graphs to assist in the analysis and development of new ways to mine data. Standards (including TDWG recommendations) and specific protocols can be applied to interconnect databases. Such semantic approaches greatly increase data interoperability. The aim of this poster is to present the 2016 IndexMed workshop results (https://indexmed2016.sciencesconf.org) and recent actions of the consortium (renamed IndexMeed - Indexing for Mining Ecological and Environmental Data): new approaches to investigate complex research questions and support the emergence of new scientific hypotheses. With one day of plenary sessions and two days of practical workshops, this event was dedicated to the sharing of experience and expertise, the acquisition of practical methods to construct graphs and value data through metadata and ”data papers”. Recent developments in data mining based on graphs, the potential for important contributions to environmental research, particularly about strategic decision-making, and new ways of organizing data were also discussed at the workshop. In particular, this workshop promoted decisions on how (i) to analyze heterogeneous distributed data spread in different databases, (ii) to create matches and incorporate some approximations, (iii) to identify statistical relationships between observed data and the emergence of contextual patterns, and (iv) to encourage openness and the sharing of data, in order to value data and their utilization. The IndexMeed project participants are now exploring the ability of two scientific communities (ecology sensu lato and computer sciences) to work together. The uses of data from biodiversity research demonstrate the prototype functionalities and introduce new perspectives to analyze environmental and societal responses including decision-making. Output of the seminar lists scientific questions that can be resolved by the new data mining approaches and proposes new ways to investigate heterogeneous environmental data with graph mining.


Figure 1 of 5
Figure 2 of 5
Figure 3 of 5
Figure 4 of 5
Figure 5 of 5
Results of IndexMed GRAIL Days 2016: How to use standards to build GRAphs and mIne data for environmentaL research?

December 2016

·

318 Reads

Data produced by biodiversity research projects that evaluate and monitor Good Environmental Status have a high potential for use by stakeholders involved in [marine] environmental management. The lack of specific scientific objectives, poor organizational logic, and a characteristically disorganized collection of information leads to a decentralized data distribution, hampering environmental research. In such a heterogeneous system across different organizations and data formats, it is difficult to efficiently harmonize the outputs. There are few tools available to assist. The task of the newly created consortium of IndexMeed is to index biodiversity data (and to provide an index of qualified existing open datasets) and make it possible to build graphs to assist in the analysis and development of new ways to mine data. Standards (including TDWG) and specific protocols can be applied to interconnect databases. Such semantic approaches greatly increase data interoperability. The aim of this talk is to present the 2016 IndexMed workshop results (https://indexmed2016.sciencesconf.org) and recent actions of the consortium (renamed “IndexMeed - Indexing for Mining Ecological and Environmental Data): new approaches to investigate complex research questions and support the emergence of new scientific hypotheses. With one day of plenary sessions and two days of practical workshops, this event was dedicated to the sharing of experience and expertise, the acquisition of practical methods to construct graphs and value data through metadata and ”data papers”. Recent developments in data mining based on graphs, the potential for important contributions to environmental research, particularly about strategic decision-making, and new ways of organizing data were also discussed at the workshop. In particular, this workshop promoted decisions on how (i) to analyze heterogeneous distributed data spread in different databases, (ii) to create matches and incorporate some approximations, (iii) to identify statistical relationships between observed data and the emergence of contextual patterns, and (iv) to encourage openness and the sharing of data, in order to value data and their utilization. The IndexMeed project participants are now exploring the ability of two scientific communities (ecology sensu lato and computer sciences) to work together. The uses of data from biodiversity research demonstrate the prototype functionalities and introduce new perspectives to analyze environmental and societal responses including decision-making. Output of the seminar lists scientific questions that can be resolved by the new data mining approaches and proposes new ways to investigate heterogeneous environmental data with graph mining.


Graph approach of heterogeneous data, the new possibilities developed by the IndexMed consortium for data mining in ecology

October 2016

·

242 Reads

In a production framework of multi-source data in ecology, the equivalence of observation systems and inter-calibration become crucial. Increasingly, integrative trans-disciplinary approaches become necessary in the study of systems where measurement in each discipline is patchy, imprecise and badly distributed. Yet all the variables (biotic, abiotic, anthropogenic and natural pressures, perceived and rendered services, societal perception, etc.) of these systems interact in a wide range of spatiotemporal scales. Beside theoretical scientific issues, the intrinsic heterogeneity and complexity of biodiversity data from genes to ecosystems and their links to environmental parameters, the improvement of data quality is hindered by data management issues: i) the dynamics of the update of voluminous datasets, ii) the update of reference repositories and standards supporting data administration, iii) the heterogeneity of data producers and their motivations to maintain and supply their information systems, and iv) the diversity of the targeted end-users and their skills. The description of data quality is an objective of the IndexMed consortium (http://www.indexmed.eu), based on an analysis of both common and different elements between databases. Descriptions as metadata form a body of criteria used for data mining. The graph-based model is an abstraction tool that allows us to compare the various databases despite their differences and that improves decision support using emerging data mining methods. Practically, it is intended to give the equivalence of data, based on data dictionaries, thesauri and ontologies. From the established logical relationships, new qualifiers can be deduced including across data heterogeneity.


Project IndexMed: original solutions to manage the heterogeneity of marine ecology data in the Mediterranean Sea

September 2016

·

64 Reads

Concerning studies about biodiversity and socio-ecological systems (SES), the data production in the coastal and marine area is expensive and still has a low level of automation. Long time series and/or large spatial areas studies are difficult to conduct, and when it is necessary to involve several observers, the robustness and reproducibility of the observation is more difficult to obtain. In a production framework of multi-source data, equivalence of observation systems and their inter-calibration become crucial. Multidisciplinary or trans-disciplinary integrative approaches become necessary in the study of systems where output of each discipline is discontinuous, imprecise and poorly distributed. Yet all variables (biotic, abiotic, anthropogenic and natural pressures, felt and rendered services, societal image…) of these systems interact over time and at each spatial scale. A better overall understanding of the balance of SES and their influence on biodiversity will be permitted by constructing and testing co-interpretation methods of analyses of these heterogeneous data. Data mining methods must be able to bring new perspectives to the disciplinary researches that finally examine interrelated objects (environmental chemistry, genomics, transcriptomics, metabolomics, population ecology / landscape, socio-ecological systems). The IndexMed consortium aims to identify and overcomes the scientific barriers related to data quality and heterogeneity. The use of graph-based model allows us to consider them, despite their differences, at a similar level, and improves decision support using emerging data mining methods (collaborative clustering, machine learning, mining graphs, knowledge representation, etc).


Figure 1-IndexMed Workflow and eservices: The resolution service is able to compare the index with storage data in einfrastructures and other distant XML, JSON Flux from different databases. When necessary/possible, it creates a persistent identifier or links datasets or data records with existing identifiers. A scientist interface, adapted to the level and needs of each user allows a qualification process. The indexing service accept/manages data for computing services like data mining and graph analyses, and statistical results and graph models are stored and proposed as a new persistent flux. When it is possible, data qualification uses tools, standards and recommendations at both national (SINP [National Information System on Biodiversity], RBDD [Network of Research Databases]) and international levels (MedOBIS [Mediterranean Ocean Biogeographic Information System], OBIS, GBIF [Cryer et al., 2009], Life-Watch, GEO-BON, etc.) or shared by other research entities (i.e. IRD [Institute of Research for the Development] or MNHN [National Museum of Natural History, Paris]). Heterogeneity in datasets may be the result of a lack of standards to name and describe data [Kattge et al., 2011; [Madin et al., 2008]. Thus, attention must be paid to the characterisation of concepts by using "controlled vocabulary" (i.e. with a shared definition commonly choose) and semantic links between these concepts, which implies building a thesaurus in the first place. Thesaurus appears more appropriate than ontology because of its flexibility. Several eco-informatics initiatives attempted to build such thesaurus (see [Michener & Jones, 2012; Laporte, 2012]) and it is expected to take them in account.
Figure 2: Example of an application of the prototype graph to a data set of 100 photo-quadrats made on coralligenous habitats in Marseille.  
Figure 3: Iterative quality approach and IndexMed Output.  
Figure 1 -IndexMed Workflow and e- services:  
IndexMed projects: new tools using the CIGESMED DataBase on Coralligenous for indexing, visualizing and data mining based on graphs

July 2016

·

590 Reads

·

8 Citations

Data produced by the SeasEra CIGESMED project (Coralligenous based Indicators to evaluate and monitor the "Good Environmental Status" of the MEDiterranean coastal waters) have a high potential to be used by several stakeholders involved in environmental management. A new consortium called IndexMed whose task is to index Mediterranean biodiversity data, makes it possible to build graphs in order to analyse the CIGESMED data and develop new ways for data mining of coralligenous data. This paper presents the prototypes under development that test the ability of graphs dataBases and tools to connect biodiversity objects with non-centralized data. This project explores the ability of two scientific communities to work together. The uses of data from coralligenous habitat demonstrate the prototype functionalities and introduce new perspectives to analyse environmental and societal responses.


Aller au-delà des données agrégées dans ArkeoGIS : utilisation de graphes au sein d'IndexMed

June 2016

·

98 Reads

Le point commun des études en archéologie, biodiversité ou sur les systèmes sociaux est que la production de données est à la fois coûteuse et peu automatisée. Les suivis de longues séries temporelles et/ou à larges emprises spatiales sont difficiles à mener, dès lors qu’il faut recourir à plusieurs observateurs. La robustesse et la reproductibilité de l’observation est aussi plus difficile à obtenir. Dans un cadre de production de données multi-sources, l’équivalence des systèmes d’observations et l’inter-calibration d’observateurs deviennent cruciales. Des approches intégratives pluri- voire transdisciplinaires deviennent nécessaires, dans l’étude de systèmes où la production de données dans chaque discipline est discontinue, peu précise et mal répartie. Pourtant, toutes les variables (cartographie d’installations humaines, caractérisation des activités économiques, études des productions, recensements d’objets, données biotiques, abiotiques, cartographies des pressions anthropiques et naturelles, services rendus et ressentis, image sociétale,...) de ces systèmes interagissent dans le temps et à chaque échelle spatiale. Après quelques années d’existence, ArkeoGIS permet d’agréger à ce jour plus de 60 bases de données représentant plus de 50 000 objets (sites, analyses). Fort de cette normalisation de l’information archéologique et paléo-environnementale, il nous a semblé important d’utiliser de nouvelles méthodes de fouille de données afin de voir si des données « connexes » peuvent être reliées à ces jeux de donnée en archéologie. Le lien entre ArkeoGIS et EPD (european pollen database) nous a permis de mettre en place une requête croisée et de tester cette possibilité au sein d’un prototype développé par le consortium IndexMed. Ce prototype, en open source, permet la mise en place de liens entre objets de bases de données différentes. Le consortium IndexMed a pour objectif d’identifier puis de lever les verrous scientifiques liés à la qualité des données et leur hétérogénéité. L’utilisation de graphes permet de les considérer malgré leur disparité et sans les hiérarchiser, et améliore l’aide à la décision en utilisant des méthodes émergentes de fouille de données (clustering collaboratif, machine-learning, fouille de graphes, représentation de connaissances) ; adapter ces méthodes à l’archéologie nous permet d’aller au-delà de la « simple » agrégation de données. L’objectif : une meilleure compréhension globale des interactions historiques entre l’homme et la biodiversité qui sera permise par la construction et le test de méthodes de co-interprétation de ces données hétérogènes. Les méthodes de fouille de données apporteront de nouvelles perspectives aux recherches disciplinaires qui étudient en fin de compte des objets intimement liés (Lien entre données archéologiques et chimie environnementale, génomique, transcriptomique, métabolomique, écologie des peuplements/des paysages, systèmes socio-écologiques).


Bilan Juin 2015 – Février 2016 du Projet VIGI-GEEK : VIsualisation of Graph In trans-disciplinary Global Ecology, Economy and Sociology data-Kernel.

February 2016

·

112 Reads

·

1 Citation

” VIGI-GEEK ” propose de construire un outil de représentation sous forme de graphes des données de différents champs disciplinaires (écologie, sociologie, économie) et d’élaborer des méthodes de création de scénarios par approches successives (coévolution de facteurs), basée sur des concepts actuellement décrits par les approches globales. L’objectif est de construire des graphes paramétrables avec des données hétérogènes (de la molécule à l’écosystème, en passant par les traits de vie, jusqu’aux paysages et aux interactions homme-milieu) concernant l’écologie méditerranéenne et d’analyser les données grâce à des algorithmes utilisés dans d’autres disciplines. « VIGI-GEEK » doit dans le cadre d’un consortium multidisciplinaire appelé « IndexMed » développer à moyen terme les usages de ces graphes pour l’aide à la décision en gestion environnementale dans le cadre d’un projet de recherche à soumettre aux appels à projets européens (BiodivERsA, FEDER, SeasEra, H2020).


Citations (3)


... On this basis, the databases of the ArkeoGIS project of the University of Strasbourg were used to extract distribution patterns of the archaeological record. The data has been fi ltered and harmonised depending on the availability and the choice of the chronological period (fi g. 6) (Bernard 2014;David et al. 2017). A digital elevation model (ASTER_GDEM2) was obtained from the United States Geological Survey (USGS) (Herzog/Yépez 2015;Suwandana et al. 2012 Barsi et al. 2014]). ...

Reference:

SFB1070-15-Human-Made Environments
Visualisation de données sous forme de graphes en archéologie. Rencontre opérationnelle des archéologues d’ArkeoGIS et des écologuesd’IndexMed

Archéologies numériques

... permet d'agréger des bases de données disparates au sein d'un outil libre et en ligne à l'aide d'une ontologie "bottum up" construite par les acteurs de la discipline. Forte de plusieurs décennies d'expériences plus ou moins comparables (Archaeomedes puis ArchaeDYN, Fastionline), et agrégeant des projets aussi bien archéologiques (Digital Atlas of Roman and Medieval Civilizations -DARMC, sont à l'étude Chronocarto, NOMISMA, artefacts…) que des projets environnementaux (European Pollen Database, MedMAX, Banadora ou DCCD en cours), le projet ArkeoGIS a trouvé avec Indexmed une équipe développant un outil très adapté, bien qu'initialement développé dans le cadre d'études en écologie marine (David et al 2016). ...

IndexMed projects: new tools using the CIGESMED DataBase on Coralligenous for indexing, visualizing and data mining based on graphs

... Organizing, linking and analyzing heterogeneous environmental data is complex but required by all arising environmental management challenges (Madon et al. 2022). Building decision support systems that take in account all types of data from molecular level to global scale, including contexts like physical and chemical environments, need to organize these links semantically (David et al. 2015). In this chapter, the WIBE IS serves as a study case to present the steps followed to curate and organize inter-and multidisciplinary data. ...

A First Prototype for Indexing, Visualizing and Mining Heterogeneous Data in Mediterranean Ecology: Within the IndexMed Consortium Interdisciplinary Framework