About
128
Publications
82,820
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
12,604
Citations
Citations since 2017
Introduction
Skills and Expertise
Publications
Publications (128)
The biomedical research community is motivated to share and reuse data from studies and projects by funding agencies and publishers. Effectively combining and reusing neuroimaging data from publicly available datasets, requires the capability to query across datasets in order to identify cohorts that match both neuroimaging and clinical/behavioral...
Background
Birth defects are functional and structural abnormalities that impact about 1 in 33 births in the United States. They have been attributed to genetic and other factors such as drugs, cosmetics, food, and environmental pollutants during pregnancy, but for most birth defects there are no known causes.
Methods
To further characterize assoc...
The December 2022 release of the SPARC Portal ( https://sparc.science ) included the first significant update to the anatomical connectivity flatmaps since the portal first launched. These flatmaps provide an interactive and visual map for the display and exploration of the autonomic nervous system of Human, rat, mouse, pig, and cat ( https://sparc...
Antibodies are ubiquitous key biological research resources yet are tricky to use as they are prone to performance issues and represent a major source of variability across studies. Understanding what antibody was used in a published study is therefore necessary to repeat and/or interpret a given study. However, antibody reagents are still frequent...
Birth defects are functional and structural abnormalities that impact 1 in 33 births in the United States. Birth defects have been attributed to genetic as well as other factors, but for most birth defects there are no known causes. Small molecule drugs, cosmetics, foods, and environmental pollutants may cause birth defects when the mother is expos...
The stimulating peripheral activity to relieve conditions (SPARC) program is a US National Institutes of Health-funded effort to improve our understanding of the neural circuitry of the autonomic nervous system (ANS) in support of bioelectronic medicine. As part of this effort, the SPARC project is generating multi-species, multimodal data, models,...
The Neuroimaging Data Model (NIDM) is a series of specifications for describing all aspects of the neuroimaging data lifecycle from raw data to analyses and provenance. NIDM uses community-driven terminologies along with unambiguous data dictionaries within a Resource Description Framework (RDF) document to describe data and metadata for integratio...
Investigator-generated transcriptomic datasets interrogating circulating immune cell (CIC) gene expression in clinical type 1 diabetes (T1D) have underappreciated re-use value. Here, we repurposed these datasets to create an open science environment for the generation of hypotheses around CIC signaling pathways whose gain or loss of function contri...
Discovering and learning how to use fast-growing publicly available online bioinformatics and data resources can be challenging for bench scientists. The NIDDK Information Network (dkNET; https://dknet.org) is an open community resource information portal for biomedical researchers supported by the National Institute of Diabetes and Digestive and K...
Traumatic brain injury (TBI) is a major public health problem. Despite considerable research deciphering injury pathophysiology, precision therapies remain elusive. Here, we present large-scale data sharing and machine intelligence approaches to leverage TBI complexity. The Open Data Commons for TBI (ODC-TBI) is a community-centered repository emph...
The Neuroimaging Data Model (NIDM) is a series of specifications for describing all aspects of the neuroimaging data lifecycle from raw data to analyses and provenance. NIDM uses community-driven terminologies along with unambiguous data dictionaries within a Resource Description Framework (RDF) document to describe data and metadata for integratio...
BACKGROUND
Improving rigor and transparency measures should lead to improvements in reproducibility across the scientific literature, but assessing measures of transparency tends to be very difficult if performed manually.
OBJECTIVE
This study addresses an enhancement of the Rigor and Transparency Index (RTI v.2.0), which attempts to automatically...
Background
Improving rigor and transparency measures should lead to improvements in reproducibility across the scientific literature; however, the assessment of measures of transparency tends to be very difficult if performed manually.
Objective
This study addresses the enhancement of the Rigor and Transparency Index (RTI, version 2.0), which atte...
In this perspective article, we consider the critical issue of data and other research object standardisation and, specifically, how international collaboration, and organizations such as the International Neuroinformatics Coordinating Facility (INCF) can encourage that emerging neuroscience data be Findable, Accessible, Interoperable, and Reusable...
The Stimulating Peripheral Activity to Relieve Conditions (SPARC) program is a US National Institutes of Health-funded effort to improve our understanding of the neural circuitry of the autonomic nervous system in support of bioelectronic medicine. As part of this effort, the SPARC project is generating multi-species, multimodal data, models, simul...
The Data and Resource Center (DRC) of the NIH-funded SPARC program is developing databases, connectivity maps, and simulation tools for the mammalian autonomic nervous system. The experimental data and mathematical models supplied to the DRC by the SPARC consortium are curated, annotated and semantically linked via a single knowledgebase. A data po...
Antibodies are widely used reagents to test for expression of proteins and other antigens. However, they might not always reliably produce results when they do not specifically bind to the target proteins that their providers designed them for, leading to unreliable research results. While many proposals have been developed to deal with the problem...
A Correction to this paper has been published: https://doi.org/10.1007/s12021-021-09522-x
The Data and Resource Center (DRC) of the NIH-funded SPARC program is developing databases, connectivity maps and simulation tools for the mammalian autonomic nervous system. The experimental data and mathematical models supplied to the DRC by the SPARC consortium are curated, annotated and semantically linked via a single knowledgebase. A data por...
Traumatic brain injury (TBI) is a major unsolved public health problem worldwide with considerable preclinical research dedicated to recapitulating clinical TBI, deciphering the underlying pathophysiology, and developing therapeutics. However, the heterogeneity of clinical TBI and correspondingly in preclinical studies have made translation from be...
The NIH Common Fund Stimulating Peripheral Activity to Relieve Conditions (SPARC) initiative is a large-scale program that seeks to accelerate the development of therapeutic devices that modulate electrical activity in nerves to improve organ function. Integral to the SPARC program are the rich anatomical and functional datasets produced by investi...
There is great need for coordination around standards and best practices in neuroscience to support efforts to make neuroscience a data-centric discipline. Major brain initiatives launched around the world are poised to generate huge stores of neuroscience data. At the same time, neuroscience, like many domains in biomedicine, is confronting the is...
Motivation: Antibodies are widely used reagents to test for expression of proteins. However, they might not always reliably produce results when they do not specifically bind to the target proteins that their providers designed them for, leading to unreliable research results. While many proposals have been developed to deal with the problem of ant...
The ever accelerating pace of biomedical research results in corresponding acceleration in the volume of biomedical literature created. Since new research builds upon existing knowledge, the rate of increase in the available knowledge encoded in biomedical literature makes the easy access to that implicit knowledge more vital over time. Toward the...
The Research Resource Identifier was introduced in 2014 to better identify biomedical research resources and track their use across the literature, including key digital resources like databases and software. Authors include an RRID after the first mention of any resource used. Here we provide an overview of RRIDs and analyze their use for digital...
Over the last 5 years multiple stakeholders in the field of spinal cord injury (SCI) research have initiated efforts to promote publications standards and to enable sharing of experimental data. In 2016 NIH/NINDS hosted representatives from the SCI community to streamline these efforts and to discuss the future of data sharing in the field accordin...
The NIDDK Information Network (dkNET; https://dknet.org) is an open community resource portal for basic and clinical investigators in diabetes, digestive, endocrine, metabolic, kidney, and urologic diseases [1]. dkNET provides access to a collection of diverse research resources, including data, information, materials, organisms, tools, funding opp...
This article presents a practical roadmap for scholarly data repositories to implement data citation in accordance with the Joint Declaration of Data Citation Principles (Data Citation Synthesis Group, 2014), a synopsis and harmonization of the recommendations of major science policy bodies. The roadmap was developed by the Repositories Early Adopt...
There has been a recent major upsurge in the concerns about reproducibility in many areas of science. Within the neuroimaging domain, one approach is to promote reproducibility is to target the re-executability of the publication. The information supporting such re-executability can enable the detailed examination of how an initial finding generali...
Data generated by scientific research enables further advancement in science through reanalyses and pooling of data for novel analyses. With the increasing amounts of scientific data generated by biomedical research providing researchers with more data than they have ever had access to, finding the data matching the researchers' requirements contin...
Most biomedical data repositories issue locally-unique accessions numbers, but do not provide globally unique, machine-resolvable, persistent identifiers for their datasets, as required by publishers wishing to implement data citation in accordance with widely accepted principles. Local accessions may however be prefixed with a namespace identifier...
The NIDDK Information Network (dknet.org) is a portal for basic and clinical investigators that makes it easier to discover, obtain, and reuse scientific research resources. Here we demonstrate how dkNET can connect researchers to resources for obesity research. A search for “obesity” returns 264,498 results (Table 1), including physical resources...
Reproducibility was assessed in the scientific literature and the field of immunology did not fare well, as described in Vasilevsky et al 2013. Antibodies and constructs used in immunological publications could not be identified in more than 50% of the cases studied. Similarly, in publications describing the recognition of immune epitopes by the ad...
Most biomedical data repositories issue locally-unique accessions numbers, but do not provide globally unique, machine-resolvable, persistent identifiers for their datasets, as required by publishers wishing to implement data citation in accordance with widely accepted principles. Local accessions may however be prefixed with a namespace identifier...
Objective:
Finding relevant datasets is important for promoting data reuse in the biomedical domain, but it is challenging given the volume and complexity of biomedical data. Here we describe the development of an open source biomedical data discovery system called DataMed, with the goal of promoting the building of additional data indexes in the...
Digital repositories bring direct impact and influence on the research community and society but measuring their value using formal metrics remains challenging. their value. It is challenging to define a single perfect metric that covers all quality aspects. Here, we distinguish here between impact and influence and discuss measures and mentions as...
Digital repositories bring direct impact and influence on the research community and society but measuring their value using formal metrics remains challenging. their value. It is challenging to define a single perfect metric that covers all quality aspects. Here, we distinguish here between impact and influence and discuss measures and mentions as...
Hua Xu Jeffrey Grethe Ian Fore- [...]
MI
Introduction
The bioCADDIE Data Discovery Index Consortium (http://biocaddie.org) is charged with developing a working prototype of a DDI that facilitates data discovery by a broad range of users in addition to its other goals. The prototype development is carried out by the core development team (CDT) at bioCADDIE. The CDT has built the DataMed se...
This article presents a practical roadmap for scholarly data repositories to implement data citation in accordance with the Joint Declaration of Data Citation Principles, a synopsis and harmonization of the recommendations of major science policy bodies. The roadmap was developed by the Repositories Expert Group, as part of the Data Citation Implem...
Today’s science increasingly requires effective ways to find and access existing datasets that are distributed across a range of repositories. For researchers in the life sciences, discoverability of datasets may soon become as essential as identifying the latest publications via PubMed. Through an international collaborative effort funded by the N...
The value of broadening searches for data across multiple repositories has been identified by the biomedical research community. As part of the US National Institutes of Health (NIH) Big Data to Knowledge initiative, we work with an international community of researchers, service providers and knowledge experts to develop and test a data index and...
Digital repositories bring direct impact and influence on the research community and society but measuring their value using formal metrics remains challenging. their value. It is challenging to define a single perfect metric that covers all quality aspects. Here, we distinguish here between impact and influence and discuss measures and mentions as...
In many disciplines, data is highly decentralized across thousands of online databases (repositories, registries, and knowledgebases). Wringing value from such databases depends on the discipline of data science and on the humble bricks and mortar that make integration possible; identifiers are a core component of this integration infrastructure. D...
Today's science increasingly requires effective ways to find and access existing datasets that are distributed across a range of repositories. For researchers in the life sciences, discoverability of datasets may soon become as essential as identifying the latest publications via PubMed. Through an international collaborative effort funded by the N...
The value of broadening searches for data across multiple repositories has been identified by the biomedical research community. As part of the NIH Big Data to Knowledge initiative, we work with an international community of researchers, service providers and knowledge experts to develop and test a data index and search engine, which are based on m...
The purpose of this document is to specify the basic data types required for storing electrophysiology and optical imaging data to facilitate computer-based neuroscience studies and data sharing. These requirements were developed within a working group of the Electrophysiology Task Force in the INCF Program on Standards for Data Sharing.
There is an urgent need to improve the infrastructure supporting the reuse of scholarly data. A diverse set of stakeholders—representing academia, industry, funding agencies, and scholarly publishers—have come together to design and jointly endorse a concise and measureable set of principles that we refer to as the FAIR Data Principles. The intent...
The NIF Registry developed and maintained by the Neuroscience Information Framework is a cooperative project aimed at cataloging research resources, e.g., software tools, databases and tissue banks, funded largely by governments and available as tools to research scientists. Although originally conceived for neuroscience, the NIF Registry has over...
Annotated data for redirect classification.
An XML file listing resource redirection candidates annotated with good/bad labels used to train/test redirect detection classifiers. For each of 178 resource candidates the HTML stripped text of the about or home page, its registry NIF ID and url is provided along with its label.
(XML)
Annotated data for resource candidate interactive learning scheme testing.
An XML file listing resource candidates detected by the system in March 2014 labeled by an annotator as good or bad. For each resource candidate its Textpresso score, url and description is provided besides its label.
(XML)
Annotated data set for ModelDB mentions from publisher and NIF searches.
An XML file containing resource mentions detected by Springer, Nature, NIF search API searches for a resource’s name, synonyms and URL for the resource ModelDB annotated with a good/bad label used for active learning experiments. Besides the label each entry includes a title,...
Screenshot of the RDW resource candidates view.
A page of curated resource candidates are shown.
(PNG)
Data for Fig 10.
Comma separated file listing ModelDB mentions found by RDW compared to human curated list of ModelDB mentions used to generate Fig 10.
(CSV)
Screenshot of the RDW resource co-occurrence heat map.
The heat map shows the 50 most frequently co-occurring resource mentions in RDW database as of July 2015. The darker a cell the more are co-occurring mentions.
(PNG)
Registry content.
TAB separated file listing SciCrunch registry resources and curated attributes.
(CSV)
Most mentioned new URLs as resource candidates.
A text file listing new resource candidate URLs grouped by host with 100+ mentions in open access papers together with their closest aligned Registry resource and similarity score.
(TXT)
Annotated data set for resource NER.
An XML file listing sentence fragments with their corresponding parse trees for the methods sections from four Elsevier journals (Neuroimage, Brain Research, Neurobiology of Disease and Journal of Neuroscience Methods) and resource name entities for them.
(XML)
The NIDDK Information Network (dkNET; http://dknet.org) was launched to serve the needs of basic and clinical investigators in metabolic, digestive and kidney disease by facilitating access to research resources that advance the mission of the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK). By research resources, we mean t...
This paper describes how DISCO, the data aggregator that supports the Neuroscience Information Framework (NIF), has been extended to play a central role in automating the complex workflow required to support and coordinate the NIF's data integration capabilities. The NIF is an NIH Neuroscience Blueprint initiative designed to help researchers acces...
The NIF system is a semantic search engine that uses an ontology to improve search quality. In this experience paper we present SKEYQL, our semantic keyword query language and describe a number of ontology-based query reformulation strategies that go beyond standard query expansion techniques. We also present a set of lessons learnt and strategies...
The NIF Registry is available to download in a couple of ways. The version attached to this paper is a snapshot and we recommend that you use an up-to date version.
The places to view the updated registry are:
1. The main NIF site
https://neuinfo.org/mynif/search.php?q=*&t=registry&b=0&r=20
*download the registry from here by looking at the "sour...
We report on progress of employing the Kepler workflow engine to prototype “end-to-end” application integration workflows that concern data coming from microscopes deployed at the National Center for Microscopy Imaging Research (NCMIR). This system is built upon the mature code base of the Cell Centered Database (CCDB) and integrated rule-oriented...