Tim Clark

Tim Clark
Harvard Medical School | HMS · Department of Neurology

PhD in Computer Science

About

105
Publications
47,010
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
17,287
Citations
Additional affiliations
June 2003 - April 2004
self-employed
Position
  • Consultant
Description
  • Consulted for Harvard Office of the CIO & misc startups.
May 2004 - present
Massachusetts General Hospital
Position
  • Computer Scientist; Director, Biomedical Informatics Core

Publications

Publications (105)
Article
Full-text available
Software is as integral as a research paper, monograph, or dataset in terms of facilitating the full understanding and dissemination of research. This article provides broadly applicable guidance on software citation for the communities and institutions publishing academic journals and conference proceedings. We expect those communities and institu...
Article
Full-text available
Software is as integral as a research paper, monograph, or dataset in terms of facilitating the full understanding and dissemination of research. This article provides broadly applicable guidance on software citation for the communities and institutions publishing academic journals and conference proceedings. We expect those communities and institu...
Article
This special issue is intended to inform the scientific computing community about recent advances and the current state of the art in software and data citation. Initial work has been done elsewhere to define standards and principles for software and data citation, and the basic required infrastructure is now in place. The challenge now is to adopt...
Article
One of the key goals of the FAIR guiding principles is defined by its final principle – to optimize data sets for reuse by both humans and machines. To do so, data providers need to implement and support consistent machine readable metadata to describe their data sets. This can seem like a daunting task for data providers, whether it is determining...
Article
The FAIR principles describe characteristics intended to support access to and reuse of digital artifacts in the scientific research ecosystem. Persistent, globally unique identifiers, resolvable on the Web, and associated with a set of additional descriptive metadata, are foundational to FAIR data. Here we describe some basic principles and exempl...
Preprint
Full-text available
The main output of the FORCE11 Software Citation working group (https://www.force11.org/group/software-citation-working-group) was a paper on software citation principles (https://doi.org/10.7717/peerj-cs.86) published in September 2016. This paper laid out a set of six high-level principles for software citation (importance, credit and attribution...
Article
The main output of the FORCE11 Software Citation working group (https://www.force11.org/group/software-citation-working-group) was a paper on software citation principles (https://doi.org/10.7717/peerj-cs.86) published in September 2016. This paper laid out a set of six high-level principles for software citation (importance, credit and attribution...
Article
Full-text available
This article presents a practical roadmap for scholarly data repositories to implement data citation in accordance with the Joint Declaration of Data Citation Principles (Data Citation Synthesis Group, 2014), a synopsis and harmonization of the recommendations of major science policy bodies. The roadmap was developed by the Repositories Early Adopt...
Article
Full-text available
This article presents a practical roadmap for scholarly publishers to implement data citation in accordance with the Joint Declaration of Data Citation Principles (JDDCP), a synopsis and harmonization of the recommendations of major science policy bodies. It was developed by the Publishers Early Adopters Expert Group as part of the Data Citation Im...
Article
Full-text available
Most biomedical data repositories issue locally-unique accessions numbers, but do not provide globally unique, machine-resolvable, persistent identifiers for their datasets, as required by publishers wishing to implement data citation in accordance with widely accepted principles. Local accessions may however be prefixed with a namespace identifier...
Preprint
Full-text available
This article presents a practical roadmap for scholarly publishers to implement data citation in accordance with the Joint Declaration of Data Citation Principles (JDDCP), a synopsis and harmonization of the recommendations of major science policy bodies. It was developed by the Publishers Early Adopters Expert Group as part of the Data Citation Im...
Preprint
Full-text available
Most biomedical data repositories issue locally-unique accessions numbers, but do not provide globally unique, machine-resolvable, persistent identifiers for their datasets, as required by publishers wishing to implement data citation in accordance with widely accepted principles. Local accessions may however be prefixed with a namespace identifier...
Article
Full-text available
Significance This work provides evidence that the protein tau induces changes in blood vessels distinct from the effects of amyloid beta on vasculature and indicates a previously unknown pathway by which pathological tau may accelerate cognitive decline in Alzheimer’s disease.
Article
Full-text available
Data in the life sciences are extremely diverse and are stored in a broad spectrum of repositories ranging from those designed for particular data types (such as KEGG for pathway data or UniProt for protein data) to those that are general-purpose (such as FigShare, Zenodo, Dataverse or EUDAT). These data have widely different levels of sensitivity...
Preprint
Full-text available
This document summarizes a series of authoritative views on research data management, sharing and citation, developed in Expert Groups, Working Groups, and other activities organized through FORCE11 (http://force11.org), an international community of over 2,000 members dedicated to advancing research communications and e-scholarship. It was prepare...
Preprint
Full-text available
This document summarizes a series of authoritative views on research data management, sharing and citation, developed in Expert Groups, Working Groups, and other activities organized through FORCE11 (http://force11.org), an international community of over 2,000 members dedicated to advancing research communications and e-scholarship. It was prepare...
Preprint
Full-text available
Data in the life sciences are extremely diverse and are stored in a broad spectrum of repositories ranging from those designed for particular data types (such as KEGG for pathway data or UniProt for protein data) to those that are general-purpose (such as FigShare, Zenodo, Dataverse or EUDAT). These data have widely different levels of sensitivity...
Preprint
Full-text available
Data in the life sciences are extremely diverse and are stored in a broad spectrum of repositories ranging from those designed for particular data types (such as KEGG for pathway data or UniProt for protein data) to those that are general-purpose (such as FigShare, Zenodo, or EUDat). These data have widely different levels of sensitivity and securi...
Article
Full-text available
Identifying accurate biomarkers of cognitive decline is essential for advancing early diagnosis and prevention therapies in Alzheimer's disease. The Alzheimer's disease DREAM Challenge was designed as a computational crowdsourced project to benchmark the current state-of-the-art in predicting cognitive outcomes in Alzheimer's disease based on high...
Article
Full-text available
There is an urgent need to improve the infrastructure supporting the reuse of scholarly data. A diverse set of stakeholders—representing academia, industry, funding agencies, and scholarly publishers—have come together to design and jointly endorse a concise and measureable set of principles that we refer to as the FAIR Data Principles. The intent...
Article
Full-text available
Data in the life sciences are extremely diverse and are stored in a broad spectrum of repositories ranging from those designed for particular data types (such as KEGG for pathway data or UniProt for protein data) to those that are general-purpose (such as FigShare, Zenodo, or EUDat). These data have widely different levels of sensitivity and securi...
Conference Paper
Full-text available
Potential drug-drug interactions (PDDI) are a significant source of preventable drug-related harm. One contributing factor is that there is no standard way to represent PDDI knowledge claims and associated evidence in a computable form. The research we present in this paper addresses this problem by creating a new version of the Drug Interaction Kn...
Article
Full-text available
This report documents the program and the outcomes of Dagstuhl Perspectives Workshop 11331 "The Future of Research Communication". The purpose of the workshop was to bring together researchers from these different disciplines, whose core research goal is changing the formats, standards, and means by which we communicate science.
Conference Paper
Full-text available
Two complementary models for biomedical literature-data integration are presented: entity-based and argument-based. We believe the argument-based model is a novel application in this domain and can be exceptionally useful in providing better support than currently exists for robust and reproducible science. We describe both approaches, along with s...
Article
Full-text available
Reproducibility and reusability of research results is an important concern in scientific communication and science policy. A foundational element of reproducibility and reusability is the open and persistently available presentation of research data. However, many common approaches for primary data publication in use today do not achieve sufficien...
Preprint
Full-text available
Reproducibility and reusability of research results is an important concern in scientific communication and science policy. A foundational element of reproducibility and reusability is the open and persistently available presentation of research data. However, many common approaches for primary data publication in use today do not achieve sufficien...
Preprint
Reproducibility and reusability of research results is an important concern in scientific communication and science policy. A foundational element of reproducibility and reusability is the open and persistently available presentation of research data. However, many common approaches for primary data publication in use today do not achieve sufficien...
Preprint
Full-text available
Reproducibility and reusability of research results is an important concern in scientific communication and science policy. A foundational element of reproducibility and reusability is the open and persistently available presentation of research data. However, many common approaches for primary data publication in use today do not achieve sufficien...
Preprint
Full-text available
This short article provides operational guidance on implementing scholarly data citation and data deposition, in conformance with the Joint Declaration of Data Citation Principles (JDDCP, http://force11.org/datacitation) to help achieve widespread, uniform human and machine accessibility of deposited data. The JDDCP is the outcome of a cross-domain...
Preprint
Full-text available
This short article provides operational guidance on implementing scholarly data citation and data deposition, in conformance with the Joint Declaration of Data Citation Principles (JDDCP, http://force11.org/datacitation) to help achieve widespread, uniform human and machine accessibility of deposited data. The JDDCP is the outcome of a cross-domain...
Article
In this special issue we present a series of articles on application of web semantics to problems in eLifeScience – the digital conduct and communication of research in biology and biomedicine. Life science research presents a number of challenges and opportunities to web semantics and web science. This special issue shows some prime examples of ho...
Article
Full-text available
Semantic web technologies can support the rapid and trans- parent validation of scientific claims by interconnecting the assumptions and evidence used to support or challenge assertions. One important application domain is medication safety, where more efficient acquisition, representation, and synthesis of evidence about potential drug-drug intera...
Article
Full-text available
Background With the advent of inexpensive assay technologies, there has been an unprecedented growth in genomics data as well as the number of databases in which it is stored. In these databases, sample annotation using ontologies and controlled vocabularies is becoming more common. However, the annotation is rarely available as Linked Data, in a m...
Article
Full-text available
Background: Social media has the potential to accelerate the pace of biomedical research through online collaboration, discussions, and faster sharing of information. Focused web-based scientific social collaboratories such as the Alzheimer Research Forum have been successful in engaging scientists in open discussions of the latest research and ide...
Conference Paper
Full-text available
We would to like to present eXframe: A software platform for devel-oping Semantic Web genomics repositories. eXframe is implemented using Drupal 7, an open-source PHP/MySQL based content management system. eXframe provides a user-friendly interface for researchers to enter the infor-mation about their experiments and share these with their colleagu...
Conference Paper
Annotopia is an open source, open services platform for creating, managing, manipulating and sharing open annotation using the W3C Open Annotation Data Model. It can create and/or manage annotation of HTML, PDF, and other resources including data and ontology concepts, with text, semantic tags, and other annotation types. It supports fine-grained p...
Article
Full-text available
Scholars have made handwritten notes and comments in books and manuscripts for centuries. Today's blogs and news sites typically invite users to express their opinions on the published content; URLs allow web resources to be shared with accompanying annotations and comments using third-party services like Twitter or Facebook. These contributions ha...
Article
Full-text available
Provenance is a critical ingredient for establishing trust of published scientific content. This is true whether we are considering a data set, a computational workflow, a peer-reviewed publication or a simple scientific claim with supportive evidence. Existing vocabularies such as DC Terms and the W3C PROV-O are domain-independent and general-purp...
Article
Background: Huntington's disease (HD) is a neurodegenerative disorder with selective vulnerability of striatal neurons and involves extensive transcriptional dysregulation early in the disease process. Previous work in cell and mouse models has shown that histone modifications are altered in HD. Specifically, monoubiquitylated histone H2A (uH2A) i...
Article
Full-text available
"Once upon a time, several engineers, biologists and clinicians realized that a lot of information in biomedicine was partitioned into 'silos' that do not intercommunicate. These silos were a side effect of the existence of different disciplines required to, for example, develop new drugs. The engineers decided to dispose of the silos, and to put t...
Article
Full-text available
Scientific publications are documentary representations of defeasible arguments, supported by data and repeatable methods. They are the essential mediating artifacts in the ecosystem of scientific communications. The institutional "goal" of science is publishing results. The linear document publication format, dating from 1665, has survived transit...
Data
The NIF Registry is available to download in a couple of ways. The version attached to this paper is a snapshot and we recommend that you use an up-to date version. The places to view the updated registry are: 1. The main NIF site https://neuinfo.org/mynif/search.php?q=*&t=registry&b=0&r=20 *download the registry from here by looking at the "sour...
Article
Full-text available
Most literature searching in biomedicine is now conducted via PubMed, Google Scholar or other web-based bibliographic search mechanisms. Yet until now a public, open, interoperable and complete web-adapted information schema for bibliographic citations, bibliographic references and scientific discourse has not been available. Such a schema, express...
Article
Full-text available
In Huntington's disease (HD; MIM ID #143100), a fatal neurodegenerative disorder, transcriptional dysregulation is a key pathogenic feature. Histone modifications are altered in multiple cellular and animal models of HD suggesting a potential mechanism for the observed changes in transcriptional levels. In particular, previous work has suggested an...
Data
Gene Ontology (GO)-Biological Process(GOTERM_BP_FAT) Functional Annotation Clustering of “Hyperacetylated in TG” genes. (DOCX)
Data
Gene-specific primer sequences for single-gene ChIP confirmation experiments. (DOCX)
Data
Gene Ontology (GO)-Biological Process (GOTERM_BP_FAT) Functional Annotation Clustering of “Not acetylated in TG” genes. (DOCX)
Data
Gene Ontology (GO)-Biological Process (GOTERM_BP_FAT) Functional Annotation Clustering of “Hypoacetylated in TG” genes. (DOCX)
Data
Gene Ontology (GO)-Biological Process (GOTERM_BP_FAT) Functional Annotation Clustering of “Ectopically acetylated in TG” genes. (DOCX)
Data
Gene-specific primer sequences for RT-qPCR analysis. (DOCX)
Article
Full-text available
With the advancement of technology and the wide adoption of ontologies as knowledge representation formats, in the last decade, a handful of models were proposed for the externalization of the rhetoric and argumentation captured within scientific publications. Conceptually, most of these models share a similar representation form of the scientific...
Article
Full-text available
Our group has developed a useful shared software framework for performing, versioning, sharing and viewing Web annotations of a number of kinds, using an open representation model. The Domeo Annotation Tool was developed in tandem with this open model, the Annotation Ontology (AO). Development of both the Annotation Framework and the open model was...
Article
Full-text available
To make full use of research data, the bioscience community needs to adopt technologies and reward mechanisms that support interoperability and promote the growth of an open 'data commoning' culture. Here we describe the prerequisites for data commoning and present an established and growing ecosystem of solutions using the shared 'Investigation-St...
Article
Full-text available
The breadth of information resources available to researchers on the Internet continues to expand, particularly in light of recently implemented data-sharing policies required by funding agencies. However, the nature of dense, multifaceted neuroscience data and the design of contemporary search engine systems makes efficient, reliable and relevant...
Article
The dissemination of knowledge derived from research and scholarship has a fundamental impact on the ways in which society develops and progresses, and at the same time it feeds back to improve subsequent research and scholarship. Here, as in so many other areas of human activity, the internet is changing the way things work; two decades of emergen...
Data
Genomics Tables. Database schema of the genomics tables
Article
Full-text available
Genome-wide experiments are routinely conducted to measure gene expression, DNA-protein interactions and epigenetic status. Structured metadata for these experiments is imperative for a complete understanding of experimental conditions, to enable consistent data processing and to allow retrieval, comparison, and integration of experimental results....
Article
Background / Purpose: The Pain Research Forum (PRF) is a free, interactive web site dedicated to basic and translational pain research. Modeled on the highly successful Alzheimer Research Forum and similar sites, the PRF is the first virtual community dedicated to chronic and neuropathic pain. Main conclusion: Launched in June 2011, the PRF of...
Article
Full-text available
Background There is currently a gap between the rich and expressive collection of published biomedical ontologies, and the natural language expression of biomedical papers consumed on a daily basis by scientific researchers. The purpose of this paper is to provide an open, shareable structure for dynamic integration of biomedical domain ontologies...
Article
Full-text available
Translational medicine requires the integration of knowledge using heterogeneous data from health care to the life sciences. Here, we describe a collaborative effort to produce a prototype Translational Medicine Knowledge Base (TMKB) capable of answering questions relating to clinical practice and pharmaceutical drug discovery. We developed the Tra...
Article
Full-text available
science publishing, online communities, science policy, new forms of publishing, bioinformatics, digital repositories, semantic publishing, citation
Article
Full-text available
The Translational Medicine Ontology provides terminology that bridges diverse areas of translational medicine including hypothesis management, discovery research, drug devel-opment and formulation, clinical research, and clinical prac-tice. Designed primarily from use cases, the ontology con-sists of essential terms that are mapped to other ontolog...
Article
Full-text available
Collaboration on the Web has great potential to accelerate the pace of scientific communication and the development of collective knowledge-bases. Focused research collaboratories and forums are required to pose open research questions, conduct discourse, identify critical funding gaps in research and accelerate the pace of therapeutic development....