Ugis Sarkans

Ugis Sarkans
European Molecular Biology Laboratory | EMBL · EMBL Hinxton (EBI)

About

84
Publications
11,658
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
11,235
Citations
Introduction
Skills and Expertise

Publications

Publications (84)
Article
Despite the huge impact of data resources in genomics and structural biology, until now there has been no central archive for biological data for all imaging modalities. The BioImage Archive is a new data resource at the European Bioinformatics Institute (EMBL-EBI) designed to fill this gap. In its initial development BioImage Archive accepts bioim...
Preprint
Despite the importance of data resources in genomics and structural biology, until now there has been no central archive for biological data for all imaging modalities. The BioImage Archive is a new data resource at the European Bioinformatics Institute (EMBL-EBI) designed to fill this gap. It accepts bioimaging data associated with publication in...
Article
Full-text available
Bioimaging data have significant potential for reuse, but unlocking this potential requires systematic archiving of data and metadata in public databases. We propose draft metadata guidelines to begin addressing the needs of diverse communities within light and electron microscopy. We hope this publication and the proposed Recommended Metadata for...
Article
Imaging technologies are used throughout the life and biomedical sciences to understand mechanisms in biology and diagnosis and therapy in animal and human medicine. We present criteria for globally applicable guidelines for open image data tools and resources for the rapidly developing fields of biological and biomedical imaging.
Article
Full-text available
This protocol illustrates the steps necessary to deposit correlated 3D cryo-imaging data from cryo-structured illumination microscopy and cryo-soft X-ray tomography with the BioStudies and EMPIAR deposition databases of the European Bioinformatics Institute. There is currently a real need for a robust method of data deposition to ensure unhindered...
Article
Full-text available
ArrayExpress (https://www.ebi.ac.uk/arrayexpress) is an archive of functional genomics data at EMBL-EBI, established in 2002, initially as an archive for publication-related microarray data and was later extended to accept sequencing-based data. Over the last decade an increasing share of biological experiments involve multiple technologies assayin...
Preprint
Full-text available
Biological and biomedical imaging datasets record the constitution, architecture and dynamics of living organisms across several orders of magnitude of space and time. Imaging technologies are now used throughout the life and biomedical sciences to achieve discovery and understanding of biological mechanisms in the basic sciences as well as assessm...
Article
Full-text available
Uncovering cellular responses from heterogeneous genomic data is crucial for molecular medicine in particular for drug safety. This can be realized by integrating the molecular activities in networks of interacting proteins. As proof-of-concept we challenge network modeling with time-resolved proteome, transcriptome and methylome measurements in iP...
Article
Full-text available
This paper was originally published under standard Nature America Inc. copyright. As of the date of this correction, the Resource is available online as an open-access paper with a CC-BY license. No other part of the paper has been changed.
Article
Full-text available
ArrayExpress (https://www.ebi.ac.uk/arrayexpress) is an archive of functional genomics data from a variety of technologies assaying functional modalities of a genome, such as gene expression or promoter occupancy. The number of experiments based on sequencing technologies, in particular RNA-seq experiments, has been increasing over the last few yea...
Article
BioStudies ( www.ebi.ac.uk/biostudies ) is a new public database that organizes data from biological studies. Typically, but not exclusively, a study is associated with a publication. BioStudies offers a simple way to describe the study structure, and provides flexible data deposition tools and data access interfaces. The actual data can be stored...
Poster
Full-text available
Expression Atlas (https://www.ebi.ac.uk/gxa) is a database and web-service at EMBL-EBI that curates, re-analyses and displays gene expression data across species and biological conditions such as different tissues, cell types, developmental stages and diseases among others. Currently, it provides gene expression results on more than 3,000 experimen...
Article
Full-text available
Access to primary research data is vital for the advancement of science. To extend the data types supported by community repositories, we built a prototype Image Data Resource (IDR). IDR links data from several imaging modalities, including high-content screening, multi-dimensional microscopy and digital pathology, with public genetic or chemical d...
Article
Full-text available
Biomedical data are being produced at an unprecedented rate owing to the falling cost of experiments and wider access to genomics, transcriptomics, proteomics and metabolomics platforms1, 2. As a result, public deposition of omics data is on the increase. This presents new challenges, including finding ways to store, organize and access different t...
Preprint
Full-text available
Access to primary research data is vital for the advancement of science. To extend the data types supported by community repositories, we built a prototype Image Data Resource (IDR) that collects and integrates imaging data acquired across many different imaging modalities. IDR links high-content screening, super-resolution microscopy, time-lapse a...
Article
Full-text available
Background Translational researchers need robust IT solutions to access a range of data types, varying from public data sets to pseudonymised patient information with restricted access, provided on a case by case basis. The reason for this complication is that managing access policies to sensitive human data must consider issues of data confidentia...
Article
Full-text available
The BioStudies database is a new EMBL‐EBI resource that holds descriptions of biological studies, links to supporting data in other databases, and archives data files that do not fit in existing public structured archives.
Article
Full-text available
Motivation: The Cellular Phenotype Database (CPD) is a repository for data derived from high-throughput systems microscopy studies. The aims of this resource are: (i) to provide easy access to cellular phenotype and molecular localization data for the broader research community; (ii) to facilitate integration of independent phenotypic studies by me...
Article
Full-text available
The field of toxicogenomics (the application of '-omics' technologies to risk assessment of compound toxicities) has expanded in the last decade, partly driven by new legislation, aimed at reducing animal testing in chemical risk assessment but mainly as a result of a paradigm change in toxicology towards the use and integration of genome wide data...
Article
Full-text available
The ArrayExpress Archive of Functional Genomics Data (http://www.ebi.ac.uk/arrayexpress) is an international functional genomics database at the European Bioinformatics Institute (EMBL-EBI) recommended by most journals as a repository for data supporting peer-reviewed publications. It contains data from over 7000 public sequencing and 42 000 array-...
Article
Full-text available
The BioSamples database at the EBI (http://www.ebi.ac.uk/biosamples) provides an integration point for BioSamples information between technology specific databases at the EBI, projects such as ENCODE and reference collections such as cell lines. The database delivers a unified query interface and API to query sample information across EBI’s databas...
Article
Autism spectrum disorder (ASD) is a complex heterogeneous neurodevelopmental disorder with onset during early childhood and typically a life-long course. The majority of ASD cases stems from complex, 'multiple-hit', oligogenic/polygenic underpinnings involving several loci and possibly gene-environment interactions. These multiple layers of complex...
Article
Full-text available
The ArrayExpress Archive of Functional Genomics Data (http://www.ebi.ac.uk/arrayexpress) is one of three international functional genomics public data repositories, alongside the Gene Expression Omnibus at NCBI and the DDBJ Omics Archive, supporting peer-reviewed publications. It accepts data generated by sequencing or array-based technologies and...
Article
Full-text available
Spreadsheet-like tabular formats are ever more popular in the biomedical field as a mean for experimental reporting. The problem of converting the graph of an experimental workflow into a table-based representation occurs in many such formats and is not easy to solve. We describe graph2tab, a library that implements methods to realise such a conver...
Article
Full-text available
The BioSample Database (http://www.ebi.ac.uk/ biosamples) is a new database at EBI that stores information about biological samples used in molecular experiments, such as sequencing, gene expression or proteomics. The goals of the BioSample Database include: (i) recording and linking of sample information consistently within EBI databases such as E...
Data
Comparison of estimates of effect size of mQTL SNPs for metabolic traits measured on the Biocrates platform. Effect sizes are compared between Illig et al. (estimates are taken from Table 1 of [14]), and the current paper's replication of Illig et al.'s findings. The comparison is made using proportions of total phenotypic variance, because this wa...
Data
Comparison of estimates of effect size of mQTL SNPs on metabolite concentrations measured by 1H NMR. Effect sizes are compared between the discovery stage (MolTWIN cohort) and the replication stage (MolOBB cohort). The MolTWIN estimates and credible intervals are as shown in Figure 4. The MolOBB estimates had to be calculated differently to the Mol...
Data
Plot of spectra for which mQTL-driven metabolites, labelled top, are determined as missing. Missing-peak spectra are plotted in black. For comparison, an arbitrarily selected set of 25 present-peak spectra is plotted in grey. Vertical green lines delimit the corresponding peak's bin (Materials and Methods). (TIF)
Data
Distribution of the ratio of TMAOu concentration to the combined concentration of TMAOu and TMAu (includes both MolTWIN and MolOBB cohorts). Trimethylaminuria controls have relatively high values of TMAOu/(TMAOu + TMAu), typically greater than 0.8 [50], whilst values for cases are considerably lower (the two cases examined in [50] have values 0.11...
Data
Peaks in urine 1H NMR spectra that are driven by mQTL variation. In the bottom panel we plotted 50 spectra over a subset of the ppm axis (note that, conventionally, the ppm axis is plotted increasing from right to left). The top panels are zoomed-in views of peaks from the three mQTL-driven urine metabolites of the current paper. The vertical scale...
Data
Details of statistical association between each mQTL-driven metabolite and the SNPs within 200 kb of its hit region. Genomic locations are given in NCBI build 37 coordinates. Columns labelled ‘Beta,’ ‘S.E. Beta’ (S.E. = standard error) and ‘p-value’ (for the test of the null hypothesis that ) give details of the fit of the non-Bayesian mixed-effect...
Data
Hit region for BAIBu. Top: location of genes, with rectangles denoting the position of exons. Middle: log-transformed p-values () for the test of association of the metabolite's concentration with each SNP in the region. Bottom: LD between each pair of SNPs in the region, with the colour scale for superimposed. (TIF)
Data
Hit region for DMAp. Top: location of genes, with rectangles denoting the position of exons. Middle: log-transformed p-values () for the test of association of the metabolite's concentration with each SNP in the region. Bottom: LD between each pair of SNPs in the region, with the colour scale for superimposed. (TIF)
Data
Previously discovered eQTLs within 200 kb of mQTL hit regions. (DOC)
Data
(A) Non-synonymous SNPs in LD with mQTL SNPs. (B) Corresponding residue changes and predicted functional effects of non-synonymous SNPs. (DOC)
Article
Full-text available
We have performed a metabolite quantitative trait locus (mQTL) study of the (1)H nuclear magnetic resonance spectroscopy ((1)H NMR) metabolome in humans, building on recent targeted knowledge of genetic drivers of metabolic regulation. Urine and plasma samples were collected from two cohorts of individuals of European descent, with one cohort compr...
Article
Full-text available
H Nuclear Magnetic Resonance spectroscopy (¹H NMR) is increasingly used to measure metabolite concentrations in sets of biological samples for top-down systems biology and molecular epidemiology. For such purposes, knowledge of the sources of human variation in metabolite concentrations is valuable, but currently sparse. We conducted and analysed a...
Data
Details of statistical association between each mQTL-driven metabolite and the SNPs within 200 kb of its hit region. Genomic locations are given in NCBI build 37 coordinates. Columns labelled ‘Beta,’ ‘S.E. Beta’ (S.E. = standard error) and ‘p-value’ (for the test of the null hypothesis that ) give details of the fit of the non-Bayesian mixed-effect...
Data
(A) Non-synonymous SNPs in LD with mQTL SNPs. (B) Corresponding residue changes and predicted functional effects of non-synonymous SNPs. (DOC)
Article
Full-text available
We have performed a metabolite quantitative trait locus (mQTL) study of the (1)H nuclear magnetic resonance spectroscopy ((1)H NMR) metabolome in humans, building on recent targeted knowledge of genetic drivers of metabolic regulation. Urine and plasma samples were collected from two cohorts of individuals of European descent, with one cohort compr...
Article
Full-text available
The ArrayExpress Archive (http://www.ebi.ac.uk/arrayexpress) is one of the three international public repositories of functional genomics data supporting publications. It includes data generated by sequencing or array-based technologies. Data are submitted by users and imported directly from the NCBI Gene Expression Omnibus. The ArrayExpress Archiv...
Article
Full-text available
SIMBioMS is a web-based open source software system for managing data and information in biomedical studies. It provides a solution for the collection, storage, management and retrieval of information about research subjects and biomedical samples, as well as experimental data obtained using a range of high-throughput technologies, including gene e...
Article
p>ArrayExpress http://www.ebi.ac.uk/arrayexpress consists of three components: the ArrayExpress Repository--a public archive of functional genomics experiments and supporting data, the ArrayExpress Warehouse--a database of gene expression profiles and other bio-measurements and the ArrayExpress Atlas--a new summary database and meta-analytical tool...
Article
Full-text available
ArrayExpress http://www.ebi.ac.uk/arrayexpress consists of three components: the ArrayExpress Repository—a public archive of functional genomics experiments and supporting data, the ArrayExpress Warehouse—a database of gene expression profiles and other bio-measurements and the ArrayExpress Atlas—a new summary database and meta-analytical tool of r...
Article
ArrayExpress at the European Bioinformatics Institute is a public database for MIAME-compliant microarray and transcriptomics data. It consists of two parts: the ArrayExpress Repository, which is a public archive of microarray data, and the ArrayExpress Warehouse of Gene Expression Profiles, which contains additionally curated subsets of data from...
Chapter
Gene expression databases store information about the absolute or relative abundance of gene transcription products in various biological samples, such as cells from a particular tissue in a particular organism, or a particular cell line. These databases allow one to access, select, retrieve and combine for analysis gene expression datasets generat...
Article
The Functional Genomics Experiment data model (FuGE) has been developed to facilitate convergence of data standards for high-throughput, comprehensive analyses in biology. FuGE models the components of an experimental activity that are common across different technologies, including protocols, samples and data. FuGE provides a foundation for descri...
Article
Full-text available
The Functional Genomics Experiment data model (FuGE) has been developed to facilitate convergence of data standards for high-throughput, comprehensive analyses in biology. FuGE models the components of an experimental activity that are common across different technologies, including protocols, samples and data. FuGE provides a foundation for descri...
Data
Standalone Person Management Tool (PMT). PMT is intended for registering confidential information about the research subjects from whom samples have been taken. .zip file contains Windows executable, .xml database file and installation description in a README file.
Data
Sample management database. .zip contains sql version of the database, documentation and the files necessary for the installation of the system.
Article
Full-text available
One of the crucial aspects of day-to-day laboratory information management is collection, storage and retrieval of information about research subjects and biomedical samples. An efficient link between sample data and experiment results is absolutely imperative for a successful outcome of a biomedical study. Currently available software solutions ar...
Article
Full-text available
ArrayExpress is a public database for high throughput functional genomics data. ArrayExpress consists of two parts—the ArrayExpress Repository, which is a MIAME supportive public archive of microarray data, and the ArrayExpress Data Warehouse, which is a database of gene expression profiles selected from the repository and consistently re-annotated...
Chapter
ArrayExpress is a public repository for microarray data developed and maintained at the European Bioinformatics Institute. ArrayExpress contains data from over 3400 hybridizations from ten different organisms, including human and all major model organisms. Minimum information about a microarray experiment (MIAME) specifies the minimum information t...
Article
Full-text available
High-throughput technologies are generating large amounts of complex data that have to be stored in databases, communicated to various data analysis tools and interpreted by scientists. Data representation and communication standards are needed to implement these steps efficiently. Here we give a classification of various standards related to syste...
Chapter
Expression Profiler (EP, http://ep.ebi.ac.uk/) is a set of tools for the analysis and interpretation of gene expression and other functional genomics data. These tools perform expression data clustering, visualization, and analysis, integration of expression data with protein interaction data and functional annotations, such as GeneOntology, and th...
Article
Full-text available
In this article, we analyze the current state of application of ontologies in biology, try to reveal the reasons for the existing difficulties, recommend a possible solution to the current problems and describe prospects and future challenges for the application of ontologival engineering in biological domains. We focus on the example of the ontolo...
Article
Full-text available
Sharing of microarray data within the research community has been greatly facilitated by the development of the disclosure and communication standards MIAME and MAGE-ML by the MGED Society. However, the complexity of the MAGE-ML format has made its use impractical for laboratories lacking dedicated bioinformatics support. We propose a simple tab-de...
Article
ArrayExpress is a public resource for microarray data that has two major goals: to serve as an archive providing access to microarray data supporting publications and to build a knowledge base of gene expression profiles. ArrayExpress consists of two tightly integrated databases: ArrayExpress repository, which is an archive, and ArrayExpress data w...
Article
Full-text available
ArrayExpress is a public microarray repository founded on the Minimum Information About a Microarray Experiment (MIAME) principles that stores MIAME-compliant gene expression data. Plant-based data sets represent approximately one-quarter of the experiments in ArrayExpress. The majority are based on Arabidopsis (Arabidopsis thaliana); however, ther...
Article
Full-text available
The lack of microarray data management systems and databases is still one of the major problems faced by many life sciences laboratories. While developing the public repository for microarray data ArrayExpress we had to find novel solutions to many non-trivial software engineering problems. Our experience will be both relevant and useful for most b...
Article
Full-text available
ArrayExpress is a public repository for microarray data that supports the MIAME (Minimum Informa-tion About a Microarray Experiment) requirements and stores well-annotated raw and normalized data. As of November 2004, ArrayExpress contains data from ∼12 000 hybridizations covering 35 species. Data can be submitted online or directly from local data...
Article
Full-text available
Expression Profiler (EP, http://www.ebi.ac.uk/expressionprofiler) is a web-based platform for microarray gene expression and other functional genomics-related data analysis. The new architecture, Expression Profiler: next generation (EP:NG), modularizes the original design and allows individual analysis-task-related components to be developed by di...