About
199
Publications
23,951
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
5,431
Citations
Publications
Publications (199)
The Integrated Database of Small Molecules (IDSM) integrates data from small-molecule datasets, making them accessible through the SPARQL query language. Its unique feature is the ability to search for compounds through SPARQL based on their molecular structure. We extended IDSM to enable mass spectra databases to be integrated and searched for bas...
Ameloblastin is a protein in biomineralization of tooth enamel. However recent results indicate that this is probably not its only role in an organism. Enamel matrix formation represents a complex process enabled via specific crosslinking of two proteins-the most abundant amelo-genin and the ameloblastin (AMBN). The human AMBN (hAMBN) gene possesse...
Background
The specific recognition of a DNA locus by a given transcription factor is a widely studied issue. It is generally agreed that the recognition can be influenced not only by the binding motif but by the larger context of the binding site. In this work, we present a novel heuristic algorithm that can reconstruct the unique binding sites ca...
Certain peptide sequences, some of them as short as amino acid triplets, are significantly overpopulated in specific secondary structure motifs in folded protein structures. For example, 74% of the EAM triplet is found in α-helices, and only 3% occurs in the extended parts of proteins (typically β-sheets). In contrast, other triplets (such as VIV a...
Current biological and chemical research is increasingly dependent on the reusability of previously acquired data, which typically come from various sources. Consequently, there is a growing need for database systems and databases stored in them to be interoperable with each other. One of the possible solutions to address this issue is to use syste...
Welcome to Prague Protein Spring 2023 meeting
Highly specialized enamel matrix proteins (EMPs) are predominantly expressed in odontogenic tissues and diverged from common ancestral gene. They are crucial for the maturation of enamel and its extreme complexity in multiple independent lineages. However, divergence of EMPs occured already before the true enamel evolved and their conservancy in to...
Protein tunnels play an essential role in transporting small molecules into the active sites of enzymes. Tunnels' geometrical and physico-chemical properties influence the transport process. The tunnels are attractive hot spots for protein engineering and drug development. However, studying the ligand binding and unbinding using experimental techni...
Computer simulations of biomolecules such as molecular dynamics often suffer from insufficient sampling. Due to limited computational resources, insufficient sampling prevents obtaining proper equilibrium distributions of observed properties. To deal with this problem, we proposed a simulation protocol for efficient resampling of collected off-equi...
Proteins are naturally formed by domains edging their functional and structural properties. A domain out of the context of an entire protein can retain its structure and to some extent also function on its own. These properties rationalize construction of artificial fusion multi‐domain proteins with unique combination of various functions. Informat...
The earliest proteins had to rely on amino acids available on early Earth before the biosynthetic pathways for more complex amino acids evolved. In extant proteins, a significant fraction of the ‘late’ amino acids (such as Arg, Lys, His, Cys, Trp and Tyr) belong to essential catalytic and structure-stabilizing residues. How (or if) early proteins c...
Contemporary bioinformatic and chemoinformatic capabilities hold promise to reshape knowledge management, analysis and interpretation of data in natural products research. Currently, reliance on a disparate set of non-standardized, insular, and specialized databases presents a series of challenges for data access, both within the discipline and for...
Dimensionality reduction methods have found vast application as visualization tools in diverse areas of science. Although many different methods exist, their performance is often insufficient for providing quick insight into many contemporary datasets, and the unsupervised mode of use prevents the users from utilizing the methods for dataset explor...
With the recent explosion of information, Natural Products (NP) research critically needs efficient ways to access and share knowledge, also to save precious knowledge being lost [1]. The reporting and sharing of NP occurrences in biological organisms are relevant to numerous scientific fields ranging from drug discovery to chemical ecology or chem...
Transient receptor potential melastatin 7 (TRPM7) represents melastatin TRP channel with two significant functions, cation permeability and kinase activity. TRPM7 is widely expressed among tissues and is therefore involved in a variety of cellular functions representing mainly Mg²⁺ homeostasis, cellular Ca²⁺ flickering, and the regulation of DNA tr...
Natural proteins represent numerous but tiny structure/function islands in a vast ocean of possible protein sequences not challenged by biological evolution and are yet to be explored by research. Recent studies have suggested this uncharted sequence space endows a surprisingly high structural propensity but understanding of this phenomenon has bee...
Most of the structural proteins known today are composed of domains that carry their own functions while keeping their structural properties. It is supposed that such domains, when taken out of the context of the whole protein, can retain their original structure and function to a certain extent. Information on the specific functional and structura...
Interactions among amino acid residues are the principal contributor to the stability of the three-dimensional structure of a protein. The Amino Acid Interactions (INTAA) web server (https://bioinfo.uochb.cas.cz/INTAA/) has established itself as a unique computational resource, which enables users to calculate the contribution of individual residue...
The Resource Description Framework (RDF), together with well-defined ontologies, significantly increases data interoperability and usability. The SPARQL query language was introduced to retrieve requested RDF data and to explore links between them. Among other useful features, SPARQL supports federated queries that combine multiple independent data...
The wide variety of protein structures and functions results from the diverse properties of the 20 canonical amino acids. The generally accepted hypothesis is that early protein evolution was associated with enrichment of a primordial alphabet, thereby enabling increased protein catalytic efficiencies and functional diversification. Aromatic amino...
Contemporary bioinformatic and chemoinformatic capabilities hold promise to reshape knowledge management, analysis and interpretation of data in natural products research. Currently, reliance on a disparate set of non-standardized, insular, and specialized databases presents a series of challenges to data access, either within the discipline or to...
Constantly increasing attention to bioengineered proteins has led to the rapid development of new functional targets. Here we present the biophysical and functional characteristics of the newly designed CaM/AMBN-Ct fusion protein. The two-domain artificial target consists of calmodulin (CaM) and ameloblastin C-terminus (AMBN-Ct). CaM as a well-char...
Ameloblastin (Ambn) as an intrinsically disordered protein (IDP) stands for an important role in the formation of enamel-the hardest biomineralized tissue commonly formed in vertebrates. The human ameloblastin (AMBN) is expressed in two isoforms: full-length isoform I (AMBN ISO I) and isoform II (AMBN ISO II), which is about 15 amino acid residues...
Background
The amount of data generated in large clinical and phenotyping studies that use single-cell cytometry is constantly growing. Recent technological advances allow the easy generation of data with hundreds of millions of single-cell data points with >40 parameters, originating from thousands of individual samples. The analysis of that amoun...
It is well-known that the large diversity of protein functions and structures is derived from the broad spectrum of physicochemical properties of the 20 canonical amino acids. According to the generally accepted hypothesis, protein evolution was continuously associated with enrichment of this alphabet, increasing stability, specificity and spectrum...
Background: The amount of data generated in large clinical and phenotyping studies that use single-cell cytometry is constantly growing. Recent technological advances allow to easily generate data with hundreds of millions of single-cell data points with more than 40 parameters, originating from thousands of individual samples. The analysis of that...
Molecular determinants of the binding of various endogenous modulators to transient receptor potential (TRP) channels are crucial for the understanding of necessary cellular pathways, as well as new paths for rational drug designs. The aim of this study was to characterise interactions between the TRP cation channel subfamily melastatin member 4 (T...
For processing massive flow and mass cytometry datasets, we have implemented a modern, extremely scalable and flexible version of the popular FlowSOM tool, called GigaSOM.jl. GigaSOM.jl simplifies the use of the commonly available compute infrastructure, and allows almost-interactive processing datasets that comprise thousands of individual samples...
Interaction with the DNA minor groove is a significant contributor to specific sequence recognition in selected families of DNA-binding proteins. Based on a statistical analysis of 3D structures of protein–DNA complexes, we propose that distortion of the DNA minor groove resulting from interactions with hydrophobic amino acid residues is a universa...
EmbedSOM is a simple and fast dimensionality reduction algorithm, originally developed for its applications in single-cell cytometry data analysis. We present an updated version of EmbedSOM, viewed as an algorithm for landmark-directed embedding enrichment, and demonstrate that it works well even with manifold-learning techniques other than the sel...
Estimation of binding free energies is one of the central aims of simulations of biomolecular complexes. We explore the accuracy and efficiency of setups based on non-equilibrium pulling simulations applied to estimation of binding affinities of DNA-binding proteins. Absolute binding free energies are calculated over a range of temperatures and com...
It was highlighted that the original article [1] contained an error in the last paragraph of the section ‘Structure search using SPARQL’, specifically in the radius of the used fingerprint. This Correction article shows the incorrect and correct paragraph of this section.
ShinySOM offers a user-friendly interface for reproducible, high-throughput analysis of high-dimensional flow and mass cytometry data guided by self-organizing maps. The software implements a FlowSOM-style workflow, with improvements in performance, visualizations and data dissection possibilities. The outputs of the analysis include precise statis...
EmbedSOM is a simple and fast dimensionality reduction algorithm, originally developed for its applications in single-cell cytometry data analysis. We present an updated version of EmbedSOM, viewed as an algorithm for landmark-based embedding enrichment, and demonstrate that it works well even with manifold-learning techniques other than the self-o...
Abstract Bioinformaticians and biologists rely increasingly upon workflows for the flexible utilization of the many life science tools that are needed to optimally convert data into knowledge. We outline a pan-European enterprise to provide a catalogue (https://bio.tools) of tools and databases that can be used in these workflows. bio.tools not onl...
Intrinsically disordered proteins (IDPs) and intrinsically disordered regions (IDRs) are now recognised as major determinants in cellular regulation. This white paper presents a roadmap for future e-infrastructure developments in the field of IDP research within the ELIXIR framework. The goal of these developments is to drive the creation of high-q...
Transient receptor potential (TRPs) channels are crucial downstream targets of calcium signalling cascades. They can be modulated either by calcium itself and/or by calcium-binding proteins (CBPs). Intracellular messengers usually interact with binding domains present at the most variable TRP regions—N- and C-cytoplasmic termini. Calmodulin (CaM) i...
Intrinsically disordered proteins (IDPs) represent a distinct class of proteins and are distinguished from globular proteins by conformational plasticity, high evolvability and a broad functional repertoire. Some of their properties are reminiscent of early proteins, but their abundance in eukaryotes, functional properties and compositional bias su...
Motivation:
The existing connections between large databases of chemicals, proteins, metabolites and assays offer valuable resources for research in fields ranging from drug design to metabolomics. Transparent search across multiple databases provides a way to efficiently utilize these resources. To simplify such searches, many databases have adop...
We systematically investigate the applicability of a molecular dynamics-based setup for the calculations of standard binding free energies of biologically relevant protein--DNA complexes. The free energies are extracted from a potential of mean force calculated using umbrella sampling simulations. Two protein--DNA systems derived from a homeodomain...
By combining bioinformatics with quantum-chemical calculations, we attempt to address quantitatively some of the physical principles underlying protein folding. The former allowed us to identify tripeptide sequences in existing protein three-dimensional structures with a strong preference for either helical or extended structure. The selected repre...
Efficient unbiased data analysis is a major challenge for laboratories handling large cytometry datasets. We present EmbedSOM, a non-linear embedding algorithm based on FlowSOM that improves the analyses by providing high-performance visualization of complex single cell distributions within cellular populations and their transition states. The algo...
Objective
Transcriptional regulatory elements in the ameloblastin (AMBN) promoter indicate that adipogenesis may influence its expression. The objective here was to investigate if AMBN is expressed in adipose tissue, and have a role during differentiation of adipocytes.
Design
AMBN expression was examined in adipose tissue and adipocytes by real-t...
Phosphorylation of serine, threonine, and tyrosine is one of the most frequently occurring and crucial post-translational modifications of proteins often associated with important structural and functional changes. We investigated the direct effect of phosphorylation on the intrinsic conformational preferences of amino acids as a potential trigger...
The sesquiterpenoid juvenile hormone (JH) is vital to insect development and reproduction. Intracellular JH receptors have recently been established as basic helix-loop-helix transcription factor (bHLH)-PAS proteins in Drosophila melanogaster known as germ cell-expressed (Gce) and its duplicate paralog, methoprene-tolerant (Met). Upon binding JH, G...
Domains are distinct units within proteins that typically can fold independently into recognizable three-dimensional structures to facilitate their functions. The structural and functional independence of protein domains is reflected by their apparent modularity in the context of multi-domain proteins. In this work, we examined the coupling of evol...
Distribution of the numbers of sequences in the MSAs.
Total number of architectures (MSAs) N = 2, 063.
(TIF)
Primary data archive for the studied architectures.
This archive contains a file presenting the list of UniProt identifiers of the URPs sequences included in the multi-domain MSAs for each of the 2,063 studied multi-domain architectures, and files containing the values of MI¯ and nMI¯ for individual domain pairs before and after domain sequence shu...
Amino acid residues showing above background levels of conservation are often indicative of functionally significant regions within a protein. Understanding how the sequence conservation profile relates in space requires projection onto a protein structure, a potentially time-consuming process. 3DPatch is a web application that streamlines this tas...
Background
Structure search is one of the valuable capabilities of small-molecule databases. Fingerprint-based screening methods are usually employed to enhance the search performance by reducing the number of calls to the verification procedure. In substructure search, fingerprints are designed to capture important structural aspects of the molecu...
Background
Serine proteases are important virulence factors for many pathogens. Recently, we discovered a group of trypsin-like serine proteases with domain organization unique to flatworm parasites and containing a thrombospondin type 1 repeat (TSR-1). These proteases are recognized as antigens during host infection and may prove useful as anthelm...
Multiple sequence alignment of SmSP2 with orthologs from other platyhelminth parasites.
Trematode sequences: Schistosoma. japonicum (GenBank: AAW24683.1), Schistosoma haematobium (XP_012796372.1), Fasciola hepatica (sequence identified in the transcriptome database (Young et al. (2010), Biotechnol Adv 28, 222–231), Opisthorchis viverrini (XP_009167...
Detailed micrograph of SmSP2 localization in the tegument of adult male S. mansoni.
The tissue section was probed with anti-SmSP2 IgG followed by an anti-rabbit IgG Alexa 594-labeled secondary antibody (red). DAPI was used to label the nuclear DNA (blue). The left image shows merged fluorescent channels; on the right, schematic depiction of the adu...
Pre-immune serum is not reactive.
As a negative control, semi-thin sections of adult S. mansoni males and females were probed with a pre-immune serum (A-F) followed by reaction with an anti-rabbit IgG Alexa 647-labeled secondary antibody (red). DAPI was used to label nuclear DNA (blue). The first and third columns show merged fluorescent channels;...
Multiple sequence alignment of the TSR-1 domain of SmSP2 with selected TSR-1 domains of human proteins.
Sequences are: TSP-1-1-3—thrombospondin-1 (TSP) type-1 domains 1, 2 and 3 (Uniprot accession number: P07996), properdin-TSR1-6—properdin thrombospondin type-1 domains 1–6 (P27918), ADAMTS13 (Q76LX8), and spondin-TSP-1—spondin-1 thrombospondin typ...
Multiple sequence alignment of the SmSP2 protease domain with catalytic domains of selected human and bovine S1 family proteases.
Human proteases: mannan-binding lectin serine protease 1 (MASP-1, Uniprot accession number: P48740), tissue plasminogen activator (tPA, P00750), urokinase plasminogen activator (uPA, P00749), plasmin (P00747), kallikrein...
Binding of native SmSP2 to a Ni2+-ion affinity column.
A protein extract of adult schistosomes (Extract) was applied to a HiTrap IMAC FF column containing immobilized Ni2+ ions and native SmSP2 eluted using 0.5 M imidazole. The extract, unbound material (FT) and eluted material (Elution) were resolved by SDS-PAGE, electrophoretically transferred on...
We tested the role of substituents at the C3' and C3'N positions of the taxane molecule to identify taxane derivatives capable of overcoming acquired resistance to paclitaxel. Paclitaxel-resistant sublines SK-BR-3/PacR and MCF-7/PacR as well as the original paclitaxel-sensitive breast cancer cell lines SK-BR-3 and MCF-7 were used for testing. Incre...
Pepsin-family aspartic peptidases are biosynthesized as inactive zymogens in which the propeptide blocks the active site until its proteolytic removal upon enzyme activation. Here, we describe a novel dual regulatory function for the propeptide using a set of crystal structures of the parasite cathepsin D IrCD1. In the IrCD1 zymogen, intramolecular...
The transient receptor potential channel of melastatin 4 (TRPM4) belongs to a group of large ion receptors that are involved in countless cell signalling cascades. This unique member is ubiquitously expressed in many human tissues, especially in cardiomyocytes, where it plays an important role in cardiovascular processes. Transient receptor potenti...
The protein sequences found in nature represent a tiny fraction of the potential sequences that could
be constructed from the 20-amino-acid alphabet. To help define the properties that shaped proteins
to stand out from the space of possible alternatives, we conducted a systematic computational and
experimental exploration of random (unevolved) s...
Ameloblastin (AMBN), an important component of the self-assembled enamel extra cellular matrix, contains several in silico predicted phosphorylation sites. However, to what extent these sites actually are phosphorylated and the possible effects of such post-translational modifications are still largely unknown. Here we report on in vitro experiment...
Human DHRS7 (SDR34C1) is one of insufficiently described enzymes of the short-chain dehydrogenase/reductase superfamily. The members of this superfamily often play an important pato/physiological role in the human body, participating in the metabolism of diverse substrates (e.g. retinoids, steroids, xenobiotics). A systematic approach to the identi...
Computational approaches have been major drivers behind the progress of proteomics in recent years. The aim of this white paper is to provide a framework for integrating computational proteomics into ELIXIR in the near future, and thus to broaden the portfolio of omics technologies supported by this European distributed infrastructure. This white p...
Large biomolecules-proteins and nucleic acids-are composed of building blocks which define their identity, properties and binding capabilities. In order to shed light on the energetic side of interactions of amino acids between themselves and with deoxyribonucleotides, we present the Amino Acid Interaction web server (http://bioinfo.uochb.cas.cz/IN...