Vince Grolmusz

Vince Grolmusz
Eötvös Loránd University · Institute of Mathematics

Ph.D., D.Sc.

About

186
Publications
20,816
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
2,029
Citations
Introduction
As a professor of mathematics, I work on bionformatics problems with mathematical perspectives. Our most recent discoveries: ---Women's braingraphs are much better connected than those of men; --- We have discovered the phenomenon of Consensus Connectome Dynamics (CCD) that - most probably - describes the development of cerebral connections between gray matter areas of the human brain --- We have mapped the individual variability of the connections in the brain. http://grolmusz.pitgroup.org
Additional affiliations
October 1991 - December 1993
Max Planck Institute for Informatics
Position
  • PostDoc Position
January 1999 - June 1999
University of Chicago
Position
  • Visiting associate professor
September 1986 - June 1987
University of Chicago
Position
  • PhD Student

Publications

Publications (186)
Article
Full-text available
The human braingraph or the connectome is the object of an intensive research today. The advantage of the graph-approach to brain science is that the rich structures, algorithms and definitions of graph theory can be applied to the anatomical networks of the connections of the human brain. In these graphs, the vertices correspond to the small (1-1....
Article
Full-text available
Graph theory in the last two decades penetrated sociology, molecular biology, genetics, chemistry, computer engineering, and numerous other fields of science. One of the more recent areas of its applications is the study of the connections of the human brain. By the development of diffusion magnetic resonance imaging (diffusion MRI), it is possible...
Article
Full-text available
In the applications of the graph theory it is unusual that one considers numerous, pairwise different graphs on the very same set of vertices. In the case of human braingraphs or connectomes, however, this is the standard situation: the nodes correspond to anatomically identified cerebral regions, and two vertices are connected by an edge if a diff...
Article
Full-text available
In our previous study we have shown that the female connectomes have significantly better, deep graph-theoretical parameters, related to superior "connectivity", than the connectome of the males. Since the average female brain is smaller than the average male brain, one cannot rule out that the significant advantages are due to the size- and not to...
Article
Full-text available
We construct a system H of exp(c log 2 n= log log n) subsets of a set of n elements such that the size of each set is divisible by 6 but their pairwise intersections are not divisible by 6. The result generalizes to all non-prime-power moduli m in place of m = 6. This result is in sharp contrast with results of Frankl and Wilson (1981) for prime po...
Preprint
Full-text available
Human braingraphs or connectomes are widely studied in the last decade to understand the structural and functional properties of our brain. In the last several years our research group has computed and deposited thousands of human braingraphs to the braingraph.org site, by applying public structural (diffusion) MRI data from young and healthy subje...
Article
Full-text available
Polycyclic aromatic hydrocarbons (PAHs) are highly toxic, carcinogenic substances. On soils contaminated with PAHs, crop cultivation, animal husbandry and even the survival of microflora in the soil are greatly perturbed, depending on the degree of contamination. Most microorganisms cannot tolerate PAH-contaminated soils, however, some microbial st...
Article
Full-text available
Mutated genes may lead to cancer development in numerous tissues. While more than 600 cancer-causing genes are known today, some of the most widespread mutations are connected to the RAS gene; RAS mutations are found in approximately 25% of all human tumors. Specifically, KRAS mutations are involved in the three most lethal cancers in the U.S., nam...
Article
Full-text available
We consider the 1015-vertex human consensus connectome computed from the diffusion MRI data of 1064 subjects. We define seven different orders on these 1015 graph vertices, where the orders depend on parameters derived from the brain circuitry, that is, from the properties of the edges (or connections) incident to the vertices ordered. We order the...
Article
Full-text available
Enzymatic processes play an increasing role in synthetic organic chemistry which requires the access to a broad and diverse set of enzymes. Metagenome mining is a valuable and efficient way to discover novel enzymes with unique properties for biotechnological applications. Here, we report the discovery and biocatalytic characterization of six novel...
Article
Full-text available
The LogRank conjecture of Lovász and Saks (1988) is the most famous open problem in communication complexity theory. The statement is as follows: suppose that two players intend to compute a Boolean function f(x,y) when x is known for the first and y for the second player, and they may send and receive messages encoded with bits, then they can comp...
Preprint
Full-text available
The LogRank conjecture of Lovász and Saks from 1988 is the most famous open problem in the communication complexity theory. The statement is as follows: Suppose that two players intend to compute a Boolean function f (x, y) when x is known for the first and y for the second player, and they may send and receive messages encoded with bits, then they...
Preprint
Full-text available
Hexapeptides are increasingly applied as model systems for studying the amyloidogenecity properties of oligo- and polypeptides. It is possible to construct 64 million different hexapeptides from the twenty proteinogenic amino acid residues. Today's experimental amyloid databases contain only a fraction of these annotated hexapeptides. For labeling...
Preprint
Full-text available
Polycyclic aromatic hydrocarbons (PAHs) are highly toxic, carcinogenic substances. On soils contaminated with PAHs, crop cultivation, animal husbandry and even the survival of microflora in the soil are greatly perturbed, depending on the degree of contamination. Most microorganisms cannot tolerate PAH-contaminated soils, however, some microbial st...
Preprint
Full-text available
We consider the 1015-vertex human consensus connectome computed from the diffusion MRI data of 1064 subjects. We define seven different orders on these 1015 graph vertices, where the orders depend on parameters derived from the brain circuitry, that is, from the properties of the edges (or connections) incident to the vertices ordered. We order the...
Preprint
Full-text available
Methods from artificial intelligence (AI), in general, and machine learning, in particular, have kept conquering new territories in numerous areas of science. Most of the applications of these techniques are restricted to the classification of large data sets, but new scientific knowledge can seldom be inferred from these tools. Here we show that a...
Article
Full-text available
How the cuticles of the roughly 4.5 million species of ecdysozoan animals are constructed is not well understood. Here, we systematically mine gene expression datasets to uncover the spatiotemporal blueprint for how the chitin-based pharyngeal cuticle of the nematode Caenorhabditis elegans is built. We demonstrate that the blueprint correctly predi...
Article
Determining important vertices in large graphs (e.g., Google’s PageRank in the case of the graph of the World Wide Web) facilitated the construction of excellent web search engines, returning the most important hits corresponding to the submitted user queries. Interestingly, finding important edges – instead of vertices – in large graphs has receiv...
Article
Full-text available
Hexapeptides are widely applied as a model system for studying the amyloid-forming properties of polypeptides, including proteins. Recently, large experimental databases have become publicly available with amyloidogenic labels. Using these data sets for training and testing purposes, one may build artificial intelligence (AI)-based classifiers for...
Preprint
Full-text available
Roughly 4.5 million species of ecdysozoan animals repeatedly shed their old cuticle and construct a new one underneath to accommodate growth. How cuticles are constructed is not well understood. Here, we systematically mine gene expression datasets to uncover the spatiotemporal blueprint for how the chitin-based pharyngeal cuticle of the nematode C...
Preprint
Full-text available
Hexapeptides are widely applied as a model system for studying amyloid-forming properties of polypeptides, including proteins. Recently, large experimental databases have become publicly available with amyloidogenic labels. Using these datasets for training and testing purposes, one may build artificial intelligence (AI)-based classifiers for predi...
Article
Full-text available
Gaussian blurring is a well-established method for image data augmentation: it may generate a large set of images from a small set of pictures for training and testing purposes for Artificial Intelligence (AI) applications. When we apply AI for non-imagelike biological data, hardly any related method exists. Here we introduce the “Newtonian blurrin...
Article
Full-text available
The analysis of enormous datasets with missing data entries is a standard task in biological and medical data processing. Large-scale, multi-institution clinical studies are the typical examples of such datasets. These sets make possible the search for multi-parametric relations since from the plenty of the data one is likely to find a satisfying n...
Article
Full-text available
The Protein Data Bank (PDB) today contains more than 174,000 entries with the 3-dimensional structures of biological macromolecules. Using the rich resources of this repository, it is possible identifying subsets with specific, interesting properties for different applications. Our research group prepared an automatically updated list of amyloid- a...
Article
Full-text available
For more than a decade now, we can discover and study thousands of cerebral connections with the application of diffusion magnetic resonance imaging (dMRI) techniques and the accompanying algorithmic workflow. While numerous connectomical results were published enlightening the relation between the braingraph and certain biological, medical, and ps...
Preprint
Full-text available
Determining important vertices in large graphs (e.g., Google's PageRank in the case of the graph of the World Wide Web) facilitated the construction of excellent web search engines, returning the most important hits corresponding to the submitted user queries. Interestingly, finding important edges -- instead of vertices -- in large graphs has rece...
Article
Full-text available
The multiple sequence alignment (MSA) is an increasingly important task in bioinformatics as we have to deal with the constantly increasing gene‐ and protein‐sequence databases. MSA is applied in phylogenetic analysis, in discovering conservative protein domains, in the assignment of secondary and tertiary structural features in proteins, or in the...
Article
Full-text available
The amyloid state of proteins is widely studied with relevance to neurology, biochemistry, and biotechnology. In contrast with nearly amorphous aggregation, the amyloid state has a well-defined structure, consisting of parallel and antiparallel β-sheets in a periodically repeated formation. The understanding of the amyloid state is growing with the...
Article
Full-text available
The human brain is the most complex object of study we encounter today. Mapping the neuronal-level connections between the more than 80 billion neurons in the brain is a hopeless task for science. By the recent advancement of magnetic resonance imaging (MRI), we are able to map the macroscopic connections between about 1000 brain areas. The MRI dat...
Preprint
The amyloid state of proteins is widely studied with relevancy in neurology, biochemistry, and biotechnology. In contrast with amorphous aggregation, the amyloid state has a well-defined structure, consisting of parallel and anti-parallel $\beta$-sheets in a periodically repeated formation. The understanding of the amyloid state is growing with the...
Preprint
Gaussian blurring is a well-established method for image data augmentation: it may generate a large set of images from a small set of pictures for training and testing purposes for Artificial Intelligence (AI) applications. When we apply AI for non-imagelike biological data, hardly any related method exists. Here we introduce the ``Newtonian blurri...
Preprint
Full-text available
The human brain is the most complex object of study we encounter today. Mapping the neuronal-level connections between the more than 80 billion neurons in the brain is a hopeless task for science. By the recent advancement of magnetic resonance imaging (MRI), we are able to map the macroscopic connections between about 1000 brain areas. The MRI dat...
Article
Full-text available
While it is still not possible to describe the neuronal-level connections of the human brain, we can map the human connectome with several hundred vertices, by the application of diffusion-MRI based techniques. In these graphs, the nodes correspond to anatomically identified gray matter areas of the brain, while the edges correspond to the axonal f...
Article
Full-text available
The human connectome has become the very frequent subject of study of brain-scientists, psychologists and imaging experts in the last decade. With diffusion magnetic resonance imaging techniques, united with advanced data processing algorithms, today we are able to compute braingraphs with several hundred, anatomically identified nodes and thousand...
Preprint
Full-text available
The Protein Data Bank (PDB) today contains more than 153,000 entries with the 3-dimensional structures of biological macromolecules. Using the rich resources of this repository, it is possible identifying subsets with specific, interesting properties for different applications. Our research group prepared an automatically updated list of amyloid- a...
Article
Full-text available
In the study of the human connectome, the vertices and the edges of the network of the human brain are analyzed: the vertices of the graphs are the anatomically identified gray matter areas of the subjects; this set is exactly the same for all the subjects. The edges of the graphs correspond to the axonal fibers, connecting these areas. In the biol...
Preprint
For more than a decade now, we can discover and study thousands of cerebral connections with the application of diffusion magnetic resonance imaging (dMRI) techniques and the accompanying algorithmic workflow. While numerous connectomical results were published enlightening the relation between the braingraph and certain biological, medical, and ps...
Article
Full-text available
We analyzed correlations between more than 700 psychological-, anatomical- and connectome--properties, originated from the Human Connectome Project's (HCP) 500-subject dataset. Apart from numerous natural correlations, which describe parameters computable or approximable from one another, we have discovered numerous significant correlations in the...
Article
Full-text available
In mapping the human structural connectome, we are in a very fortunate situation: one can compute and compare graphs, describing the cerebral connections between the very same, anatomically identified small regions of the gray matter among hundreds of human subjects. The comparison of these graphs has led to numerous recent results, as the (1) disc...
Preprint
The human connectome has become the very frequent subject of study of brain-scientists, psychologists, and imaging experts in the last decade. With diffusion magnetic resonance imaging techniques, unified with advanced data processing algorithms, today we are able to compute braingraphs with several hundred, anatomically identified nodes and thousa...
Chapter
While it is still not possible to describe the neuronal-level connections of the human brain, we can map the human connectome with several hundred vertices, by the application of diffusion-MRI based techniques. In these graphs, the nodes correspond to anatomically identified gray matter areas of the brain, while the edges correspond to the axonal f...
Article
Full-text available
Here we show a method of directing the edges of the connectomes, prepared from HARDI datasets from the human brain. Before the present work, no high-definition directed braingraphs were published, because the tractography methods in use are not capable of assigning directions to the neural tracts discovered. Previous work on the functional connecto...
Data
Table gives the source C# code of the program, which computes the directions of the connectome edges. (PDF)
Preprint
Full-text available
While it is still not possible to describe the neural-level connections of the human brain, we can map the human connectome with several hundred vertices, by the application of diffusion-MRI based techniques. In these graphs, the nodes correspond to anatomically identified gray matter areas of the brain, while the edges correspond to the axonal fib...
Article
Full-text available
Deep, classical graph-theoretical parameters, like the size of the minimum vertex cover, the chromatic number, or the eigengap of the adjacency matrix of the graph were studied widely by mathematicians in the last century. Most researchers today study much simpler parameters of braingraphs or connectomes which were defined in the last twenty years...
Article
Full-text available
The Protein Data Bank (PDB) contains more than 135,000 entries at present. From these, relatively few amyloid structures can be identified, since amyloids are insoluble in water. Therefore, most amyloid structures deposited in the PDB are solid state NMR data. Based on the geometric analysis of these deposited structures we have prepared an automat...
Preprint
Full-text available
In the study of the human connectome, the vertices and the edges of the network of the human brain is analyzed: the vertices of the graphs are the anatomically identified gray matter areas of the subjects; this set is exactly the same for all the subjects. The edges of the graphs correspond to the axonal fibers, connecting these areas. In the biolo...
Preprint
Full-text available
The Protein Data Bank (PDB) contains more than 135 000 entries today. From these, relatively few amyloid structures can be identified, since amyloids are insoluble in water. Therefore, mostly solid state NMR-recorded amyloid structures are deposited in the PDB. Based on the geometric analysis of these deposited structures we have prepared an automa...
Article
Using solely geometric conditions for beta‐sheets, we have prepared and maintain an automatically updated list of amyloid structures from the Protein Data Bank (PDB). Our list contains the insoluble amyloid structures, and, additionally, those globular proteins, which contain amyloid‐like sub‐structures. We assume that these globular proteins are t...
Article
Full-text available
Consensus Connectome Dynamics (CCD) is a remarkable phenomenon of the human connectomes (braingraphs) that was discovered by continuously decreasing the minimum confidence-parameter at the graphical interface of the Budapest Reference Connectome Server, which depicts the cerebral connections of n = 418 subjects with a frequency-parameter k: For any...
Article
Full-text available
The fast and affordable sequencing of large clinical and environmental metagenomic datasets opens up new horizons in medical and biotechnological applications. It is believed that today we have described only about 1\% of the microorganisms on the Earth, therefore, metagenomic analysis mostly deals with unknown species in the samples. Microbial com...
Preprint
The fast and affordable sequencing of large clinical and environmental metagenomic datasets opens up new horizons in medical and biotechnological applications. It is believed that today we have described only about 1\% of the microorganisms on the Earth, therefore, metagenomic analysis mostly deals with unknown species in the samples. Microbial com...
Article
Full-text available
Artificial intelligence (AI) tools are gaining more and more ground each year in bioinformatics. Learning algorithms can be taught for specific tasks by using the existing enormous biological databases, and the resulting models can be used for the high-quality classification of novel, un-categorized data in numerous areas, including biological sequ...
Article
Full-text available
Mimivirus was identified in 2003 from a biofilm of an industrial water-cooling tower in England. Later, numerous new giant viruses were found in oceans and freshwater habitats, some of them having 2,500 genes. We have demonstrated their likely presence in four soil samples taken from the Kutch Desert (Gujarat, India). Here we describe a bioinformat...
Article
Full-text available
Artificial neural networks (ANNs) have gained a well-deserved popularity among machine learning tools upon their recent successful applications in image-and sound processing and classification problems. ANNs have also been applied for predicting the family or function of a protein, knowing its residue sequence. Here we present two new ANNs with mul...
Preprint
Artificial neural networks (ANNs) have gained a well-deserved popularity among machine learning tools upon their recent successful applications in image- and sound processing and classification problems. ANNs have also been applied for predicting the family or function of a protein, knowing its residue sequence. Here we present two new ANNs with mu...
Article
Full-text available
Fine-tuned regulation of the cellular nucleotide pools is indispensable for faithful replication of Deoxyribonucleic Acid (DNA). The genetic information is also safeguarded by DNA damage recognition and repair processes. Uracil is one of the most frequently occurring erroneous bases in DNA; it can arise from cytosine deamination or thymine-replacin...
Data
List of prokaryotic genomes with the simultaneous lack of the dut and ung genes (dut–, ung– genotype). The table provides gives the list of the prokaryotic (bacterial/archaeal) genomes that lack both the dUTPase and UNG genes.
Data
List of prokaryotic genomes where the dut gene is absent and the ung gene is present (dut–, ung+ genotype). The table provides gives the list of the prokaryotic (bacterial/archaeal) genomes without the dUTPase but with the UNG gene. The second column shows the presence of UNG inhibitors in the genome.
Data
The distribution of bacterial/Archaeal genomes with and without dUTPase at the family level. Only those families are shown that have at least 15 genomes examined. Each node of the tree is labeled by three numbers: the first is the number of genomes with dUTPase under the node (lilac color on the pie graph segment); the second is the number of genom...
Article
Full-text available
The increasing quantity and quality of the publicly available human cerebral diffusion MRI data make possible the study of the brain as it was unimaginable before. The Consensus Connectome Dynamics (CCD) is a remarkable phenomenon that was discovered by continuously decreasing the minimum confidence-parameter at the graphical interface of the Budap...
Article
Full-text available
Based on the data of the NIH-funded Human Connectome Project, we have computed structural connectomes of 426 human subjects in five different resolutions of 83, 129, 234, 463 and 1015 nodes and several edge weights. The graphs are given in anatomically annotated GraphML format that facilitates better further processing and visualization. For 96 sub...
Article
Full-text available
Background: Metagenomic analysis of environmental and clinical samples is gaining considerable importance in today's literature. Changes in the composition of the intestinal microbial communities, relative to the healthy control, are reported in numerous conditions. Methods: We have carefully analyzed the frequencies of the short nucleotide sequ...
Article
Full-text available
DNA sequencing technologies are applied widely and frequently today to describe metagenomes, i.e., microbial communities in environmental or clinical samples, without the need for culturing them. These technologies usually return short (100-300 base-pairs long) DNA reads, and these reads are processed by metagenomic analysis software that assign ph...
Article
Full-text available
The average human brain volume of the males is larger than that of the females. Several MRI voxel-based morphometry studies show that the gray matter/white matter ratio is larger in females. Here we have analyzed the recent public release of the Human Connectome Project, and by using the diffusion MRI data of 511 subjects (209 men and 302 women), w...
Article
Connections of the living human brain, on a macroscopic scale, can be mapped by a diffusion MR imaging based workflow. Since the same anatomic regions can be corresponded between distinct brains, one can compare the presence or the absence of the edges, connecting the very same two anatomic regions, among multiple cortices. Previously, we have cons...
Preprint
Connections of the living human brain, on a macroscopic scale, can be mapped by a diffusion MR imaging based workflow. Since the same anatomic regions can be corresponded between distinct brains, one can compare the presence or the absence of the edges, connecting the very same two anatomic regions, among multiple cortices. Previously, we have cons...
Article
Full-text available
The human braingraph or the connectome is the object of an intensive research today. The advantage of the graph-approach to brain science is that the rich structures, algorithms and definitions of graph theory can be applied to the 1000-node anatomical networks of the connections of the human brain. In these graphs, the vertices correspond to the s...
Article
Full-text available
Fine-tuned regulation of the cellular nucleotide pools is indispensable for faithful replication of DNA. The genetic information is also safeguarded by DNA damage recognition and repair processes. Uracil is one of the most frequently occurring erroneous base in DNA; it can arise from cytosine deamination or thymine-replacing incorporation. Two enzy...