Vince GrolmuszEötvös Loránd University · Institute of Mathematics
Vince Grolmusz
Ph.D., D.Sc.
About
186
Publications
20,816
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
2,029
Citations
Introduction
As a professor of mathematics, I work on bionformatics problems with mathematical perspectives.
Our most recent discoveries:
---Women's braingraphs are much better connected than those of men;
--- We have discovered the phenomenon of Consensus Connectome Dynamics (CCD) that - most probably - describes the development of cerebral connections between gray matter areas of the human brain
--- We have mapped the individual variability of the connections in the brain.
http://grolmusz.pitgroup.org
Additional affiliations
October 1991 - December 1993
January 1999 - June 1999
September 1986 - June 1987
Publications
Publications (186)
The human braingraph or the connectome is the object of an intensive research today. The advantage of the graph-approach to brain science is that the rich structures, algorithms and definitions of graph theory can be applied to the anatomical networks of the connections of the human brain. In these graphs, the vertices correspond to the small (1-1....
Graph theory in the last two decades penetrated sociology, molecular biology, genetics, chemistry, computer engineering, and numerous other fields of science. One of the more recent areas of its applications is the study of the connections of the human brain. By the development of diffusion magnetic resonance imaging (diffusion MRI), it is possible...
In the applications of the graph theory it is unusual that one considers numerous, pairwise different graphs on the very same set of vertices. In the case of human braingraphs or connectomes, however, this is the standard situation: the nodes correspond to anatomically identified cerebral regions, and two vertices are connected by an edge if a diff...
In our previous study we have shown that the female connectomes have
significantly better, deep graph-theoretical parameters, related to superior
"connectivity", than the connectome of the males. Since the average female
brain is smaller than the average male brain, one cannot rule out that the
significant advantages are due to the size- and not to...
We construct a system H of exp(c log 2 n= log log n) subsets of a set of n elements such that the size of each set is divisible by 6 but their pairwise intersections are not divisible by 6. The result generalizes to all non-prime-power moduli m in place of m = 6. This result is in sharp contrast with results of Frankl and Wilson (1981) for prime po...
Human braingraphs or connectomes are widely studied in the last decade to understand the structural and functional properties of our brain. In the last several years our research group has computed and deposited thousands of human braingraphs to the braingraph.org site, by applying public structural (diffusion) MRI data from young and healthy subje...
Polycyclic aromatic hydrocarbons (PAHs) are highly toxic, carcinogenic substances. On soils contaminated with PAHs, crop cultivation, animal husbandry and even the survival of microflora in the soil are greatly perturbed, depending on the degree of contamination. Most microorganisms cannot tolerate PAH-contaminated soils, however, some microbial st...
Mutated genes may lead to cancer development in numerous tissues. While more than 600 cancer-causing genes are known today, some of the most widespread mutations are connected to the RAS gene; RAS mutations are found in approximately 25% of all human tumors. Specifically, KRAS mutations are involved in the three most lethal cancers in the U.S., nam...
We consider the 1015-vertex human consensus connectome computed from the diffusion MRI data of 1064 subjects. We define seven different orders on these 1015 graph vertices, where the orders depend on parameters derived from the brain circuitry, that is, from the properties of the edges (or connections) incident to the vertices ordered. We order the...
Enzymatic processes play an increasing role in synthetic organic chemistry which requires the access to a broad and diverse set of enzymes. Metagenome mining is a valuable and efficient way to discover novel enzymes with unique properties for biotechnological applications. Here, we report the discovery and biocatalytic characterization of six novel...
The LogRank conjecture of Lovász and Saks (1988) is the most famous open problem in communication complexity theory. The statement is as follows: suppose that two players intend to compute a Boolean function f(x,y) when x is known for the first and y for the second player, and they may send and receive messages encoded with bits, then they can comp...
The LogRank conjecture of Lovász and Saks from 1988 is the most famous open problem in the communication complexity theory. The statement is as follows: Suppose that two players intend to compute a Boolean function f (x, y) when x is known for the first and y for the second player, and they may send and receive messages encoded with bits, then they...
Hexapeptides are increasingly applied as model systems for studying the amyloidogenecity properties of oligo- and polypeptides. It is possible to construct 64 million different hexapeptides from the twenty proteinogenic amino acid residues. Today's experimental amyloid databases contain only a fraction of these annotated hexapeptides. For labeling...
Polycyclic aromatic hydrocarbons (PAHs) are highly toxic, carcinogenic substances. On soils contaminated with PAHs, crop cultivation, animal husbandry and even the survival of microflora in the soil are greatly perturbed, depending on the degree of contamination. Most microorganisms cannot tolerate PAH-contaminated soils, however, some microbial st...
We consider the 1015-vertex human consensus connectome computed from the diffusion MRI data of 1064 subjects. We define seven different orders on these 1015 graph vertices, where the orders depend on parameters derived from the brain circuitry, that is, from the properties of the edges (or connections) incident to the vertices ordered. We order the...
Methods from artificial intelligence (AI), in general, and machine learning, in particular, have kept conquering new territories in numerous areas of science. Most of the applications of these techniques are restricted to the classification of large data sets, but new scientific knowledge can seldom be inferred from these tools. Here we show that a...
How the cuticles of the roughly 4.5 million species of ecdysozoan animals are constructed is not well understood. Here, we systematically mine gene expression datasets to uncover the spatiotemporal blueprint for how the chitin-based pharyngeal cuticle of the nematode Caenorhabditis elegans is built. We demonstrate that the blueprint correctly predi...
Determining important vertices in large graphs (e.g., Google’s PageRank in the case of the graph of the World Wide Web) facilitated the construction of excellent web search engines, returning the most important hits corresponding to the submitted user queries. Interestingly, finding important edges – instead of vertices – in large graphs has receiv...
Hexapeptides are widely applied as a model system for studying the amyloid-forming properties of polypeptides, including proteins. Recently, large experimental databases have become publicly available with amyloidogenic labels. Using these data sets for training and testing purposes, one may build artificial intelligence (AI)-based classifiers for...
Roughly 4.5 million species of ecdysozoan animals repeatedly shed their old cuticle and construct a new one underneath to accommodate growth. How cuticles are constructed is not well understood. Here, we systematically mine gene expression datasets to uncover the spatiotemporal blueprint for how the chitin-based pharyngeal cuticle of the nematode C...
Hexapeptides are widely applied as a model system for studying amyloid-forming properties of polypeptides, including proteins. Recently, large experimental databases have become publicly available with amyloidogenic labels. Using these datasets for training and testing purposes, one may build artificial intelligence (AI)-based classifiers for predi...
Gaussian blurring is a well-established method for image data augmentation: it may generate a large set of images from a small set of pictures for training and testing purposes for Artificial Intelligence (AI) applications. When we apply AI for non-imagelike biological data, hardly any related method exists. Here we introduce the “Newtonian blurrin...
The analysis of enormous datasets with missing data entries is a standard task in biological and medical data processing. Large-scale, multi-institution clinical studies are the typical examples of such datasets. These sets make possible the search for multi-parametric relations since from the plenty of the data one is likely to find a satisfying n...
The Protein Data Bank (PDB) today contains more than 174,000 entries with the 3-dimensional structures of biological macromolecules. Using the rich resources of this repository, it is possible identifying subsets with specific, interesting properties for different applications. Our research group prepared an automatically updated list of amyloid- a...
For more than a decade now, we can discover and study thousands of cerebral connections with the application of diffusion magnetic resonance imaging (dMRI) techniques and the accompanying algorithmic workflow. While numerous connectomical results were published enlightening the relation between the braingraph and certain biological, medical, and ps...
Determining important vertices in large graphs (e.g., Google's PageRank in the case of the graph of the World Wide Web) facilitated the construction of excellent web search engines, returning the most important hits corresponding to the submitted user queries. Interestingly, finding important edges -- instead of vertices -- in large graphs has rece...
The multiple sequence alignment (MSA) is an increasingly important task in bioinformatics as we have to deal with the constantly increasing gene‐ and protein‐sequence databases. MSA is applied in phylogenetic analysis, in discovering conservative protein domains, in the assignment of secondary and tertiary structural features in proteins, or in the...
The amyloid state of proteins is widely studied with relevance to neurology, biochemistry, and biotechnology. In contrast with nearly amorphous aggregation, the amyloid state has a well-defined structure, consisting of parallel and antiparallel β-sheets in a periodically repeated formation. The understanding of the amyloid state is growing with the...
The human brain is the most complex object of study we encounter today. Mapping the neuronal-level connections between the more than 80 billion neurons in the brain is a hopeless task for science. By the recent advancement of magnetic resonance imaging (MRI), we are able to map the macroscopic connections between about 1000 brain areas. The MRI dat...
The amyloid state of proteins is widely studied with relevancy in neurology, biochemistry, and biotechnology. In contrast with amorphous aggregation, the amyloid state has a well-defined structure, consisting of parallel and anti-parallel $\beta$-sheets in a periodically repeated formation. The understanding of the amyloid state is growing with the...
Gaussian blurring is a well-established method for image data augmentation: it may generate a large set of images from a small set of pictures for training and testing purposes for Artificial Intelligence (AI) applications. When we apply AI for non-imagelike biological data, hardly any related method exists. Here we introduce the ``Newtonian blurri...
The human brain is the most complex object of study we encounter today. Mapping the neuronal-level connections between the more than 80 billion neurons in the brain is a hopeless task for science. By the recent advancement of magnetic resonance imaging (MRI), we are able to map the macroscopic connections between about 1000 brain areas. The MRI dat...
While it is still not possible to describe the neuronal-level connections of the human brain, we can map the human connectome with several hundred vertices, by the application of diffusion-MRI based techniques. In these graphs, the nodes correspond to anatomically identified gray matter areas of the brain, while the edges correspond to the axonal f...
The human connectome has become the very frequent subject of study of brain-scientists, psychologists and imaging experts in the last decade. With diffusion magnetic resonance imaging techniques, united with advanced data processing algorithms, today we are able to compute braingraphs with several hundred, anatomically identified nodes and thousand...
The Protein Data Bank (PDB) today contains more than 153,000 entries with the 3-dimensional structures of biological macromolecules. Using the rich resources of this repository, it is possible identifying subsets with specific, interesting properties for different applications. Our research group prepared an automatically updated list of amyloid- a...
In the study of the human connectome, the vertices and the edges of the network of the human brain are analyzed: the vertices of the graphs are the anatomically identified gray matter areas of the subjects; this set is exactly the same for all the subjects. The edges of the graphs correspond to the axonal fibers, connecting these areas. In the biol...
For more than a decade now, we can discover and study thousands of cerebral connections with the application of diffusion magnetic resonance imaging (dMRI) techniques and the accompanying algorithmic workflow. While numerous connectomical results were published enlightening the relation between the braingraph and certain biological, medical, and ps...
We analyzed correlations between more than 700 psychological-, anatomical- and connectome--properties, originated from the Human Connectome Project's (HCP) 500-subject dataset. Apart from numerous natural correlations, which describe parameters computable or approximable from one another, we have discovered numerous significant correlations in the...
In mapping the human structural connectome, we are in a very fortunate situation: one can compute and compare graphs, describing the cerebral connections between the very same, anatomically identified small regions of the gray matter among hundreds of human subjects. The comparison of these graphs has led to numerous recent results, as the (1) disc...
The human connectome has become the very frequent subject of study of brain-scientists, psychologists, and imaging experts in the last decade. With diffusion magnetic resonance imaging techniques, unified with advanced data processing algorithms, today we are able to compute braingraphs with several hundred, anatomically identified nodes and thousa...
While it is still not possible to describe the neuronal-level connections of the human brain, we can map the human connectome with several hundred vertices, by the application of diffusion-MRI based techniques. In these graphs, the nodes correspond to anatomically identified gray matter areas of the brain, while the edges correspond to the axonal f...
Here we show a method of directing the edges of the connectomes, prepared from HARDI datasets from the human brain. Before the present work, no high-definition directed braingraphs were published, because the tractography methods in use are not capable of assigning directions to the neural tracts discovered. Previous work on the functional connecto...
Table gives the source C# code of the program, which computes the directions of the connectome edges.
(PDF)
While it is still not possible to describe the neural-level connections of the human brain, we can map the human connectome with several hundred vertices, by the application of diffusion-MRI based techniques. In these graphs, the nodes correspond to anatomically identified gray matter areas of the brain, while the edges correspond to the axonal fib...
Deep, classical graph-theoretical parameters, like the size of the minimum vertex cover, the chromatic number, or the eigengap of the adjacency matrix of the graph were studied widely by mathematicians in the last century. Most researchers today study much simpler parameters of braingraphs or connectomes which were defined in the last twenty years...
The Protein Data Bank (PDB) contains more than 135,000 entries at present. From these, relatively few amyloid structures can be identified, since amyloids are insoluble in water. Therefore, most amyloid structures deposited in the PDB are solid state NMR data. Based on the geometric analysis of these deposited structures we have prepared an automat...
In the study of the human connectome, the vertices and the edges of the network of the human brain is analyzed: the vertices of the graphs are the anatomically identified gray matter areas of the subjects; this set is exactly the same for all the subjects. The edges of the graphs correspond to the axonal fibers, connecting these areas. In the biolo...
The Protein Data Bank (PDB) contains more than 135 000 entries today. From these, relatively few amyloid structures can be identified, since amyloids are insoluble in water. Therefore, mostly solid state NMR-recorded amyloid structures are deposited in the PDB. Based on the geometric analysis of these deposited structures we have prepared an automa...
Using solely geometric conditions for beta‐sheets, we have prepared and maintain an automatically updated list of amyloid structures from the Protein Data Bank (PDB). Our list contains the insoluble amyloid structures, and, additionally, those globular proteins, which contain amyloid‐like sub‐structures. We assume that these globular proteins are t...
Consensus Connectome Dynamics (CCD) is a remarkable phenomenon of the human connectomes
(braingraphs) that was discovered by continuously decreasing the minimum confidence-parameter
at the graphical interface of the Budapest Reference Connectome Server, which depicts the cerebral
connections of n = 418 subjects with a frequency-parameter k: For any...
The fast and affordable sequencing of large clinical and environmental metagenomic datasets opens up new horizons in medical and biotechnological applications. It is believed that today we have described only about 1\% of the microorganisms on the Earth, therefore, metagenomic analysis mostly deals with unknown species in the samples. Microbial com...
The fast and affordable sequencing of large clinical and environmental metagenomic datasets opens up new horizons in medical and biotechnological applications. It is believed that today we have described only about 1\% of the microorganisms on the Earth, therefore, metagenomic analysis mostly deals with unknown species in the samples. Microbial com...
Artificial intelligence (AI) tools are gaining more and more ground each year in bioinformatics. Learning algorithms can be taught for specific tasks by using the existing enormous biological databases, and the resulting models can be used for the high-quality classification of novel, un-categorized data in numerous areas, including biological sequ...
Mimivirus was identified in 2003 from a biofilm of an industrial water-cooling tower in England. Later, numerous new giant viruses were found in oceans and freshwater habitats, some of them having 2,500 genes. We have demonstrated their likely presence in four soil samples taken from the Kutch Desert (Gujarat, India). Here we describe a bioinformat...
Artificial neural networks (ANNs) have gained a well-deserved popularity among machine learning tools upon their recent successful applications in image-and sound processing and classification problems. ANNs have also been applied for predicting the family or function of a protein, knowing its residue sequence. Here we present two new ANNs with mul...
Artificial neural networks (ANNs) have gained a well-deserved popularity among machine learning tools upon their recent successful applications in image- and sound processing and classification problems. ANNs have also been applied for predicting the family or function of a protein, knowing its residue sequence. Here we present two new ANNs with mu...
Fine-tuned regulation of the cellular nucleotide pools is indispensable for faithful replication of Deoxyribonucleic Acid (DNA). The genetic information is also safeguarded by DNA damage recognition and repair processes. Uracil is one of the most frequently occurring erroneous bases in DNA; it can arise from cytosine deamination or thymine-replacin...
List of prokaryotic genomes with the simultaneous lack of the dut and ung genes (dut–, ung– genotype). The table provides gives the list of the prokaryotic (bacterial/archaeal) genomes that lack both the dUTPase and UNG genes.
List of prokaryotic genomes where the dut gene is absent and the ung gene is present (dut–, ung+ genotype). The table provides gives the list of the prokaryotic (bacterial/archaeal) genomes without the dUTPase but with the UNG gene. The second column shows the presence of UNG inhibitors in the genome.
The distribution of bacterial/Archaeal genomes with and without dUTPase at the family level. Only those families are shown that have at least 15 genomes examined. Each node of the tree is labeled by three numbers: the first is the number of genomes with dUTPase under the node (lilac color on the pie graph segment); the second is the number of genom...
The increasing quantity and quality of the publicly available human cerebral diffusion MRI data make possible the study of the brain as it was unimaginable before. The Consensus Connectome Dynamics (CCD) is a remarkable phenomenon that was discovered by continuously decreasing the minimum confidence-parameter at the graphical interface of the Budap...
Based on the data of the NIH-funded Human Connectome Project, we have computed structural connectomes of 426 human subjects in five different resolutions of 83, 129, 234, 463 and 1015 nodes and several edge weights. The graphs are given in anatomically annotated GraphML format that facilitates better further processing and visualization. For 96 sub...
Background:
Metagenomic analysis of environmental and clinical samples is gaining considerable importance in today's literature. Changes in the composition of the intestinal microbial communities, relative to the healthy control, are reported in numerous conditions.
Methods:
We have carefully analyzed the frequencies of the short nucleotide sequ...
DNA sequencing technologies are applied widely and frequently today to describe metagenomes, i.e., microbial communities in environmental or clinical samples, without the need for culturing them. These technologies usually return short (100-300 base-pairs long) DNA reads, and these reads are processed by metagenomic analysis software that assign ph...
The average human brain volume of the males is larger than that of the females. Several MRI voxel-based morphometry studies show that the gray matter/white matter ratio is larger in females. Here we have analyzed the recent public release of the Human Connectome Project, and by using the diffusion MRI data of 511 subjects (209 men and 302 women), w...
Connections of the living human brain, on a macroscopic scale, can be mapped by a diffusion MR imaging based workflow. Since the same anatomic regions can be corresponded between distinct brains, one can compare the presence or the absence of the edges, connecting the very same two anatomic regions, among multiple cortices. Previously, we have cons...
Connections of the living human brain, on a macroscopic scale, can be mapped by a diffusion MR imaging based workflow. Since the same anatomic regions can be corresponded between distinct brains, one can compare the presence or the absence of the edges, connecting the very same two anatomic regions, among multiple cortices. Previously, we have cons...
The human braingraph or the connectome is the object of an intensive research
today. The advantage of the graph-approach to brain science is that the rich
structures, algorithms and definitions of graph theory can be applied to the
1000-node anatomical networks of the connections of the human brain. In these
graphs, the vertices correspond to the s...
Fine-tuned regulation of the cellular nucleotide pools is indispensable for
faithful replication of DNA. The genetic information is also safeguarded by DNA
damage recognition and repair processes. Uracil is one of the most frequently
occurring erroneous base in DNA; it can arise from cytosine deamination or
thymine-replacing incorporation. Two enzy...