Aleksandra GrucaSilesian University of Technology · Institute of Computer Science
Aleksandra Gruca
DSc. PhD. Eng.
About
62
Publications
13,665
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
896
Citations
Introduction
I obtained my PhD in Technical Sciences, Bioinformatics from SUT in 2009. My research interests are focused on application of data mining and machine learning methods for automated functional interpretation of high-throughput biological experiments. I am also interested in industrial data analysis. From 2010 I am a Member of the Board of the Polish Bioinformatics Society. I am also a Chair of Organising Committee of the International Conference on Man-Machine Interactions.
Publications
Publications (62)
The Polish Bioinformatic Society (PTBI) Symposium convenes annually at leading Polish Universities, and in 2023, the Silesian University of Technology hosted participants from all over the world. The 15th PTBI Symposium, spanning a three-day duration and divided into four scientific sessions, gathered around 100 participants and centered on researc...
The rapid development of machine learning (ML) techniques has opened up the data-dense field of microbiome research for novel therapeutic, diagnostic, and prognostic applications targeting a wide range of disorders, which could substantially improve healthcare practices in the era of precision medicine. However, several challenges must be addressed...
Weather4cast again advanced modern algorithms in AI and machine learning through a highly topical interdisciplinary competition challenge: The prediction of hires rain radar movies from multi-band satellite sensors, requiring data fusion, multi-channel video frame prediction, and super-resolution. Accurate predictions of rain events are becoming ev...
[This corrects the article DOI: 10.3389/fmolb.2022.828674.].
Low complexity regions are fragments of protein sequences composed of only a few types of amino acids. These regions frequently occur in proteins and can play an important role in their functions. However, scientists are mainly focused on regions characterized by high diversity of amino acid composition. Similarity between regions of protein sequen...
The IARAI Traffic4cast competitions at NeurIPS 2019 and 2020 showed that neural networks can successfully predict future traffic conditions 1 hour into the future on simply aggregated GPS probe data in time and space bins. We thus reinterpreted the challenge of forecasting traffic conditions as a movie completion task. U-Nets proved to be the winni...
Deficiency in a principal epidermal barrier protein, filaggrin (FLG), is associated with multiple allergic manifestations, including atopic dermatitis and contact allergy to nickel. Toxicity caused by dermal and respiratory exposures of the general population to nickel-containing objects and particles is a deleterious side effect of modern technolo...
Patinent multi-omics datasets are often characterized by a high dimensionality, however usually only for a small fraction of the features is informative, that is changes in their values is directly related to the disease outcome or patient survival. In medical sciences, in addition to a robust feature selection procedure, the ability to discover hu...
New diseases constantly endanger the lives of populations, and, nowadays, they can spread easily and constitute a global threat. The COVID-19 pandemic has shown that the fight against a new disease may be difficult, especially at the initial stage of the epidemic, when medical knowledge is not complete and the symptoms are ambiguous. The use of mac...
High-resolution remote sensing technology for Earth Observation (EO) has radically changed how we monitor the state of our planet around the clock. An effective interpretation of the resulting complex large-scale time series adopts the best machine learning techniques from signal processing, computer vision, pattern recognition, and artificial inte...
In the DECODE project, data were collected from 3,114 surveys filled by symptomatic patients RT-qPCR tested for SARS-CoV-2 in a single university centre in March-September 2020. The population demonstrated balanced sex and age with 759 SARS-CoV-2( +) patients. The most discriminative symptoms in SARS-CoV-2( +) patients at early infection stage were...
Efforts of the scientific community led to the development of multiple screening approaches for COVID-19 that rely on machine
learning methods. However, there is a lack of works showing how to tune the classification models used for such a task and what the tuning effect is in terms of various classification quality measures. Understanding the impa...
Today, academic researchers benefit from the changes driven by digital technologies and the enormous growth of knowledge and data, on globalisation, enlargement of the scientific community, and the linkage between different scientific communities and the society. To fully benefit from this development, however, information needs to be shared openly...
Background
The rapid spread of the COVID-19 demands immediate response from the scientific communities. Appropriate countermeasures mean thoughtful and educated choice of viral targets (epitopes). There are several articles that discuss such choices in the SARS-CoV-2 proteome, other focus on phylogenetic traits and history of the Coronaviridae geno...
The number of microbiome-related studies has notably increased the availability of data on human microbiome composition and function. These studies provide the essential material to deeply explore host-microbiome associations and their relation to the development and progression of various complex diseases. Improved data-analytical tools are needed...
Today, academic researchers benefit from the changes driven by digital technologies and the enormous growth of knowledge and data, on globalisation, enlargement of the scientific community, and the linkage between different scientific communities and the society. To fully benefit from this development, however, information needs to be shared openly...
Distinguishing COVID-19 from other flu-like illnesses can be difficult due to ambiguous symptoms and still an initial experience of doctors. Whereas, it is crucial to filter out those sick patients who do not need to be tested for SARS-CoV-2 infection, especially in the event of the overwhelming increase in disease. As a part of the presented resea...
The rapid spread of the COVID-19 demands immediate response from the scientific communities. Appropriate countermeasures mean thoughtful and educated choice of viral targets (epitopes). There are several articles that discuss such choices in the SARS-CoV-2 proteome, other focus on phylogenetic traits and history of the Coronaviridae genome/proteome...
Low complexity regions (LCRs) in protein sequences are characterized by a less diverse amino acid composition compared to typically observed sequence diversity. Recent studies have shown that LCRs may co-occur with intrinsically disordered regions, are highly conserved in many organisms, and often play important roles in protein functions and in di...
Modern multi-omics studies introduce a major challenge for biomedical data storage, processing and integration. The ability to utilize multiple different measurement techniques in order to provide a comprehensive view on the studied processes is becoming a standard in molecular biology and medicine, increasing the need for the development of new st...
Low Complexity Regions (LCRs) are fragments of protein sequences that are characterized by a small diversity in amino acid composition. LCRs could play important roles in protein functions or they could be relevant to protein structure. However, for many years, low complexity regions were ignored by the scientific community which resulted in lack o...
This book includes a selection papers describing the latest advances and discoveries in the field of human-computer interactions, which were presented at the 6th International Conference on Man-Machine Interactions, ICMMI 2019, held in Cracow, Poland, in October 2019.
Human-computer interaction is a multidisciplinary field concerned with the desig...
The widespread occurrence of repetitive stretches of DNA in genomes of organisms across the tree of life imposes fundamental challenges for sequencing, genome assembly, and automated annotation of genes and proteins. This multi-level problem can lead to errors in genome and protein databases that are often not recognized or acknowledged. As a conse...
Electrostatic interactions play important roles in the functional mechanisms exploited by intrinsically disordered proteins (IDPs). The atomic resolution description of long-range and local structural propensities that can both be crucial for the function of highly charged IDPs present significant experimental challenges. Here we investigate the co...
There are multiple definitions for low complexity regions (LCRs) in protein sequences, with all of them broadly considering LCRs as regions with fewer amino acid types compared to an average composition. Following this view, LCRs can also be defined as regions showing composition bias. In this critical review, we focus on the definition of sequence...
Modern, high-throughput methods for the analysis of genetic information, gene and metabolic products and their interactions offer new opportunities to gain comprehensive information on life processes. The data and knowledge generated open diverse application possibilities with enormous innovation potential. To unlock that potential skills in genera...
We present Cancer Clonal Evolution Simulator (CCES)—a simulation system of the clonal growth of cancer cell populations with arbitrary selection patterns. It is possible record the life history of each separate cancer cell and to track propagation of mutations over cancer cells phylogenies at the scale of hundreds of thousands of cancer cells. Than...
Modern high-throughput technologies based on genome, transcriptome or proteome profiling provide abundance of data that needs to be processed, analyzed and, finally, interpreted. Effective and efficient analysis of data coming from molecular profiling is crucial for a detailed diagnosis, prognosis, and prediction of therapy outcome. Meaningful conc...
Proteomics profiling of tissue specimens representative for major types of thyroid cancers: papillary (classical and follicular variant), follicular, anaplastic and medullary, as well as benign follicular adenoma, was performed using shotgun LC-MS/MS approaches. A combination of Orbitrap and MALDI-TOF approach allowed to identify protein products o...
Background
High-throughput methods in molecular biology provided researchers with abundance of experimental data that need to be interpreted in order to understand the experimental results. Manual methods of functional gene/protein group interpretation are expensive and time-consuming; therefore, there is a need to develop new efficient data mining...
In the fuel industry, as in any other, it is important to have control over the resources that directly generate profit. In the case of a petrol station this is a gasoline that passes a very complicated path from the terminal, through underground tank up to the tank inside the car. At most stages of its journey, the system can control the volume, h...
This book provides an overview of the current state of research on development and application of methods, algorithms, tools and systems associated with the studies on man-machine interaction. Modern machines and computer systems are designed not only to process information, but also to work in dynamic environment, supporting or even replacing huma...
Glyceraldehyde-3-phosphate dehydrogenase from human sperms (GAPDHS) provides energy to the sperm flagellum, and is therefore essential for sperm motility and male fertility. This isoform is distinct from somatic GAPDH, not only in being specific for the testis but also because it contains an additional amino-terminal region that encodes a proline-r...
This book provides an overview of the current state of research on development and application of methods, algorithms, tools and systems associated with the studies on man-machine interaction. Modern machines and computer systems are designed not only to process information, but also to work in dynamic environment, supporting or even replacing huma...
In this paper we apply two data dimensionality reduction methods to eye movement dataset and analyse how the feature reduction method improves classification accuracy. Due to the specificity of the recording process, eye movement datasets are characterized by both big size and high-dimensionality that make them difficult to analyse and classify usi...
The approach to identify clusters of genes represented both by expression values and Gene Ontology annotations, where cluster membership should not be in conflict with any of the representations is presented in the paper. The method enables to identify the genes that are differently clustered in different representations, what can lead to further a...
Man-Machine Interaction is an interdisciplinary field of research that covers many aspects of science focused on a human and machine in conjunction. Basic goal of the study is to improve and invent new ways of communication between users and computers, and many different subjects are involved to reach the long-term research objective of an intuitiv...
Object-relational mapping is a technology that connects relationships with object-oriented entities, which aims to eliminate duplicate layers together with costs of maintenance and any errors arising from their existence. A lot of tools and technologies were designed in order to support and implement idea of object-relational mapping. In this paper...
Systems for re-annotations of DNA microarray data for supporting analysis of results of DNA microarray experiments are becoming important elements of bioinformatics aspects of gene expression based studies. However, due to the computational problems related to the whole genome browsing projects, available services and data for re-annotation of micr...
In this paper we present new extension of RuleGO rule generation method. The method was designed to discover logical rules including combination of GO terms in their premises in order to provide functional description of analyzed gene signatures. As the number of obtained rules is typically huge, filtration algorithm is required to select only the...
In the paper new modification of the rules induction method for description of gene groups using Gene Ontology based on FP-growth al-gorithm is proposed. The modification takes advantage of the hierarchical structure of GO graph, specific property of a single prefix-path FP tree and the fact that if we generate rules for description purposes we do...
Methods for automatic functional description of gene groups are useful tools supporting the interpretation of biological experiments. The RuleGO algorithm provides functional interpretation of gene groups in a form of logical rules including combinations of Gene Ontology terms in their premises. The number of rules generated by the algorithm is usu...
In this paper we present results of analysis if (and how) the functional similarity of genes can be compared to the similarity resulting from raw experimental data. We assume that information provided by Gene Ontology database can be regarded as an expert knowledge on genes and their function and therefore it should be correlated with genes similar...
Systems for re-annotations of DNA microarray data for supporting analysis of results of DNA microarray experiments are becoming important elements of bioinformatics aspects of gene expression based studies [10]. However, due to the computational problems related to the whole genome browsing projects, available services and data for re-annotation of...
Genome-wide expression profiles obtained with the use of DNA microarray technology provide abundance of experimental data
on biological and molecular processes. Such amount of data need to be further analyzed and interpreted in order to obtain
biological conclusions on the basis of experimental results. The analysis requires a lot of experience and...
In this paper we present the results of the research verifying how the functional description of genes contained in Gene Ontology
database is related to genes expression values recorded during biological experiments. We compare several different gene similarity
measures and semantic term similarity measures, and evaluate how the similarity of gene...
By mistake, the following funding information was not included in the original version of the paper:
Acknowledgments. This paper was partially supported by the European Community through the European Social Fund.
A rules induction algorithm dedicated to describe groups of genes with similar expression profiles by means of Gene Ontology terms is discussed in the paper. The presented algorithm takes into consideration information contained in the Gene Ontology graph. A huge number of created rules requires defining the rules quality and similarity measures, t...
Quality improvement of rule-based gene group descriptions using information about GO terms importance occurring in premises of determined rules
In this paper we present a method for evaluating the importance of GO terms which compose multi-attribute rules. The rules are generated for the purpose of biological interpretation of gene groups. Each mul...
This paper presents the Internet application, which allows to perform distant statistical analysis of the data form the GENEPI-ENTB
database. The database includes tissues from irradiated patients with different types of cancer linked out to a detailed description
of treatment and outcome. The main purpose of the system presented in the paper is to...
The paper presents results of the research verifying whether gene clustering that takes under consideration both gene expression
values and similarity of GO terms improves a quality of rule-based description of the gene groups. The obtained results show
that application of the Conditional Robust Fuzzy C-Medoids algorithm enables to obtain gene grou...
In this paper we present the architecture of the RuleGO – a grid-based Internet application for describing gene groups using
decision rules based on Gene Ontology terms. Due to the complexity of the rule induction algorithm there is a need to use
sophisticated mechanisms for supporting multiple requests from the application users and to perform sim...
This paper presents the method of evaluating the image quality using similarity of images phase spectrum. The authors introduce
phase correlation coefficient as an objective measure of an image quality index which is compared to the subjective distortion
evaluation using stimulus impairment scale. Artificially distorted images produced by proportio...
Multimedia streaming transmission might be affected by jitter – short term packet delivery delay variation. To avoid transmission errors there are de-jitter buffers used, therefore it is necessary to estimate the current jitter although it is impossible to calculate straight value on-line. The situation is complicated by various jitter causes and i...
Multimedia real time streaming transmission might be affected by jitter – short term packet delivery delay variation. To avoid playing out 'hiccup' there are de-jitter buffers used, therefore it is necessary to estimate the current jitter although it is impossible to calculate straight value on-line. Nowadays several running value estimators based...
In this paper, a novel method for characterizing the Gene Ontology (GO) composition of the gene clusters on basis of the decision rules is presented. The rules are expressed as logical functions of the Gene Ontology terms which are interpreted as binary attributes. A new method for evaluating the quality of decision rules based on statistical signi...
We describe in this paper, a system that groups, classifies and finds the latent semantic features in a database composed of a large number of documents. The database will be constantly growing as users who co-create it will be adding more and more new documents. Users require a system to provide them information, both about a specific document, an...