Luis Rueda

Luis Rueda
University of Windsor · School of Computer Science

Doctor of Philosophy

About

210
Publications
47,293
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,565
Citations
Citations since 2017
59 Research Items
757 Citations
2017201820192020202120222023050100150
2017201820192020202120222023050100150
2017201820192020202120222023050100150
2017201820192020202120222023050100150
Introduction
My main research interests and contributions are focused on devising new machine learning algorithms for interactomics and transcriptomics data analysis. The main applications are on general problems in these fields and on the role of these in finding biomarkers in breast and prostate cancer. Currently, my research focuses on multi-omics data analysis, aiming at finding relevant biomarkers associated with different diseases and states of disease, and from different types of data, including copy number variations, DNA methylation, mutations, gene expression, microRNA, and others.
Additional affiliations
July 2008 - June 2013
University of Windsor
Position
  • Professor (Associate)
January 2006 - July 2008
University of Concepción
July 2003 - present
University of Windsor
Education
January 1999 - May 2002
Carleton University
Field of study
January 1998 - December 1998
Carleton University
Field of study
April 1987 - March 1993

Publications

Publications (210)
Cover Page
Full-text available
Invitation : Book chapter contribution Dear Colleagues, We are currently editing a book titled "Machine Learning Methods for Multi-Omics Data Integration ". This book is to be published by Springer Nature. Knowing that you have contributed significantly to this research area, I would like to cordially invite you to write a book chapter on any...
Conference Paper
Full-text available
SLiMs (Short Linear Motifs) are patterns of three to 20 amino acids within proteins that are sufficient to fulfill certain functions. SLiMs play a critical role in many biological processes. Hence, with the increasing quantity of biological data, it is important to develop algorithms that can quickly find patterns in large databases of DNA, RNA and...
Chapter
Recent studies on Single-cell RNA sequencing (scRNA-seq) technology have been widely applied in biological research and drug discovery. Before in-depth investigations of the functionality of single cells for pathological goals, identification of cell types is an essential step. Recently, several unsupervised learning methods have been developed to...
Conference Paper
Discovering clusters in social networks is of fundamental and practical interest. This paper presents a novel clustering strategy for large-scale highly-connected social networks. We propose a new hybrid clustering technique based on non-negative matrix fac-torization and independent component analysis for finding complex relationships among users...
Article
Full-text available
‘De novo’ drug discovery is costly, slow, and with high risk. Repurposing known drugs for treatment of other diseases offers a fast, low-cost/risk and highly-efficient method toward development of efficacious treatments. The emergence of large-scale heterogeneous biomolecular networks, molecular, chemical and bioactivity data, and genomic and pheno...
Article
Condition Monitoring (CM) is an essential element of securing reliable operating conditions of Wind Turbines (WT) in a wind farm. CM helps optimize maintenance by providing Remaining Useful Life (RUL) forecast. However, the expected RUL is not often reliable due to uncertainty associated with the prediction horizon. In this paper, we employ high-le...
Article
Full-text available
Identifying relevant disease modules such as target cell types is a significant step for studying diseases. High-throughput single-cell RNA-Seq (scRNA-seq) technologies have advanced in recent years, enabling researchers to investigate cells individually and understand their biological mechanisms. Computational techniques such as clustering, are th...
Article
Full-text available
Background: Circadian rhythms are daily physiological oscillations driven by the circadian clock: a 24-hour transcriptional timekeeper that regulates hormones, inflammation, and metabolism. Circadian rhythms are known to be important for health, but whether their loss contributes to colorectal cancer is not known. Aims: We tested the non-redunda...
Article
Chromatin immunoprecipitation (ChIPSeq) has emerged as a superior alternative to microarray technology as it provides higher resolution, less noise, greater coverage and wider dynamic range. While ChIP-Seq enables probing of DNA-protein interaction over the entire genome, it requires the use of sophisticated tools to recognize hidden patterns and e...
Preprint
Full-text available
'De novo' drug discovery is costly, slow, and with high risk. Repurposing known drugs for treatment of other diseases offers a fast, low-cost/risk and highly-efficient method toward development of efficacious treatments. The emergence of large-scale heterogeneous biomolecular networks, molecular, chemical and bioactivity data, and genomic and pheno...
Article
Online public reviews have significant influenced customers who purchase products or seek services. Fake reviews are posted online to promote or demote targeted products or reputation of the organizations and businesses. Spam review detection has been the focus of many researchers in recent years. As the online services have been growing rapidly, t...
Chapter
Human cell is a complex of interacting small molecules which work together to perform daily tasks of the cell. The reading and the measurements of this different molecules are called omics, where any dysfunction among these omics may cause different diseases, and cancer is not any exception. The advances in biomedical technology in general and in s...
Conference Paper
Full-text available
Recent emergence of a new coronavirus, SARS-CoV2, has caused the disease COVID-19 and has been declared a worldwide pandemic. Identification of relevant modules such as target cells is a significant step for characterizing diseases and consequently leads to better diagnosis, treatment and prognosis. High-throughput single-cell RNA-Seq (scRNA-seq) t...
Article
Full-text available
Background Increasing the survival rates for breast cancer has gained significant researcher interest. However, current studies reveal that a small subset of gene makers can predict survivability for people with different breast cancer subtypes. In these studies, the selected genes are not necessarily functionally related, and hence, they may not c...
Conference Paper
Breast cancer is the most common cancer among North American women and worldwide. In this paper, we present a deep learning model based on multiomics data integration to predict the five-year interval survival of breast cancer InClust 5. The data was selected from METABRIC dataset that contains three omic datasets: gene expression, copy number alte...
Poster
Abstract A tool that integrates next-generation sequencing data pre-processing with machine learning techniques to extract meaningful biomarkers from next-generation sequencing data.
Article
Motivation: One of the main challenges in applying graph convolutional neural networks on gene-interaction data is the lack of understanding of the vector space to which they belong, and also the inherent difficulties involved in representing those interactions on a significantly lower dimension, viz Euclidean spaces. The challenge becomes more pr...
Chapter
We introduce a network-based approach to identify subnets of functionally-related genes for predicting 5-year survivability of breast cancer patients treated with chemotherapy, hormone therapy, and a combination of these. A gene expression dataset and a protein-protein interaction network are integrated to construct a weighted graph, where edge wei...
Chapter
Breast cancer starts when cells in the breast begin to grow out of control. These cells usually form a tumor that can often be seen on an x-ray or felt as a lump. The tumor is malignant (cancer) if the cells can grow into (invade) surrounding tissues or spread (metastasize) to distant areas of the body. The challenge of this project was to build an...
Article
Full-text available
Background: Finding the tumor location in the prostate is an essential pathological step for prostate cancer diagnosis and treatment. The location of the tumor - the laterality - can be unilateral (the tumor is affecting one side of the prostate), or bilateral on both sides. Nevertheless, the tumor can be overestimated or underestimated by standar...
Article
568 Background: Bladder cancer is the fifth most common cancer and eighth leading cause of cancer related-death in North America. It can present as non-muscle invasive bladder cancer (NMIBC) and/or muscle invasive bladder (MIBC). Although genomic profiling studies have established that low-grade NMIBC and MIBC are genetically distinct, high-grade N...
Article
Full-text available
Prostate cancer (Pca) is one of the most common cancers among men worldwide. The current screening methods lack effectiveness such as prostate-specific antigen (PSA) and Magnetic resonance imaging (MRI), and some others come with pain such as biopsy. Understanding the genomic behavior of the disease may play a key part in designing more effective,...
Article
Full-text available
(1) Background:One of the most common cancers that affect North American men and men worldwide is prostate cancer. The Gleason score is a pathological grading system to examine the potential aggressiveness of the disease in the prostate tissue. Advancements in computing and next-generation sequencing technology now allow us to study the genomic pro...
Preprint
Full-text available
1) Background: One of the deadliest cancers that affect men worldwide and North American men is prostate cancer. This disease motivates parts of the cells in the prostate to lose control of their growth and division. 2) Methods: We are proposing a machine learning method used to analyze gene expressions of prostate tumors with different Gleason sco...
Conference Paper
Diagnosing and understanding the molecular mechanisms of cancer initiation and progression is critical for effective management in patients with sporadic colorectal cancer (CRC) in young adults. However, the lack of relevant biomarkers to identify this vital group of patients has been a major challenge. The main goal of this work is to devise a dee...
Conference Paper
Full-text available
Identifying biomarkers that can be used to classify certain disease stages, or identify when a disease becomes more aggressive is one of the most important applications of machine learning. Traditional biomarker identification approaches, typically, use machine learning techniques to identify a number of genes and macromolecules as biomarkers that...
Article
Full-text available
Genomic profiles among different breast cancer survivors who received similar treatment may provide clues about the key biological processes involved in the cells and finding the right treatment. More specifically, such profiling may help personalize the treatment based on the patients' gene expression. In this paper, we present a hierarchical mach...
Article
Full-text available
Prostate cancer is one of the most common types of cancer among Canadian men. Next-generation sequencing using RNA-Seq provides large amounts of data that may reveal novel and informative biomarkers. We introduce a method that uses machine learning techniques to identify transcripts that correlate with prostate cancer development and progression. W...
Article
Constructing a boundary representation for an AM part is challenging due to the large number of CSG operations that need to be performed. To tackle the problem, we begin with a review of different numeric representations and their suitability for solving geometric problems. We then review the state of the art in explicit boundary representations, e...
Article
Full-text available
Background The prediction of calmodulin-binding (CaM-binding) proteins plays a very important role in the fields of biology and biochemistry, because the calmodulin protein binds and regulates a multitude of protein targets affecting different cellular processes. Computational methods that can accurately identify CaM-binding proteins and CaM-bindin...
Article
Full-text available
Analyzing the genetic activity of breast cancer survival for a specific type of therapy provides a better understanding of the body response to the treatment and helps select the best course of action and while leading to the design of drugs based on gene activity. In this work, we use supervised and nonsupervised machine learning methods to deal w...
Data
SupplementaryMaterial_CLN – Supplemental material for A Novel Approach for Identifying Relevant Genes for Breast Cancer Survivability on Specific Therapies
Conference Paper
Constructing a boundary representation for an AM part is challenging due to the large numberof CSG operations that need to be performed. To tackle the problem, we begin with a review of different numeric representations and their suitability for solving geometric problems. We thenreview the state of the art in explicit boundary representations, exp...
Conference Paper
Full-text available
This paper focuses on the effective classification of the behavior of users accessing computing devices to authenticate them. The authentication is based on keystroke dynamics which captures the user's behavioral biometric and applies machine learning concepts to classify them. The users type a strong passcode ".tie5Roanl" to record their typing pa...
Conference Paper
Full-text available
Machine learning techniques are widely used for diagnosing faults to guarantee the safe and reliable operation of the systems. Among various techniques, semi-supervised learning can help in diagnosing faulty states and decision making in partially labeled data, where only a few number of labeled observations along with a large number of unlabeled o...
Conference Paper
Full-text available
Studying gene expression through various time intervals of breast cancer survival may provide new insights into the recovery from the disease. In this work, we propose a hierarchical clustering method to separate dissimilar groups of gene time-series profiles, which have the furthest distances from the rest of the profiles throughout different time...
Conference Paper
Gene expression data have been used in many researches to help reveal the underlying mechanism of many diseases. In this study, we applied feature selection techniques on breast cancer patients in the METABRIC Study to predict whether patients will be disease free or not, under different treatments. Our models for prediction are of high performance...
Article
Breast cancer is a complex disease that can be classified into at least 10 different molecular subtypes. Appropriate diagnosis of specific subtypes is critical for ensuring the best possible patient treatment and response to therapy. Current computational methods for determining the subtypes are based on identifying differentially expressed genes (...
Article
Full-text available
Genomic aberrations and gene expression-defined subtypes in the large METABRIC patient cohort have been used to stratify and predict survival. The present study used normalized gene expression signatures of paclitaxel drug response to predict outcome for different survival times in METABRIC patients receiving hormone (HT) and, in some cases, chemot...
Data
The predicted and expected response to treatment for each individual METABRIC patient for each analyses listed in Table 1, Table 2 and Table 3 are indexed. Patients sensitive to treatment are labeled with ‘0’ while resistant patients are labeled ‘1’.
Article
Full-text available
Next-generation sequencing technology generates a huge number of reads (short sequences), which contain a vast amount of genomic data. The sequencing process, however, comes with artifacts. Preprocessing of sequences is mandatory for further downstream analysis. We present Zseq, a linear method that identifies the most informative genomic sequences...
Conference Paper
Full-text available
Prediction of Calmodulin-binding (CaM-binding) proteins plays a very important role in the fields of biology and biochemistry, because Calmodulin binds and regulates a multitude of protein targets affecting different cellular processes. Short linear motifs (SLiMs), on the other hand, have been effectively used as features for analyzing protein-prot...
Conference Paper
Prostate cancer is a leading cause of death world-widely and the third leading cause of cancer death in Northen American men. Prostate cancer causes parts of the prostate cells to lose normal control of growth and division. The Gleason classification system is one of the known systems used to grade the aggressiveness of the prostate progression. In...
Article
Full-text available
Genomic aberrations and gene expression-defined subtypes in the large METABRIC patient cohort have been used to stratify and predict survival. The present study used normalized gene expression signatures of paclitaxel drug response to predict outcome for different survival times in METABRIC patients receiving hormone (HT) and, in some cases, chemot...
Conference Paper
An optimal classification model for classifying on a given problem should comprise of a classifier, a proper feature subset and a parameter set such that the classifier can attain high prediction performance as possible. Many recent feature selection methods are either too exhaustive or too greedy. Besides, many classification approaches conduct pa...
Conference Paper
Full-text available
We developed a new tool that can identify open reading frames (ORFs) for a given transcript and reconstruct protein isoforms using RNA-Seq data. Moreover, we use a modified version of the measure of abundance Fragments Per Kilobase of transcript per Million mapped reads (FPKM), aka adaptive FPKM (AFPKM), which in addition to using information about...
Article
Full-text available
Genomic aberrations and gene expression-defined subtypes in the large METABRIC patient cohort have been used to stratify and predict survival. The present study used normalized gene expression signatures of paclitaxel drug response to predict outcome for different survival times in METABRIC patients receiving hormone (HT) and, in some cases, chemot...
Poster
Full-text available
Translation, as the second step of central dogma in molecular biology, is a process for transforming mRNAs into amino acid chains. An open reading frame (ORF) is a continuous sequence of codons that begins with a start codon and ends with a one of the stop codons. Finding ORFs corresponding to a given mRNA transcript, is an important step in recons...
Conference Paper
Full-text available
Clustering is a prominent method to identify similar patterns in large groups of data and can be beneficial in the bioinformatics studies due to this property. Classical methods such as k-means and maximum likelihood consider a mixture of Gaussian probability density function (PDF) of data and find clusters based on maximizing the PDF. However, cor...
Conference Paper
Full-text available
Many bioinformatics data sets have class-imbalanced data, where the number of samples in each class is not equal. Since most of data sets contain usual versus unusual cases, e.g. cancer versus normal or miRNAs versus other non-coding RNA, where the minority class with the least number of samples is the interesting class that contains the unusual ca...
Article
Background: In cancer alternative RNA splicing represents one mechanism for flexible gene regulation, whereby protein isoforms can be created to promote cell growth, division and survival. Detecting novel splice junctions in the cancer transcriptome may reveal pathways driving tumourigenic events. In this regard, RNA-Seq, a high-throughput sequenc...
Chapter
Interactomics aims to study the main aspects of protein interactions in living systems. To understand the complex cellular mechanisms involved in a biological system, it is necessary to study the nature of these interactions at the molecular level, in which prediction of protein-protein interactions (PPIs) plays a significant role. This chapter foc...
Conference Paper
World-wide, one in nine women are diagnosed with breast cancer in their lifetime and breast cancer is the second leading cause of death among women. Accurate diagnosis of the specific subtypes of this disease is vital to ensure that the patients will have the best possible response to therapy. Using the newly proposed ten subtypes of breast cancer...
Conference Paper
World wide, one in nine women are diagnosed with breast cancer in their lifetime and breast cancer is the second leading cause of death among women. Accurate diagnosis of the specific subtypes of this disease is vital to ensure that patients will have the best possible response to therapy. One way to discriminate subtypes of breast cancer is to stu...