Markus Heinonen

Markus Heinonen
Aalto University · Department of Computer Science

PhD

About

62
Publications
7,993
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
766
Citations
Introduction
Machine learning and bioinformatics researcher

Publications

Publications (62)
Preprint
We introduce TCRconv, a deep learning model for predicting recognition between T-cell receptors and epitopes. TCRconv uses a deep protein language model and convolutions to extract contextualized motifs and provides state-of-the-art TCR-epitope prediction accuracy. Using TCR repertoires from COVID-19 patients, we demonstrate that TCRconv can provid...
Article
Surrogate models have been successfully used in likelihood-free inference to decrease the number of simulator evaluations. The current state-of-the-art performance for this task has been achieved by Bayesian Optimization with Gaussian Processes (GPs). While this combination works well for unimodal target distributions, it is restricting the flexibi...
Preprint
Full-text available
Approximate Bayesian inference estimates descriptors of an intractable target distribution - in essence, an optimization problem within a family of distributions. For example, Langevin dynamics (LD) extracts asymptotically exact samples from a diffusion process because the time evolution of its marginal distributions constitutes a curve that minimi...
Preprint
Recent machine learning advances have proposed black-box estimation of unknown continuous-time system dynamics directly from data. However, earlier works are based on approximative ODE solutions or point estimates. We propose a novel Bayesian nonparametric model that uses Gaussian processes to infer posteriors of unknown ODE systems directly from d...
Preprint
Sample-efficient domain adaptation is an open problem in robotics. In this paper, we present affine transport -- a variant of optimal transport, which models the mapping between state transition distributions between the source and target domains with an affine transformation. First, we derive the affine transport framework; then, we extend the bas...
Article
Full-text available
Adaptive immune system uses T cell receptors (TCRs) to recognize pathogens and to consequently initiate immune responses. TCRs can be sequenced from individuals and methods analyzing the specificity of the TCRs can help us better understand individuals’ immune status in different disorders. For this task, we have developed TCRGP, a novel Gaussian p...
Preprint
Model-based reinforcement learning (MBRL) approaches rely on discrete-time state transition models whereas physical systems and the vast majority of control tasks operate in continuous-time. To avoid time-discretization approximation of the underlying process, we propose a continuous-time MBRL framework based on a novel actor-critic method. Our app...
Article
Full-text available
In this work, deoxyribose-5-phosphate aldolase (Ec DERA, EC 4.1.2.4) from Escherichia coli was chosen as the protein engineering target for improving the substrate preference towards smaller, non-phosphorylated aldehyde donor substrates, in particular towards acetaldehyde. The initial broad set of mutations was directed to 24 amino acid positions i...
Preprint
Reinforcement learning provides a framework for learning to control which actions to take towards completing a task through trial-and-error. In many applications observing interactions is costly, necessitating sample-efficient learning. In model-based reinforcement learning efficiency is improved by learning to simulate the world dynamics. The chal...
Preprint
In machine learning and computer vision, optimal transport has had significant success in learning generative models and defining metric distances between structured and stochastic data objects, that can be cast as probability measures. The key element of optimal transport is the so called lifting of an \emph{exact} cost (distance) function, define...
Preprint
In recent years, surrogate models have been successfully used in likelihood-free inference to decrease the number of simulator evaluations. The current state-of-the-art performance for this task has been achieved by Bayesian Optimization with Gaussian Processes (GPs). While this combination works well for unimodal target distributions, it is restri...
Preprint
The behavior of many dynamical systems follow complex, yet still unknown partial differential equations (PDEs). While several machine learning methods have been proposed to learn PDEs directly from data, previous methods are limited to discrete-time approximations or make the limiting assumption of the observations arriving at regular grids. We pro...
Chapter
We propose deep convolutional Gaussian processes, a deep Gaussian process architecture with convolutional structure. The model is a principled Bayesian framework for detecting hierarchical combinations of local features for image classification. We demonstrate greatly improved image classification performance compared to current convolutional Gauss...
Preprint
Full-text available
Variational inference techniques based on inducing variables provide an elegant framework for scalable posterior estimation in Gaussian process (GP) models. Most previous works treat the locations of the inducing variables, i.e. the inducing inputs, as variational hyperparameters, and these are then optimized together with GP covariance hyper-param...
Preprint
We present Ordinary Differential Equation Variational Auto-Encoder (ODE$^2$VAE), a latent second order ODE model for high-dimensional sequential data. Leveraging the advances in deep generative models, ODE$^2$VAE can simultaneously learn the embedding of high dimensional trajectories and infer arbitrarily complex continuous-time latent dynamics. Ou...
Preprint
Full-text available
We introduce the convolutional spectral kernel (CSK), a novel family of interpretable and non-stationary kernels derived from the convolution of two imaginary radial basis functions. We propose the input-frequency spectrogram as a novel tool to analyze nonparametric kernels as well as the kernels of deep Gaussian processes (DGPs). Observing through...
Preprint
Full-text available
T cell receptors (TCRs) can recognize various pathogens and consequently start immune responses. TCRs can be sequenced from individuals and methods that can analyze the specificity of the TCRs can help us better understand the individual's immune status in different diseases. We have developed TCRGP, a novel Gaussian process (GP) method that can pr...
Article
Background. Acquired aplastic anemia (AA) is a bone marrow failure syndrome, in which patients' hematopoietic stem cells are destroyed, resulting in pancytopenia. The exact mechanism and biological process leading to AA remain largely unknown. Bone marrow destruction is perceived as an immune-mediated process, which is supported by elevated cytotox...
Preprint
Full-text available
Standard kernels such as Mat\'ern or RBF kernels only encode simple monotonic dependencies within the input space. Spectral mixture kernels have been proposed as general-purpose, flexible kernels for learning and discovering more complicated patterns in the data. Spectral mixture kernels have recently been generalized into non-stationary kernels by...
Preprint
Full-text available
The expressive power of Gaussian processes depends heavily on the choice of kernel. In this work we propose the novel harmonizable mixture kernel (HMK), a family of expressive, interpretable, non-stationary kernels derived from mixture models on the generalized spectral representation. As a theoretically sound treatment of non-stationary kernels, H...
Preprint
We propose a novel deep learning paradigm of differential flows that learn a stochastic differential equation transformations of inputs prior to a standard classification or regression function. The key property of differential Gaussian processes is the warping of inputs through infinitely deep, but infinitesimal, differential fields, that generali...
Preprint
We propose deep convolutional Gaussian processes, a deep Gaussian process architecture with convolutional structure. The model is a principled Bayesian framework for detecting hierarchical combinations of local features for image classification. We demonstrate greatly improved image classification performance compared to current Gaussian process ap...
Article
Full-text available
The vascular endothelium is considered as a key cell compartment for the response to ionizing radiation of normal tissues and tumors, and as a promising target to improve the differential effect of radiotherapy in the future. Following radiation exposure, the global endothelial cell response covers a wide range of gene, miRNA, protein and metabolit...
Data
The spectral k-means-clustering algorithm where the outlier-resistant k-means—clustering in the eigenspace of the graph Laplacian were used. (PPTX)
Data
Comparison of the proposed kernels (Bhattacharyaa, expected likelihood, Kullback-Leibler, and overlap coefficient) within a simulated gene expression study. The OVL and BH kernels achieve a consistently high performance. (PPTX)
Data
Complete list of the 49 differential genes found in the 43 clusters. The names, descriptions and Swiss-Prot IDs of the 49 statistically differentially expressed genes (determined by the GPR model as previously published in [7]) found in the 49 clusters of temporal expression are given. (XLSX)
Data
Complete list of the 47 transcription factors associated with the 49 differential genes. The names, descriptions and Swiss-Prot IDs of the 47 TFs predicted from the 49 differential genes using the MotiMap system are given in this table. (XLSX)
Data
List of clusters, genes, transcription motifs and associated transcription factors. This table gives the names of the genes, the motifs IDs and the names (in MotifMap and their corresponding names in Pathway Studio) of the predicted TFs for each cluster of differential genes. (XLSX)
Data
Subnetwork enrichment of BIRC5, CXCL8, CXCL10, CXCL12, PTGS2 (regulating cell processes). The table presents the result of the subnetwork enrichment of the five node genes BIRC5, CXCL8, CXCL10, CXCL12, PTGS2 searching for regulating cell processes using the Pathway Studio software. Ranks of hits are based on the p-values. (XLSX)
Data
Comparison of the proposed spectral k-means-clustering with varying outlier ratio against standard spectral k-means and spectral EM clustering algorithms within a simulated experiment. The outlier approach achieves an overall performance similar to that of standard k-means, but with higher precision and lower recall. (PPTX)
Data
Motifs and transcription factors associated with the 301 measured genes (MotifMap analysis). This table presents the results of the MotifMap system analysis using an FDR of 0.1. The motifs, their location with respect to the start codon and their location in the genome, as well as the predicted TFs and their Bayesian Branch Length Score (BBLS) are...
Data
Motifs and transcription factors associated with the 78 differential genes (MotifMap analysis). This table presents the results of the MotifMap system analysis using an FDR of 0.1. The motifs, their location with respect to the start codon and their location in the genome, as well as the predicted TFs and their Bayesian Branch Length Score (BBLS) a...
Data
Occurrences of predicted transcription factors (per day and per time window). We report here the number of times each TF was respectively predicted for each day and each time window (days 1–4, 4–7, 7–10, 10–14, 14–17 and 17–21) post-irradiation. Cluster numbers are also indicated for each day and time window. (XLSX)
Article
Full-text available
Motivation: Many inference problems in bioinformatics, including drug bioactivity prediction, can be formulated as pairwise learning problems, in which one is interested in making predictions for pairs of objects, e.g. drugs and their targets. Kernel-based approaches have emerged as powerful tools for solving problems of that kind, and especially...
Article
Full-text available
Motivation: Metabolic flux balance analysis (FBA) is a standard tool in analyzing metabolic reaction rates compatible with measurements, steady-state and the metabolic reaction network stoichiometry. Flux analysis methods commonly place model assumptions on fluxes due to the convenience of formulating the problem as a linear programing model, whil...
Article
Zero-inflated datasets, which have an excess of zero outputs, are commonly encountered in problems such as climate or rare event modelling. Conventional machine learning approaches tend to overestimate the non-zeros leading to poor performance. We propose a novel model family of zero-inflated Gaussian processes (ZiGP) for such zero-inflated dataset...
Article
In conventional ODE modelling coefficients of an equation driving the system state forward in time are estimated. However, for many complex systems it is practically impossible to determine the equations or interactions governing the underlying dynamics. In these settings, parametric ODE model cannot be formulated. Here, we overcome this issue by i...
Article
Full-text available
Motivation: Proteins are commonly used by biochemical industry for numerous processes. Refining these proteins' properties via mutations causes stability effects as well. Accurate computational method to predict how mutations affect protein stability is necessary to facilitate efficient protein design. However, accuracy of predictive models is ult...
Article
Computationally modeling changes in binding free energies upon mutation (interface ΔΔG) allows large-scale prediction and perturbation of protein-protein interactions. Additionally, methods that consider and sample relevant conformational plasticity should be able to achieve higher prediction accuracy over methods that do not. To test this hypothes...
Conference Paper
Full-text available
We propose non-stationary spectral kernels for Gaussian process regression. We propose to model the spectral density of a non-stationary kernel function as a mixture of input-dependent Gaussian process frequency density surfaces. We solve the generalised Fourier transform with such a model, and present a family of non-stationary and non-monotonic k...
Preprint
Full-text available
Computationally modeling changes in binding free energies upon mutation (interface ΔΔ G ) allows large-scale prediction and perturbation of protein-protein interactions. Additionally, methods that consider and sample relevant conformational plasticity should be able to achieve higher prediction accuracy over methods that do not. To test this hypoth...
Conference Paper
Full-text available
We introduce a novel kernel that models input-dependent couplings across multiple latent processes. The pairwise joint kernel measures covariance along inputs and across different latent signals in a mutually-dependent fashion. A latent correlation Gaussian process (LCGP) model combines these non-stationary latent components into multiple outputs b...
Article
Full-text available
Background The filamentous fungus Trichoderma reesei (teleomorph Hypocrea jecorina) is a widely used industrial host organism for protein production. In industrial cultivations, it can produce over 100 g/l of extracellular protein, mostly constituting of cellulases and hemicellulases. In order to improve protein production of T. reesei the transcri...
Article
Full-text available
Devoted to multi-task learning and structured output learning, operator-valued kernels provide a flexible tool to build vector-valued functions in the context of Reproducing Kernel Hilbert Spaces. To scale up these methods, we extend the celebrated Random Fourier Feature methodology to get an approximation of operator-valued kernels. We propose a g...
Article
Full-text available
We present a novel approach for fully non-stationary Gaussian process regression (GPR), where all three key parameters -- noise variance, signal variance and lengthscale -- can be simultaneously input-dependent. We develop gradient-based inference methods to learn the unknown function and the non-stationary model parameters, without requiring any m...
Article
Full-text available
Modeling dynamical systems with ordinary differential equations implies a mechanistic view of the process underlying the dynamics. However in many cases, this knowledge is not available. To overcome this issue, we introduce a general framework for nonparametric ODE models using penalized regression in Reproducing Kernel Hilbert Spaces (RKHS) based...
Article
Full-text available
Motivation: Identifying the set of genes differentially expressed along time is an important task in two-sample time course experiments. Furthermore, estimating at which time periods the differential expression is present can provide additional insight into temporal gene functions. The current differential detection methods are designed to detect...
Article
Full-text available
Metabolite identification is a major bottleneck in metabolomics due to the number and diversity of the molecules. To alleviate this bottleneck, computational methods and tools that reliably filter the set of candidates are needed for further analysis by human experts. Recent efforts in assembling large public mass spectral databases such as MassBan...
Article
Full-text available
Metabolite identification from tandem mass spectra is an important problem in metabolomics, underpinning subsequent metabolic modelling and network analysis. Yet, currently this task requires matching the observed spectrum against a database of reference spectra originating from similar equipment and closely matching operating parameters, a conditi...
Conference Paper
Reflection seismic data acquired in hard-rock terrains are often difficult to interpret due to complex geological architecture of the target areas. Even fairly simple geological structures, such as folds, can be difficult to identify from the seismic profiles because the reflection method is only able to image the sub-horizontal fold hinges, and no...
Article
Full-text available
Kernels for structured data are rapidly becoming an essential part of the machine learning toolbox. Graph kernels provide similarity measures for complex relational objects, such as molecules and enzymes. Graph kernels based on walks are popular due their fast computation but their predictive performance is often not satisfactory, while kernels bas...
Article
Full-text available
The ability to trace the fate of individual atoms through the metabolic pathways is needed in many applications of systems biology and drug discovery. However, this information is not immediately available from the most common metabolome studies and needs to be separately acquired. Automatic discovery of correspondence of atoms in biochemical react...
Conference Paper
Full-text available
We present a structured output prediction approach for classifying potential anti-cancer drugs. Our QSAR model takes as input a description of a molecule and predicts the activity against a set of cancer cell lines in one shot. Statistical dependencies between the cell lines are encoded by a Markov network that has cell lines as nodes and edges rep...
Conference Paper
Full-text available
We present a multilabel learning approach for molecular classification, an important task in drug discovery. We use a conditional random field to model the dependencies between drug targets and discriminative training to separate correct multilabels from incorrect ones with a large margin. Efficient training of the model is ensured by conditional g...
Article
We present FiD (Fragment iDentificator), a software tool for the structural identification of product ions produced with tandem mass spectrometric measurement of low molecular weight organic compounds. Tandem mass spectrometry (MS/MS) has proven to be an indispensable tool in modern, cell-wide metabolomics and fluxomics studies. In such studies, th...
Article
Tandem mass spectrometry (MS/MS) is an indis- pensable method for fast and accurate analysis of the metabolism of a cell. In the metabolomics studies, the structural information contained in the MS/MS spectra helps in the identification and quantitative analysis of unknown metabolites. Commercially available software for the structural elucidation...
Conference Paper
Full-text available
Mass spectrometry is one of the key enabling measurement technologies for systems biology, due to its ability to quantify molecules in small concentrations. Tan- dem mass spectrometers tackle the main shortcoming of mass spectrometry, the fact that molecules with an equal mass-to-charge ratio are not separated. In tandem mass spectrometer molecules...

Projects

Projects (2)
Project
We have develop an array of methods and tools for the analysis of metabolic networks, including Metabolite Flux analysis, Metabolic Network Reconstruction and Pathway analysis. Our methods combine combinatorial network analysis with advanced machine learning.