Anna Cichonska

Anna Cichonska
Nightingale Health

PhD

About

52
Publications
10,814
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
423
Citations
Additional affiliations
January 2019 - present
University of Turku
Position
  • PostDoc Position
August 2018 - December 2018
University of Helsinki
Position
  • PostDoc Position
September 2013 - July 2018
Aalto University
Position
  • PhD Student
Description
  • 1) Helsinki Institute for Information Technology HIIT, Aalto University 2) Institute for Molecular Medicine Finland FIMM, University of Helsinki https://www.fimm.fi/en/training/doctoral-training/fimm-embl-international-phd
Education
September 2013 - July 2018
Aalto University and Institute for Molecular Medicine, Finland
Field of study
  • Machine Learning in Bioinformatics
February 2012 - July 2012
Politecnico di Milano
Field of study
  • Computer Science
February 2011 - September 2012
Silesian University of Technology
Field of study
  • Biotechnology with major in Bioinformatics

Publications

Publications (52)
Article
Introduction: System-wide identification of both on- and off-targets of chemical probes provides improved understanding of their therapeutic potential and possible adverse effects, thereby accelerating and de-risking drug discovery process. Given the high costs of experimental profiling of the complete target space of drug-like compounds, computat...
Article
Full-text available
Motivation: A dominant approach to genetic association studies is to perform univariate tests between genotype-phenotype pairs. However, analyzing related traits together increases statistical power, and certain complex associations become detectable only when several variants are tested jointly. Currently, modest sample sizes of individual cohort...
Article
Full-text available
Due to relatively high costs and labor required for experimental profiling of the full target space of chemical compounds, various machine learning models have been proposed as cost-effective means to advance this process in terms of predicting the most potent compound-target interactions for subsequent verification. However, most of the model pred...
Article
Full-text available
Motivation: Many inference problems in bioinformatics, including drug bioactivity prediction, can be formulated as pairwise learning problems, in which one is interested in making predictions for pairs of objects, e.g. drugs and their targets. Kernel-based approaches have emerged as powerful tools for solving problems of that kind, and especially...
Thesis
Full-text available
Systems pharmacology aims to transform large-scale heterogenous clinical and biological data into actionable therapeutic strategies. This thesis develops practical machine learning frameworks that contribute to different aspects of systems pharmacology, including the determination of therapeutic drug targets for complex diseases through genome-wide...
Article
Pleiotropy and genetic correlation are widespread features in GWAS, but they are often difficult to interpret at the molecular level. Here, we perform GWAS of 16 metabolites clustered at the intersection of amino acid catabolism, glycolysis, and ketone body metabolism in a subset of UK Biobank. We utilize the well-documented biochemistry jointly im...
Article
Full-text available
Background: The direct effects of general adiposity (body mass index (BMI)) and central adiposity (waist-to-hip-ratio (WHR)) on circulating lipoproteins, lipids, and metabolites are unknown. Methods: We used new metabolic data from UK Biobank (N=109,532, a five-fold higher N over previous studies). EDTA-plasma was used to quantify 249 traits wit...
Preprint
Blood lipids and metabolites are both markers of current health and indicators of risk for future disease. Here, we describe plasma nuclear magnetic resonance (NMR) biomarker data for 118,461 participants in the UK Biobank, an open resource for public health research with extensive clinical and genomic data. The biomarkers cover 249 measures of lip...
Article
Context While Asians have a higher risk of type 2 diabetes (T2D) than Europeans for a given BMI, it remains unclear whether the same markers of metabolic pathways are associated with diabetes. Objective We evaluated associations between metabolic biomarkers and incident T2D in three major Asian ethnic groups (Chinese, Malay, and Indian) and a Euro...
Preprint
Pleiotropy and genetic correlation are widespread features in GWAS, but they are often difficult to interpret at the molecular level. Here, we perform GWAS of 16 metabolites clustered at the intersection of amino acid catabolism, glycolysis, and ketone body metabolism in a subset of UK Biobank. We utilize the well-documented biochemistry jointly im...
Article
Full-text available
Despite decades of intensive search for compounds that modulate the activity of particular protein targets, a large proportion of the human kinome remains as yet undrugged. Effective approaches are therefore required to map the massive space of unexplored compound–kinase interactions for novel and potent activities. Here, we carry out a crowdsource...
Article
Full-text available
Motivation Combination therapies have emerged as a powerful treatment modality to overcome drug resistance and improve treatment efficacy. However, the number of possible drug combinations increases very rapidly with the number of individual drugs in consideration, which makes the comprehensive experimental screening infeasible in practice. Machine...
Preprint
Aims/hypothesis: We aimed to evaluate metabolic biomarkers in relation to incident type 2 diabetes (T2D) representing three major ethnic groups in Asia (Chinese, Malay, and Indian) and a European population. Methods: We used data from male and female adult participants of multiple cohorts, including two cohorts from Singapore (n = 6,393 Asians) con...
Preprint
Full-text available
Background: The causal impact of excess adiposity on systemic metabolism is unclear. We used multivariable Mendelian randomization to compare the direct effects of total adiposity (using body mass index (BMI)) and abdominal adiposity (using waist-to-hip-ratio (WHR)) on circulating lipoproteins, lipids, and metabolites with a five-fold increase in s...
Article
Full-text available
Biomarkers of low-grade inflammation have been associated with susceptibility to a severe infectious disease course, even when measured prior to disease onset. We investigated whether metabolic biomarkers measured by nuclear magnetic resonance (NMR) spectroscopy could be associated with susceptibility to severe pneumonia (2507 hospitalised or fatal...
Preprint
Full-text available
Motivation: Combination therapies have emerged as a powerful treatment modality to overcome drug resistance and improve treatment efficacy. However, the number of possible drug combinations increases very rapidly with the number of individual drugs in consideration which makes the comprehensive experimental screening infeasible in practice. Machine...
Article
Full-text available
We present comboFM, a machine learning framework for predicting the responses of drug combinations in pre-clinical studies, such as those based on cell lines or patient-derived cells. comboFM models the cell context-specific drug interactions through higher-order tensors, and efficiently learns latent factors of the tensor using powerful factorizat...
Article
Multivariate methods are known to increase the statistical power to detect associations in the case of shared genetic basis between phenotypes. They have, however, lacked essential analytic tools to follow-up and understand the biology underlying these associations. We developed a novel computational workflow for multivariate GWAS follow-up analyse...
Preprint
Full-text available
We present comboFM, a machine learning framework for predicting the responses of drug combinations in preclinical studies, such as those based on cell lines or patient-derived cells. comboFM models the cell context-specific drug interactions through higher-order tensors, and efficiently learns latent factors of the tensor using powerful factorizati...
Preprint
Full-text available
Background: Identification of healthy people at high risk for severe COVID-19 is a global health priority. We investigated whether blood biomarkers measured by high-throughput metabolomics could be predictive of severe pneumonia and COVID-19 hospitalisation years after the blood sampling. Methods: Nuclear magnetic resonance metabolomics was used to...
Preprint
Full-text available
This paper proposes a novel method for learning highly nonlinear, multivariate functions from examples. Our method takes advantage of the property that continuous functions can be approximated by polynomials, which in turn are representable by tensors. Hence the function learning problem is transformed into a tensor reconstruction problem, an inver...
Preprint
Full-text available
Despite decades of intensive search for compounds that modulate the activity of particular proteins, there are currently small-molecule probes available only for a small proportion of the human proteome. Effective approaches are therefore required to map the massive space of unexplored compound-target interactions for novel and potent activities. H...
Preprint
Full-text available
Multivariate methods are known to increase the statistical power of association detection, but they have lacked essential follow-up analysis tools necessary for understanding the biology underlying these associations. We developed a novel computational workflow for multivariate GWAS follow-up analyses, including fine-mapping and identification of t...
Presentation
Full-text available
Many real world prediction problems can be formulated as pairwise learning problems, in which one is interested in making predictions for pairs of objects, e.g. drugs and their targets. Kernel-based approaches have emerged as powerful tools for solving problems of that kind, and especially multiple kernel learning (MKL) offers promising benefits as...
Data
Scatter plots between the measured compound-kinase binding affinities from Metz et al. study and their model predictions obtained using KronRLS with KD-GIP drug kernel and KP-GS-domain protein kernel. r indicates Pearson correlation and p-values were calculated using a Student's t distribution for a transformation of the correlation, as implemented...
Data
Interaction map between 152 compounds (rows) and 138 kinases (columns) profiled in the study of Metz et al. The white cells represent unmeasured binding affinities. The higher the pKi value, the stronger the affinity between the compound and kinase. (PDF)
Data
(a) Leave-one-out and (b) leave-drug-out cross-validation results. The prediction accuracy was assessed with root mean squared error (RMSE) between binding affinities (pKi) from the study by Metz et al. and those predicted using KronRLS model with different pairs of drug (rows) and protein (columns) molecular descriptors encoded as kernel matrices...
Data
Distribution of 152 drug-wise Pearson correlation values between compound-kinase binding affinities (pKi) measured in the Metz et al. study and their model predictions under the New Drug scenario. The predictions were made using KronRLS algorithm with the best pair of drug and protein kernels (KD-sp and KP-GS) under the leave-drug-out cross-validat...
Data
(a) Leave-target-out cross-validation results. The prediction accuracy was evaluated with Pearson correlation (r) between binding affinities (pKi) from the study by Metz et al. and those predicted using KronRLS algorithm with different pairs of drug (rows) and protein (columns) molecular descriptors encoded as kernel matrices (b). (PDF)
Data
Distribution of 16,265 compound-kinase binding affinities measured in the study of Metz et al. (PDF)
Data
Predicted target profile of tivozanib and the results of experimental validation. (XLSX)
Data
Kinase inhibitors used in our experimental assays. (PDF)
Data
The comparison between model-predicted (based on the data from Metz et al. study) and experimentally-measured (in the study by Davis et al.) compound-kinase bioactivities. (a) The Bioactivity Imputation scenario. The KronRLS model was trained using Metz et al. dataset together with the best-performing (under the Bioactivity Imputation scenario) dru...
Data
Kinase dendrogram created by hierarchical clustering of 138 kinases from Metz et al. study based on their bioactivity data (pKi). The bar length is proportional to the number of compound interactions for each kinase (pKi ≥ 7 M). On-targets of tivozanib are marked with red lines (FLT1, FLT4, KDR), fedratinib–green lines (JAK2, JAK3, TYK2), vx11e and...
Data
Schematic illustration of the nested leave-one-out cross-validation (LOO-CV) procedure, consisting of the inner loop for model selection, and the outer loop for model performance estimation. Single round of the outer CV is shown, where one compound-protein pair is removed from the training data, and used as a test fold. The inner leave-one-out CV i...
Data
Results of our kinase assay for testing bioactivities predicted to fill the experimental gaps in the large-scale kinase inhibitor target profiling study by Metz et al.; examples of drug response curves obtained as described in Materials and Methods section of the main paper. Corresponding pIC50 values are summarized in S2 Table. (PDF)
Data
Results of our kinase assay for testing predicted target interactions for a new investigational kinase inhibitor tivozanib; drug response curves obtained as described in Materials and Methods section of the main paper. Corresponding pIC50 values are summarized in S3 Table. (PDF)
Data
Possible drug-protein interaction prediction scenarios. In this work, we focused on two most practical ones, namely the Bioactivity Imputation (Fig 2A) and the New Drug (Fig 2B) scenarios. Additionally, we included the results under the New Target setup (Fig 2C) in S7 Fig. (dx, px) denotes the query drug-protein pair, the binding affinity of which...
Data
Model parameters tested. In case of Gaussian kernels (KD-GIP, KP-GIP), the values for kernel width parameter σ were selected by computing pairwise distances between all data points and taking 0.1, 0.5 and 0.9 quantiles. (PDF)
Data
(a,b) Scatter plots between compound-kinase binding affinities (pKi) measured in the Metz et al. study and their model predictions under the (a) Bioactivity Imputation, (b) New Drug setups, using KronRLS algorithm with the best pairs of drug and protein kernels. r indicates Pearson correlation and p-values were calculated using a Student's t distri...
Data
The comparison between model-predicted and experimentally-measured in different assays bioactivities of 100 compound-kinase pairs included in our experimental validation. (a,c) Technical variability between two experimental kinase assays. Scatter plots between (a) 82 pKi values measured in Metz et al. study and pIC50 values from our experimental as...
Data
Kinome map of 138 kinases used in our work. Figure was created with KinMap (http://kinhub.org/kinmap). (PDF)
Data
Schematic illustration of the nested leave-drug-out cross-validation (LDO-CV) procedure consisting of the inner loop for model selection, and the outer loop for model performance estimation. Single round of the outer CV is shown, where all binding affinities of selected compound are removed from the training data, and used as a test fold. The inner...
Data
Visualisation of the binding between tivozanib and ABL1 (PDB code: 2e2b). The docking was performed with Rosetta (https://www.rosettacommons.org/) and the figure was created using UCSF cHimera (https://www.cgl.ucsf.edu/chimera/). A radius for docking was set to 5 Å around the centre of the ATP-binding site. (PNG)
Data
Results of the experimental validation of bioactivities predicted to fill the gaps in the study of Metz et al. (XLSX)
Data
Kinases, substrates, and their concentrations used in tivozanib’s off-target testing. (PDF)
Article
Full-text available
Background: New treatment options are needed to maintain and improve therapy for tuberculosis, which caused the death of 1.5 million people in 2013 despite potential for an 86 % treatment success rate. A greater understanding of Mycobacterium tuberculosis (M.tb) bacilli that persist through drug therapy will aid drug development programs. Predicti...
Research
Full-text available
A dominant approach to genetic association studies is to perform univariate tests between genotype- phenotype pairs. However, analysing related traits to- gether increases statistical power, and certain complex associations become detectable only when several variants are tested jointly. Currently, modest sample sizes of in- dividual cohorts and re...
Preprint
Full-text available
A dominant approach to genetic association studies is to perform univariate tests between genotype-phenotype pairs. However, analysing related traits together increases statistical power, and certain complex associations become detectable only when several variants are tested jointly. Currently, modest sample sizes of individual cohorts and restric...

Network

Cited By

Projects

Projects (2)
Project
This Challenge seeks to evaluate the power of statistical and machine learning models as a systematic and cost-effective means for catalyzing compound-target interaction mapping efforts by prioritizing most potent interactions for further experimental evaluation. The Challenge will focus on kinase inhibitors, due to their clinical importance, and will be implemented in a screening-based, pre-competitive drug discovery project in collaboration with the NIH-funded IDG Kinase-DRGC consortium, with the aim to establish kinome-wide target profiles of small-molecule agents, toward extending the druggability of the human kinome space.
Project
We are developing new computational methods for mapping the drug-target interactions and bioactivity profiles of drugs in different diseases, in particular taking into account the many-to-many relationships between drugs, targets and diseases.