Mateusz Staniak

Mateusz Staniak
University of Wroclaw | WROC ·  Instytut Matematyczny

Master of Science

About

11
Publications
8,869
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
75
Citations
Introduction
Second year PhD student at the University of Wrocław. I am interested in biostatistics and bioinformatics, i n particular statistical methods for mass spectrometry based proteomics. I also write statistical software in R.

Publications

Publications (11)
Article
Full-text available
Complex models are commonly used in predictive modeling. In this paper we present R packages that can be used to explain predictions from complex black box models and attribute parts of these predictions to input features. We introduce two new approaches and corresponding packages for such attribution, namely live and breakDown. We also compare the...
Preprint
Full-text available
The increasing availability of large but noisy data sets with a large number of heterogeneous variables leads to the increasing interest in the automation of common tasks for data analysis. The most time-consuming part of this process is the Exploratory Data Analysis, crucial for better domain understanding, data cleaning, data validation, and feat...
Preprint
Full-text available
VARCLUST algorithm is proposed for clustering variables under the assumption that variables in a given cluster are linear combinations of a small number of hidden latent variables, corrupted by the random noise. The entire clustering task is viewed as the problem of selection of the statistical model, which is defined by the number of clusters, the...
Article
Full-text available
Allogeneic-hematopoietic-stem-cell-transplantation (allo-HSCT) is the only potential cure for PNH, however the data on its utility in PNH are limited. Retrospective analysis of patients with PNH who underwent allo-HSCT in 11 Polish centers in 2002-2016 included 78 PNH patients: 27 classic (cPNH), 51 bone-marrow-failure-associated (BMF/PNH), 59%-mal...
Article
Full-text available
The increasing availability of large but noisy data sets with a large number of heterogeneous variables leads to the increasing interest in the automation of common tasks for data analysis. The most time-consuming part of this process is the Exploratory Data Analysis, crucial for better domain understanding, data cleaning, data validation, and feat...
Presentation
Full-text available
A presentation I gave at the Faculty of Computer and Information Science, University of Ljubljana (https://www.fri.uni-lj.si/en). It is based on my presentation at the TFML 2019 conference, but it adds several new insights about the local explanations in tabular data setting. I am grateful to the Laboratory for Cognitive Modeling (https://www.fri...
Presentation
Full-text available
Short talk given at TFML 2019 conference (tfml.gmum.net) that describes a particular challenge in explainability for tabular data - finding meaningful interpretable features.
Preprint
Full-text available
The paper deals with fluctuations of Kendall random walks, which are extremal Markov chains. We give the joint distribution of the first ascending ladder epoch and height over any level $a \geq 0$ and distribution of maximum and minimum for these extremal Markovian sequences. We show that distribution of the first crossing time of level $a \geq0$ i...
Poster
Full-text available
From online ad targeting to aiding medical and legal decision making, complex machine learning models such as deep neural networks are per- vasive in the digital era. However, decision made by the black box models can’t be justified and explained, which often makes them untrustworthy. In this poster, we present methods that help explain decision ma...
Presentation
Full-text available
Presentation on local interpretability of machine learning model, which summarizes findings from https://arxiv.org/abs/1804.01955

Network

Cited By

Projects

Projects (3)
Project
Develop novel statistical methods for protein inference and quantification that make use of shared peptides based on label-free and TMT data.
Archived project
Apply AI methods to make the exploration of data and models easier, faster and more efficient. The project will focus strongly on visualization and high-dimensional data.
Archived project
Explore methods of explaining single predictions made by machine learning models, in particular visualization techniques