Thomas Burger

Thomas Burger
French National Centre for Scientific Research | CNRS · ProFI (FR2048)

PhD
All the PDFs of my papers/preprints can be freely accessed through my personal webpage: download them without request!

About

102
Publications
21,502
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,865
Citations

Publications

Publications (102)
Article
Full-text available
Phosphorylation is a major post-translation modification (PTM) of proteins which is finely tuned by the activity of several hundred kinases and phosphatases. It controls most if not all cellular pathways including anti-viral responses. Accordingly, viruses often induce important changes in the phosphorylation of host factors that can either promote...
Article
Full-text available
Background Metabolic dysfunction-associated steatotic liver disease (MASLD) is estimated to affect 30% of the world’s population, and its prevalence is increasing in line with obesity. Liver fibrosis is closely related to mortality, making it the most important clinical parameter for MASLD. It is currently assessed by liver biopsy – an invasive pro...
Preprint
Phosphorylation is a major post-translation modification (PTM) of proteins, and small molecules, which is finely tuned by the activity of several hundred kinases and phosphatases. It controls most if not all cellular pathways including anti-viral responses. Accordingly, viruses often induce important changes in the phosphorylation of host factors t...
Article
Full-text available
Selecting omic biomarkers using both their effect size and their differential status significance (i.e., selecting the “volcano-plot outer spray”) has long been equally biologically relevant and statistically troublesome. However, recent proposals are paving the way to resolving this dilemma.
Preprint
Full-text available
Label-free bottom-up proteomics using mass spectrometry and liquid chromatography has long established as one of the most popular high-throughput analysis workflow for proteome characterization. However, it produces data hindered by complex and heterogeneous missing values, which imputation has long remained problematic. To cope with this, we intro...
Article
Full-text available
Cullin-RING finger ligases (CRLs) represent the largest family of ubiquitin ligases. They are responsible for the ubiquitination of ∼20% of cellular proteins degraded through the proteasome, by catalyzing the transfer of E2-loaded ubiquitin to a substrate. Seven Cullins are described in vertebrates. Among them, CUL4 associates with DDB1 to form the...
Article
In discovery proteomics, as well as many other "omic" approaches, the possibility to test for the differential abundance of hundreds (or of thousands) of features simultaneously is appealing, despite requiring specific statistical safeguards, among which controlling for the false discovery rate (FDR) has become standard. Moreover, when more than tw...
Article
In their recent article, Madej et al. (Madej, D.; Wu, L.; Lam, H.Common Decoy Distributions Simplify False Discovery Rate Estimation in Shotgun Proteomics. J. Proteome Res.2022, 21 (2), 339-348) proposed an original way to solve the recurrent issue of controlling for the false discovery rate (FDR) in peptide-spectrum-match (PSM) validation. Briefly...
Preprint
In their recent article, Madej et al. 1 proposed an original way to solve the recurrent issue of controlling for the false discovery rate (FDR) in peptide-spectrum-match (PSM) validation. Briefly, they proposed to derive a single precise distribution of decoy matches termed the Common Decoy Distribution (CDD) and to use it to control for FDR during...
Preprint
Full-text available
In discovery proteomics, as well as many other "omic" approaches, more than two biological conditions or group treatments can be compared in the one-way Analysis of Variance (OW-ANOVA) framework. The subsequent possibility to test for the differential abundance of hundreds (or thousands) of features simultaneously is appealing, despite requiring sp...
Article
Full-text available
Background Proteogenomics aims to identify variant or unknown proteins in bottom-up proteomics, by searching transcriptome- or genome-derived custom protein databases. However, empirical observations reveal that these large proteogenomic databases produce lower-sensitivity peptide identifications. Various strategies have been proposed to avoid this...
Article
In their recent review ( J. Proteome Res. 2022, 21 (4), 849-864), Crook et al. diligently discuss the basics (and less basics) of Bayesian modeling, survey its various applications to proteomics, and highlight its potential for the improvement of computational proteomic tools. Despite its interest and comprehensiveness on these aspects, the pitfall...
Article
Full-text available
Genes are pleiotropic and getting a better knowledge of their function requires a comprehensive characterization of their mutants. Here, we generated multi-level data combining phenomic, proteomic and metabolomic acquisitions from plasma and liver tissues of two C57BL/6 N mouse models lacking the Lat (linker for activation of T cells) and the Mx2 (...
Preprint
Full-text available
Background Proteogenomics aims to identify variant or unknown proteins in bottom-up proteomics, by searching transcriptome- or genome-derived custom protein databases. However, empirical observations reveal that these large proteogenomic databases produce lower-sensitivity peptide identifications. Various strategies have been proposed to avoid this...
Preprint
Full-text available
In proteomic differential analysis, FDR control is often performed through a multiple test correction (i.e., the adjustment of the original p-values). In this protocol, we apply a recent and alternative method, based on so-called knockoff filters. It shares interesting conceptual similarities with the target-decoy competition procedure, classically...
Article
Factorization of large data corpora has emerged as an essential technique to extract dictionaries (sets of patterns that are meaningful for sparse encoding). Following this line, we present a novel algorithm based on compressive learning theory. In this framework, the (arbitrarily large) dataset of interest is replaced by a fixed‐size sketch result...
Chapter
In proteomic differential analysis, FDR control is often performed through a multiple test correction (i.e., the adjustment of the original p-values). In this protocol, we apply a recent and alternative method, based on so-called knockoff filters. It shares interesting conceptual similarities with the target–decoy competition procedure, classically...
Chapter
Prostar is a software tool dedicated to the processing of quantitative data resulting from mass spectrometry-based label-free proteomics. Practically, once biological samples have been analyzed by bottom-up proteomics, the raw mass spectrometer outputs are processed by bioinformatics tools, so as to identify peptides and quantify them, notably by m...
Article
Full-text available
Background The clustering of data produced by liquid chromatography coupled to mass spectrometry analyses (LC-MS data) has recently gained interest to extract meaningful chemical or biological patterns. However, recent instrumental pipelines deliver data which size, dimensionality and expected number of clusters are too large to be processed by cla...
Article
Summary: Many factors can influence results in clinical research, in particular bias in the distribution of samples prior to biochemical preparation. Well Plate Maker is a user-friendly application to design single- or multiple-well plate assays. It allows multiple group experiments to be randomized and therefore helps to reduce possible batch effe...
Article
In bottom-up discovery proteomics, target-decoy competition (TDC) is the most popular method for false discovery rate (FDR) control. Despite unquestionable statistical foundations, this method has drawbacks, including its hitherto unknown intrinsic lack of stability vis-à-vis practical conditions of application. Although some consequences of this i...
Preprint
Full-text available
Motivation Quantitative mass spectrometry-based proteomics data are characterized by high rates of missing values, which may be of two kinds: missing completely-at-random (MCAR) and missing not-at-random (MNAR). Despite numerous imputation methods available in the literature, none account for this duality, for it would require to diagnose the missi...
Article
Wilson’s disease (WD), a rare genetic disease caused by mutations in the ATP7B gene, is associated with altered expression and/or function of the copper-transporting ATP7B protein, leading to massive toxic accumulation of copper in the liver and brain. The Atp7b-/- mouse, a genetic and phenotypic model of WD, was developed to provide new insights i...
Preprint
Full-text available
Target-decoy competition (TDC) is the most popular method for false discovery rate (FDR) control in bottom-up discovery proteomics. Despite unquestionable statistical foundations, we unveil a so far unknown weakness of TDC: its intrinsic lack of stability vis-à-vis practical conditions of application. Although some consequences of this instability...
Article
Full-text available
Results from mass spectrometry based quantitative proteomics analysis correspond to a subset of proteins which are considered differentially abundant relative to a control. Their selection is delicate and often requires some statistical expertise in addition to a refined knowledge of the experimental data. To facilitate the selection process, we ha...
Chapter
ProStaR is a software tool dedicated to differential analysis in label-free quantitative proteomics. Practically, once biological samples have been analyzed by bottom-up mass spectrometry-based proteomics, the raw mass spectrometer outputs are processed by bioinformatics tools, so as to identify peptides and quantify them, by means of precursor ion...
Article
The term “spectral clustering” is sometimes used to refer to the clustering of mass spectrometry data. However, it also classically refers to a family of popular clustering algorithms. To avoid confusion, a more specific term could advantageously be coined.
Article
We propose a new hypothesis test for the differential abundance of proteins in mass-spectrometry based relative quantification. An important feature of this type of high-throughput analyses is that it involves an enzymatic digestion of the sample proteins into peptides prior to identification and quantification. Due to numerous homology sequences,...
Article
The vocabulary of theoretical statistics can be difficult to embrace from the viewpoint of computational proteomics research, even though the notions it conveys are essential to publication guidelines. For example, “adjusted p-values”, “q-values” and “false discovery rates” are essentially similar concepts, whereas “false discovery rate” and “false...
Preprint
Full-text available
We propose a new hypothesis test for the differential abundance of proteins in mass-spectrometry based relative quantification. An important feature of this type of high-throughput analyses is that it involves an enzymatic digestion of the sample proteins into peptides prior to identification and quantification. Due to numerous homology sequences,...
Article
Full-text available
DAPAR and ProStaR are software tools to perform the statistical analysis of label-free XIC-based quantitative discovery proteomics experiments. DAPAR contains procedures to filter, normalize, impute missing value, aggregate peptide intensities, perform null hypothesis significance tests and select the most likely differentially abundant proteins wi...
Article
Full-text available
Selecting proteins with significant differential abundance is the cornerstone of many relative quantitative proteomics experiments. To do so, a trade-off between p-value thresholding and fold-change thresholding can be performed thanks to a specific parameter, named fudge factor, and classically noted s(0) . We have observed that this fudge factor...
Article
Missing values are a genuine issue in label-free quantitative proteomics. Recent works have surveyed the different statistical methods to conduct imputation and have compared them on real or simulated datasets, and recommended a list of missing value imputation methods for proteomics application. Although insightful, these comparisons do not accoun...
Article
Recently, several works have focused on the study of conflict among belief functions with a geometric approach, trying to elaborate on the intuition that distant belief functions are more conflicting than neighboring ones. In this article, I discuss the extent to which the mathematical properties of a metric are compliant with what can be expected...
Article
Full-text available
In mass-spectrometry based quantitative proteomics, the false discovery rate control (i.e. the limitation of the number of proteins which are wrongly claimed as differentially abundant between several conditions) is a major post-analysis step. It is classically achieved thanks to a specific statistical procedure which computes the adjusted p-values...
Article
Full-text available
Machine learning is a quickly evolving field which now looks really different from what it was 15 years ago, when classification and clustering were major issues. This document proposes several trends to explore the new questions of modern machine learning, with the strong afterthought that the belief function framework has a major role to play.
Article
Full-text available
Combining pieces of information provided by several sources without or with little prior knowledge about the behavior of the sources is an old yet still important and rather open problem in the belief function theory. In this paper, we propose an approach to select the behavior of sources based on a very general and expressive fusion scheme, that h...
Article
Full-text available
Dempster-Shafer Theory (DST) is particularly efficient in combining multiple information sources providing incomplete, imprecise, biased, and conflictive knowledge. In this work, we focused on the improvement of the accuracy rate and the reliability of a HMM based handwriting recognition system, by the use of Dempster-Shafer Theory (DST). The syste...
Article
This paper presents a new non-negative matrix factorization technique which (1) allows the decomposition of the original data on multiple latent factors accounting for the geometrical structure of the manifold embedding the data; (2) provides an optimal representation with a controllable level of sparsity; (3) has an overall linear complexity allow...
Conference Paper
Recently, several works have focused on the study of conflict among belief functions with a geometrical approach. In such framework, a corner stone is to endow the set of belief functions with an appropriated metric, and to consider that distant belief functions are more conflicting than neighboring ones. This article discusses such approaches, cav...
Article
Full-text available
Photosynthesis has shaped atmospheric and ocean chemistries and probably changed the climate as well, as oxygen is released from water as part of the photosynthetic process. In photosynthetic eukaryotes, this process occurs in the chloroplast, an organelle containing the most abundant biological membrane, the thylakoids. The thylakoids of plants an...
Article
Full-text available
Quantitative mass spectrometry based spatial proteomics involves elaborate, expensive and time consuming experimental procedures and considerable effort is invested in the generation of such data. Multiple research groups have described a variety of approaches to establish high quality proteome-wide data sets. However, data analysis is as critical...
Article
Full-text available
Hyperspectral data analysis has been given a growing attention due to the scientific challenges it raises and the wide set of applications that can benefit from it. Classification of hyperspectral images has been identified as one of the hottest topics in this context, and has been mainly addressed by discriminative methods such as SVM. In this pap...
Article
Full-text available
Experimental spatial proteomics, i.e the high-throughput assignment of proteins to sub-cellular compartments based on quantitative proteomics data, promises to shed new light on many biological processes given adequate computational tools. Here we present pRoloc, a complete infrastructure to support and guide the sound analysis of quantitative mass...
Article
Full-text available
As Dempster-Shafer theory spreads in different application fields, and as mass functions are involved in more and more complex systems, the need for algorithms randomly generating mass functions arises. Such algorithms can be used, for instance, to evaluate some statistical properties or to simulate the uncertainty in some systems (e.g., data base...
Article
Full-text available
Automation of smart home for ambient assisted living is currently based on a widespread use of sensors. As efficient as it seems to be, this solution can sometimes be problematic when one focus on user acceptability intimately related to cost and intrusivity. In this paper, we propose a context-aware system based on the semantic analysis of each us...
Conference Paper
Full-text available
Combining pieces of information provided by several sources without prior knowledge about the behavior of the sources is an old yet still important and rather open problem in belief function theory. In this paper, we propose a general approach to select the behavior of sources, based on two cornerstones of information fusion that are the notions of...
Article
In the Hilbert space reproducing the Gaussian kernel, projected data points are located on an hypersphere. Following some recent works on geodesic analysis on that particular manifold, we propose a method which purpose is to select a subset of input data by sampling the corresponding hypersphere. The selected data should represent correctly the inp...
Thesis
Full-text available
Population ageing is widespread across the world. Unprecedented in the history of mankind, this demographic trend leads to a number of social and economic issues related to ageing/disabled people whose number increases considerably over the years. As the number of caregivers can not evolve accordingly, we must now think of alternatives allowing the...
Conference Paper
Full-text available
Using kernels to embed non linear data into high dimensional spaces where linear analysis is possible has become utterly classical. In the case of the Gaussian kernel however, data are distributed on a hypersphere in the corresponding Reproducing Kernel Hilbert Space (RKHS). Inspired by previous works in non-linear statistics, this article investig...
Article
Full-text available
Recently, the problem of measuring the conflict between two bodies of evidence represented by belief functions has known a regain of interest. In most works related to this issue, Dempster's rule plays a central role. In this paper, we propose to study the notion of conflict from a different perspective. We start by examining consistency and confli...
Conference Paper
Full-text available
A new classification technique, PerTurbo, has been investigated in the context on hyperspectral remote sensing images context. In this framework, each class is characterised by its Laplace-Beltrami operator, then approximated by the spectrum of K(S), whose terms are derived from the Gaussian kernel. The method is very simple, easy to implement and...
Conference Paper
Full-text available
Automation of smart home for ambient assisted living is cur-rently based on a widespread use of sensors. In this paper, we propose a monitoring system based on the semantic analysis of home automation logs (user requests). Our goal is to replace as many sensors as possible by using advanced tools to infer information usually sensored. To take up th...
Conference Paper
Full-text available
L'automatisation et la supervision des systèmes pervasifs est à l'heure actuelle principalement basée sur l'utilisation massive de capteurs distribués dans l'environnement. Dans cet article, nous proposons un modèle de super-vision d'interactions basé sur l'analyse sémantique des logs domotiques (commandes émises par l'utilisateur), visant à limite...
Chapter
In the proteomics field, the production and publication of reliable mass spectrometry (MS)-based label-free quantitative results is a major concern. Due to the intrinsic complexity of bottom-up proteomics experiments (requiring aggregation of data relating to both precursor and fragment peptide ions into protein information, and matching this data...
Conference Paper
Full-text available
As Dempster-Shafer theory spreads in different applications fields involving complex systems, the need for algorithms randomly generating mass functions arises. As such random generation is often perceived as secondary, most proposed algorithms use procedures whose sample statistical properties are difficult to characterize. Thus, although they pro...
Conference Paper
Full-text available
The problem of conflict measurement between information sources knows a regain of interest. In most works related to this issue, Dempter's rule plays a central role. In this paper, we propose to revisit conflict from a different perspective. We do not make a priori assumption about dependencies and start from the definition of conflicting sets, stu...
Conference Paper
Full-text available
Population ageing is set to affect European countries over the coming decades, increasing the number of dependent people. In this context, Ambient Assisted Living (AAL) is one solution to enable these people to stay in their preferred environment longer, thus delaying hospitalization. To take up these challenges, this paper proposes an original lin...
Article
Full-text available
To study chloroplast metabolism and functions, subplastidial localization is a prerequisite to achieve protein functional characterization. As the accurate localization of many chloroplast proteins often remains hypothetical, we set up a proteomics strategy in order to assign the accurate subplastidial localization. A comprehensive study of Arabido...
Conference Paper
Full-text available
PerTurbo, an original, non-parametric and efficient classification method is presented here. In our framework, the manifold of each class is characterized by its Laplace-Beltrami operator, which is evaluated with classical methods involving the graph Laplacian. The classification criterion is established thanks to a measure of the magnitude of the...
Conference Paper
Full-text available
In this paper, a novel rejection strategy is proposed to optimize the reliability of an handwritten word recognition system. The proposed approach is based on several steps. First, we combine the outputs of several HMM classifiers using the Dempster-Shafer theory (DST). Then, we take advantage of the expressivity of mass functions (the counter part...
Conference Paper
Thesearchfornewsimplifiedinteractiontechniquesismainly motivated by the improvements of the communication with interactive devices. In this paper, we present an interactive TVs module capable of recognizing human gestures through the PS3Eye low-cost camera. We recognize gestures by the tracking of human skin blobs and analyzing the corresponding mo...
Conference Paper
Full-text available
The search for new simplied interaction techniques is mainly motivated by the improvements of the communication with interactive devices. In this paper, we present an interactive TVs module capable of recognizing human gestures through the PS3Eye low-cost camera. We recognize gestures by the tracking of human skin blobs and analyzing the correspond...
Conference Paper
Full-text available
The Dempster-Shafer theory (DST) is particularly interesting to deal with imprecise information. However, it is known for its high computational cost, as dealing with a frame of discernment Ω involves the manipulation of up to 2|Ω| elements. Hence, classification problems where the number of classes is too large cannot be considered. In this paper,...
Conference Paper
Full-text available
People with disabilities sometimes have considerable difficulties, or even physical incapacities, performing daily tasks independently. Many research works have introduced home automation as a useful way to overcome these activity limitations. However, very few of these accomplishments have focused on the design of intelligent systems which would a...
Conference Paper
Full-text available
The classification process in handwriting recognition is designed to provide lists of results rather than single results, so that context models can be used as post-processing. Most of the time, the length of the list is determined once and for all the items to classify. Here, we present a method based on Dempster-Shafer theory that allows a differ...
Conference Paper
Full-text available
Les personnes à mobilité réduite éprouvent des difficultés importantes, sinon une incapacité phy-sique totale à effectuer les tâches quotidiennes de manière autonome. Une des solutions permet-tant de compenser ces incapacités motrices consiste à s'appuyer sur des technologies d'automa-tisation du fonctionnement de l'habitat. De nombreuses initiativ...
Conference Paper
Full-text available
In this work, we focus on an improvement of a multi-script handwritting recognition system using a HMM based classiers combi- nation. The improvement relies on the use of Dempster-Shafer theory to combine in a ner way the probabilistic outputs of the HMM classiers. The experiments are conducted on two public databases written on two dierent scripts...
Article
Full-text available
In this paper, we consider the dominance properties of the set of the pignistic k-additive belief functions. Then, given k, we conjecture the shape of the polytope of all the k-additive belief functions dominating a given belief function, starting from an analogy with the case of dominating probability measures. Under such conjecture, we compute th...
Article
Full-text available
Approximating a belief function (with a probability distribution or with another belief function with a restricted number of focal elements) is an important issue in Dempster- Shafer Theory. The reason is that such approximations are really useful in two different situations: (1) decision making and (2) computational saving. In this paper, we propo...
Article
Full-text available
Considering handwriting recognition, we compare the accuracy of probabilistic and eviential methods for ensemble HMM classifier combination. The recognition performances show that, in case of simple database, the probabilistic methods are more efficient. On the other hand, for more difficult recognition tasks (large vocabulary, weak classifiers, et...
Article
Full-text available
Les personnes en situation de handicap éprouvent parfois de réelles difficultés, voire une incapacité physique totale, à effectuer les activités de la vie quotidienne de manière au-tonome. De nombreuses initiatives ont introduit la domotique et les habitats intelligents comme une solution possible pour compenser ce handicap. Cependant, très peu de...
Conference Paper
Full-text available
The Transferable Belief Model is a powerful interpretation of belief function theory where decision making is based on the pignistic transform. Smets has proposed a generalization of the pignistic trans- form which appears to be equivalent to the Shapley value in the trans- ferable utility model. It corresponds to the situation where the decision m...
Article
Most of the research on sign language recognition concentrates on recognizing only manual signs (hand gestures and shapes), discarding a very important component: the non-manual signals (facial expressions and head/shoulder motion). We address the recognition of signs with both manual and non-manual components using a sequential belief-based fusion...
Conference Paper
Full-text available
As part of our work on hand gesture interpretation, we present our results on hand shape recognition. Our method is based on attribute extraction and multiple partial classifications. The novelty lies in the fashion the fusion of all the partial classification results are performed. This fusion is (1) more efficient in terms of information theory a...
Book
Full-text available
Gestural interfaces, besides providing natural means of human computer interaction for everyone, enable the hearing impaired to use sign language or better understand speech through vision. This chapter overviews (1) the various modalities involved in gestured languages (2) the mean to automatically apprehend them individually and (3) to fuse them...