Tapabrata Maiti’s research while affiliated with Michigan State University and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (130)


Spike-and-Slab Shrinkage Priors for Structurally Sparse Bayesian Neural Networks
  • Article

October 2024

·

5 Reads

·

4 Citations

IEEE Transactions on Neural Networks and Learning Systems

·

·

Tapabrata Maiti

Network complexity and computational efficiency have become increasingly significant aspects of deep learning. Sparse deep learning addresses these challenges by recovering a sparse representation of the underlying target function by reducing heavily overparameterized deep neural networks. Specifically, deep neural architectures compressed via structured sparsity (e.g., node sparsity) provide low-latency inference, higher data throughput, and reduced energy consumption. In this article, we explore two well-established shrinkage techniques, Lasso and Horseshoe, for model compression in Bayesian neural networks (BNNs). To this end, we propose structurally sparse BNNs, which systematically prune excessive nodes with the following: 1) spike-and-slab group Lasso (SS-GL) and 2) SS group Horseshoe (SS-GHS) priors, and develop computationally tractable variational inference, including continuous relaxation of Bernoulli variables. We establish the contraction rates of the variational posterior of our proposed models as a function of the network topology, layerwise node cardinalities, and bounds on the network weights. We empirically demonstrate the competitive performance of our models compared with the baseline models in prediction accuracy, model compression, and inference latency.


Table 10 :
The proportion of correct variable selection after 0-4 correct variables in the model, for different cases over 1000 repetitions. The results show the mean.
Selection false positive rate average of the ENNS, DNP and LassoNet under dif- ferent number of true variables in 101 repetitions. Standard deviations are given in parenthesis.
Variable selection capacity of ENNS and other methods with low signal strength in the regression (top) and classification (bottom) setup. The numbers reported are the average number of selected variables that are truly nonzero. The standard errors are given in parentheses.
Prediction results on the testing set using neural networks with and without l 1

+3

ENNS: Variable Selection, Regression, Classification and Deep Neural Network for High-Dimensional Data
  • Article
  • Full-text available

October 2024

·

30 Reads

Journal of Machine Learning Research

High-dimensional, low-sample-size (HDLSS) data have been attracting people's attention for a long time. Many studies have proposed different approaches to dealing with this situation, among which variable selection is a significant idea. However, neural networks have been used to model complicated relationships. This paper discusses current variable selection techniques with neural networks. We showed that the stage-wise algorithm with the neural network suffers from some disadvantages, such as that the variables entering the model later may not be consistent. We also proposed an ensemble method to achieve better variable selection and proved that it has a probability tending to zero that a false variable will be selected. Moreover, we discussed further regularization to deal with over-fitting. Simulations and examples of real data are given to support the theory.

Download

CMPLE: Correlation Modeling to Decode Photosynthesis Using the Minorize–Maximize Algorithm

May 2024

·

26 Reads

Journal of Agricultural Biological and Environmental Statistics

Abhijnan Chattopadhyay

·

·

·

[...]

·

Samiran Sinha

In plant genomic experiments, correlations among various biological traits (phenotypes) give new insights into how genetic diversity may have tuned biological processes to enhance fitness under diverse conditions. Consequently, knowing how the correlations are affected by genetic (G) and environmental (E) factors helps develop climate-resilient plants. However, the current literature lacks any method for assessing the effect of predictors on pairwise correlations among multiple phenotypes together with easily interpretable model parameters. To address this need, we propose to model pairwise correlations directly in terms of G and E and develop a computationally efficient inference procedure. Two major novelties in our methodology are (1) the use of a composite pairwise likelihood method to avoid the positive definiteness restriction on the correlation matrix and (2) the use of a novel Minorize–Maximize (MM) algorithm for the efficient estimation of a large number of parameters. The proposed method shows excellent numerical performance on synthetic datasets. The analysis of the motivating data on cowpea reveals that the rates of solar energy storage by photosynthesis (the aggregate trait) are differentially affected by different genetic loci through two distinct processes: “photoinhibition” which results from photodamage caused by excess light, and “photoprotection” which protects plants from photodamage but also results in energy loss. Supplementary material to this paper is provided online.


Error‐controlled feature selection for ultrahigh‐dimensional and highly correlated feature space using deep learning

March 2024

·

32 Reads

Statistical Analysis and Data Mining

Deep learning has been at the center of analytics in recent years due to its impressive empirical success in analyzing complex data objects. Despite this success, most existing tools behave like black‐box machines, thus the increasing interest in interpretable, reliable, and robust deep learning models applicable to a broad class of applications. Feature‐selected deep learning has emerged as a promising tool in this realm. However, the recent developments do not accommodate ultrahigh‐dimensional and highly correlated features or high noise levels. In this article, we propose a novel screening and cleaning method with the aid of deep learning for a data‐adaptive multi‐resolutional discovery of highly correlated predictors with a controlled error rate. Extensive empirical evaluations over a wide range of simulated scenarios and several real datasets demonstrate the effectiveness of the proposed method in achieving high power while keeping the false discovery rate at a minimum.


Comprehensive study of variational Bayes classification for dense deep neural networks

October 2023

·

33 Reads

·

2 Citations

Statistics and Computing

Although Bayesian deep neural network models are ubiquitous in classification problems; their Markov Chain Monte Carlo based implementation suffers from high computational cost, limiting the use of this powerful technique in large-scale studies. Variational Bayes (VB) has emerged as a competitive alternative to overcome some of these computational issues. This paper focuses on the variational Bayesian deep neural network estimation methodology and discusses the related statistical theory and algorithmic implementations in the context of classification. For a dense deep neural network-based classification, the paper compares and contrasts the true posterior’s consistency and contraction rates and the corresponding variational posterior. Based on the complexity of the deep neural network (DNN), this paper provides an assessment of the loss in classification accuracy due to VB’s use and guidelines on the characterization of the prior distributions and the variational family. The difficulty of the numerical optimization for obtaining the variational Bayes solution has also been quantified as a function of the complexity of the DNN. The development is motivated by an important biomedical engineering application, namely building predictive tools for the transition from mild cognitive impairment to Alzheimer’s disease. The predictors are multi-modal and may involve complex interactive relations.


Figure 2: SS-GL ς 2 choice experiment. Here, we demonstrate the performance of our SS-GL with fixed ς 2 = 1 and variable ς 2 ∼ Γ(c = 4, d = 2). (a) we plot the classification accuracy on the test data. (b) and (c) we plot the proportion of active nodes (node sparsity) in layer-1 and layer-2 of the network respectively. We observe that placing a prior on ς 2 yields better classification accuracy.
Figure 3: SS-GHS c reg choice experiment. Here, we demonstrate the performance of our SS-GHS with regularization constant of c reg = 1 and c reg = k l + 1 = 401. (a) we plot the classification accuracy on the test data. (b) and (c) we plot the proportion of active nodes (node sparsity) in layer-1 and layer-2 of the network respectively. We observe that both c reg choices lead to similar classification accuracies with c reg = 1 having better layer-1 node sparsity.
Figure 4: Main MLP-MNIST experiment. Here, we demonstrate the performance of our SS-GL (ς 2 ∼ Γ(c = 4, d = 2)) and SS-GHS (c reg = 1) models compared against SS-IG model. (a) we plot the classification accuracy on the test data. (b) and (c) we plot the proportion of active nodes (node sparsity) in layer-1 and layer-2 of the network respectively. We observe that our SS-GHS yields the most compact network with the best classification accuracy.
Figure 5: LeNet-5-Caffe network experiment results. Top row (a)-(c) represent the LeNet-5-Caffe on MNIST experiment results. Bottom row (d)-(f) represent the LeNet-5-Caffe on Fashion-MNIST experiment results.
ResNet-CIFAR-10 experiment results. The results of each method is calculated by averaging over 3 independent runs with standard deviation reported in parentheses. For BNN cs and VBNN models, we show predefined percentages of pruned parameters used for magnitude pruning given in (Sun et al., 2021).
A comprehensive study of spike and slab shrinkage priors for structurally sparse Bayesian neural networks

August 2023

·

81 Reads

Network complexity and computational efficiency have become increasingly significant aspects of deep learning. Sparse deep learning addresses these challenges by recovering a sparse representation of the underlying target function by reducing heavily over-parameterized deep neural networks. Specifically, deep neural architectures compressed via structured sparsity (e.g. node sparsity) provide low latency inference, higher data throughput, and reduced energy consumption. In this paper, we explore two well-established shrinkage techniques, Lasso and Horseshoe, for model compression in Bayesian neural networks. To this end, we propose structurally sparse Bayesian neural networks which systematically prune excessive nodes with (i) Spike-and-Slab Group Lasso (SS-GL), and (ii) Spike-and-Slab Group Horseshoe (SS-GHS) priors, and develop computationally tractable variational inference including continuous relaxation of Bernoulli variables. We establish the contraction rates of the variational posterior of our proposed models as a function of the network topology, layer-wise node cardinalities, and bounds on the network weights. We empirically demonstrate the competitive performance of our models compared to the baseline models in prediction accuracy, model compression, and inference latency.


Layer Adaptive Node Selection in Bayesian Neural Networks: Statistical Guarantees and Implementation Details

August 2023

·

10 Reads

·

7 Citations

Neural Networks

Sparse deep neural networks have proven to be efficient for predictive model building in large-scale studies. Although several works have studied theoretical and numerical properties of sparse neural architectures, they have primarily focused on the edge selection. Sparsity through edge selection might be intuitively appealing; however, it does not necessarily reduce the structural complexity of a network. Instead pruning excessive nodes leads to a structurally sparse network with significant computational speedup during inference. To this end, we propose a Bayesian sparse solution using spike-and-slab Gaussian priors to allow for automatic node selection during training. The use of spike-and-slab prior alleviates the need of an ad-hoc thresholding rule for pruning. In addition, we adopt a variational Bayes approach to circumvent the computational challenges of traditional Markov Chain Monte Carlo (MCMC) implementation. In the context of node selection, we establish the fundamental result of variational posterior consistency together with the characterization of prior parameters. In contrast to the previous works, our theoretical development relaxes the assumptions of the equal number of nodes and uniform bounds on all network weights, thereby accommodating sparse networks with layer-dependent node structures or coefficient bounds. With a layer-wise characterization of prior inclusion probabilities, we discuss the optimal contraction rates of the variational posterior. We empirically demonstrate that our proposed approach outperforms the edge selection method in computational complexity with similar or better predictive performance. Our experimental evidence further substantiates that our theoretical work facilitates layer-wise optimal node recovery.



Statistically Valid Variational Bayes Algorithm for Ising Model Parameter Estimation

May 2023

·

58 Reads

·

1 Citation


Performance of different methods on various datasets quantified by AUPRC ratio. All datasets have 100 genes. Left panel reports the results for varying number of cells. Middle one reports the results for varying dropout ratios. Right panel report results for varying degrees of view similarities, which is measured by the percentage of common edges across views in the ground truth graphs. Top plot shows the results for Erdős-Rényi model and the bottom plot shows the results for Barabási-Albert model
Performance of scMSGL without any kernel (first row) and with different kernels on datasets generated from BA model and studied in Fig. 1
Genes with the highest node degrees. Orange and blue bars indicate that the degree is calculated using activating and inhibitory edges, respectively. Only genes whose activating or inhibitory degrees is among the top 15 genes in any view are shown
Connections of MYC (top) and OTX2 (bottom) genes. Edge widths are proportional to connection weights. Orange and blue edge colors indicate that the connection is activating and inhibitory, respectively. Only the top one third of the connections in all views of the multiview graph are shown
Kernelized multiview signed graph learning for single-cell RNA sequencing data

April 2023

·

45 Reads

·

1 Citation

BMC Bioinformatics

Background Characterizing the topology of gene regulatory networks (GRNs) is a fundamental problem in systems biology. The advent of single cell technologies has made it possible to construct GRNs at finer resolutions than bulk and microarray datasets. However, cellular heterogeneity and sparsity of the single cell datasets render void the application of regular Gaussian assumptions for constructing GRNs. Additionally, most GRN reconstruction approaches estimate a single network for the entire data. This could cause potential loss of information when single cell datasets are generated from multiple treatment conditions/disease states. Results To better characterize single cell GRNs under different but related conditions, we propose the joint estimation of multiple networks using multiple signed graph learning (scMSGL). The proposed method is based on recently developed graph signal processing (GSP) based graph learning, where GRNs and gene expressions are modeled as signed graphs and graph signals, respectively. scMSGL learns multiple GRNs by optimizing the total variation of gene expressions with respect to GRNs while ensuring that the learned GRNs are similar to each other through regularization with respect to a learned signed consensus graph. We further kernelize scMSGL with the kernel selected to suit the structure of single cell data. Conclusions scMSGL is shown to have superior performance over existing state of the art methods in GRN recovery on simulated datasets. Furthermore, scMSGL successfully identifies well-established regulators in a mouse embryonic stem cell differentiation study and a cancer clinical study of medulloblastoma.


Citations (59)


... [36], [37]. Consequently, approximate Bayesian methods such as variational inference [38], sparse learning [39], [40], and dimension reduction [41] have been explored. Alternately, ensemble-based methods like deep ensembles [42], SWAG [43], and SeBayS [44] also provide principled strategies for capturing predictive variability. ...

Reference:

Uncertainty-Aware Adaptation of Large Language Models for Protein-Protein Interaction Analysis
Spike-and-Slab Shrinkage Priors for Structurally Sparse Bayesian Neural Networks
  • Citing Article
  • October 2024

IEEE Transactions on Neural Networks and Learning Systems

... Such as Kim et al. (2021); Shen et al. (2022); Hu et al. (2022); Bos and Schmidt-Hieber (2022); Wang and Shang (2022); Kohler et al. (2022); Meyer (2023); Kohler and Langer (2025). While a variational Bayes classification for dense deep neural networks has been studied Bhattacharya et al. (2024), however, the prediction error for classification using Bayesian sparse deep learning remains unexplored. This study aims to address this gap by presenting an effort to shed light on this aspect. ...

Comprehensive study of variational Bayes classification for dense deep neural networks

Statistics and Computing

... [36], [37]. Consequently, approximate Bayesian methods such as variational inference [38], sparse learning [39], [40], and dimension reduction [41] have been explored. Alternately, ensemble-based methods like deep ensembles [42], SWAG [43], and SeBayS [44] also provide principled strategies for capturing predictive variability. ...

Layer Adaptive Node Selection in Bayesian Neural Networks: Statistical Guarantees and Implementation Details
  • Citing Article
  • August 2023

Neural Networks

... As expected, the copulabased approach M mvt-ClayCop has an increased computational time compared to both M uni and M mvt-Gauss , due to the additional MCMC step for estimating the copula parameter. An potentially faster alternative for carrying out the copula-estimation procedure would be replacing the MCMC step with variational Bayes (Kejzlar and Maiti 2023). The computational cost of estimating the HPR is independent on the modeling procedure and is negligible compared to the latter: 0.54 s (on average) for a bivariate predictive set of size M � = 1000. ...

Variational inference with vine copulas: an efficient approach for Bayesian computer model calibration

Statistics and Computing

... Many tools are available to study the ADMET mechanisms in situ with spatial resolution, such as whole-body autoradiography [2], positron emission tomography (PET) [3], spectroscopy [4], and recently, spatial transcriptomics [5]. Mass spectrometry imaging (MSI) provides label-free, highly multiplexed, and high-throughput measurements to characterize xenobiotics, their metabolites, and endogenous molecules in situ on various types of samples [6][7][8][9][10][11]. Matrix-assisted laser desorption/ionization MS (MALDI-MS) imaging uses highly focused laser pulses to desorb and ionize chemical matrix and sample materials to achieve chemical imaging. ...

Single cell transcriptomics shows dose-dependent disruption of hepatic zonation by TCDD in mice
  • Citing Article
  • October 2022

Toxicological Sciences

... Existing works on tensor regression. Recent research has demonstrated the efficacy of tensor decomposition in a range of domains, including neuroscience, computer vision and many others (Zhou, Li and Zhu, 2013;Li and Zhang, 2021;Cichocki et al., 2014;Ju et al., 2017;Sidiropoulos et al., 2017;Chen et al., 2016;Yuankai et al., 2018;Li et al., 2022). Tensor decomposition techniques which allow for dimension reduction in the feature space include Canonic Polyadic (CP) (Kolda and Bader, 2009), Tucker (Hitchcock, 1927), Hierarchical Tucker (HT) (Silva and Herrmann, 2014) and Tensor Train (TT) (Oseledets, 2011) decomposition. ...

Coupled support tensor machine classification for multimodal neuroimaging data
  • Citing Article
  • May 2022

Statistical Analysis and Data Mining

... In this paper, we present a multiple signed graph learning algorithm (scMSGL) for joint inference of GRNs from multiple classes (conditions/disease states). Based on the method developed in [31], scMSGL learns multiple GRNs by deriving an optimization problem using three assumptions: (i) expressions of genes connected with activating edges are similar to each other, (ii) expressions of genes connected with inhibitory edges are dissimilar to each other, and (iii) GRNs corresponding to the different datasets are related to each other. Thus, scMSGL optimizes the total variation of graph signals to learn signed graphs while ensuring that the learned signed graphs are similar to each other through regularization with respect to a learned signed consensus graph. ...

scSGL: Kernelized Signed Graph Learning for Single-Cell Gene Regulatory Network Inference
  • Citing Article
  • April 2022

Bioinformatics

... As alluded to earlier in this section, the VI approach requires a modification to allow for multimodal posteriors as the full rank Gaussian variational family (5) is not flexible enough to describe multimodality directly. The VI-based multimodal posterior distributions in figures 4 and 5 were obtained using the Black Box Variational Bayesian Model Averaging (BBVBMA) algorithm [32]. The BBVBMA posterior variational distribution is a mixture distribution, where each mixture component is a full rank Gaussian produced by the standard VI approach (as described in section 2) with random initialization. ...

Black Box Variational Bayesian Model Averaging
  • Citing Article
  • March 2022

... We assess the efficacy and versatility of our method across diverse scenarios by evaluating it on multiple datasets encompassing various cell types and perturbations. Specifically, we employ five distinct datasets: human peripheral blood mononuclear cells (PBMCs) treated with interferon β (IFN-β) published by Kang et al. [22], McFarland dataset [3], Chang dataset [4], single-dose TCDD liver by Nault et al. [23], and a subset of the cross-individual dataset (NeurIPS) from the Open Problems in Single-cell competition [24]. ...

Benchmarking of a Bayesian single cell RNAseq differential gene expression test for dose–response study designs

Nucleic Acids Research

... MultispeQ v2.0 (Photosynthesis RIDES 2.0 protocol) [36] was used to study rice photoprotection capacity by determining NPQ. In addition, we also focused on NPQ sensitivity under ambient conditions during the day, defined as the responsiveness of NPQ to LEF [37]. ...

Light potentials of photosynthetic energy storage in the field: what limits the ability to use or dissipate rapidly increased light energy?