Eric P. Xing

Tsinghua University, Beijing, Beijing Shi, China

Are you Eric P. Xing?

Claim your profile

Publications (15)10.38 Total impact

  • Article: Bayesian Inference with Posterior Regularization and Infinite Latent SVMs
    Jun Zhu, Ning Chen, Eric P. Xing
    [show abstract] [hide abstract]
    ABSTRACT: Existing Bayesian models, especially nonparametric Bayesian methods, rely heavily on specially conceived priors to incorporate domain knowledge for discovering improved latent representations. While priors can affect posterior distributions through Bayes' theorem, imposing posterior regularization is arguably more direct and in some cases can be more natural and easier. In this paper, we present regularized Bayesian inference (RegBayes), a computational framework to perform posterior inference with a convex regularization on the desired post-data posterior distributions. RegBayes covers both directed Bayesian networks and undirected Markov networks whose Bayesian formulation results in hybrid chain graph models. When the convex regularization is induced from a linear operator on the posterior distributions, RegBayes can be solved with convex analysis theory. Furthermore, we present two concrete examples of RegBayes, infinite latent support vector machines (iLSVM) and multi-task infinite latent support vector machines (MT-iLSVM), which explore the large-margin idea in combination with a nonparametric Bayesian model for discovering predictive latent features for classification and multi-task learning, respectively. We present efficient inference methods and report empirical studies on several benchmark datasets, which appear to demonstrate the merits inherited from both large-margin learning and Bayesian nonparametrics. Such results were not available until now, and contribute to push forward the interface between these two important subfields, which have been largely treated as isolated in the community.
    10/2012;
  • Article: Efficient Algorithm for Extremely Large Multi-task Regression with Massive Structured Sparsity
    Seunghak Lee, Eric P. Xing
    [show abstract] [hide abstract]
    ABSTRACT: We develop a highly scalable optimization method called "hierarchical group-thresholding" for solving a multi-task regression model with complex structured sparsity constraints on both input and output spaces. Despite the recent emergence of several efficient optimization algorithms for tackling complex sparsity-inducing regularizers, true scalability in practical high-dimensional problems where a huge amount (e.g., millions) of sparsity patterns need to be enforced remains an open challenge, because all existing algorithms must deal with ALL such patterns exhaustively in every iteration, which is computationally prohibitive. Our proposed algorithm addresses the scalability problem by screening out multiple groups of coefficients simultaneously and systematically. We employ a hierarchical tree representation of group constraints to accelerate the process of removing irrelevant constraints by taking advantage of the inclusion relationships between group sparsities, thereby avoiding dealing with all constraints in every optimization step, and necessitating optimization operation only on a small number of outstanding coefficients. In our experiments, we demonstrate the efficiency of our method on simulation datasets, and in an application of detecting genetic variants associated with gene expression traits.
    08/2012;
  • Source
    Article: Leveraging input and output structures for joint mapping of epistatic and marginal eQTLs.
    Seunghak Lee, Eric P Xing
    [show abstract] [hide abstract]
    ABSTRACT: As many complex disease and expression phenotypes are the outcome of intricate perturbation of molecular networks underlying gene regulation resulted from interdependent genome variations, association mapping of causal QTLs or expression quantitative trait loci must consider both additive and epistatic effects of multiple candidate genotypes. This problem poses a significant challenge to contemporary genome-wide-association (GWA) mapping technologies because of its computational complexity. Fortunately, a plethora of recent developments in biological network community, especially the availability of genetic interaction networks, make it possible to construct informative priors of complex interactions between genotypes, which can substantially reduce the complexity and increase the statistical power of GWA inference. In this article, we consider the problem of learning a multitask regression model while taking advantage of the prior information on structures on both the inputs (genetic variations) and outputs (expression levels). We propose a novel regularization scheme over multitask regression called jointly structured input-output lasso based on an ℓ(1)/ℓ(2) norm, which allows shared sparsity patterns for related inputs and outputs to be optimally estimated. Such patterns capture multiple related single nucleotide polymorphisms (SNPs) that jointly influence multiple-related expression traits. In addition, we generalize this new multitask regression to structurally regularized polynomial regression to detect epistatic interactions with manageable complexity by exploiting the prior knowledge on candidate SNPs for epistatic effects from biological experiments. We demonstrate our method on simulated and yeast eQTL datasets. Software is available at http://www.sailing.cs.cmu.edu/.
    Bioinformatics 06/2012; 28(12):i137-46. · 5.47 Impact Factor
  • Article: Structured Input-Output Lasso, with Application to eQTL Mapping, and a Thresholding Algorithm for Fast Estimation
    Seunghak Lee, Eric P. Xing
    [show abstract] [hide abstract]
    ABSTRACT: We consider the problem of learning a high-dimensional multi-task regression model, under sparsity constraints induced by presence of grouping structures on the input covariates and on the output predictors. This problem is primarily motivated by expression quantitative trait locus (eQTL) mapping, of which the goal is to discover genetic variations in the genome (inputs) that influence the expression levels of multiple co-expressed genes (outputs), either epistatically, or pleiotropically, or both. A structured input-output lasso (SIOL) model based on an intricate l1/l2-norm penalty over the regression coefficient matrix is employed to enable discovery of complex sparse input/output relationships; and a highly efficient new optimization algorithm called hierarchical group thresholding (HiGT) is developed to solve the resultant non-differentiable, non-separable, and ultra high-dimensional optimization problem. We show on both simulation and on a yeast eQTL dataset that our model leads to significantly better recovery of the structured sparse relationships between the inputs and the outputs, and our algorithm significantly outperforms other optimization techniques under the same model. Additionally, we propose a novel approach for efficiently and effectively detecting input interactions by exploiting the prior knowledge available from biological experiments.
    05/2012;
  • Article: Large-margin Predictive Latent Subspace Learning for Multi-view Data Analysis.
    Ning Chen, Jun Zhu, Fuchun Sun, Eric P Xing
    [show abstract] [hide abstract]
    ABSTRACT: Learning from multi-view data is important in many applications such as image classification, retrieval and annotation. Standard predictive methods, such as support vector machines that are built with all the variables available without taking into consideration the presence of distinct views, would sacrifice predictive performance and may also be incapable of performing view-level analysis. In this paper, we present a statistical method to learn a predictive subspace representation shared by multiple views when supervising side information is provided and perform view-level predictions. Our approach is based on a multi-view latent subspace Markov network (MN) which fulfills a weak conditional independence assumption that multi-view observations and response variables are conditionally independent given a set of latent variables. To learn the latent subspace multi-view MN, we develop a large-margin approach which jointly maximizes data likelihood and minimizes a prediction loss on training data. The learning and inference problems are efficiently solved with a contrastive divergence method. Finally, we extensively evaluate the large-margin multi-view latent subspace MN on real TRECVID video, Flickr web image and hotel review datasets for classification, regression, image annotation and retrieval. Our results demonstrate that the large-margin approach can achieve significant improvements in terms of prediction performance and discovering predictive latent subspace representations.
    IEEE Transactions on Pattern Analysis and Machine Intelligence 02/2012; · 4.91 Impact Factor
  • Conference Proceeding: Infinite SVM: a Dirichlet Process Mixture of Large-margin Kernel Machines.
    Jun Zhu, Ning Chen, Eric P. Xing
    Proceedings of the 28th International Conference on Machine Learning, ICML 2011, Bellevue, Washington, USA, June 28 - July 2, 2011; 01/2011
  • Conference Proceeding: Sparse Topical Coding.
    Jun Zhu, Eric P. Xing
    UAI 2011, Proceedings of the Twenty-Seventh Conference on Uncertainty in Artificial Intelligence, Barcelona, Spain, July 14-17, 2011; 01/2011
  • Source
    Conference Proceeding: Conditional Topic Random Fields.
    Jun Zhu, Eric P. Xing
    Proceedings of the 27th International Conference on Machine Learning (ICML-10), June 21-24, 2010, Haifa, Israel; 01/2010
  • Source
    Conference Proceeding: Adaptive Multi-Task Lasso: with Application to eQTL Detection.
    Seunghak Lee, Jun Zhu, Eric P. Xing
    Advances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems 2010. Proceedings of a meeting held 6-9 December 2010, Vancouver, British Columbia, Canada.; 01/2010
  • Source
    Conference Proceeding: Predictive Subspace Learning for Multi-view Data: a Large Margin Approach.
    Ning Chen, Jun Zhu, Eric P. Xing
    Advances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems 2010. Proceedings of a meeting held 6-9 December 2010, Vancouver, British Columbia, Canada.; 01/2010
  • Conference Proceeding: Grafting-light: fast, incremental feature selection and structure learning of Markov random fields.
    Jun Zhu, Ni Lao, Eric P. Xing
    Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, July 25-28, 2010; 01/2010
  • Source
    Conference Proceeding: On primal and dual sparsity of Markov networks.
    Jun Zhu, Eric P. Xing
    Proceedings of the 26th Annual International Conference on Machine Learning, ICML 2009, Montreal, Quebec, Canada, June 14-18, 2009; 01/2009
  • Source
    Conference Proceeding: Primal sparse Max-margin Markov networks.
    Jun Zhu, Eric P. Xing, Bo Zhang
    Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Paris, France, June 28 - July 1, 2009; 01/2009
  • Source
    Conference Proceeding: Laplace maximum margin Markov networks.
    Jun Zhu, Eric P. Xing, Bo Zhang
    Machine Learning, Proceedings of the Twenty-Fifth International Conference (ICML 2008), Helsinki, Finland, June 5-9, 2008; 01/2008
  • Source
    Conference Proceeding: Partially Observed Maximum Entropy Discrimination Markov Networks.
    Jun Zhu, Eric P. Xing, Bo Zhang
    Advances in Neural Information Processing Systems 21, Proceedings of the Twenty-Second Annual Conference on Neural Information Processing Systems, Vancouver, British Columbia, Canada, December 8-11, 2008; 01/2008

Institutions

  • 2012
    • Tsinghua University
      • Department of Computer Science and Technology
      Beijing, Beijing Shi, China
  • 2008–2012
    • Carnegie Mellon University
      • Computer Science Department
      Pittsburgh, PA, USA