IEEE Transactions on Pattern Analysis and Machine Intelligence (IEEE T PATTERN ANAL )

Publisher: IEEE Computer Society; Institute of Electrical and Electronics Engineers, Institute of Electrical and Electronics Engineers


Theory and application of computers in pattern analysis and machine intelligence. Topics include computer vision and image processing; knowledge representation, inference systems, and probabilistic reasoning. Extensive bibliographies.

Impact factor 5.69

  • Hide impact factor history
    Impact factor
  • 5-year impact
  • Cited half-life
  • Immediacy index
  • Eigenfactor
  • Article influence
  • Website
    IEEE Transactions on Pattern Analysis and Machine Intelligence website
  • Other titles
    IEEE transactions on pattern analysis and machine intelligence, Institute of Electrical and Electronics Engineers transactions on pattern analysis and machine intelligence
  • ISSN
  • OCLC
  • Material type
    Periodical, Internet resource
  • Document type
    Journal / Magazine / Newspaper, Internet Resource

Publisher details

Institute of Electrical and Electronics Engineers

  • Pre-print
    • Author can archive a pre-print version
  • Post-print
    • Author can archive a post-print version
  • Conditions
    • Author's pre-print on Author's personal website, employers website or publicly accessible server
    • Author's post-print on Author's server or Institutional server
    • Author's pre-print must be removed upon publication of final version and replaced with either full citation to IEEE work with a Digital Object Identifier or link to article abstract in IEEE Xplore or replaced with Authors post-print
    • Author's pre-print must be accompanied with set-phrase, once submitted to IEEE for publication ("This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible")
    • Author's pre-print must be accompanied with set-phrase, when accepted by IEEE for publication ("(c) 20xx IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works.")
    • IEEE must be informed as to the electronic address of the pre-print
    • If funding rules apply authors may post Author's post-print version in funder's designated repository
    • Author's Post-print - Publisher copyright and source must be acknowledged with citation (see above set statement)
    • Author's Post-print - Must link to publisher version with DOI
    • Publisher's version/PDF cannot be used
    • Publisher copyright and source must be acknowledged
  • Classification
    ​ green

Publications in this journal

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: We propose the supervised hierarchical Dirichlet process (sHDP), a nonparametric generative model for the joint distribution of a group of observations and a response variable directly associated with that whole group. We compare the sHDP with another leading method for regression on grouped data, the supervised latent Dirichlet allocation (sLDA) model. We evaluate our method on two real-world classification problems and two real-world regression problems. Bayesian nonparametric regression models based on the Dirichlet process, such as the Dirichlet process-generalised linear models (DP-GLM) have previously been explored; these models allow flexibility in modelling nonlinear relationships. However, until now, Hierarchical Dirichlet Process (HDP) mixtures have not seen significant use in supervised problems with grouped data since a straightforward application of the HDP on the grouped data results in learnt clusters that are not predictive of the responses. The sHDP solves this problem by allowing for clusters to be learnt jointly from the group structure and from the label assigned to each group.
    IEEE Transactions on Pattern Analysis and Machine Intelligence 12/2014; 99.
  • IEEE Transactions on Pattern Analysis and Machine Intelligence 12/2014;
  • IEEE Transactions on Pattern Analysis and Machine Intelligence 12/2014;
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Correlation filters (CFs) are a class of classifiers that are attractive for object localization and tracking applications. Traditionally, CFs have been designed in the frequency domain using the discrete Fourier transform (DFT), where correlation is efficiently implemented. However, existing CF designs do not account for the fact that the multiplication of two DFTs in the frequency domain corresponds to a circular correlation in the time/spatial domain. Because this was previously unaccounted for, prior CF designs are not truly optimal, as their optimization criteria do not accurately quantify their optimization intention. In this paper, we introduce new zero-aliasing constraints that completely eliminate this aliasing problem by ensuring that the optimization criterion for a given CF corresponds to a linear correlation rather than a circular correlation. This means that previous CF designs can be significantly improved by this reformulation. We demonstrate the benefits of this new CF design approach with several important CFs. We present experimental results on diverse data sets and present solutions to the computational challenges associated with computing these CFs. Code for the CFs described in this paper and their respective zero-aliasing versions is available at
    IEEE Transactions on Pattern Analysis and Machine Intelligence 11/2014;
  • [Show abstract] [Hide abstract]
    ABSTRACT: Many computer vision and pattern recognition problems may be posed as the analysis of a set of dissimilarities between objects. For many types of data, these dissimilarities are not euclidean (i.e., they do not represent the distances between points in a euclidean space), and therefore cannot be isometrically embedded in a euclidean space. Examples include shape-dissimilarities, graph distances and mesh geodesic distances. In this paper, we provide a means of embedding such non-euclidean data onto surfaces of constant curvature. We aim to embed the data on a space whose radius of curvature is determined by the dissimilarity data. The space can be either of positive curvature (spherical) or of negative curvature (hyperbolic). We give an efficient method for solving the spherical and hyperbolic embedding problems on symmetric dissimilarity data. Our approach gives the radius of curvature and a method for approximating the objects as points on a hyperspherical manifold without optimisation. For objects which do not reside exactly on the manifold, we develop a optimisation-based procedure for approximate embedding on a hyperspherical manifold. We use the exponential map between the manifold and its local tangent space to solve the optimisation problem locally in the euclidean tangent space. This process is efficient enough to allow us to embed data sets of several thousand objects. We apply our method to a variety of data including time warping functions, shape similarities, graph similarity and gesture similarity data. In each case the embedding maintains the local structure of the data while placing the points in a metric space.
    IEEE Transactions on Pattern Analysis and Machine Intelligence 11/2014; 36(11):2255-2269.
  • [Show abstract] [Hide abstract]
    ABSTRACT: This paper presents a novel approach to recovering estimates of 3D structure and motion of a dynamic scene from a sequence of binocular stereo images. The approach is based on matching spatiotemporal orientation distributions between left and right temporal image streams, which encapsulates both local spatial and temporal structure for disparity estimation. By capturing spatial and temporal structure in this unified fashion, both sources of information combine to yield disparity estimates that are naturally temporal coherent, while helping to resolve matches that might be ambiguous when either source is considered alone. Further, by allowing subsets of the orientation measurements to support different disparity estimates, an approach to recovering multilayer disparity from spacetime stereo is realized. Similarly, the matched distributions allow for direct recovery of dense, robust estimates of 3D scene flow. The approach has been implemented with real-time performance on commodity GPUs using OpenCL. Empirical evaluation shows that the proposed approach yields qualitatively and quantitatively superior estimates in comparison to various alternative approaches, including the ability to provide accurate multilayer estimates in the presence of (semi)transparent and specular surfaces.
    IEEE Transactions on Pattern Analysis and Machine Intelligence 11/2014; 36(11):2241-2254.
  • [Show abstract] [Hide abstract]
    ABSTRACT: In this work, we address the problem of estimating 2d human pose from still images. Articulated body pose estimation is challenging due to the large variation in body poses and appearances of the different body parts. Recent methods that rely on the pictorial structure framework have shown to be very successful in solving this task. They model the body part appearances using discriminatively trained, independent part templates and the spatial relations of the body parts using a tree model. Within such a framework, we address the problem of obtaining better part templates which are able to handle a very high variation in appearance. To this end, we introduce parts dependent body joint regressors which are random forests that operate over two layers. While the first layer acts as an independent body part classifier, the second layer takes the estimated class distributions of the first one into account and is thereby able to predict joint locations by modeling the interdependence and co-occurrence of the parts. This helps to overcome typical ambiguities of tree structures, such as self-similarities of legs and arms. In addition, we introduce a novel data set termed $FashionPose$ that contains over $7{,}000$ images with a challenging variation of body part appearances due to a large variation of dressing styles. In the experiments, we demonstrate that the proposed parts dependent joint regressors outperform independent classifiers or regressors. The method also performs better or similar to the state-of-the-art in terms of accuracy, while running with a couple of frames per second.
    IEEE Transactions on Pattern Analysis and Machine Intelligence 11/2014; 36(11):2131-2143.
  • [Show abstract] [Hide abstract]
    ABSTRACT: Designing effective features is a fundamental problem in computer vision. However, it is usually difficult to achieve a great tradeoff between discriminative power and robustness. Previous works shown that spatial co-occurrence can boost the discriminative power of features. However the current existing co-occurrence features are taking few considerations to the robustness and hence suffering from sensitivity to geometric and photometric variations. In this work, we study the Transform Invariance (TI) of co-occurrence features. Concretely we formally introduce a Pairwise Transform Invariance (PTI) principle, and then propose a novel Pairwise Rotation Invariant Co-occurrence Local Binary Pattern (PRICoLBP) feature, and further extend it to incorporate multi-scale, multi-orientation, and multi-channel information. Different from other LBP variants, PRICoLBP can not only capture the spatial context co-occurrence information effectively, but also possess rotation invariance. We evaluate PRICoLBP comprehensively on nine benchmark data sets from five different perspectives, e.g., encoding strategy, rotation invariance, the number of templates, speed, and discriminative power compared to other LBP variants. Furthermore we apply PRICoLBP to six different but related applications—texture, material, flower, leaf, food, and scene classification, and demonstrate that PRICoLBP is efficient, effective, and of a well-balanced tradeoff between the discriminative power and robustness.
    IEEE Transactions on Pattern Analysis and Machine Intelligence 11/2014; 36.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: For many computer vision and machine learning problems, large training sets are key for good performance. However, the most computationally expensive part of many computer vision and machine learning algorithms consists of finding nearest neighbor matches to high dimensional vectors that represent the training data. We propose new algorithms for approximate nearest neighbor matching and evaluate and compare them with previous algorithms. For matching high dimensional features, we find two algorithms to be the most efficient: the randomized k-d forest and a new algorithm proposed in this paper, the priority search k-means tree. We also propose a new algorithm for matching binary features by searching multiple hierarchical clustering trees and show it outperforms methods typically used in the literature. We show that the optimal nearest neighbor algorithm and its parameters depend on the data set characteristics and describe an automated configuration procedure for finding the best algorithm to search a particular data set. In order to scale to very large data sets that would otherwise not fit in the memory of a single machine, we propose a distributed nearest neighbor matching framework that can be used with any of the algorithms described in the paper. All this research has been released as an open source library called fast library for approximate nearest neighbors (FLANN), which has been incorporated into OpenCV and is now one of the most popular libraries for nearest neighbor matching.
    IEEE Transactions on Pattern Analysis and Machine Intelligence 11/2014; 36(11):2227-2240.
  • [Show abstract] [Hide abstract]
    ABSTRACT: The search for a low-dimensional structure in high-dimensional data is one of the fundamental tasks in machine learning and pattern recognition. Manifold learning algorithms have recently emerged as alternatives to traditional linear dimension reduction techniques. In this paper, we propose a novel projection method that can be combined with any manifold learning methods to improve their dimension reduction performance when applied to high-dimensional data with a high level of noise. The method first builds a dispersion function that describes the distribution of dispersed manifold where the data lie. It then projects the noisy data onto a region wrapping the true manifold sufficiently close to it by applying a dynamical projection system associated with the constructed dispersion function. The effectiveness of the proposed projection method is validated by applying it to some real-world data sets with promising results.
    IEEE Transactions on Pattern Analysis and Machine Intelligence 11/2014; 36(11):2303-2309.
  • [Show abstract] [Hide abstract]
    ABSTRACT: Hidden Markov chains have been shown to be inadequate for data modeling under some complex conditions. In this work, we address the problem of statistical modeling of phenomena involving two heterogeneous system states. Such phenomena may arise in biology or communications, among other fields. Namely, we consider that a sequence of meaningful words is to be searched within a whole observation that also contains arbitrary one-by-one symbols. Moreover, a word may be interrupted at some site to be carried on later. Applying plain hidden Markov chains to such data, while ignoring their specificity, yields unsatisfactory results. The Phasic triplet Markov chain, proposed in this paper, overcomes this difficulty by means of an auxiliary underlying process in accordance with the triplet Markov chains theory. Related Bayesian restoration techniques and parameters estimation procedures according to the new model are then described. Finally, to assess the performance of the proposed model against the conventional hidden Markov chain model, experiments are conducted on synthetic and real data.
    IEEE Transactions on Pattern Analysis and Machine Intelligence 11/2014; 36(11):2310-2316.