IEEE Transactions on Pattern Analysis and Machine Intelligence Journal Impact Factor & Information

Publisher: IEEE Computer Society; Institute of Electrical and Electronics Engineers, Institute of Electrical and Electronics Engineers

Journal description

Theory and application of computers in pattern analysis and machine intelligence. Topics include computer vision and image processing; knowledge representation, inference systems, and probabilistic reasoning. Extensive bibliographies.

Current impact factor: 5.69

Impact Factor Rankings

2015 Impact Factor Available summer 2015
2013 / 2014 Impact Factor 5.694
2012 Impact Factor 4.795
2011 Impact Factor 4.908
2010 Impact Factor 5.027
2009 Impact Factor 4.378
2008 Impact Factor 5.96
2007 Impact Factor 3.579
2006 Impact Factor 4.306
2005 Impact Factor 3.81
2004 Impact Factor 4.352
2003 Impact Factor 3.823
2002 Impact Factor 2.923
2001 Impact Factor 2.289
2000 Impact Factor 2.094
1999 Impact Factor 1.882
1998 Impact Factor 1.417
1997 Impact Factor 1.668
1996 Impact Factor 2.085
1995 Impact Factor 1.94
1994 Impact Factor 2.006
1993 Impact Factor 1.917
1992 Impact Factor 1.906

Impact factor over time

Impact factor

Additional details

5-year impact 6.14
Cited half-life 0.00
Immediacy index 0.63
Eigenfactor 0.05
Article influence 3.24
Website IEEE Transactions on Pattern Analysis and Machine Intelligence website
Other titles IEEE transactions on pattern analysis and machine intelligence, Institute of Electrical and Electronics Engineers transactions on pattern analysis and machine intelligence
ISSN 0162-8828
OCLC 4253074
Material type Periodical, Internet resource
Document type Journal / Magazine / Newspaper, Internet Resource

Publisher details

Institute of Electrical and Electronics Engineers

  • Pre-print
    • Author can archive a pre-print version
  • Post-print
    • Author can archive a post-print version
  • Conditions
    • Author's pre-print on Author's personal website, employers website or publicly accessible server
    • Author's post-print on Author's server or Institutional server
    • Author's pre-print must be removed upon publication of final version and replaced with either full citation to IEEE work with a Digital Object Identifier or link to article abstract in IEEE Xplore or replaced with Authors post-print
    • Author's pre-print must be accompanied with set-phrase, once submitted to IEEE for publication ("This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible")
    • Author's pre-print must be accompanied with set-phrase, when accepted by IEEE for publication ("(c) 20xx IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works.")
    • IEEE must be informed as to the electronic address of the pre-print
    • If funding rules apply authors may post Author's post-print version in funder's designated repository
    • Author's Post-print - Publisher copyright and source must be acknowledged with citation (see above set statement)
    • Author's Post-print - Must link to publisher version with DOI
    • Publisher's version/PDF cannot be used
    • Publisher copyright and source must be acknowledged
  • Classification
    ​ green

Publications in this journal

  • [Show abstract] [Hide abstract]
    ABSTRACT: We present a method to track the shape of an object from video. The method uses a joint shape and appearance model of the object, which is propagated to match shape and radiance in subsequent frames, determining object shape. Self-occlusions and dis-occlusions of the object from camera and object motion pose difficulties to joint shape and appearance models in tracking. They are unable to adapt to new shape and appearance information, leading to inaccurate shape detection. In this work, we model self-occlusions and dis-occlusions in a joint shape and appearance tracking framework. Self-occlusions and the warp to propagate the model are coupled, thus we formulate a joint optimization problem. We derive a coarse-to-fine optimization method, advantageous in tracking, that initially perturbs the model by coarse perturbations before transitioning to finer-scale perturbations seamlessly. This coarse-to-fine behavior is automatically induced by gradient descent on a novel infinite-dimensional Riemannian manifold that we introduce. The manifold consists of planar parameterized regions, and the metric that we introduce is a novel Sobolev metric. Experiments on video exhibiting occlusions/dis-occlusions, complex radiance and background show that occlusion/dis-occlusion modeling leads to superior shape accuracy.
    IEEE Transactions on Pattern Analysis and Machine Intelligence 05/2015; 37(5):1053-1066. DOI:10.1109/TPAMI.2014.2360380
  • [Show abstract] [Hide abstract]
    ABSTRACT: The problem of real-time multiclass object recognition is of great practical importance in object recognition. In this paper, we describe a framework that simultaneously utilizes shared representation, reconstruction sparsity, and parallelism to enable real-time multiclass object detection with deformable part models at 5Hz on a laptop computer with almost no decrease in task performance. Our framework is trained in the standard structured output prediction formulation and is generically applicable for speeding up object recognition systems where the computational bottleneck is in multiclass, multi-convolutional inference. We experimentally demonstrate the efficiency and task performance of our method on PASCAL VOC, subset of ImageNet, Caltech101 and Caltech256 dataset.
    IEEE Transactions on Pattern Analysis and Machine Intelligence 05/2015; 37(5):1001-1012. DOI:10.1109/TPAMI.2014.2353631
  • [Show abstract] [Hide abstract]
    ABSTRACT: Statistical optimality in multipartite ranking is investigated as an extension of bipartite ranking. We consider the optimality of ranking algorithms through minimization of the theoretical risk which combines pairwise ranking errors of ordinal categories with differential ranking costs. The extension shows that for a certain class of convex loss functions including exponential loss, the optimal ranking function can be represented as a ratio of weighted conditional probability of upper categories to lower categories, where the weights are given by the misranking costs. This result also bridges traditional ranking methods such as proportional odds model in statistics with various ranking algorithms in machine learning. Further, the analysis of multipartite ranking with different costs provides a new perspective on non-smooth listwise ranking measures such as the discounted cumulative gain and preference learning. We illustrate our findings with simulation study and real data analysis.
    IEEE Transactions on Pattern Analysis and Machine Intelligence 05/2015; 37(5):1080-1094. DOI:10.1109/TPAMI.2014.2360397
  • [Show abstract] [Hide abstract]
    ABSTRACT: Sparse representation provides an effective tool for classification under the conditions that every class has sufficient representative training samples and the training data are uncorrupted. These conditions may not hold true in many practical applications. Face identification is an example where we have a large number of identities but sufficient representative and uncorrupted training images cannot be guaranteed for every identity. A violation of the two conditions leads to a poor performance of the sparse representation-based classification (SRC). This paper addresses this critic issue by analyzing the merits and limitations of SRC. A sparse- and dense-hybrid representation (SDR) framework is proposed in this paper to alleviate the problems of SRC. We further propose a procedure of supervised low-rank (SLR) dictionary decomposition to facilitate the proposed SDR framework. In addition, the problem of the corrupted training data is also alleviated by the proposed SLR dictionary decomposition. The application of the proposed SDR-SLR approach in face recognition verifies its effectiveness and advancement to the field. Extensive experiments on benchmark face databases demonstrate that it consistently outperforms the state-of-the-art sparse representation based approaches and the performance gains are significant in most cases.
    IEEE Transactions on Pattern Analysis and Machine Intelligence 05/2015; 37(5):1067-1079. DOI:10.1109/TPAMI.2014.2359453
  • [Show abstract] [Hide abstract]
    ABSTRACT: Many established classifiers fail to identify the minority class when it is much smaller than the majority class. To tackle this problem, researchers often first rebalance the class sizes in the training dataset, through oversampling the minority class or undersampling the majority class, and then use the rebalanced data to train the classifiers. This leads to interesting empirical patterns. In particular, using the rebalanced training data can often improve the area under the receiver operating characteristic curve (AUC) for the original, unbalanced test data. The AUC is a widely-used quantitative measure of classification performance, but the property that it increases with rebalancing has, as yet, no theoretical explanation. In this note, using Gaussian-based linear discriminant analysis (LDA) as the classifier, we demonstrate that, at least for LDA, there is an intrinsic, positive relationship between the rebalancing of class sizes and the improvement of AUC. We show that the largest improvement of AUC is achieved, asymptotically, when the two classes are fully rebalanced to be of equal sizes.
    IEEE Transactions on Pattern Analysis and Machine Intelligence 05/2015; 37(5):1109-1112. DOI:10.1109/TPAMI.2014.2359660
  • [Show abstract] [Hide abstract]
    ABSTRACT: Clothing recognition is a societally and commercially important yet extremely challenging problem due to large variations in clothing appearance, layering, style, and body shape and pose. In this paper, we tackle the clothing parsing problem using a retrieval-based approach. For a query image, we find similar styles from a large database of tagged fashion images and use these examples to recognize clothing items in the query. Our approach combines parsing from: pre-trained global clothing models, local clothing models learned on the fly from retrieved examples, and transferred parse-masks (Paper Doll item transfer) from retrieved examples. We evaluate our approach extensively and show significant improvements over previous state-of-the-art for both localization (clothing parsing given weak supervision in the form of tags) and detection (general clothing parsing). Our experimental results also indicate that the general pose estimation problem can benefit from clothing parsing.
    IEEE Transactions on Pattern Analysis and Machine Intelligence 05/2015; 37(5):1028-1040. DOI:10.1109/TPAMI.2014.2353624
  • [Show abstract] [Hide abstract]
    ABSTRACT: We propose a new family of message passing techniques for MAP estimation in graphical models which we call Sequential Reweighted Message Passing (SRMP). Special cases include well-known techniques such as Min-Sum Diffusion (MSD) and a faster Sequential Tree-Reweighted Message Passing (TRW-S). Importantly, our derivation is simpler than the original derivation of TRW-S, and does not involve a decomposition into trees. This allows easy generalizations. The new family of algorithms can be viewed as a generalization of TRW-S from pairwise to higher-order graphical models. We test SRMP on several real-world problems with promising results.
    IEEE Transactions on Pattern Analysis and Machine Intelligence 05/2015; 37(5):919-930. DOI:10.1109/TPAMI.2014.2363465
  • [Show abstract] [Hide abstract]
    ABSTRACT: Modeling intensity of facial action units from spontaneously displayed facial expressions is challenging mainly because of high variability in subject-specific facial expressiveness, head-movements, illumination changes, etc. These factors make the target problem highly context-sensitive. However, existing methods usually ignore this context-sensitivity of the target problem. We propose a novel Conditional Ordinal Random Field (CORF) model for context-sensitive modeling of the facial action unit intensity, where the W5+ (who, when , what, where, why and how) definition of the context is used. While the proposed model is general enough to handle all six context questions, in this paper we focus on the context questions: who (the observed subject), how (the changes in facial expressions), and when (the timing of facial expressions and their intensity). The context questions who and how are modeled by means of the newly introduced context-dependent covariate effects, and the context question when is modeled in terms of temporal correlation between the ordinal outputs, i.e., intensity levels of action units. We also introduce a weighted softmax-margin learning of CRFs from data with skewed distribution of the intensity levels, which is commonly encountered in spontaneous facial data. The proposed model is evaluated on intensity estimation of pain and facial action units using two recently published datasets (UNBC Shoulder Pain and DISFA) of spontaneously displayed facial expressions. Our experiments show that the proposed model performs significantly better on the target tasks compared to the state-of-the-art approaches. Furthermore, compared to traditional learning of CRFs, we show that the proposed weighted learning results in more robust parameter estimation from th- imbalanced intensity data.
    IEEE Transactions on Pattern Analysis and Machine Intelligence 05/2015; 37(5):944-958. DOI:10.1109/TPAMI.2014.2356192
  • [Show abstract] [Hide abstract]
    ABSTRACT: Soft-constraint semi-supervised affinity propagation (SCSSAP) adds supervision to the affinity propagation (AP) clustering algorithm without strictly enforcing instance-level constraints. Constraint violations lead to an adjustment of the AP similarity matrix at every iteration of the proposed algorithm and to addition of a penalty to the objective function. This formulation is particularly advantageous in the presence of noisy labels or noisy constraints since the penalty parameter of SCSSAP can be tuned to express our confidence in instance-level constraints. When the constraints are noiseless, SCSSAP outperforms unsupervised AP and performs at least as well as the previously proposed semi-supervised AP and constrained expectation maximization. In the presence of label and constraint noise, SCSSAP results in a more accurate clustering than either of the aforementioned established algorithms. Finally, we present an extension of SCSSAP which incorporates metric learning in the optimization objective and can further improve the performance of clustering.
    IEEE Transactions on Pattern Analysis and Machine Intelligence 05/2015; 37(5):1041-1052. DOI:10.1109/TPAMI.2014.2359454
  • [Show abstract] [Hide abstract]
    ABSTRACT: This paper describes a novel framework for computing geodesic paths in shape spaces of spherical surfaces under an elastic Riemannian metric. The novelty lies in defining this Riemannian metric directly on the quotient (shape) space, rather than inheriting it from pre-shape space, and using it to formulate a path energy that measures only the normal components of velocities along the path. In other words, this paper defines and solves for geodesics directly on the shape space and avoids complications resulting from the quotient operation. This comprehensive framework is invariant to arbitrary parameterizations of surfaces along paths, a phenomenon termed as gauge invariance. Additionally, this paper makes a link between different elastic metrics used in the computer science literature on one hand, and the mathematical literature on the other hand, and provides a geometrical interpretation of the terms involved. Examples using real and simulated 3D objects are provided to help illustrate the main ideas.
    IEEE Transactions on Pattern Analysis and Machine Intelligence 04/2015;
  • [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, we propose a tracking algorithm based on a multi-feature joint sparse representation. The templates for the sparse representation can include pixel values, textures, and edges. In the multi-feature joint optimization, noise or occlusion is dealt with using a set of trivial templates. A sparse weight constraint is introduced to dynamically select the relevant templates from the full set of templates. A variance ratio measure is adopted to adaptively adjust the weights of different features. The multi-feature template set is updated adaptively. We further propose an algorithm for tracking multi-objects with occlusion handling based on the multi-feature joint sparse reconstruction. The observation model based on sparse reconstruction automatically focuses on the visible parts of an occluded object by using the information in the trivial templates. The multi-object tracking is simplified into a joint Bayesian inference. The experimental results show the superiority of our algorithm over several state-of-the-art tracking algorithms.
    IEEE Transactions on Pattern Analysis and Machine Intelligence 04/2015; 37(4):816-833. DOI:10.1109/TPAMI.2014.2353628
  • [Show abstract] [Hide abstract]
    ABSTRACT: Reconstructing transparent objects is a challenging problem. While producing reasonable results for quite complex objects, existing approaches require custom calibration or somewhat expensive labor to achieve high precision. When an overall shape preserving salient and fine details is sufficient, we show in this paper a significant step toward solving the problem when the object’s silhouette is available and simple user interaction is allowed, by using a video of a transparent object shot under varying illumination. Specifically, we estimate the normal map of the exterior surface of a given solid transparent object, from which the surface depth can be integrated. Our technical contribution lies in relating this normal estimation problem to one of graph-cut segmentation. Unlike conventional formulations, however, our graph is dual-layered, since we can see a transparent object’s foreground as well as the background behind it. Quantitative and qualitative evaluation are performed to verify the efficacy of this practical solution.
    IEEE Transactions on Pattern Analysis and Machine Intelligence 04/2015; 37(4):890-897. DOI:10.1109/TPAMI.2014.2346195
  • [Show abstract] [Hide abstract]
    ABSTRACT: In this paper we introduce a new theory of blur invariants. Blur invariants are image features which preserve their values if the image is convolved by a point-spread function (PSF) of a certain class. We present the invariants to convolution with an arbitrary $N$ -fold symmetric PSF, both in Fourier and image domain. We introduce a notion of a primordial image as a canonical form of all blur-equivalent images. It is defined in spectral domain by means of projection operators. We prove that the moments of the primordial image are invariant to blur and we derive recursive formulae for their direct computation without actually constructing the primordial image. We further prove they form a complete set of invariants and show how to extent their invariance also to translation, rotation and scaling. We illustrate by simulated and real-data experiments their invariance and recognition power. Potential applications of this method are wherever one wants to recognize objects on blurred images.
    IEEE Transactions on Pattern Analysis and Machine Intelligence 04/2015; 37(4):1-1. DOI:10.1109/TPAMI.2014.2353644
  • [Show abstract] [Hide abstract]
    ABSTRACT: Elastic distortion of fingerprints is one of the major causes for false non-match. While this problem affects all fingerprint recognition applications, it is especially dangerous in negative recognition applications, such as watchlist and deduplication applications. In such applications, malicious users may purposely distort their fingerprints to evade identification. In this paper, we proposed novel algorithms to detect and rectify skin distortion based on a single fingerprint image. Distortion detection is viewed as a two-class classification problem, for which the registered ridge orientation map and period map of a fingerprint are used as the feature vector and a SVM classifier is trained to perform the classification task. Distortion rectification (or equivalently distortion field estimation) is viewed as a regression problem, where the input is a distorted fingerprint and the output is the distortion field. To solve this problem, a database (called reference database) of various distorted reference fingerprints and corresponding distortion fields is built in the offline stage, and then in the online stage, the nearest neighbor of the input fingerprint is found in the reference database and the corresponding distortion field is used to transform the input fingerprint into a normal one. Promising results have been obtained on three databases containing many distorted fingerprints, namely FVC2004 DB1, Tsinghua Distorted Fingerprint database, and the NIST SD27 latent fingerprint database.
    IEEE Transactions on Pattern Analysis and Machine Intelligence 03/2015; 37(3):555-568. DOI:10.1109/TPAMI.2014.2345403
  • [Show abstract] [Hide abstract]
    ABSTRACT: In spite of many interesting attempts, the problem of automatically finding alignments in a 2D set of points seems to be still open. The difficulty of the problem is illustrated here by very simple examples. We then propose an elaborate solution. We show that a correct alignment detection depends on not less than four interlaced criteria, namely the amount of masking in texture, the relative bilateral local density of the alignment, its internal regularity, and finally a redundancy reduction step. Extending tools of the a contrario detection theory, we show that all of these detection criteria can be naturally embedded in a single probabilistic a contrario model with a single user parameter, the number of false alarms. Our contribution to the a contrario theory is the use of sophisticated conditional events on random point sets, for which expectation we nevertheless find easy bounds. By these bounds the mathematical consistency of our detection model receives a simple proof. Our final algorithm also includes a new formulation of the exclusion principle in Gestalt theory to avoid redundant detections. Aiming at reproducibility, a source code and an online demo open to any data point set are provided. The method is carefully compared to three state-of-the-art algorithms and an application to real data is discussed. Limitations of the final method are also illustrated and explained.
    IEEE Transactions on Pattern Analysis and Machine Intelligence 03/2015; 37(3):499-512. DOI:10.1109/TPAMI.2014.2345389
  • [Show abstract] [Hide abstract]
    ABSTRACT: This paper presents a general formulation for a minimum cost data association problem which associates data features via one-to-one, m-to-one and one-to-n links with minimum total cost of the links. A motivating example is a problem of tracking multiple interacting nanoparticles imaged on video frames, where particles can aggregate into one particle or a particle can be split into multiple particles. Many existing multitarget tracking methods are capable of tracking non-interacting targets or tracking interacting targets of restricted degrees of interactions. The proposed formulation solves a multitarget tracking problem for general degrees of inter-object interactions. The formulation is in the form of a binary integer programming problem. We propose a polynomial time solution approach that can obtain a good relaxation solution of the binary integer programming, so the approach can be applied for multitarget tracking problems of a moderate size (for hundreds of targets over tens of time frames). The resulting solution is always integral and obtains a better duality gap than the simple linear relaxation solution of the corresponding problem. The proposed method was validated through applications to simulated multitarget tracking problems and a real multitarget tracking problem.
    IEEE Transactions on Pattern Analysis and Machine Intelligence 03/2015; 37(3):611-624. DOI:10.1109/TPAMI.2014.2346202
  • [Show abstract] [Hide abstract]
    ABSTRACT: We show how discrete Morse theory provides a rigorous and unifying foundation for defining skeletons and partitions of grayscale digital images. We model a grayscale image as a cubical complex with a real-valued function defined on its vertices (the voxel values). This function is extended to a discrete gradient vector field using the algorithm presented in Robins, Wood, Sheppard TPAMI 33:1646 (2011). In the current paper we define basins (the building blocks of a partition) and segments of the skeleton using the stable and unstable sets associated with critical cells. The natural connection between Morse theory and homology allows us to prove the topological validity of these constructions; for example, that the skeleton is homotopic to the initial object. We simplify the basins and skeletons via Morse-theoretic cancellation of critical cells in the discrete gradient vector field using a strategy informed by persistent homology. Simple working Python code for our algorithms for efficient vector field traversal is included. Example data are taken from micro-CT images of porous materials, an application area where accurate topological models of pore connectivity are vital for fluid-flow modelling.
    IEEE Transactions on Pattern Analysis and Machine Intelligence 03/2015; 37(3):654-666. DOI:10.1109/TPAMI.2014.2346172
  • [Show abstract] [Hide abstract]
    ABSTRACT: We present a new approach for matching sets of branching curvilinear structures that form graphs embedded in ${mathbb {R}}^2$ or ${mathbb {R}}^3$ and may be subject to deformations. Unlike earlier methods, ours does not rely on local appearance similarity nor does require a good initial alignment. Furthermore, it can cope with non-linear deformations, topological differences, and partial graphs. To handle arbitrary non-linear deformations, we use Gaussian process regressions to represent the geometrical mapping relating the two graphs. In the absence of appearance information, we iteratively establish correspondences between points, update the mapping accordingly, and use it to estimate where to find the most likely correspondences that will be used in the next step. To make the computation tractable for large graphs, the set of new potential matches considered at each iteration is not selected at random as with many RANSAC-based algorithms. Instead, we introduce a so-called Active Testing Search strategy that performs a priority search to favor the most likely matches and speed-up the process. We demonstrate the effectiveness of our approach first on synthetic cases and then on angiography data, retinal fundus images, and microscopy image stacks acquired at very different resolutions.
    IEEE Transactions on Pattern Analysis and Machine Intelligence 03/2015; 37(3):625-638. DOI:10.1109/TPAMI.2014.2343235