IEEE Transactions on Image Processing (IEEE T IMAGE PROCESS)

Publisher: IEEE Signal Processing Society; Institute of Electrical and Electronics Engineers, Institute of Electrical and Electronics Engineers

Journal description

This journal will focus on the signal processing aspects of image acquisition, processing, and display, especially where it concerns modeling, design, and analysis having a strong mathematical basis.

Current impact factor: 3.63

Impact Factor Rankings

2016 Impact Factor Available summer 2017
2014 / 2015 Impact Factor 3.625
2013 Impact Factor 3.111
2012 Impact Factor 3.199
2011 Impact Factor 3.042
2010 Impact Factor 2.606
2009 Impact Factor 2.848
2008 Impact Factor 3.315
2007 Impact Factor 2.462
2006 Impact Factor 2.715
2005 Impact Factor 2.428
2004 Impact Factor 2.011
2003 Impact Factor 2.642
2002 Impact Factor 2.553
2001 Impact Factor 2.185
2000 Impact Factor 2.078
1999 Impact Factor 2.695
1998 Impact Factor 1.364
1997 Impact Factor 1.063

Impact factor over time

Impact factor

Additional details

5-year impact 4.48
Cited half-life 7.70
Immediacy index 0.44
Eigenfactor 0.04
Article influence 1.58
Website IEEE Transactions on Image Processing website
Other titles IEEE transactions on image processing, Institute of Electrical and Electronics Engineers transactions on image processing, Image processing
ISSN 1941-0042
OCLC 24103523
Material type Periodical, Internet resource
Document type Journal / Magazine / Newspaper, Internet Resource

Publisher details

Institute of Electrical and Electronics Engineers

  • Pre-print
    • Author can archive a pre-print version
  • Post-print
    • Author can archive a post-print version
  • Conditions
    • Author's pre-print on Author's personal website, employers website or publicly accessible server
    • Author's post-print on Author's server or Institutional server
    • Author's pre-print must be removed upon publication of final version and replaced with either full citation to IEEE work with a Digital Object Identifier or link to article abstract in IEEE Xplore or replaced with Authors post-print
    • Author's pre-print must be accompanied with set-phrase, once submitted to IEEE for publication ("This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible")
    • Author's pre-print must be accompanied with set-phrase, when accepted by IEEE for publication ("(c) 20xx IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works.")
    • IEEE must be informed as to the electronic address of the pre-print
    • If funding rules apply authors may post Author's post-print version in funder's designated repository
    • Author's Post-print - Publisher copyright and source must be acknowledged with citation (see above set statement)
    • Author's Post-print - Must link to publisher version with DOI
    • Publisher's version/PDF cannot be used
    • Publisher copyright and source must be acknowledged
  • Classification

Publications in this journal

  • [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, we propose interactive image segmentation using adaptive constraint propagation (ACP), called ACP Cut. In interactive image segmentation, the interactive inputs provided by users play an important role in guiding image segmentation. However, these simple inputs often cause bias which leads to failure in preserving object boundaries. To effectively use this limited interactive information, we employ ACP for semisupervised kernel matrix learning (SS-KML) which adaptively propagates the interactive information into the whole image while successfully keeping the original data coherence. Moreover, ACP Cut adopts seed propagation to achieve discriminative structure learning and reduce the computational complexity. Experimental results demonstrate that ACP Cut extracts foreground objects successfully from the background and outperforms the state-ofthe- art methods for interactive image segmentation in terms of both effectiveness and efficiency.
    No preview · Article · Jan 2016 · IEEE Transactions on Image Processing
  • [Show abstract] [Hide abstract]
    ABSTRACT: Content-based image retrieval (CBIR) has attracted much attention during the past decades for its potential practical applications to image database management. A variety of relevance feedback (RF) schemes have been designed to bridge the gap between low-level visual features and high-level semantic concepts for an image retrieval task. In the process of RF, it would be impractical or too expensive to provide explicit class label information for each image. Instead, similar or dissimilar pairwise constraints between two images can be acquired more easily. However, most of the conventional RF approaches can only deal with training images with explicit class label information. In this paper, we propose a novel discriminative semantic subspace analysis (DSSA) method, which can directly learn a semantic subspace from similar and dissimilar pairwise constraints without using any explicit class label information. In particular, DSSA can effectively integrate the local geometry of labeled similar images, the discriminative information between labeled similar and dissimilar images, and the local geometry of labeled and unlabeled images together to learn a reliable subspace. Compared with the popular distance metric analysis approaches, our method can also learn a distance metric but perform more effectively when dealing with high-dimensional images. Extensive experiments on both synthetic datasets and a real-world image database demonstrate the effectiveness of the proposed scheme in improving the performance of CBIR.
    No preview · Article · Jan 2016 · IEEE Transactions on Image Processing
  • [Show abstract] [Hide abstract]
    ABSTRACT: In this paper we propose a novel model for the computational color constancy, inspired by the amazing ability of the human vision system (HVS) to perceive the color of objects largely constant as the light source color changes. The proposed model imitates the color processing mechanisms in the specific level of the retina, the first stage of the HVS, from the adaptation emerging in the layers of cone photoreceptors and horizontal cells (HCs) to the color-opponent mechanism and disinhibition effect of the non-classical receptive field (non-CRF) in the layer of retinal ganglion cells (RGCs). In particular, HC modulation provides a global color correction with cone-specific lateral gain control, and the following RGCs refine the processing with iterative adaptation till all the three opponent channels reach their stable states (i.e., obtain stable outputs). Instead of explicitly estimating the scene illuminant(s) like most existing algorithms, our model directly removes the effect of scene illuminant. Evaluations on four commonly used color constancy datasets show that the proposed model produces competitive results in comparison to the state-of-the-art methods for the scenes under either single or multiple illuminants. The results indicate that single-opponency, especially the disinhibitory effect emerging in the receptive field's subunit-structured surround of RGCs, plays an important role in removing scene illuminant(s) by inherently distinguishing the spatial structures of surfaces from extensive illuminant(s).
    No preview · Article · Jan 2016 · IEEE Transactions on Image Processing
  • [Show abstract] [Hide abstract]
    ABSTRACT: We propose a novel reconstruction based transfer learning method called Latent Sparse Domain Transfer (LSDT) for domain adaptation and visual categorization of heterogeneous data. For handling cross-domain distribution mismatch, we advocate reconstructing the target domain data with the combined source and target domain data points based on -norm sparse coding. Furthermore, we propose a joint learning model for simultaneous optimization of the sparse coding and the optimal subspace representation. Additionally, we generalize the proposed LSDT model into a kernel based linear/nonlinear basis transformation learning framework for tackling nonlinear subspace shifts in Reproduced Kernel Hilbert Space. The proposed methods have three advantages: 1) the latent space and reconstruction are jointly learned for pursuit of an optimal subspace transfer; 2) with the theory of sparse subspace clustering (SSC), a few valuable source and target data points are formulated to reconstruct the target data with noise (outliers) from source domain removed during domain adaptation, such that the robustness is guaranteed; 3) a nonlinear projection of some latent space with kernel is easily generalized for dealing with highly nonlinear domain shift (e.g. face poses). Extensive experiments on several benchmark vision datasets demonstrate that the proposed approaches outperform other state-of-the-art representation based domain adaptation methods.
    No preview · Article · Jan 2016 · IEEE Transactions on Image Processing
  • [Show abstract] [Hide abstract]
    ABSTRACT: The paper proposes a new method of calculating a matching metric for motion estimation. The proposed method splits the information in the source images into multiple scale and orientation subbands, reduces the subband values to a binary representation via an adaptive thresholding algorithm, and uses mutual information to model the similarity of corresponding square windows in each image. A moving window strategy is applied to recover a dense estimated motion field whose properties are explored. The proposed matching metric is a sum of mutual information scores across space, scale and orientation. This facilitates the exploitation of information diversity in the source images. Experimental comparisons are performed amongst several related approaches, revealing that the proposed matching metric is better able to exploit information diversity, generating more accurate motion fields.
    No preview · Article · Jan 2016 · IEEE Transactions on Image Processing
  • [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, we present Multinomial Latent Logistic Regression (MLLR), a new learning paradigm that introduces latent variables to logistic regression. By inheriting the advantages of logistic regression, MLLR is efficiently optimized using second-order derivatives and provides effective probabilistic analysis on output predictions. MLLR is particularly effective in weakly-supervised settings where the latent variable has an exponential number of possible values. The effectiveness of MLLR is demonstrated on four different image understanding applications, including a new challenging architectural style classification task. Furthermore, we show that MLLR can be generalized to general structured output prediction, and in doing so we provide a thorough investigation of the connections and differences between MLLR and existing related algorithms, including latent structural SVMs and hidden CRFs.
    No preview · Article · Dec 2015 · IEEE Transactions on Image Processing
  • [Show abstract] [Hide abstract]
    ABSTRACT: Blur in facial images significantly impedes the efficiency of recognition approaches. However, most existing blind deconvolution methods cannot generate satisfactory results, due to their dependence on strong edges which are sufficient in natural images but not in facial images. In this paper, we represent a point spread functions (PSF) by the linear combination of a set of pre-defined orthogonal PSFs and similarly, an estimated intrinsic sharp face image (EI) is represented by the linear combination of a set of pre-defined orthogonal face images. In doing so, PSF and EI estimation is simplified to discovering two sets of linear combination coefficients which are simultaneously found by our proposed coupled learning algorithm. To make our method robust to different kinds of blurry face images, we generate several candidate PSFs and EIs for a test image, and then a non-blind deconvolution method is adopted to generate more EIs by those candidate PSFs. Finally, we deploy a blind image quality assessment metric to automatically select the optimal EI. Thorough experiments on the The Facial Recognition Technology (FERET) Database, extended Yale Face Database B, CMU Pose, Illumination, and Expression (PIE) database and Face Recognition Grand Challenge (FRGC) Database version 2.0 demonstrate that the proposed approach effectively restores intrinsic sharp face images and consequently improves the performance of face recognition.
    No preview · Article · Dec 2015 · IEEE Transactions on Image Processing
  • [Show abstract] [Hide abstract]
    ABSTRACT: We introduce a variational phase retrieval algorithm for the imaging of transparent objects. Our formalism is based on the transport-of-intensity equation (TIE) which relates the phase of an optical field to the variation of its intensity along the direction of propagation. TIE practically requires one to record a set of defocus images to measure the variation of intensity. We first investigate the effect of the defocus distance on the retrieved phase map. Based on our analysis, we propose a weighted phase reconstruction algorithm yielding a phase map that minimizes a convex functional. The method is nonlinear and combines different ranges of spatial frequencies-depending on the defocus value of the measurements-in a regularized fashion. The minimization task is solved iteratively via the alternatingdirection method of multipliers. Our simulations outperform commonly used linear and nonlinear TIE solvers. We also illustrate and validate our method on real microscopy data of HeLa cells.
    No preview · Article · Dec 2015 · IEEE Transactions on Image Processing
  • [Show abstract] [Hide abstract]
    ABSTRACT: A focus profile depicts the image sharpness (or focus value) as the lens sweeps along the optical axis of a camera. Accurate modeling of the focus profile is important to many imaging tasks. In this paper, we present an approach to focus profile modeling that makes the search of in-focus lens position a mathematically tractable problem and hereby improves the efficiency and accuracy of image acquisition. The proposed approach entails a transformation that converts the representation of a focus profile to quadratic form. An important feature of the approach is that no prior knowledge of the focus measurement technique is required. Experimental results are provided to demonstrate the effectiveness of the approach.
    No preview · Article · Dec 2015 · IEEE Transactions on Image Processing
  • [Show abstract] [Hide abstract]
    ABSTRACT: Human action recognition in videos has been extensively studied in recent years due to its wide range of applications. Instead of classifying video sequences into a number of action categories, in this paper, we focus on a particular problem of action similarity labeling, which aims at verifying whether a pair of videos contain the same type of action or not. To address this challenge, a novel approach called Compressive Sequential Learning (CSL) is proposed by leveraging the compressive sensing theory and sequential learning. We first project data points to a low dimensional space by effectively exploring an important property in compressive sensing: the Restricted Isometry Property (RIP). In particular, a very sparse measurement matrix is adopted to reduce the dimensionality efficiently. We then learn an ensemble classifier for measuring similarities between pair-wise videos by iteratively minimizing its empirical risk with the AdaBoost strategy on the training set. Unlike conventional AdaBoost, the weak learner for each iteration is not explicitly defined and its parameters are learned through greedy optimization. Furthermore, an alternative of CSL named Compressive Sequential Encoding (CSE) is developed as an encoding technique and followed by a linear classifier to address the similarity labeling problem. Our method has been systematically evaluated on four action data sets: ASLAN, KTH, HMDB51 and Hollywood2, and the results show the effectiveness and superiority of our method for action similarity labeling.
    No preview · Article · Dec 2015 · IEEE Transactions on Image Processing
  • [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, we propose a novel tracking framework based on sparse and discriminative hashing method. Different from the previous work, we treat object tracking as an Approximate Nearest Neighbor searching process in a binary space. Using the hash functions, the target templates and candidates can be projected into the Hamming space, facilitating the distance calculation and tracking efficiency. First, we integrate both the inter-class and intra-class information to train multiple hash functions for better classification while most classifiers in previous tracking methods usually neglect the interclass correlation, which may cause the inaccuracy. Then, we introduce sparsity into the hash coefficient vectors for dynamic feature selection, which is crucial to select the discriminative and stable features to adapt to visual variations during the tracking process. Extensive experiments on various challenging sequences show that the proposed algorithm performs favorably against the state-of-the-art methods.
    No preview · Article · Dec 2015 · IEEE Transactions on Image Processing
  • [Show abstract] [Hide abstract]
    ABSTRACT: Local Binary Pattern (LBP) has been successfully used in computer vision and pattern recognition applications such as texture recognition. It could effectively address gray-scale and rotation variation. However, it failed to get desirable performance for texture classification with scale transformation. In this paper, a new method based on dominant LBP in scale space is proposed to address scale variation for texture classification. First, a scale space of a texture image is derived by a Gaussian filter. Then, a histogram of pre-learned dominant LBPs is built for each image in the scale space. Finally, for each pattern, the maximal frequency among different scales is considered as the scale invariant feature. Extensive experiments on five public texture databases (UIUC, CUReT, KTH-TIPS, UMD, ALOT) validate the efficiency of the proposed feature extraction scheme. Coupled with the nearest subspace classifier (NSC), the proposed method could yield competitive results, which are 99.36%, 99.51%, 99.39%, 99.46% and 99.71% for UIUC, CUReT, KTH-TIPS, UMD and ALOT respectively. Meanwhile, the proposed method inherits simple and efficient merits of LBP, for example, it could extract scale-robust feature for a 200200× image within 0.24 seconds, which is applicable for many real time applications.
    No preview · Article · Dec 2015 · IEEE Transactions on Image Processing
  • [Show abstract] [Hide abstract]
    ABSTRACT: Horror content sharing on the Web is a growing phenomenon that can interfere with our daily life and affect the mental health of those involved. As an important form of expression, horror images have their own characteristics that can evoke extreme emotions. In this paper, we present a novel context-aware multi-instance learning (CMIL) algorithm for horror image recognition. The CMIL algorithm identifies horror images and picks out the regions that cause the sensation of horror in these horror images. It obtains contextual cues among adjacent regions in an image using a random walk on a contextual graph. Borrowing the strength of the fuzzy support vector machine (FSVM), we define a heuristic optimization procedure based on the FSVM to search for the optimal classifier for the CMIL. To improve the initialization of the CMIL, we propose a novel visual saliency model based on the tensor analysis. The average saliency value of each segmented region is set as its initial fuzzy membership in the CMIL. The advantage of the tensor-based visual saliency model is that it not only adaptively selects features, but also dynamically determines fusion weights for saliency value combination from different feature subspaces. The effectiveness of the proposed CMIL model is demonstrated by its use in horror image recognition on two large-scale image sets collected from the Internet.
    No preview · Article · Dec 2015 · IEEE Transactions on Image Processing
  • [Show abstract] [Hide abstract]
    ABSTRACT: One of the light field capturing techniques is the focused plenoptic capturing. By placing a microlens array in front of the photosensor, the focused plenoptic cameras capture both spatial and angular information of a scene in each microlens image and across microlens images. The capturing results in significant amount of redundant information, and the captured image is usually of a large resolution. A coding scheme that removes the redundancy before coding can be of advantage for efficient compression, transmission and rendering. In this paper, we propose a lossy coding scheme to efficiently represent plenoptic images. The format contains a sparse image set and its associated disparities. The reconstruction is performed by disparity-based interpolation and inpainting, and the reconstructed image is later employed as a prediction reference for the coding of the full plenoptic image. As an outcome of the representation, the proposed scheme inherits a scalable structure with three layers. The results show that plenoptic images are compressed efficiently with over 60 percent bit rate reduction compared to HEVC intra, and with over 20 percent compared to HEVC block copying mode.
    No preview · Article · Nov 2015 · IEEE Transactions on Image Processing
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Invertible image representation methods (transforms) are routinely employed as low-level image processing operations based on which feature extraction and recognition algorithms are developed. Most transforms in current use (e.g. Fourier, Wavelet, etc.) are linear transforms, and, by themselves, are unable to substantially simplify the representation of image classes for classification. Here we describe a nonlinear, invertible, low-level image processing transform based on combining the well known Radon transform for image data, and the 1D Cumulative Distribution Transform proposed earlier. We describe a few of the properties of this new transform, and with both theoretical and experimental results show that it can often render certain problems linearly separable in transform space.
    Preview · Article · Nov 2015 · IEEE Transactions on Image Processing