IEEE Journal of Selected Topics in Signal Processing Impact Factor & Information

Publisher: Institute of Electrical and Electronics Engineers; IEEE Signal Processing Society, Institute of Electrical and Electronics Engineers

Journal description

Current impact factor: 3.63

Impact Factor Rankings

2015 Impact Factor Available summer 2015
2013 / 2014 Impact Factor 3.629
2012 Impact Factor 3.297
2011 Impact Factor 2.88
2010 Impact Factor 2.571
2009 Impact Factor 1.2

Impact factor over time

Impact factor

Additional details

5-year impact 3.82
Cited half-life 3.50
Immediacy index 0.32
Eigenfactor 0.02
Article influence 1.99
Other titles IEEE journal of selected topics in signal processing, Selected topics in signal processing
ISSN 1932-4553
OCLC 158906070
Material type Periodical, Internet resource
Document type Internet Resource, Computer File, Journal / Magazine / Newspaper

Publisher details

Institute of Electrical and Electronics Engineers

  • Pre-print
    • Author can archive a pre-print version
  • Post-print
    • Author can archive a post-print version
  • Conditions
    • Author's pre-print on Author's personal website, employers website or publicly accessible server
    • Author's post-print on Author's server or Institutional server
    • Author's pre-print must be removed upon publication of final version and replaced with either full citation to IEEE work with a Digital Object Identifier or link to article abstract in IEEE Xplore or replaced with Authors post-print
    • Author's pre-print must be accompanied with set-phrase, once submitted to IEEE for publication ("This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible")
    • Author's pre-print must be accompanied with set-phrase, when accepted by IEEE for publication ("(c) 20xx IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works.")
    • IEEE must be informed as to the electronic address of the pre-print
    • If funding rules apply authors may post Author's post-print version in funder's designated repository
    • Author's Post-print - Publisher copyright and source must be acknowledged with citation (see above set statement)
    • Author's Post-print - Must link to publisher version with DOI
    • Publisher's version/PDF cannot be used
    • Publisher copyright and source must be acknowledged
  • Classification
    ​ green

Publications in this journal

  • [Show abstract] [Hide abstract]
    ABSTRACT: This paper presents a parametric method for perceptual sound field recording and reproduction from a small-sized microphone array to arbitrary loudspeaker layouts. The applied parametric model has been found to be effective and well-correlated with perceptual attributes in the context of directional audio coding, and here it is generalized and extended to higher orders of spherical harmonic signals. Higher order recordings are used for estimation of the model parameters inside angular sectors that provide increased separation between simultaneous sources and reverberation. The perceptual synthesis according to the combined properties of these sector parameters is achieved with an adaptive least-squares mixing technique. Furthermore, considerations regarding practical microphone arrays are presented and a frequency-dependent scheme is proposed. A realization of the system is described for an existing spherical microphone array and for a target loudspeaker setup similar to NHK 22.2. It is demonstrated through listening tests that, compared to a reference scene, the perceived difference is greatly reduced with the proposed higher order analysis model. The results further indicate that, on the same task, the method outperforms linear reproduction with the same recordings available.
    IEEE Journal of Selected Topics in Signal Processing 08/2015; 9(5). DOI:10.1109/JSTSP.2015.2415762
  • [Show abstract] [Hide abstract]
    ABSTRACT: This paper addresses the problem of rumor source detection with multiple observations, from a statistical point of view of a spreading over a network, based on the susceptible-infectious model. For tree networks, multiple independent observations can dramatically improve the detection probability. For the case of a single rumor source, we propose a unified inference framework based on the joint rumor centrality, and provide explicit detection performance for degree-regular tree networks. Surprisingly, even with merely two observations, the detection probability at least doubles that of a single observation, and further approaches one, i.e., reliable detection, with increasing degree. This indicates that a richer diversity enhances detectability. Furthermore, we consider the case of multiple connected sources and investigate the effect of diversity. For general graphs, a detection algorithm using a breadth-first search strategy is also proposed and evaluated. Besides rumor source detection, our results can be used in network forensics to combat recurring epidemic-like information spreading such as online anomaly and fraudulent email spams.
    IEEE Journal of Selected Topics in Signal Processing 06/2015; 9(4):1-1. DOI:10.1109/JSTSP.2015.2389191
  • [Show abstract] [Hide abstract]
    ABSTRACT: With the vast availability of traffic sensors from which traffic information can be derived, a lot of research effort has been devoted to developing traffic prediction techniques, which in turn improve route navigation, traffic regulation, urban area planning, etc. One key challenge in traffic prediction is how much to rely on prediction models that are constructed using historical data in real-time traffic situations, which may differ from that of the historical data and change over time. In this paper, we propose a novel online framework that could learn from the current traffic situation (or context) in real-time and predict the future traffic by matching the current situation to the most effective prediction model trained using historical data. As real-time traffic arrives, the traffic context space is adaptively partitioned in order to efficiently estimate the effectiveness of each base predictor in different situations. We obtain and prove both short-term and long-term performance guarantees (bounds) for our online algorithm. The proposed algorithm also works effectively in scenarios where the true labels (i.e., realized traffic) are missing or become available with delay. Using the proposed framework, the context dimension that is the most relevant to traffic prediction can also be revealed, which can further reduce the implementation complexity as well as inform traffic policy making. Our experiments with real-world data in real-life conditions show that the proposed approach significantly outperforms existing solutions.
    IEEE Journal of Selected Topics in Signal Processing 06/2015; 9(4):1-1. DOI:10.1109/JSTSP.2015.2389196
  • [Show abstract] [Hide abstract]
    ABSTRACT: We propose three novel algorithms for simultaneous dimensionality reduction and clustering of data lying in a union of subspaces. Specifically, we describe methods that learn the projection of data and find the sparse and/or low-rank coefficients in the low-dimensional latent space. Cluster labels are then assigned by applying spectral clustering to a similarity matrix built from these representations. Efficient optimization methods are proposed and their non-linear extensions based on kernel methods are presented. Various experiments show that the proposed methods perform better than many competitive subspace clustering methods.
    IEEE Journal of Selected Topics in Signal Processing 06/2015; 9(4):691-701. DOI:10.1109/JSTSP.2015.2402643
  • [Show abstract] [Hide abstract]
    ABSTRACT: We present structured data fusion (SDF) as a framework for the rapid prototyping of knowledge discovery in one or more possibly incomplete data sets. In SDF, each data set—stored as a dense, sparse, or incomplete tensor—is factorized with a matrix or tensor decomposition. Factorizations can be coupled, or fused, with each other by indicating which factors should be shared between data sets. At the same time, factors may be imposed to have any type of structure that can be constructed as an explicit function of some underlying variables. With the right choice of decomposition type and factor structure, even well-known matrix factorizations such as the eigenvalue decomposition, singular value decomposition and QR factorization can be computed with SDF. A domain specific language (DSL) for SDF is implemented as part of the software package Tensorlab, with which we offer a library of tensor decompositions and factor structures to choose from. The versatility of the SDF framework is demonstrated by means of four diverse applications, which are all solved entirely within Tensorlab’s DSL.
    IEEE Journal of Selected Topics in Signal Processing 06/2015; 9(4):586-600. DOI:10.1109/JSTSP.2015.2400415
  • [Show abstract] [Hide abstract]
    ABSTRACT: Sparsity-based techniques have been widely popular in signal processing applications such as compression, denoising, and compressed sensing. Recently, the learning of sparsifying transforms for data has received interest. The advantage of the transform model is that it enables cheap and exact computations. In Part I of this work, efficient methods for online learning of square sparsifying transforms were introduced and investigated (by numerical experiments). The online schemes process signals sequentially, and can be especially useful when dealing with big data, and for real-time, or limited latency signal processing applications. In this paper, we prove that although the associated optimization problems are non-convex, the online transform learning algorithms are guaranteed to converge to the set of stationary points of the learning problem. The guarantee relies on a few simple assumptions. In practice, the algorithms work well, as demonstrated by examples of applications to representing and denoising signals.
    IEEE Journal of Selected Topics in Signal Processing 06/2015; 9(4):637-646. DOI:10.1109/JSTSP.2015.2407860
  • [Show abstract] [Hide abstract]
    ABSTRACT: With the Internet, social media, wireless mobile devices, and pervasive sensors continuously collecting massive amounts of data, we undoubtedly live in an era of ---data deluge.--- Learning from such huge volumes of data however, promises ground-breaking advances in science and engineering along with consequent improvements in quality of life. Indeed, mining information from big data could limit the spread of epidemics and diseases, identify trends in financial and e-markets, unveil topologies and dynamics of emergent social-computational systems, accelerate brain imaging, neuroscience and systems biology models, and also protect critical infrastructure including the power grid and the Internet's backbone network. While Big Data can be definitely perceived as a big blessing, big challenges also arise with large-scale datasets. Given these challenges, ample signal processing opportunities arise. The articles in this special section explore novel modeling approaches, algorithmic advances along with their performance analysis, as well as representative applications of Big Data analytics to address practical challenges, while revealing fundamental limits and insights on the analytical trade-offs involved.
    IEEE Journal of Selected Topics in Signal Processing 06/2015; 9(4):583-585. DOI:10.1109/JSTSP.2015.2418393
  • [Show abstract] [Hide abstract]
    ABSTRACT: Automatic speech recognition (ASR) systems are used daily by millions of people worldwide to dictate messages, control devices, initiate searches or to facilitate data input in small devices. The user experience in these scenarios depends on the quality of the speech transcriptions and on the responsiveness of the system. For multilingual users, a further obstacle to natural interaction is the monolingual character of many ASR systems, in which users are constrained to a single preset language. In this work, we present an end-to-end multi-language ASR architecture, developed and deployed at Google, that allows users to select arbitrary combinations of spoken languages. We leverage recent advances in language identification and a novel method of real-time language selection to achieve similar recognition accuracy and nearly-identical latency characteristics as a monolingual system.
    IEEE Journal of Selected Topics in Signal Processing 06/2015; 9(4):749-759. DOI:10.1109/JSTSP.2014.2364559
  • [Show abstract] [Hide abstract]
    ABSTRACT: Techniques exploiting the sparsity of signals in a transform domain or dictionary have been popular in signal processing. Adaptive synthesis dictionaries have been shown to be useful in applications such as signal denoising, and medical image reconstruction. More recently, the learning of sparsifying transforms for data has received interest. The sparsifying transform model allows for cheap and exact computations. In this paper, we develop a methodology for online learning of square sparsifying transforms. Such online learning can be particularly useful when dealing with big data, and for signal processing applications such as real-time sparse representation and denoising. The proposed transform learning algorithms are shown to have a much lower computational cost than online synthesis dictionary learning. In practice, the sequential learning of a sparsifying transform typically converges faster than batch mode transform learning. Preliminary experiments show the usefulness of the proposed schemes for sparse representation, and denoising.
    IEEE Journal of Selected Topics in Signal Processing 06/2015; 9(4):625-636. DOI:10.1109/JSTSP.2015.2417131
  • [Show abstract] [Hide abstract]
    ABSTRACT: One of the most challenging ongoing issues in the field of 3D visual research is how to interpret human 3D perception over virtual 3D space between the human eye and a 3D display. When a human being perceives a 3D structure, the brain classifies the scene into the binocular or monocular vision region depending on the availability of binocular depth perception in the unit of a certain region (coarse 3D perception). The details of the scene are then perceived by applying visual sensitivity to the classified 3D structure (fine 3D perception) with reference to the fixation. Furthermore, we include the coarse and fine 3D perception in the quality assessment, and propose a human 3D Perception-based Stereo image quality pooling (3DPS) model. In 3DPS we divide the stereo image into segment units, and classify each segment as either the binocular or monocular vision region. We assess the stereo image according to the classification by applying different visual weights to the pooling method to achieve more accurate quality assessment. In particular, it is demonstrated that 3DPS performs remarkably for quality assessment of stereo images distorted by coding and transmission errors.
    IEEE Journal of Selected Topics in Signal Processing 04/2015; 9(3):1-1. DOI:10.1109/JSTSP.2015.2393296
  • [Show abstract] [Hide abstract]
    ABSTRACT: Phase shifting profilometry (PSP) and Fourier transform profilometry (FTP) are two well-known fringe analysis methods for 3D sensing. PSP offers high accuracy but requires multiple images; FTP uses a single image but is limited in its accuracy. In this paper, we propose a novel Fourier-assisted phase shifting (FAPS) method for accurate dynamic 3D sensing. Our key observation is that the motion vulnerability of multi-shot PSP can be overcome through single-shot FTP, while the high accuracy of PSP is preserved. Moreover, to solve the phase ambiguity of complex scenes without additional images, we propose an efficient parallel spatial unwrapping strategy that embeds a sparse set of markers in the fringe patterns. Our dynamic 3D sensing system based on the above principles demonstrates superior performance over previous structured light techniques, including PSP, FTP, and Kinect.
    IEEE Journal of Selected Topics in Signal Processing 04/2015; 9(3):396-408. DOI:10.1109/JSTSP.2014.2378217
  • [Show abstract] [Hide abstract]
    ABSTRACT: Structured light techniques are widely used for depth sensing. In this paper, we propose a single shot dual-frequency structured light based method to achieve dense depth in dynamic scenes. The projected pattern is a mixture of two different periodical waves whose phases are related to the change of color and intensity respectively, which can avoid the requirement of Fourier spectra separation in other multi- frequency patterns. Gabor filter is adopted to interpolate the phases. The number theory is used to deal with the phase ambiguity in phase based method conveniently and speedily. A dense depth can be achieved because of the phase-based encoding mode. The proposed method is suitable for dense depth acquisition of the moving object. Experimental results show higher accuracy of the proposed method in depth acquisition compared with the Kinect and larger resolution compared with the ToF (Time of Flight) depth camera. Meanwhile, the proposed method can also acquire the depth of the color scenes and is robust to the surface texture of objects.
    IEEE Journal of Selected Topics in Signal Processing 04/2015; 9(3):384-395. DOI:10.1109/JSTSP.2015.2403794
  • [Show abstract] [Hide abstract]
    ABSTRACT: The papers in this special section focus on interactive media processing for immersive communication applications and services.
    IEEE Journal of Selected Topics in Signal Processing 04/2015; 9(3):381-383. DOI:10.1109/JSTSP.2015.2406092
  • [Show abstract] [Hide abstract]
    ABSTRACT: Several multiview video coding standards have been developed to efficiently compress images from different camera views capturing the same scene by exploiting the spatial, the temporal and the interview correlations. However, the compressed texture and depth data have typically many interview coding dependencies, which may not suit interactive multiview video streaming (IMVS) systems, where the user requests only one view at a time. In this context, this paper proposes an algorithm for the effective selection of the interview prediction structures (PSs) and associated texture and depth quantization parameters (QPs) for IMVS under relevant constraints. These PSs and QPs are selected such that the visual distortion is minimized, given some storage and point-to-point transmission rate constraints, and a user interaction behavior model. Simulation results show that the novel algorithm has near-optimal compression efficiency with low computational complexity, so that it offers an effective encoding solution for IMVS applications.
    IEEE Journal of Selected Topics in Signal Processing 04/2015; 9(3):487-500. DOI:10.1109/JSTSP.2015.2407320
  • [Show abstract] [Hide abstract]
    ABSTRACT: We study forward error correction codes for low-delay, real-time streaming communication over packet erasure channels. Our encoder operates on a stream of source packets in a sequential fashion, and the decoder must output each packet in the source stream within a fixed delay. We consider a class of practical channel models with correlated erasures and introduce new “streaming codes” for efficient error correction over these channels. For our analysis, we propose a simplified class of erasure channels that introduce both burst and isolated erasures within the same decoding window. We demonstrate that the previously proposed streaming codes can lead to significant number of packet losses over such channels. Our proposed constructions involve a layered coding approach, where a burst-erasure code is first constructed, and additional layers of parity-checks are concatenated to recover from the isolated erasure patterns. We also introduce another construction that requires a significantly smaller field-size and decoding complexity, but incurs some performance loss. Numerical simulations over the Gilbert-Elliott and Fritchman channel models indicate that by addressing patterns involving both burst and isolated erasures within the same window, our proposed codes achieve significant gains over previously proposed streaming codes.
    IEEE Journal of Selected Topics in Signal Processing 04/2015; 9(3):501-516. DOI:10.1109/JSTSP.2014.2388191
  • [Show abstract] [Hide abstract]
    ABSTRACT: Applications involving indirect interpersonal communication, such as collaborative design/assembly/exploration of physical objects, can benefit strongly from the transmission of contact-based haptic media, in addition to the more traditional audiovisual media. Inclusion of haptic media has been shown to improve immersiveness, task performance, and the overall experience of task execution. While several decades of research have been dedicated to the acquisition, processing, coding, and display of audio and video streams, similar aspects for haptic streams have been addressed only recently. Simultaneous masking is a perceptual phenomenon widely exploited in the compression of audio data. In the first part of this paper, to the best of our knowledge, we present first-time empirical evidence for masking in the perception of wideband vibrotactile signals. Our results show that this phenomenon for haptics is very similar to its auditory analog. Signals closer in frequency to a powerful masker ( 25 dB above detection threshold) are masked more strongly (peak threshold-shifts of up to 28 dB) than those away from the masker (threshold-shifts of 15–20 dB). The masking curves approximately follow the masker's spectral profile. In the second part of this paper, we present a bitrate scalable haptic texture codec, which incorporates the masking model and describe its subjective and objective performance evaluation. Experiments show that we can drive down the codec output bitrate to a very low value of 2.3 kbps, without the subjects being able to reliable discriminate between the codec input and distorted output texture signals.
    IEEE Journal of Selected Topics in Signal Processing 04/2015; 9(3):462-473. DOI:10.1109/JSTSP.2014.2374574
  • [Show abstract] [Hide abstract]
    ABSTRACT: An adaptive range image coding algorithm for the geometry compression of large-scale 3D point clouds (LS3DPCs) is proposed in this work. A terrestrial laser scanner generates an LS3DPC by measuring the radial distances of objects in a real world scene, which can be mapped into a range image. In general, the range image exhibits different characteristics from an ordinary luminance or color image, and thus the conventional image coding techniques are not suitable for the range image coding. We propose a hybrid range image coding algorithm, which predicts the radial distance of each pixel using previously encoded neighbors adaptively in one of three coordinate domains: range image domain, height image domain, and 3D domain. We first partition an input range image into blocks of various sizes. For each block, we apply multiple prediction modes in the three domains and compute their rate-distortion costs. Then, we perform the prediction of all pixels using the optimal mode and encode the resulting prediction residuals. Experimental results show that the proposed algorithm provides significantly better compression performance on various range images than the conventional image or video coding techniques.
    IEEE Journal of Selected Topics in Signal Processing 04/2015; 9(3):422-434. DOI:10.1109/JSTSP.2014.2370752